Chapter 12
From The SDI Cookbook
Contents |
Chapter 12: Terminology
Editor: Greg Yetman, CIESIN, Columbia University gyetman {[at]} ciesin.columbia.edu
INTRODUCTION
‘If we are to understand each other, we must comprehend a common language.’
The truth in this statement would be apparent to anyone who has visited a foreign country for the first time. The initial encounter with an unfamiliar national language can be a bewildering and threatening experience. The sudden inability to effectively communicate quickly frustrates even the simplest tasks and pleasures. A single burning question repeatedly goes through your mind; ‘Why didn’t I take those language lessons before I left?’
Clearly a common language is an essential prerequisite to effective communication between any two people or cultures. However, a simple knowledge of a language’s vocabulary is not sufficient to guarantee effective communication. A word can have several meanings depending on the context in which it is used. Similarly, a concept can be referenced by several words, each communicating a different connotation or level of severity.
A comprehension of a language’s subtleties and nuances is therefore needed if it is to be used effectively and unambiguously. The use of the wrong word can offend or mislead, leading to the classic ‘failure to communicate’. This in turn can cause misunderstanding, dysfunctional outcomes and even hostility. The precise use and comprehension of words by both communicating parties is vital.
The issues associated with the correct use of a language can extend far beyond day-to-day communication. Every field of endeavor, from engineering to cookery, has its own language and vocabulary. In order to participate in discussions on the subject, it is necessary to understand both the terms and the context in which they are to be used. The imprecise use of a technical or professional language (for example, by using two terms interchangeably when, in fact, they have distinctly different connotations) gives rise to the same traps and dangers associated with the inappropriate use of a spoken language.
The risks in failing to have a common understanding of both spoken and technical languages are therefore clear. However such risks can compound considerably when it is necessary to translate a technical term from one language (for example, English) into a totally different language (for example, Mandarin Chinese). The different cultures, language structures and character sets give rise to some very real problems in ensuring that the term has precisely the same meaning in both languages. The issue becomes one of mapping the term in both languages to a clearly identified unique common concept. This, in turn, places considerable emphasis on the philosophy of concepts and the progressive decomposition of complex concepts into their base conceptual components.
The following paragraphs will consider the development and management of terminology in the field of Geographic Information. The discussion will consider the principles that are applied when selecting and defining concepts, terms and definitions, with particular emphasis on the requirements of the International Organization for Standardisation. This will be followed by instances of terminology implementation in practice.
THE CONTEXT AND RATIONALE OF TERMINOLOGY
The development of terminology involves the simultaneous consideration of three inextricably linked processes, being
The identification of a concept The nomination of a term for that concept The construction of a definition for that term that unambiguously describes the concept
The three processes are guided by the objective that, for each concept, there will be a single term (and vice-versa) and for each term there will be a single definition (and vice versa).
From the outset it should be stated that it is not the objective of the terminology process to ‘reinvent the wheel’. There are terms and concepts that are found in general language dictionaries and have definitions that correspond to definitions in the field of geographic information. Similarly, there are terms and concepts that have already been defined in International Standards or can be found in similar documentation. These should be adopted whenever possible, avoiding the unnecessary proliferation or duplication of terms.
Quite often, however, there are instances where the definitions in general language dictionaries are insufficiently rigorous or concise to describe the concept. In such cases, it is appropriate to refine or adapt the concept, term and definition as appropriate.
Identification of Concepts
The identification of concepts is arguably the most important part of the terminology process. It is also the most complex and demanding part. The complexity stems from the fact that a concept rarely exists in isolation. It is very often built on a number of simpler concepts, giving rise to a hierarchical concept system.
Consider, for example, the concept of:
spatial referencing by coordinates,
which is
the description of position by means of 1-, 2- or 3-dimensional coordinates.
This is dependent on the concept of a:
coordinate reference system,
which is
a coordinate system which is related to the real world by a datum.
This, in turn, combines the concepts of:
coordinate system
which is
a set of mathematical rules for specifying how coordinates are to be assigned to points.
and
datum
which is
a set of parameters that defines the position of the origin, the scale and the orientation of the coordinate axes
Further decomposition of ‘coordinate system’ and ‘datum’ into component concepts is possible (for example, into ‘coordinate’, ‘origin’, ‘scale’, ‘axis’) as is aggregation into other more complex concepts (for example, ‘Cartesian coordinate system’, ‘compound coordinate reference system’).
A concept system, therefore, comprises a set of concepts that are distinct but closely related to each other. Each concept is capable of separate description and may also be capable of further decomposition. However collectively they are components of a broader concept.
The concise decomposition and identification of concepts is an essential precursor to the allocation of terms and the articulation of definitions. The development of a concept system usually proceeds in a top-down fashion, starting with the identification of the broader concept (for example, spatial referencing by coordinates). The process of decomposition ceases when the concepts become so basic that they do not need to be defined.
Terms
The objective of the terminology process is to identify a single term for each concept. The term is referred to as the ‘Preferred Term’ and is adopted as being the primary descriptor for the given concept. Sometimes there may also be a shortened form of the Preferred Term, referred to as the Abbreviated Term. This is an equivalent but more convenient version of the term formed by omitting words or letters from the full name.
Three other classifications also need to be mentioned, being “Admitted Term’, Deprecated Term’ and Obsolete Term’. An ‘Admitted Term’ is a synonym for a preferred term. Typically such terms are national variants of the preferred term and should be identified as such in any register or dictionary.
A ‘Deprecated Term’ is one that has been judged undesirable for use in relation to a particular concept. An ‘Obsolete Term’ is one that is no longer in common use.
The selection of terms needs some care. A term should not be a trade name or the name of a research project. Similarly, it should not be a colloquial term (i.e. a local informal term used to describe a formal term).
To avoid ambiguity, there should be a single definition associated with each concept. It may be necessary to refine the terminology in some instances to ensure that its field of application is understood. Consider, for example, the term ‘object’ which has broad application in the information technology field. It is sometimes necessary to identify a specific type of object that is characterised by particular attributes, relationships or behaviour. In such cases, the term can be adapted to ensure that it is specific to the particular concept. In the case of ‘object’, two adaptations might be: spatial object object used for representing a spatial characteristic of a feature
and
geometric object spatial object representing a geometric set.
The realization of the one-to-one correspondence between concept, term and definition is not always immediately possible, particularly in instances when multiple terms have been used interchangeably for long periods of time. An example is provided by the terms geodetic height and ellipsoidal height. Both terms have the same definition (distance of a point from the ellipsoid measured along the perpendicular from the ellipsoid to this point positive if upwards or outside of the ellipsoid). The two terms continue to be used interchangeably and there appears to be no consensus on which is preferred
Definitions
The role of a definition is to precisely describe the content of an identified concept. It should be as brief as possible, containing only that information that makes the concept unique. It should also focus on what the concept encapsulates rather than what it excludes. Thus the following definition for lexical language would be considered unsatisfactory.
language whose syntax is expressed in terms of symbols defined as character strings rather than letters from then Greek alphabet
Deleting the final seven words provides a more satisfactory outcome.
language whose syntax is expressed in terms of symbols defined as character strings
A definition should neither be too broad or too narrow and should only describe a single concept. It may be complex, referring to other concepts (either basic or elsewhere defined) through their terms. However it should not include the characteristics of other concepts as part of its text. Should this happen, then the decomposition process has not been undertaken correctly and must be reviewed. For example, consider the following proposed definition for data quality element.
quantitative component documenting the quality of an identifiable collection of data.
It does define the concept. However, it also describes a second concept through the words ‘identifiable collection of data’. This should be given its own term and definition, resulting in the following:
dataset - identifiable collection of data
data quality element - quantitative component documenting the quality of a dataset
The relationships between concepts should be evident in the structure of the definitions. In particular, the structures should reflect the connections between the concepts and the delimitations that distinguish them from each other. Consider the following terms and definitions:
conformance assessment process - process for assessing the conformance of an implementation to an International Standard
conformance clause - clause defining what is necessary in order to meet the requirements of the International Standard
conformance testing - testing of a product to determine the extent to which the product is a conforming implementation
conformance test report - summary of the conformance to the International Standard as well as all the details of the testing that supports the given overall summary
All four are concerned with quality assessment. Conformance assessment process is the toplevel concept, being the process for assessing the conformance of an implementation to an International Standard. The other three terms identify distinct lower level concepts that are incorporated into the process, being a statement of requirements, the test itself and the subsequent report. The relationships and structures are evident in the terms and associated definitions.
The validity of a definition can be tested through application of the substitution principle. This involves replacing the term by its definition in the body of a text in which it is used. If the substitution does not affect the meaning of the text, the definition is valid. If such is not the case, the definition needs to be reconsidered.
The substitution principle can be particularly useful for identifying instances of circularity in definitions. If one concept is defined using a second concept, and that second concept is defined using the term or elements of the term designating the first concept, the resulting definitions are said to be circular. Such instances do not clarify the understanding of the concepts involved and must be avoided.
THE ISO 19100-SERIES STANDARDS
The International Organisation for Standardisation, through its technical committee ISO/TC 211, is developing a family of International Standards for geographic information. The standards are collectively referred to as the ISO 19100-series. A member of the series, ISO 19104 Geographic Information – Terminology, will provide rules for writing definitions and for the structuring of terminology records. These are being applied in all other members of the series.
ISO 19104 defines twelve fields that may be included in a terminology record. Five of the fields are mandatory and must be included in all conforming implementations. The remainder may be excluded from profiles of the standard or simply not populated if it is appropriate to do so. The fields are as follows:
entry number [mandatory] – an arbitrary value implying no structure or hierarchy;
preferred term [mandatory] – the term to be associated with the concept;
abbreviated term – if preferred, the abbreviated term shall precede the full form, otherwise an abbreviated form shall follow the full form;
admitted term(s) – national variants shall be followed by a country code as defined in ISO 3166-2, numeric 3-digit code is used for the IT-interface (i.e. stored in the database), while the meaning of this code is presented in the human language used by the user (i.e. the human interface);
definition [mandatory] – if taken from another normative document, a reference shall be added in square brackets after the definition; or, if referring to another concept in the vocabulary, then that concept shall be named by its preferred term and presented in bold face characters;
deprecated or obsolete terms (in alphabetical order);
references to related entries;
examples of term usage;
notes – may be used to provide additional information, (if a definition has been adapted from a source, this may be explained in a note);
beginning date of the instance [mandatory];
terminological data type [mandatory];
ending date of the instance.
ISO19104 also makes allowance for the designation of term equivalents, these being the preferred, admitted and abbreviated terms in languages other than their definition language. Such equivalents shall be preceded by:
the numeric 3-digit country code as defined in ISO 3166-2 if needed; and
the Terminology alphabetic-3 digit language code as defined in ISO 639-2 (e.g. "fra" for French, "deu" for German).
IMPLEMENTATION APPROACHES
Some Current Implementation Instances
The most commonly encountered approach to terminology implementation is the provision of a glossary of terms as part of a publication or through a web site. Typically the glossary will list the terms and definitions and may provide references to the sources of the definitions in some instances.
There are many examples of such listings (including the Glossary within this document). For example, the Digital Geographic Information Exchange Standard (DIGEST) version 2.1 includes a terminology listing in Part 1 of its documentation. Similarly, the Association for Geographic Information and the University of Edinburgh Department of Geography host an on-line dictionary of GIS terms. The dictionary includes definitions for 980 terms compiled from a variety of sources which either relate directly to GIS or which GIS users may come across in the course of their work. It includes definitions, references to related terms plus references and further reading. Searching can be done from an alphabetic list or through a search by category. A list of acronyms is included.
Clause 4 in each of the ISO 19100-series standards contains the terminology for concepts that are used or developed within that standard. The clauses are fully compliant with the provisions of ISO 19104 Geographic Information – Terminology. In addition, ISO/TC 211 have sponsored development of an on-line terminology repository that can be freely accessed via the Internet. The repository lists all terms, definitions, notes and examples included in the ISO 19100-series standards. It is an attempt to make the terminology as widely available as possible and thus promote the consistent use of terms and concepts.
Registries and the Need for Unique Identification
In the preceding sections, considerable emphasis has been placed on the principle that there should be a one-to-one relationship between a concept, its term and its definition. In the vast majority of instances where this is possible, it is tempting to consider the term to be the unique identifier for the concept. The term and the concept are both unique and are closely linked to each other. Why shouldn’t the term be considered to be a unique identifier?
In fact, there is no reason at all why this should not be the case provided the term never needs to be translated into a different language. If, however, translation is required, it then becomes necessary to ensure that the original and translated terms can both be unambiguously linked to the original concept. The use of a unique identifier that is associated with all translations of the term provides a mechanism for doing this. The original term provided through the authoring language is not suitable as the identifier.
At the time of writing, ISO/TC 211 is considering the issue of unique identification as part of its deliberations on Cultural and Linguistic Adaptability. In particular, it is considering the establishment of a terminology register in which all listed terms would have a unique registration identifier. A number of options for unique identification have been proposed, ranging from a sequential number based on the order of registration, though to more complex numbering schemes. The main consideration, however, is that the identifier be unique and that its association with its concept never change.
REFERENCES AND LINKAGES
ISO 704:2000, Terminology Work – Principles and Methods
ISO/TC 211 N 1320: Text for DIS 19104, Geographic Information – Terminology, as sent to ISO Central Secretariat for issuing as Draft International Standards, September 2002.
The Digital Geographic Information Exchange Standard (DIGEST), Edition 2.1, produced and issued by the Digital Geographic Information Working Group (DGIWG), September 2000.
