Many expert vocabularies have emerged from specific and limited scientific fields such as medicine and botany. They have aimed to achieve precise understanding between experts in these fields based on exact definitions of the terms used and originally, in their early examples, through the widespread use of Arabic or Latin as international scientific languages.
LandVoc is an example of what could be called a broad spectrum vocabulary. People involved with Land Governance may be qualified in any of a range of technical disciplines ranging from the natural to the social sciences or in professions such as management and law. They include people with a variety of social and educational backgrounds. The work often requires a high degree of interdisciplinarity. The meaning of the concepts applied are often subtly different between disciplines and, as importantly, from one language to another. Quite obviously, there are significant differences in the development of a vocabulary intended to serve similarly trained experts in a single speciality and one aimed at building bridges between the knowledge silos of a heterogeneous community. This is not simply a challenge for LandVoc but for any interdisciplinary or ‘cross-boundary’ description of data and information.
Expert vocabularies are often known as ‘controlled’ vocabularies. ‘Control’ can imply the authority to impose a structure and definition on the whole community. This, we would argue, is unlikely to work and might well exacerbate existing divisions rather than heal them. ‘Control’ can also mean to order and/or to manage. This, we believe, is a more productive approach to the challenges of heterogeneous contexts. It also links to the idea of participatory linguistic mapping as a process for exploring and recording differences, so that other interpretations are not lost or suppressed. Such a process not only offers a means of continuous improvement to the vocabulary but also offers a route to researching differences of meaning, which may contribute directly to a better understanding of the underlying subjects. In this blog we therefore explore the potential of this understanding of ‘control’ to achieve a functioning ‘controlled’ vocabulary, using LandVoc as an example.
The Land Portal, which initiated LandVoc, understands its work as contributing to a healthy information ecology from which all stakeholders in land governance can benefit and to which all contribute. An ecological approach includes an understanding of overlapping fields. Hence a recent report on the ‘State of Land Information in South Africa’ found that, depending on the role of the relevant institutions, data on land governance sat alongside that on the marine environment, climate change, urbanisation or a host of other important issues. There are also issues of vertical positioning. LandVoc deliberately places itself as a subset of AGROVOC, which covers a far wider range of rural development issues. It also needs to be aware of (and link clearly to) more specialist independent tools, such as the Cadastre and Land Administration Thesaurus (CaLAThe) or FAOLEX (a legislative and policy database, maintained by FAO), which lie within its own field of interest.
‘Concepts’ or ‘terms’ are at the heart of every controlled vocabulary. If the vocabulary is to work, they have to be used consistently. What is ‘controlled’ is usually their relationships with each other and, more problematically in our case, their definition.
Every vocabulary needs a structure which is logically appropriate to how it will be used. Often, this consists of branching concept trees in which a few main concepts each have levels of extra subdivision arranged in hierarchical order. Vocabularies covering more complex, real life contexts may need to be less simple. The SKOS standards for the semantic web in fact allow considerable flexibility as to the relationships between concepts and AGROVOC’s encouragement of sub-schema mixes a wish to see some fields developed in greater depth with an acknowledgement that this may lead to different hierarchies and other relationships between concepts than those it uses itself.
To give an example, LandVoc relies on both vertical and horizontal relationships between concepts. It has a number of high-level concepts, such as land administration, land rights etc., which map fairly cleanly onto well defined and core areas of land governance work, such as land measurement, law and policy. There are some concepts in these areas which can apply to more than one branch or be closely related to concepts in other branches and these relationships can be recorded. For the most part, however, within these defined areas of work, the relationships are vertical, with higher level concepts constructed from component parts. The high level concept of ‘Land Equity’ requires a different approach. It does have some hierarchical elements in that it is constituted from attention to the specific issues of gender, the inclusion of indigenous people in the discourse and, as our consultations suggested, concerns about ethnicity, migration and LGBT experiences. However the crux of these issues is usually how they are experienced within the contexts of the other high-level concepts. How are they addressed in policy or in law? To show this, the vocabulary needs to give serious attention to its horizontal structure.
Consistency in the definition of meaning is a core building block of any controlled vocabulary. Why this should be the case is pretty obvious. Nonetheless, in a multi-disciplinary setting it is a problematic requirement.
First there are, using the term in its post-colonial anthropological sense, ontological issues. As a concept relating to daily activity, ‘land stakeholders’ can be organised in both vertical and horizontal relationships with other concepts in straightforward ways. However, as feedback to our concept development pilot from researchers in Brazil and South Africa made very clear, what the concept of land itself means to stakeholders in many places exposes profound differences. To investors land may simply be regarded as a commodity, to others it may be a natural resource with varying understandings of what rights they have to access it or the obligations that imposes on them for its care. To yet others it may be a foundational element of their being, their relationship with ancestors, their relationship with nature. Without some explanation of such differences, it is impossible to understand land governance issues in any depth in many parts of the world.
At a more mundane level, many words have more than one meaning even in a single language. How this fact is accommodated varies in different settings. For example, the editors of ‘The New Oxford Dictionary of English’ take issue with the longer established ‘Oxford English Dictionary’, claiming that theirs is a dictionary of ‘current English’, focused on what is ‘central and typical’. It offers two definitions of the word ‘knowledge’ compared to the fourteen offered by the OED. This is all well and good for daily use, but one of the contexts of controlled vocabularies is that of specialist areas of work, which often make use of specialist terminologies. How are these to be clearly understood and differentiated, not to mention identified and translated, in multi-disciplinary contexts? At another level, the context is that of overlapping vocabularies. Even if it is possible to secure a single meaning across a broad field of work, how is that to be insisted on in neighbouring and overlapping fields?
An important principle of the Land Portal is that it is not an advocate for one position or another in land governance debates. It aims to provide information as fairly and as inclusively (in terms of voice and access) as possible. It does not assume a right to opine on which worldview is superior to another. It follows the librarian’s path in ensuring that all relevant sources are on the shelf, so that the readers can form their own opinion.
In my view this implies that the full development of semantically operable vocabularies beyond narrowly defined subject areas cannot rely on single definitions of concepts. They can and should insist on the meaning of any concept being clear but this could be achieved by having a curated and separately identified list of definitions for any one concept and the vocabulary editor choosing and stating which one applies. Likewise, the relationships between concepts will be remodelled according to the position and priorities of the users. All of this makes developing multi-stakeholder and multi-disciplinary controlled vocabularies challenging. It does not, at least in our experience with LandVoc so far, make it impossible.
LandVoc is a thesaurus covering concepts related to land governance. LandVoc, currently in its second version, now consists of 310 concepts organized hierarchically and is available in a multitude of languages. These languages include English, French, Spanish, Portuguese, Khmer, Vietnamese, Burmese, Thai, Swahili and Arabic.
LandVoc is an AGROVOC sub-vocabulary hosted by FAO.