Without Metadata, Your Data is Invisible

Global

Metadata is at the heart of what we do at the Land Portal. What an odd statement to make for an organization in the land sector! But remember our mission, and you'll understand why creating, curating, and enriching metadata is important to us, and should be to any organization with data at its core.

We believe that access to information is crucial for achieving good land governance and securing land rights for landless and vulnerable people. When we established the Land Portal 10 years ago as a global gateway for land governance issues, we wanted to collect as much of the scattered data as possible and make it less hidden. We do that through our focus on metadata.

Data in its purest state is raw material -- underground and hard to use. Even after raw data becomes information and knowledge with the help of information scientists and statisticians, it is difficult for humans to absorb it all. We use machines to call the most relevant data to our fingertips, and in the process, benefit from the powers of metadata.

In the simplest terms, metadata is the data that describes data. The descriptions include author, title, date, geography, abstracts, key words, and much more. These data are the primary tool for linking, organizing, and connecting data that are generated in different geographies, different languages, and across different industries. Metadata is what makes information machine-readable and therefore discoverable. It is how we demonstrate relevance to search engines. Choosing the right keywords for the metadata of your article, for instance, is very important to let people understand what your resource is about.

The difference between good and less-good metadata is something called “standards”

It is not enough, however, to tag data and information with keywords freely. Nor is it wise to have unlimited metadata terms with no relationship to each other. These practices make it harder for machines to use metadata. Let me make a concrete example. If someone tags Article 1 about slums as “favela” and another tags a similar Article 2 with “informal settlements,” a “slums” researcher down the line might not see the second article unless the two terms have some sort of relation to each other. This human-controlled list of terms and the relations each term has with other terms is what we call a “standard” vocabulary. When a standard is in place - and used in your metadata - the machine is directed to recognize “slums'' as similar to''favela” and ''informal settlements.”

We saw a gap for such a standard vocabulary in the land governance sector, and we co-built LandVoc in 2012 in response.

LandVoc is not a random set of land-related terms that exists in a vacuum. It is a "controlled vocabulary" containing 300-plus carefully selected terms that are linked and curated by a community of experts. Controlled vocabularies like LandVoc work with unique IDs for each concept. There is the possibility of adding several labels to that ID: the preferred term, translations in an endless number of languages, relationships between terms. This way the machine can form relationships between the languages and the nuances we use in languages. It can then help retrieve the most relevant and to-the-point information to a user’s query.

LandVoc is almost invisible when it's doing its job, the way metadata is, but it is in fact what helps make our data more visible and discoverable.

Democratizing the information ecosystem through effective metadata usage

LandVoc and other metadata standards - such as geographical standards for countries - can be an extremely powerful tool in making land data and information more discoverable by connecting isolated pieces of data, such as news articles, blog posts, or primary datasets. They can connect knowledge and experiences from across the world, bridging both language and culture barriers.

A core Land Portal objective is to democratize the information ecosystem and strengthen flows of land governance data from all perspectives and all levels. We still have a long way to go, but investing in LandVoc and standard metadata models is one of the best ways we can do that.

Importantly, LandVoc is meant to be re-used by others.That is how the land information ecosystem grows more robust. LandVoc is intended to be an unbranded linking tool between the different classification and tagging systems information providers in the land sector use.

I encourage our partners, contributors, and publishers everywhere to reach out to us to learn more about using LandVoc with your own content metadata and how we can strengthen our data and information practices.

Without Metadata, Your Data is Invisible

Read more content like this

Share this page

Or log in with...

Read more content like this

Share this page