Passar para o conteúdo principal

page search

Data Model for statistical data (LOD)

Overview

The Data Model for statistical data (LOD) is organized around country-based datasets and indicators, following widely adopted standards for statistical data.

The Data Model for statistical data (LOD) (part of the whole Land Portal LOD data model) is designed on top of the following existing vocabularies:

  • Dublin Core for properties common to most resources
  • RDF Data Cube provides a means to publish multi-dimensional data, such as statistics, on the web in such a way that it can be linked to related data sets and concepts using RDF
  • Computex (Computing Statistical Indexes) can be seen as an extension of RDF Data Cube vocabulary to handle statistical indexes.
  • SDMX (Statistical Data and Metadata eXchange), an ISO standard for exchanging and sharing statistical data and metadata among organizations.
  • The OWL-Time ontology is an ontology of temporal concepts, for describing the temporal properties of resources in the world
  • The Schema.org vocabulary for properties of all relevant entities (creative works, persons, organizations, events, places)
  • The SKOS vocabulary for all related concepts 

Table 1. Namespaces used in the Data Model for statistical data (LOD)

Prefix Namespace
cex http://purl.org/weso/computex/ontology#
dct http://purl.org/dc/terms/
lb http://purl.org/weso/landbook/ontology#
owl http://www.w3.org/2002/07/owl#
qb http://purl.org/linked-data/cube#
rdf http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs http://www.w3.org/2000/01/rdf-schema#
schema http://schema.org/
sdmx-attribute http://purl.org/linked-data/sdmx/2009/attribute#
skos http://www.w3.org/2004/02/skos/core#
time http://www.w3.org/2006/time#

Entity: Dataset

A dataset is a collection of data, published or curated by a single agent (source), and available for access or download in one or more formats (definition from DCAT).

The fields of a dataset are:

  • Label: Label of the dataset.
  • Description: Description of the dataset.
  • ID: Internal ID of the dataset.
  • Logo: The logo of the dataset
  • License: The license of the dataset
  • Copyright details: Detailed copyright statements to be highlighted
  • Organization: Organization that publish the dataset.
  • Related Themes: Themes related to the dataset.
  • Related LandVoc Concepts: LandVoc concepts related to the dataset.

RDF types:

skos:Concept, qb:DataSet, dcat:Dataset

 

URI pattern:

http://data.landportal.info/dataset/{dataset-ID}

Values: Taxonomy: Dataset

Properties RDF predicates Predicate type (details)
Label

skos:prefLabel, dct:title, rdfs:label

literal
Description

skos:definition, dct:description, rdfs:comment

literal
ID

skos:notation, dct:identifier

literal
Organization (publisher) dct:publisher resource (Entity: Organization)
Themes dct:subject, schema:about (Drupal) resource (Entity: LandVoc Theme)
Related concepts (LandVoc) dct:subject, schema:about (Drupal) resource (Entity: LandVoc Concept)
Logo schema:image, schema:logo resource
License dct:license, schema:license resource (Entity: License)
Copyright details dc:rights literal
See Also rfds:seeAlso resource (Drupal)
Spatial coverage dct:spatial

Calculated from observations.

Spatial coverage of the dataset.

resource (Entity: Region)


Entity: Indicator

A statistical indicator is a data element that represents statistical data for a specified time, place, and other characteristics (definition from OECD). Currently, the place is limited to the country level.

The fields of an indicator are:

  • Label: Label of the indicator.
  • Description: Description of the indicator.
  • Picture: Image to describe the indicator.
  • Dataset: Dataset with data of this indicator
  • ID: Internal ID of the indicator.
  • Min: Minimun possible value (integer) of the indicator.
  • Max: Maximun possible value (integer) of the indicator.
  • Measurement unit: Measurement unit, like % or hectares, of the indicator.
  • has Coded Value: The values for this indicator is taken from some controlled term list (could be characters, colors, strings, numbers...) Read more
  • High / Low: High means it is better to have a high value, low means the best value is the lowest one (like in rankings).
  • Related Themes: Themes related to the indicator.
  • Related LandVoc Concepts: LandVoc concepts related to the indicator.

RDF types:

skos:Concept, cex:Indicator

URI pattern:

http://data.landportal.info/indicator/{indicator-ID}

Values: Taxonomy: Indicator

Properties RDF predicates Predicate type (details)
Label

skos:prefLabel, dct:title, rdfs:label

literal
Description

skos:definition, dct:description, rdfs:comment

literal
ID

skos:notation, dct:identifier

literal
Dataset dct:source resource (Entity: Dataset)
Measurement unit sdmx-attribute:unitMeasure literal
Geographical focus dct:spatial, schema:spatialCoverage resource (Entity: Region)
Themes dct:subject, schema:about resource (Entity: LandVoc Theme)
Related concepts (LandVoc) dct:subject, schema:about resource (Entity: LandVoc Concept)
Picture schema:image resource
See also rdfs:seeAlso resource (Drupal)

 


Entity: Observation

An Observation represents a single indicator value for a given year and area.

We consider three main dimensions for each observation:

  • Indicator: The reference indicator
  • Area/Country: A geographic area (mainly a country, but it could be a region/continent)
  • Time/Year: The time that is referred by that observation (usually a year or time interval). Use Time Ontology in OWL.

And each observation has a value:

  • Value: could be numeric or not (xsd:integer, xsd:double, xsd:string)

Also each observation has:

  • Label: A label generated using the pattern "Value of {Region} in {Time} for indicator {Indicator}"@en
  • Note: An optional comment or note about the observation.
  • Dataset: Dataset to which an observation belongs
  • Timestamp: Date when the observation was generated
  • Computation from which this observation has been obtained, from a closed list (with rdf:type cex:ObsStatus, like cex:Raw)
  • Observation status: Observation status code obtained from a close list

 

RDF types:

qb:Observation

URI pattern:

http://data.landportal.info/dataset/{dataset-ID}/observation/{observation-ID}

 

Properties RDF predicates Predicate type (details)
Indicator cex:ref-indicator resource (Entity: Indicator)
Area cex:ref-area resource (Entity: Region)
Time cex:ref-time resource (time:DateTimeInterval)
Value cex:value

literal (xsd:integer, xsd:double, xsd:string)

Label rdfs:label literal
Note rdfs:comment literal
Dataset qb:dataSet resource (Entity: Dataset)
Timestamp dct:issued literal (xsd:date)
Computation obtained from cex:computation resource (with rdf:type cex:ObsStatus)
Observation status codes sdmx-concept:obsStatus literal (code list)

Entity: Region

A Region represents a Country (mainly) or a Continent/Subcontinent regions.

RDF types:

lb:Country

URI pattern:

http://data.landportal.info/geo/{ISO 3166-1 alpha-3 code (aka ISO3)}

ValuesTaxonomy: Regions (only with ISO3 code)

Properties RDF predicates Predicate type
Name dct:title, rdfs:label literal
ISO 3166-1 alpha-3 code (aka ISO3) dct:identifier, skos:notation, geonames:countryCode literal