DCAT-US - Version 3

Data Catalog Application Profile for the United States of America
Candidate Recommendation Snapshot

The DCAT-US 3.0 Profile (DCAT-US 3.0) is an updated specification designed to facilitate data cataloging, discovery, and interoperability among US government agencies. Leveraging the strong foundation laid by the Project Open Data (POD) 1.1 standard (also known as [[DCAT-US-1.1]]), this profile seamlessly aligns with the emerging Data Catalog Vocabulary (DCAT) - Version 3 (DCAT 3) [[VOCAB-DCAT-3]] recommendations approved by the World Wide Web Consortium (W3C), all while upholding the essential FAIR principles. Moreover, it emphasizes maintaining compatibility with the existing POD 1.1 standard, ensuring a fluid transition. The result ensures data's Findability, Accessibility, Interoperability, and Reusability (FAIR).

The predominant significance of the DCAT-US 3.0 lies in its role as a bridge between the well-established DCAT-US 1.1 and the forward-looking DCAT 3, uniting them under a single, standardized approach for articulating and exchanging datasets. By harmonizing the most significant attributes of both standards, this profile also addresses the distinctive metadata prerequisites inherent to the US context. It goes above and beyond by encompassing specialized properties to address geospatial and statistical datasets, effectively harnessing established vocabularies to elevate the process of data sharing and subsequent reuse.

Distinguished by its usage of the Shapes Constraint Language (SHACL) [[?SHACL]] for structural and semantic validation, the DCAT-US 3.0 introduces a highly refined, interoperable, and future-proof framework for describing and validating dataset metadata. In essence, it is not just a specification but an advanced stride towards achieving a data-centric landscape where precise metadata description empowers the efficient flow of information while laying the groundwork for sustained innovation.

Background

The FAIRness Project is introducing a draft update to the Data Catalog (DCAT) standard for the United States. This update, “DCAT-US 3.0 Schema,” builds upon the requirements we received from agencies as well as data creators, providers, and users, Data Inventory statutory requirements, and the lessons learned over ten years of successful implementation of the Project Open Data Metadata Standard (DCAT-US v1.1) used by Data.gov.

We need your help to review and comment on this draft so that it meets agencies' data inventory needs and those of cross-government programs like Data.gov, GeoPlatform, and the Standard Application Process Portal.

Once approved and implemented, the update will improve the FAIRness, or Findability, Accessibility, Interoperability, and Reusability of all types of federal data. DCAT-US 3.0 will provide a *single* metadata standard able to support most requirements for documentation of business, technical, statistical, and geospatial data consistently.

The DCAT-US 3.0 Schema introduces the following key enhancements:

Please review the documentation below and provide feedback to help make this standard as useful as possible to you and the broader federal data user community.

Please follow the instructions found here to submit your comments and issues with the current draft schema specification.

Overview

The DCAT-US 3.0 Profile is a comprehensive update to the Project Open Data (POD) 1.1 standard, designed to meet the evolving needs of data exchange and interoperability among US government agencies. This profile builds on the foundation laid by POD 1.1 and is aligned with the latest DCAT 3 standard from the World Wide Web Consortium (W3C). In addition, the profile aims to embody the FAIR principles, ensuring that data is Findable, Accessible, Interoperable, and Reusable. This introduction will provide an overview of the purpose of this profile, highlight the gaps between POD 1.1 and DCAT 3, and elaborate on the differences and enhancements offered by the DCAT-US 3.0 profile.

Purpose and Evolution

The purpose of the DCAT-US 3.0 Profile is to improve data discoverability, accessibility, and interoperability among US government agencies. By adhering to the FAIR principles, the profile promotes more effective data sharing and reuse. The FAIR principles emphasize that data should be:

The DCAT-US 3.0 Profile bridges the gap between the POD 1.1 and DCAT 3 standards by incorporating the best features of both while also addressing specific metadata requirements unique to the US context. It offers a standardized approach for describing and exchanging datasets, thereby enabling more efficient data sharing and reuse.

Data Structure

The Application Profile specified in this document is based on the specification of the Data Catalog Vocabulary Version 3 (DCAT 3) [[VOCAB-DCAT-3]] developed under the responsibility of the W3C Dataset Exchange Working Group (DXWG). DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. Additional classes and properties from other well-known vocabularies are re-used where necessary.

The DCAT vocabulary consists of classes and properties.

Classes and properties are used to deliver the metadata in a structured way.

Application Areas

The DCAT Application Profile for data portals in the United States (DCAT-US) is an Application Profile of the DCAT vocabulary.

Gaps with DCAT-US 1.1

The DCAT-US 1.1 standard, while effective for its time, had some limitations that the DCAT 3 standard has addressed. The key differences between the two standards include:

DCAT-US Features

The DCAT-US 3.0 Profile not only incorporates the enhancements provided by DCAT 3 but also maintains the US-specific metadata requirements defined in POD 1.1. This profile offers a harmonized approach to data cataloging that accounts for the unique needs of US agencies.

One of the key features of this profile is its use of reference controlled vocabularies. These vocabularies enable better interoperability between US agencies by providing a common language for describing datasets. The profile also introduces new properties to handle geospatial data and statistical datasets, leveraging established vocabularies in these domains.

The Data Catalog Vocabulary (DCAT-US) specification introduces several key features designed to enhance the accessibility, interoperability, and effectiveness of data cataloging practices. Below, we outline the compelling advantages of adopting DCAT-US over traditional document-centric metadata standards, such as ISO 19115, highlighting its superiority in meeting the needs of modern data ecosystems.

In conclusion, DCAT-US represents a forward-looking solution that significantly advances beyond traditional rigid document-centric metadata standard silos. Its design and features cater to the demands of contemporary data management and publishing, ensuring that data assets are more visible, accessible, and valuable to users across the data ecosystem.

Profile Encoding

The encoding of the DCAT-US profile involves the technical aspects of how data is represented and exchanged, addressing questions about data format and interoperability. While the DCAT-US 3.0 conformance does not strictly mandate the use of RDF serialization for data exchange, it emphasizes the importance of ensuring that the exchanged format can be unambiguously transformed into RDF. This flexibility allows for interoperability while accommodating various data exchange requirements.

One prevalent format for data exchange between systems is JSON (JavaScript Object Notation), which is widely used due to its simplicity and human-readable nature. To facilitate data exchange in JSON while adhering to the DCAT-US profile, a dedicated mechanism is provided: the JSON-LD context file. JSON-LD 1.1 (JSON for Linked Data) is a W3C Recommendation [[?JSON-LD]] that establishes a standardized approach for interpreting JSON structures as RDF, enhancing the potential for semantic integration and interoperability.

The DCAT-US profile offers a [[?JSON-LD]] context file that implementers can utilize as a foundation for their data exchange processes. By incorporating this JSON-LD context file, implementers can ensure that their data adheres to the DCAT-US standards while being exchanged in a JSON format. This allows for a coherent and consistent representation of the data that aligns with the RDF model, promoting interoperability among different systems and tools.

It's important to note that the provided JSON-LD context file is not normative, indicating that other JSON-LD contexts can also be used to establish a conformant DCAT-US data exchange. This flexibility caters to various implementation scenarios and data requirements, while still adhering to the overarching principles of the DCAT-US profile. Overall, the encoding of the DCAT-US profile acknowledges the significance of data format and interchange methods, leveraging JSON-LD and related mechanisms to facilitate seamless and interoperable data exchange within the context of the DCAT-US specification.

Profile Validation

While the JSON Schema approach used in POD 1.1 was effective in certain scenarios, it has limitations when compared to using SHACL for defining data models and constraints:

Considering these limitations, the DCAT-US 3.0 Profile has chosen SHACL as the foundation for its data modeling and validation, ensuring a more expressive, interoperable, and future-proof framework for defining dataset metadata.

The DCAT-US 3.0 Profile is defined using the Shapes Constraint Language (SHACL), which offers several advantages over previous approaches:

By using [[?SHACL]], the DCAT-US 3.0 Profile ensures a robust and extensible foundation for future updates, as well as compatibility with a wide range of data processing tools and applications.

Document Status

Candidate Recommendation Snapshot

Data Provider requirements

In order to conform to this Application Profile, an application that provides metadata MUST: For the properties listed in the table in section [[[#controlled-vocabularies]]], the associated controlled vocabularies MUST be used. Additional controlled vocabularies MAY be used. In addition to the mandatory properties, any of the recommended and optional properties defined in each class description MAY be provided.

Receiver requirements

In order to conform to this Application Profile, an application that receives metadata MUST be able to: "Processing" means that receivers must accept incoming data and transparently provide these data to applications and services. It does neither imply nor prescribe what applications and services finally do with the data (parse, convert, store, make searchable, display to users, etc.).

DCAT-US Classes

This section displays the classes for the DCAT-US 3 profile. We distinguish core classes, which represent the primary business entities that the application profile is concerned with, from supporting classes, which are used to provide additional context, metadata, or structure to the core classes.

This following table provides a summary of critical changes and updates in the DCAT-US 3.0 Application Profile, offering valuable insights into the evolution of class definitions within this data cataloging standard. Each change type is carefully documented, from the introduction of new classes specifically designed for DCAT-US 3.0 to updates and adaptations from the broader DCAT specifications, such as DCAT 1, DCAT 2, and DCAT 3. Understanding these changes is essential for data practitioners, as it enables them to grasp the evolving landscape of data cataloging and its alignment with various DCAT versions, ultimately facilitating more effective data management and interoperability.

Change Type Description
New! New DCAT-US 3.0 specific class that is not referred in DCAT specifications
Aligned Class introduced in DCAT specifications that does not exist in DCAT-US 1.1

Core Classes

The DCAT US Application Profile (“DCAT-US ”) are structured around the following main classes:

Class name Usage note for the Application Profile URI and Reference Changes from DCAT-US 1.1
Catalog A catalog or repository that hosts the Datasets or Data Services being described. dcat:Catalog Aligned
Catalog Record A record in a catalog, describing the registration of a single dcat:Resource dcat:CatalogRecord Aligned
Dataset A conceptual entity that represents the information published. dcat:Dataset Aligned
Distribution A physical embodiment of the Dataset in a particular format. dcat:Distribution Aligned
Data Service A collection of operations that provides access to one or more datasets or data processing functions. dcat:DataService Aligned
Dataset Series A collection of datasets that are published separately, but share some characteristics that group them. dcat:DatasetSeries Aligned
UML Model for Core Classes of DCAT-US 3.0 (click to open)
DCAT-US 3.0 Core Classes

Supporting Classes

Class name Usage note for the Application Profile URI and Reference Changes from DCAT-US 1.1

AccessRestriction

The "AccessRestriction" class used by NARA represents limitations placed on accessing specific records or information within their archives, ensuring controlled and responsible access based on legal, ethical, or security considerations.

dcat-us:AccessRestriction

New!

Activity

An activity carried out by an Agent over an entity, according to a plan, and generating another entity.

prov:Activity

Aligned

Address (Location)

A postal address for a Location.

locn:Address

New!

Address (Contact Point)

A postal address for Contact Point.

vcard:Address

Aligned
Agent

An entity (e.g., an individual or an organization) that is associated with Catalogs, Catalog Records, Data Services, or Datasets. If the Agent is an organization, the use of the Organization Ontology [[VOCAB-ORG]] is recommended.

foaf:Agent

Aligned

Attribution

A responsibility of an Agent for a resource.

prov:Attribution

Aligned
Concept

A subject of a Catalog, Dataset, or Data Service.

skos:Concept

Aligned
Concept scheme

A concept collection (e.g. controlled vocabulary) in which the Concept is defined.

skos:ConceptScheme

Aligned
Checksum

A value that allows the contents of a file to be authenticated. This class allows the results of a variety of checksum and cryptographic message digest algorithms to be represented.

spdx:Checksum

Aligned
Contact

A description following the [[VCARD-RDF]] specification, e.g. to provide telephone number and e-mail address for a contact point using vcard:Kind .

vcard:Kind

Aligned

CUI Restriction

Controlled Unclassified Information (CUI) is information that requires safeguarding or dissemination controls pursuant to and consistent with applicable law, regulations, and government-wide policies but is not classified.

dcat-us:CuiRestriction

New!
Document

A textual resource intended for human consumption that contains information, e.g., a Web page about a Dataset, a publication, a chapter book, a technical report, but also a blog post.

foaf:Document

Aligned

Geographic Bounding Box

GeographicBoundingBox describes the spatial extent of domain of application of an resource and is standardized in WGS 84 Lat/Long coordinate system.

dcat-us:GeographicBoundingBox

New!
Identifier

An identifier in a particular context, consisting of the string that is the identifier; an optional identifier for the identifier scheme; an optional identifier for the version of the identifier scheme; an optional identifier for the agency that manages the identifier scheme

adms:Identifier

New!
LiabilityStatement

A formal declaration accompanying a dataset which outlines the responsibilities and limitations of the data provider in terms of the accuracy, completeness, and potential use of the data. It often serves to limit the legal exposure of the data provider by defining the scope of allowed uses and disclaiming warranties or guarantees.

dcat-us:LiabilityStatement

New!
License Document

A legal document giving official permission to do something with a resource.

dcterms:LicenseDocument

Aligned
Location

A spatial region or named place. It can be represented using a controlled vocabulary or with geographic coordinates.

dcterms:Location

Aligned

Media type

A media type, e.g. the format of a computer file.

dcterms:MediaType

Aligned

Metric

Represents a standard to measure a quality dimension. An observation (instance of dqv:QualityMeasurement) assigns a value in a given unit to a Metric.

In DCAT-US, this class is used to define individuals corresponding to the different types of spatial resolution.

dqv:Metric

Aligned
Organization

Represents a collection of people organized together into a community or other social, commercial or political structure. The group has some common purpose or reason for existence which goes beyond the set of people belonging to it and can act as an Agent. Organizations are often decomposable into hierarchical structures.

org:Organization Aligned
Period of time

An interval of time that is named or defined by its start and end dates.

dcterms:PeriodOfTime

Aligned
Person

This class represents an individual human being or a person. It can be used to provide information about individuals, such as their name, email address, homepage URL, and other personal details.

foaf:Person Aligned
Provenance Statement

A statement of any changes in ownership and custody of a resource since its creation that are significant for its authenticity, integrity, and interpretation

dcterms:ProvenanceStatement

New!

Quality Measurement

Represents the evaluation of a given resource (as a Data Service, Dataset, or Distribution) against a specific quality metric.

dqv:QualityMeasurement

Aligned
Relationship

An association class for attaching additional information to a relationship between DCAT Resources

dcat:Relationship

Aligned
Rights statement

A statement about the intellectual property rights (IPR) held in or over a resource, a legal document giving official permission to do something with a resource, or a statement about access rights.

dcterms:RightsStatement

Aligned
Role

A role is the function of a resource or agent with respect to another resource, in the context of resource attribution or resource relationships.

Note it is a subclass of skos:Concept.

dcat:Role

Aligned
Standard

A standard or other specification to which a Catalog, Catalog Record, Data Service, Dataset, or Distribution conforms.

dcterms:Standard

Aligned

Use Restriction

A UseRestriction is a set of rules, guidelines, or legal provisions that dictate how a particular resource, asset, information, or object can be utilized. Use restrictions may encompass limitations on access, distribution, reproduction, modification, or sharing, and they are often put in place to protect privacy, intellectual property rights, security, or compliance with legal or ethical standards.

dcat-us:UseRestriction

New!
UML Model for Supporting Classes of DCAT-US 3.0 (click to open)
DCAT-US 3.0 Supporting Classes

Properties per Class

Overview

Requirement levels

DCAT-US defines four requirement levels for data receivers and senders:

  • Mandatory property: a receiver MUST be able to process the information for that property; a sender MUST provide the information for that property.
  • Recommended property: a receiver MUST be able to process the information for that property; a sender SHOULD provide the information for that property if it is available.
  • Optional property: a receiver MUST be able to process the information for that property; a sender MAY provide the information for that property but is not obliged to do so.
  • Deprecated property: a receiver SHOULD be able to process information about instances of that property; a sender SHOULD NOT provide the information about instances of that property.

The meaning of the terms MUST, MUST NOT, SHOULD and MAY in this section and in the following sections are as defined in RFC 2119.

In the given context, the term "processing" means that receivers MUST accept incoming data and transparently provide these data to applications and services. It does neither imply nor prescribe what applications and services finally do with the data (parse, convert, store, make searchable, display to users, etc.).

Notations

  • Property: denotes the Property that the class or property is given in DCAT-US .
  • URI: denotes the property URI.
  • Range: specifies the range of values that is expected for the property.
  • ReqLevel (“Requirement level”): denotes whether the class / property is mandatory, recommended or optional.
  • Card (“Cardinality”): specifies the minimum number of values that MUST be provided for that property and the maximum number of values that MAY be provided.
  • Usage note: specifies custom usage instructions and provides background information.
  • CV (“Controlled Vocabulary”): defines which controlled vocabulary SHOULD be used.

Property Evolution in DCAT-US 3.0.

The following table provides an overview of the various types of changes and updates within the DCAT-US specifications, shedding light on the evolution and adaptation of the data catalog standard. Each change type is categorized, and its significance is explained, ranging from the introduction of new properties to updates that align with the latest DCAT specifications. Understanding these changes is essential for data practitioners and stakeholders seeking to keep pace with the evolving landscape of data cataloging and data sharing standards.

Change Type Description
New! New DCAT-US 3.0 specific property that is not referred in DCAT specifications
Aligned Property aligned with latest DCAT-3 specification that does not exist in DCAT-US 1.1
Fixed Fixed property that is inconsistent with DCAT specification
No Change No change from DCAT-US 1.1 profile
Multilingual Support Extension of DCAT-US property to support multilingual values

AccessRestriction

The class "AccessRestriction" used by the National Archives and Records Administration (NARA) refers to a categorization or specification that denotes limitations or conditions imposed on the accessibility of certain records, documents, or information within their archival holdings. Access restrictions are employed to regulate and control access to sensitive or confidential content based on legal, ethical, security, or other relevant considerations. These restrictions may pertain to who can access the information, the purposes for which it can be accessed, and the conditions under which it can be utilized. The "AccessRestriction" class provides a structured framework for classifying and managing these access limitations within NARA's archival context, contributing to the proper governance and responsible dissemination of historical records and data.

RDF Class: dcat-us:AccessRestriction
Definition: The "AccessRestriction" class used by NARA represents limitations placed on accessing specific records or information within their archives, ensuring controlled and responsible access based on legal, ethical, or security considerations.
Usage note

The "AccessRestriction" class serves as a valuable tool within NARA's archival framework, enabling the organization to effectively manage and communicate access limitations for specific records or information. By employing this class, NARA can categorize and enforce controlled access to sensitive content, safeguarding confidentiality, adhering to legal requirements, and preserving the integrity of historical data. Researchers, archivists, and authorized users can rely on "AccessRestriction" to navigate and understand the accessibility parameters associated with archived materials, facilitating responsible information dissemination and usage.

Rationale The "AccessRestriction" class in the DCAT-US application profile is essential for categorizing and managing access restrictions according to NARA standards, ensuring responsible access to sensitive historical records. It enhances transparency, aiding researchers and authorized users in understanding and navigating access parameters for archived materials.

Properties Summary

Property URI Range ReqLevel Card
restriction status dcat-us:restrictionStatus skos:Concept M 1..1
specific restriction dcat-us:specificRestriction skos:Concept R 0..1
restriction note dcat-us:restrictionNote rdfs:Literal O 0..1

Mandatory Properties

Property: restriction status

Property restriction status
Requirement level Mandatory
Cardinality 1
URI dcat-us:restrictionStatus
Range skos:Concept
Definition The indication of whether or not there are access restrictions on the data.

Optional Properties

Property: restriction note

Property restriction note
Requirement level Optional
Cardinality 0..1
URI dcat-us:restrictionNote
Range rdfs:Literal
Definition A note related to the access restriction

Example

Activity

RDF Class: prov:Activity
Definition: An activity is something that occurs over a period of time and acts upon or with entities; it may include consuming, processing, transforming, modifying, relocating, using, or generating entities.
Usage note The activity associated with generation of a dataset will typically be an initiative, project, mission, survey, on-going activity ("business as usual"). mission or survey etc. Multiple prov:wasGeneratedBy properties can be used to indicate the dataset production context at various levels of granularity. Details about how to describe the activity that generated a dataset are out of scope for this applicition profile. prov:Activity provides for minimum basic properties for labeling and classification of activities.
Rationale:

Integrating prov:Activity into the DCAT-US schema offers a streamlined, generic class to represent a myriad of operations, such as initiatives, projects, and ongoing activities, without the complexity of managing numerous specialized classes. This inclusion not only simplifies the representation of varied activities under a unified semantic framework but also enhances data provenance tracking and interoperability across diverse systems and domains. Consequently, it provides a flexible, future-proof approach to accommodate evolving types of activities without necessitating continual schema modifications.

Properties Summary

Property URI Range ReqLevel Card
label rdfs:label xsd:string M 1..n
category dcterms:type skos:Concept O 0..1

Mandatory Properties

Property: label

Property label
Requirement level Mandatory
Cardinality 1..n
URI rdfs:label
Range xsd:string
Usage note This property is used to give a human-readable label for the activity.

Optional Properties

Property: category

Property category
Requirement level Optional
Cardinality 0..n
URI dcterms:type
Range skos:Concept
Usage note

Example

Address (Contact Point)

RDF Class: vcard:Address
Obligation Optional
Definition: Specify the components of the delivery address for a contact point
Usage note This class is used only to associate an address with a contact point. When incorporating [[VCARD-RDF]] vcard:Address within DCAT-US, ensure to utilize its properties, such as vcard:street-address, vcard:locality, and vcard:country-name, to provide comprehensive and accurate address details for entities like organizations or publishers. Always adhere to consistent formatting across the catalog, be mindful of privacy considerations, especially for individual addresses, and validate the data regularly to maintain its accuracy and reliability.
Rationale: Integrating [[VCARD-RDF]]'s contact point address into DCAT-US ensures a standardized, interoperable format for presenting address data

Properties Summary

Property URI Range ReqLevel Card
administrative area vcard:region rdfs:Literal R 0..1
city vcard:locality rdfs:Literal R 0..1
country name vcard:country-name rdfs:Literal R 0..1
postal code vcard:postal-code rdfs:Literal R 0..1
street address vcard:street-address rdfs:Literal R 0..1

Example

Address (Location)

RDF Class: locn:Address
Definition: The address of a location.
Usage note This class is used to define a location defined by an address. It should be used only with the property dcterms:spatial, not the contact point address property.
Rationale: Incorporating locn:Address from the W3C Location ontology [[LOCN]] into DCAT-US provides a standardized, structured, and extensible format to represent physical addresses, facilitating consistent, interoperable, and precise sharing of location information across various datasets and digital platforms.

Properties Summary

Property URI Range ReqLevel Card
administrative area locn:adminUnitL2 rdfs:Literal R 0..1
city locn:postName rdfs:Literal R 0..1
country locn:adminUnitL1 rdfs:Literal R 0..1
postal code locn:postCode rdfs:Literal R 0..1
street address locn:thoroughfare rdfs:Literal R 0..1

Agent

RDF Class: foaf:Agent
Definition: An entity that acts on something (eg. person, group, software or physical artifact).
Usage note
  • Use this class when refering to a software agent that is associated with Catalogs, Catalog Records, Data Services, or Datasets.
  • If the Agent is an organization, the use of the org:Organization is recommended.
  • If the Agent is a person, the use of foaf:Person is recommended
Rationale: The addition of the foaf:Agent class in DCAT-US 3.0 serves a dual purpose. Firstly, it allows for the representation of software agents, aligning with modern data automation needs. Secondly, it acts as an abstract class for both foaf:Person and org:Organization, promoting consistency and interoperability while simplifying resource descriptions within the dataset catalog.

Properties Summary

Property URI Range ReqLevel Card
name foaf:name xsd:string M 1..1
type dcterms:type skos:Concept R 0..1

Mandatory Properties

Property: name

Property name
Requirement level Mandatory
Cardinality 1
URI foaf:name
Range xsd:string
Definition The name of the software agent

Example

Attribution

RDF Class: prov:Attribution
Definition: A responsibility of an Agent for a resource.
Usage note Used to link to an Agent where the nature of the relationship is known but does not match one of the standard [[DCTERMS]] properties (dcterms:creator, dcterms:contributor, dcterms:rightsHolder, and dcterms:publisher). Use dcat:hadRole on the prov:Attribution to capture the responsibility of the Agent with respect to the Resource.
Rationale The inclusion of prov:Attribution in the DCAT profile enables clear data source attribution, promoting responsible data sharing and proper citation practices. It aligns the profile with data provenance best practices for accurate attribution in data sharing.

Properties Summary

Property URI Range ReqLevel Card
agent prov:agent foaf:Agent M 1
role dcat:hadRole dcat:Role M 1

Mandatory Properties

Property: agent

Property agent
Requirement level Mandatory
Cardinality 1
URI prov:agent
Range foaf:Agent
Definition The prov:agent property references an Agent that plays a role in the resource

Property: role

Property role
Requirement level Mandatory
Cardinality 1
URI dcat:hadRole
Range dcat:Role
Definition The function of an entity or agent with respect to another entity or resource.

Example

Catalog

A Catalog or repository that hosts the Datasets or Data Services being described.

DCAT-US allows Catalogs of only Datasets, but also Catalogs of only Data Services

RDF Class: dcat:Catalog
Definition: A curated collection of metadata about resources (e.g., datasets and data services in the context of a data catalog)
Sub-class of: dcat:Dataset
Usage note
  • A Web-based data catalog is typically represented as a single instance of this class.
  • Populate metadata within the dcat:Catalog to facilitate resource discovery, including title, description, classifiers and other relevant information.
  • Specify the resources hosted within the catalog by linking them as dcat:dataset or dcat:service.
Rationale The update of the dcat:Catalog class aligns with the generalization of catalog scope in DCAT-US 3.0, accommodating catalogs of datasets, data services, or a mixture of both. It reflects the evolving landscape of data publication and discovery, allowing data publishers to describe and share their resources effectively. Additionally, by making Catalog a subclass of Dataset, DCAT-US promotes consistency in metadata representation and enables catalogs to be composed of other catalogs, promoting modularity and extensibility in the data catalog ecosystem.
Property URI Range ReqLevel Card Changes from DCAT-US 1.1
title dcterms:title rdfs:Literal M 1..n Multilingual support
description dcterms:description rdfs:Literal M 1..n Multilingual support
publisher dcterms:publisher foaf:Agent M 1..1 Aligned
dataset dcat:dataset dcat:Dataset M 1..n No Change
homepage foaf:homepage foaf:Document R 0..1 Aligned
language dcterms:language dcterms:LinguisticSystem R 0..n Aligned
license dcterms:license dcterms:LicenseDocument R 0..1 Aligned
release date dcterms:issued rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R 0..1 Aligned
rights dcterms:rights dcterms:RightsStatement R 0..n Aligned
spatial/geographic coverage dcterms:spatial dcterms:Location R 0..n Aligned
themes dcat:themeTaxonomy skos:ConceptScheme R 0..n Aligned
update/modification date dcterms:modified rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R 0..1 Aligned
schema version dcterms:conformsTo dcterms:Standard R 0..1 No Change
creator dcterms:creator dcterms:Agent O 0..n Aligned
access rights dcterms:accessRights dcterms:RightsStatement O 0..1 Aligned
catalog dcat:catalog dcat:Catalog O 0..n Aligned
contact point dcat:contactPoint vcard:Kind O 0..n Aligned
keyword/tag dcat:keyword rdfs:Literal O 0..n Aligned
has part dcterms:hasPart dcat:Catalog O 0..n Aligned
catalog record dcat:record dcat:CatalogRecord O 0..n Aligned
service dcat:service dcat:DataService O 0..n Aligned
theme/category dcat:theme skos:Concept O 0..n Aligned
identifier dcterms:identifier rdfs:Literal O 0..n Aligned
rights holder dcterms:rightsHolder org:Organization O 0..1 New!
subject dcterms:subject skos:Concept O 0..n New!
temporal coverage dcterms:temporal dcterms:PeriodOfTime O 0..n Aligned
qualified attribution prov:qualifiedAttribution prov:Attribution O 0..n Aligned
category dcterms:type skos:Concept O 0..1 New!

Mandatory Properties

Property: title
Property Title
Requirement level Mandatory
Cardinality 1..n
URI dcterms:title
Range rdfs:Literal
Usage note
  • The title of the catalog in the indicated language
  • This property can be repeated for parallel language versions of the description (see )

Property: description

Property description
Requirement level Mandatory
Cardinality 1..n
URI dcterms:description
Range rdfs:Literal
Definition Free-text description of the catalog (in the language indicated in the attribute).
Usage note
  • This property contains a free-text account of the data Catalog (in the language indicated in the attribute).
  • This property can be repeated for parallel language versions of the description (see ).

Property: publisher

Property publisher
Requirement level Mandatory
Cardinality 1..1
URI dcterms:publisher
Range foaf:Agent
Definition Entity responsible for making the catalog available.
Usage note
  • This property refers to an entity (organization) responsible for making the Catalog available.

Property: dataset

Property dataset
Requirement level Mandatory
Cardinality 1..n
URI dcat:dataset
Range dcat:Dataset
Definition Dataset that is part of the catalog.
Usage note
  • This property links the Catalog with a Dataset that is part of the Catalog.
  • As empty Catalogs are usually indications of problems, this property SHOULD be combined with the property service to implement an empty Catalog check.

Optional Properties

Property: creator

Property creator
Requirement level Optional
Cardinality 0..n
URI dcterms:creator
Range dcterms:Agent
Definition: The entity responsible for producing the resource.
Usage note
  • Resources of type foaf:Agent are recommended as values for this property.

Property: access rights

Property access rights
Requirement level Optional
Cardinality 0..1
URI dcterms:accessRights
Range dcterms:RightsStatement
Usage note
  • This property refers to information that indicates whether the Catalog is open data, has access restrictions or is not public.
  • CV to be used: [[?DATA-GOV-AR]].

Property: catalog

Property catalog
Requirement level Optional
Cardinality 0..n
URI dcat:catalog
Range dcat:Catalog
Usage note
  • This property refers to a catalog whose contents are of interest in the context of this catalog.

Property: contact point

Property contact point
Requirement level Optional
Cardinality 0..n
URI dcat:contactPoint
Range vcard:Kind
Usage note
  • Relevant contact information for the cataloged resource. Use of vCard is recommended

Property: keyword/tag

Property keyword/tag
Requirement level Optional
Cardinality 0..n
URI dcat:keyword
Range rdfs:Literal
Usage note
  • A keyword or tag describing the resource.

Property: has part

Property has part
Requirement level Optional
Cardinality 0..n
URI dcterms:hasPart
Range dcat:Catalog
Usage note
  • This property refers to a related catalog that is part of the described catalog.

Property: catalog record

Property catalog record
Definition: A record describing the registration of a single resource (e.g., a dataset, a data service) that is part of the catalog.
Requirement level Optional
Cardinality 0..n
URI dcat:record
Range dcat:CatalogRecord

Property: service

Property service
Requirement level Optional
Cardinality 0..n
URI dcat:service
Range dcat:DataService
Usage note
  • This property refers to a site or end-point (Data Service) that is listed in the Catalog.
  • As empty Catalogs are usually indications of problems, this property SHOULD be combined with the property Dataset to implement an empty Catalog check.

Property: theme/category

Property theme/category
Requirement level Optional
Cardinality 0..n
URI dcat:theme
Range skos:Concept
Usage note
  • This property refers to a category of the Catalog. A Catalog may be associated with multiple themes.
  • CV to be used: [[?DATA-GOV-THEME]]

Property: identifier

Property identifier
Requirement level Optional
Cardinality 0..n
URI dcterms:identifier
Range rdfs:Literal
Usage note
  • This property contains the main identifier for the Catalog, e.g. the URI or other unique identifier.

Property:rights holder

Property rights holder
Requirement level Optional
Cardinality 0..n
URI dcterms:rightsHolder
Range org:Organization
Usage note
  • This property refers to an organization holding rights on the Catalog.

Property: subject

Property subject
Requirement level Optional
Cardinality 0..n
URI dcterms:subject
Range skos:Concept

Property: temporal coverage

Property temporal coverage
Requirement level Optional
Cardinality 0..n
URI dcterms:temporal
Range dcterms:PeriodOfTime
Usage note
  • This property refers to a temporal period that the Catalog covers.

Property: qualified attribution

Property qualified attribution
Requirement level Optional
Cardinality 0..n
URI prov:qualifiedAttribution
Range prov:Attribution
Usage note
  • This property refers to a link to an Agent having some form of responsibility for the Catalog.

Property: category

Property category
Requirement level Optional
Cardinality 0..1
URI dcterms:type
Range skos:Concept
Usage note
  • The category of the Catalog

Example

Catalog Record

RDF Class: dcat:CatalogRecord
Definition: A record in a catalog, describing the registration of a single dcat:Resource.
Usage note This class is optional and not all catalogs will use it. It exists for catalogs where a distinction is made between metadata about a dataset or service and metadata about the entry in the catalog about the dataset or service. For example, the publication date property of the dataset reflects the date when the information was originally made available by the publishing agency, while the publication date of the catalog record is the date when the dataset was added to the catalog. In cases where both dates differ, or where only the latter is known, the publication date SHOULD only be specified for the catalog record. Notice that the W3C PROV Ontology [[PROV-O]] allows describing further provenance information such as the details of the process and the agent involved in a particular change to a dataset or its registration.
Rationale While its use is not mandatory, the incorporation of dcat:CatalogRecord into DCAT-US 3.0 holds significant value. It enables catalogs to distinguish between metadata describing datasets or services and the actual catalog entries. This differentiation proves especially advantageous for ensuring adherence to application profiles that demand specific metadata for catalog records. Furthermore, it streamlines resource lifecycle management, empowering catalogs to monitor alterations and revisions to their entries, ultimately bolstering data governance and quality assurance protocols.

Properties Summary

Property URI Range ReqLevel Card
application profile dcterms:conformsTo dcterms:Standard R 0..1
change type adms:status skos:Concept R 0..1
description dcterms:description rdfs:Literal O 0..n
language dcterms:language dcterms:LinguisticSystem O 0..n
listing date dcterms:issued rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O 0..n
update/modification date dcterms:modified rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) M 1..1
primary topic foaf:primaryTopic dcat:Resource M 1..1
source metadata dcterms:source dcat:Resource O 0..1
title dcterms:title rdfs:Literal O 0..n

Mandatory Properties

Property: update/modification date

Property update/modification date
Requirement level Mandatory
Cardinality 1..1
URI dcterms:modified
Range rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth)
Definition The most recent date on which the Catalog Record's entry was changed or modified.

Property: primary topic

Property primary topic
Requirement level Mandatory
Cardinality 1..1
URI foaf:primaryTopic
Range dcat:Resource
Definition A link to the Dataset, Data service or Catalog described in the Catalog Record.
Usage note A catalog record will refer to one entity in a catalog. This can be either a Dataset or a Data Service. To ensure an unambigous reading of the cardinality the range is set to Cataloged Resource. However it is not the intend with this range to require the explicit use of the class Cataloged Record. As abstract class, an subclass should be used.

Optional Properties

Property: description

Property description
Requirement level Optional
Cardinality 0..n
URI dcterms:description
Range rdfs:Literal
Definition A free-text account of the Catalog Record. This property can be repeated for parallel language versions of the description.

Property: language

Property language
Requirement level Optional
Cardinality 0..n
URI dcterms:language
Range dcterms:LinguisticSystem
Definition A language used in the textual metadata describing titles, descriptions, etc. of the members of the catalog.
Usage note Resources defined by the Library of Congress [[ISO 639-1]] SHOULD be used.
Usage note This property can be repeated if the metadata is provided in multiple languages.

Property: listing date

Property listing date
Requirement level Optional
Cardinality 0..n
URI dcterms:issued
Range rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth)
Definition The date on which the description of the Dataset was included in the Catalog.

Property: source metadata

Property source metadata
Requirement level Optional
Cardinality 0..1
URI dcterms:source
Range dcat:Resource
Definition The original metadata that was used in creating metadata for the datasets, data services, or catalogs in the Catalog Record.

Property: title

Property title
Requirement level Optional
Cardinality 0..n
URI dcterms:title
Range rdfs:Literal
Definition A name given to the Catalog Record.
Usage note This property can be repeated for parallel language versions of the name.

Example

Checksum

RDF Class: spdx:Checksum
Definition: A Checksum is a value that allows to check the integrity of the contents of a file. Even small changes to the content of the file will change its checksum. This class allows the results of a variety of checksum and cryptographic message digest algorithms to be represented [[SPDX]].
Usage note
  • The Checksum includes the algorithm (spdx:algorithm) and value (spdx:checksumValue) that allows the integrity of a file to be verified to ensure no errors occurred in transmission or storage.
Rationale: Introducing the spdx:Checksum class in DCAT-US bolsters data integrity and trust by ensuring datasets remain unaltered during transfers. This standardized approach promotes consistency across catalogs, facilitates error detection, and adapts to evolving cryptographic needs, enhancing the utility of automated tools.

Properties Summary

Property URI Range ReqLevel Card
algorithm spdx:algorithm spdx:ChecksumAlgorithm M 1..1
checksum value spdx:checksumValue xsd:hexBinary M 1..1

Mandatory Properties

Property: algorithm

Property algorithm
Requirement level Mandatory
Cardinality 1..1
URI spdx:algorithm
Range spdx:ChecksumAlgorithm
Definition The algorithm used to produce the subject Checksum.

Property: checksum value

Property checksum value
Requirement level Mandatory
Cardinality 1..1
URI spdx:checksumValue
Range xsd:hexBinary
Definition A lower case hexadecimal encoded digest value produced using a specific algorithm.

Example

Concept

RDF Class: skos:Concept
Definition: A controlled vocabulary term used to classify Catalog, Dataset, or Data Service.
Usage note
  • Following FAIR Vocabulary principles, Concept URI should be made resolvable and accessible using SKOS encoding and provided in Linked Data format (RDF/XML,TTL, JSON-LD, NTriples)
  • Ensure FAIR Resolvability: Make Concept URIs resolvable using FAIR principles, allowing them to be Findable, Accessible, Interoperable, and Reusable. This ensures that skos:Concept instances can be easily discovered, accessed, integrated with other resources, and reused across the DCAT-US ecosystem, promoting data interoperability and accessibility.
  • To enhance data interoperability and consistency, it is advisable to reuse established controlled vocabularies such as Global Change Master Directory (GCMD) [[?GCMD]], Agrovoc, and NAICS for data description.
Rationale: The inclusion of skos:Concept in DCAT-US 3.0 enhances semantic search in catalogs, enabling more accurate discovery of Catalogs, Datasets, and Data Services. It improves user experience, promotes data discoverability, and supports better resource utilization. Additionally, it aligns with international standards like SKOS, ensuring compatibility and adherence to recognized controlled vocabulary practices.

Properties Summary

Property URI Range ReqLevel Card
alternate label skos:altLabel rdfs:Literal O 0..n
definition skos:definition rdfs:Literal R 0..n
in scheme skos:inScheme skos:ConceptScheme M 1..1
notation skos:notation xsd:string O 0..n
preferred label skos:prefLabel rdfs:Literal M 1.n

Mandatory Properties

Property: preferred label

Property preferred label
Requirement level Mandatory
Cardinality 1..n
URI skos:prefLabel
Range rdfs:Literal
Definition Preferred label for the controlled vocabulary term (one per language).

Property: in scheme

Property in scheme
Requirement level Mandatory
Cardinality 1
URI skos:inScheme
Range skos:ConceptScheme
Definition Concept scheme defining the concept.

Optional Properties

Property: alternate label

Property alternate label
Requirement level Optional
Cardinality 0..n
URI skos:altLabel
Range rdfs:Literal
Definition Alternative labels for a concept.

Property: notation

Property notation
Requirement level Optional
Cardinality 0..n
URI skos:notation
Range xsd:string
Definition Abbreviations or codes from code lists for an organization.

Example

Concept Scheme

RDF Class: skos:ConceptScheme
Definition: A concept collection (e.g. controlled vocabulary) in which a concept is defined.
Usage note
  • Following FAIR Vocabulary principles, Concept Scheme URI should be made resolvable and accessible using SKOS encoding and provided in Linked Data format (RDF/XML,TTL, JSON-LD, NTriples)
  • To enhance data interoperability and consistency, it is advisable to reuse established controlled vocabularies such as Global Change Master Directory (GCMD) [[?GCMD]], Agrovoc, and NAICS for data description.
Rationale: The introduction of skos:ConceptScheme in DCAT-US 3.0 enhances data resource organization, categorization, and accessibility. It provides a structured framework for controlled vocabularies, aligning with FAIR Vocabulary principles for improved data interoperability and discoverability.

Properties

Property URI Range ReqLevel Card
title dcterms:title rdfs:Literal M 1..n
description dcterms:description rdfs:Literal R 0..n
creation date dcterms:created rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O 0..1
publication date dcterms:issued rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O 0..1
update/modification date dcterms:modified rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O 0..1
version info dcat:version xsd:string O 0..1

Mandatory Properties

Property: title

Property title
Requirement level Mandatory
Cardinality 1..n
URI dcterms:title
Range rdfs:Literal
Definition The title of the concept scheme in the indicated language.
Usage note Only one title per language.

Optional Properties

Property: creation date

Property creation date
Requirement level Optional
Cardinality 0..1
URI dcterms:created
Range rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth)
Definition This property contains the date on which the Concept Scheme has been first created.

Property: publication date

Property publication date
Requirement level Optional
Cardinality 0..1
URI dcterms:issued
Range rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth)
Definition This property contains the date of formal issuance (e.g., publication) of the Concept Scheme.

Property: update/modification date

Property update/modification date
Requirement level Optional
Cardinality 0..1
URI dcterms:modified
Range rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth)
Definition This property contains the most recent date at which the Concept Scheme was changed or modified.

Property: version info

Property version info
Requirement level Optional
Cardinality 0..1
URI dcat:version
Range xsd:string
Definition This property contains a version number or other version designation of the Concept Scheme.

Examples

Contact

RDF Class: vcard:Kind
Definition: Point of Contact information
Rationale: The introduction of vcard:Kind in DCAT-US 3.0 is driven by the need for standardized, reliable, and interoperable Point of Contact information, ultimately improving the accessibility and usability of data resources within the DCAT-US ecosystem.

Properties Summary

Property URI Range ReqLevel Card
formatted name vcard:fn xsd:string M 1
email vcard:hasEmail rdfs:Resource M 1
telephone vcard:tel rdfs:Resource O 0..1
organization name vcard:organization-name xsd:string O 0..1
family name vcard:family-name xsd:string O 0..1
given name vcard:given-name xsd:string O 0..1
position title vcard:title xsd:string O 0..1
has uid vcard:hasUID xsd:string O 0..1
address vcard:address vcard:Address O 0..n

Mandatory Properties

Property: formatted name

Property formatted name
Requirement level Mandatory
Cardinality 1
URI vcard:fn
Range xsd:string
Definition The formatted text corresponding to the name of the contact

Property: email

Property email
Requirement level Mandatory
Cardinality 1
URI vcard:hasEmail
Range rdfs:Resource
Definition The email address of the contact.
Usage note Use email with function name instead of individual name (e.g. support). The email address should be formatted as url starting with "mailto:" scheme

Optional Properties

Property: telephone

Property telephone
Requirement level Optional
Cardinality 0..1
URI vcard:tel
Range rdfs:Resource
Definition This property specifies the telephone number for telephony communication with the person or organization.

Property: organization name

Property organization name
Requirement level Optional
Cardinality 0..1
URI vcard:organization-name
Range xsd:string
Definition This property specifies the name of the organization to contact

Property: family name

Property family name
Requirement level Optional
Cardinality 0..1
URI vcard:family-name
Range xsd:string
Definition This property specifies the family name of the person to contact

Property: given name

Property given name
Requirement level Optional
Cardinality 0..1
URI vcard:given-name
Range xsd:string
Definition This property specifies the given name of the person to contact

Property: position title

Property title
Requirement level Optional
Cardinality 0..1
URI vcard:title
Range xsd:string
Definition This property specifies the position role of the person to contact

Property: has UID

Property hasUID
Requirement level Optional
Cardinality 0..1
URI vcard:hasUID
Range xsd:string
Definition This property specifies a value that represents a globally unique identifier corresponding to the contact (could also be used as URI component of the contact)
Usage Note The hasUID property is used to assign a unique identifier to a contact associated with a dataset or catalog. This identifier, which is optional and should be a string, ensures that each contact can be distinctly recognized and referenced. The utility of this property is particularly evident in scenarios where contacts need to be uniquely identified across different datasets or catalogs, preventing any ambiguity. It can also serve as a part of a URI for a contact, providing a consistent and resolvable identifier. Implementers are encouraged to use a globally unique string value, such as a ORCID or a URI that is guaranteed to be unique, to facilitate unambiguous identification and referencing of contacts.

Property: address

Property address
Requirement level Optional
Cardinality 0..n
URI vcard:address
Range vcard:Address
Definition This property specifies the address of the contact

Example

CUI Restriction

Controlled Unclassified Information (CUI) is information that requires safeguarding or dissemination controls pursuant to and consistent with applicable law, regulations, and government-wide policies but is not classified.

RDF Class: dcat-us:CuiRestriction
Definition: Represents Controlled Unclassified Information (CUI), which is information that requires safeguarding or dissemination controls in accordance with applicable laws, regulations, and government-wide policies but is not classified as confidential.
Usage note
  • The CUI Restriction class is designed to capture information related to Controlled Unclassified Information (CUI) in accordance with NARA guidelines.
  • Users of this class must provide the mandatory properties, i.e the CUI banner marking and designation indicator, to accurately describe the CUI status of a resource.
  • The optional property, "required indicator per authority," allows for additional information or context about CUI restrictions, providing flexibility for specific use cases.
Rationale: The introduction of the dcat-us:CuiRestriction class in DCAT-US 3.0 is driven by the need for compliance with National Archives and Records Administration (NARA) guidelines regarding Controlled Unclassified Information (CUI). This addition ensures that DCAT-US aligns with NARA's standards, promotes transparency, facilitates compliance audits, and supports efficient resource management. Ultimately, it enhances data interoperability and security within the government data ecosystem.

Properties Summary

Property URI Range ReqLevel Card
CUI banner marking dcat-us:cuiBannerMarking xsd:string M 1..1
CUI designation indicator dcat-us:designationIndicator xsd:string M 1..1
required indicator per authority dcat-us:requiredIndicatorPerAuthority xsd:string O 0..n

Mandatory Properties

Property: CUI banner marking

Property CUI banner marking
Requirement level Mandatory
Cardinality 1
URI dcat-us:cuiBannerMarking
Range xsd:string
Definition CUI (Controlled Unclassified Information) banner marking is required for any unclassified information that is deemed sensitive and requires protection.

Property: CUI designation indicator

Property CUI designation Indicator
Requirement level Mandatory
Cardinality 1
URI dcat-us:designationIndicator
Range xsd:string
Definition Designation Indicator shows which agency made the document CUI
Usage note
  • Free text per NARA Marking Guidebook and DODI 5200.48 (should have at least "Controlled by:").
  • It is best practice to include contact information.

Optional Properties

Property: required indicator per authority

Property required indicator per authority
Requirement level Optional
Cardinality 0..n
URI dcat-us:requiredIndicatorPerAuthority
Range xsd:string
Definition free text (e.g., text of the category description or the distribution statement)

Example

Data Service

RDF Class: dcat:DataService
Definition: A collection of operations that provides access to one or more datasets or data processing functions.
Sub-class of: dcat:Resource
Sub-class of: dctype:Service
Usage note
  • If a dcat:DataService is bound to one or more specified Datasets, they are indicated by the dcat:servesDataset property.
  • The kind of service can be indicated using the dcterms:type property. Its value may be taken from a controlled vocabulary such as the Data.GOV spatial data service type code list [[?DATA-GOV-SDST]].
Rationale: Introducing dcat:DataService is essential as it clarifies the representation of data services, addressing the confusion caused by using dcat:Distribution to describe services in DCAT 1. This addition promotes clear communication of service-related information, improving discoverability, and facilitating seamless integration and usage by data consumers and applications.
Property URI Range ReqLevel Card
endpoint URL dcat:endpointURL rdfs:Resource M 1..n
contact point dcat:contactPoint vcard:Kind M 1..n
publisher dcterms:publisher foaf:Agent M 1..1
title dcterms:title rdfs:Literal M 1..n
endpoint description dcat:endpointDescription rdfs:Resource R 0..n
license dcterms:license dcterms:LicenseDocument R 0..1
serves dataset dcat:servesDataset dcat:Dataset R 0..n
keyword/tag dcat:keyword rdfs:Literal O 0..n
spatial resolution in meters dcat:spatialResolutionInMeters rdfs:Literal typed as xsd:decimal O 0..n
temporal resolution dcat:temporalResolution rdfs:Literal typed as xsd:duration O 0..n
theme/category dcat:theme skos:Concept O 0..n
access rights dcterms:accessRights dcterms:RightsStatement O 0..1
conforms to dcterms:conformsTo dcterms:Standard O 0..n
creation date dcterms:created rdfs:Literal typed as xsd:date or xsd:dateTime O 0..1
creator dcterms:creator dcterms:Agent O 0..n
description dcterms:description rdfs:Literal O O..n
identifier dcterms:identifier rdfs:Literal O 0..n
language dcterms:language dcterms:LinguisticSystem O 0..n
update/modification date dcterms:modified rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O 0..1
rights dcterms:rights dcterms:RightsStatement O 0..n
rights holder dcterms:rightsHolder org:Organization O 0..n
spatial/geographic coverage dcterms:spatial dcterms:Location O 0..n
status adms:status skos:Concept O 0..1
termporal coverage dcterms:temporal dcterms:PeriodOfTime O 0..n
category dcterms:type skos:Concept O 0..1
quality measurement dqv:hasQualityMeasurement dqv:QualityMeasurement O 0..n
qualified attribution prov:qualifiedAttribution prov:Attribution O 0..n
was used by prov:wasUsedBy prov:Activity O 0..n
geographic bounding box dcat-us:geographicBoundingBox dcat-us:GeographicBoundingBox O 0..n

Mandatory Properties

Property: endpoint URL

RDF Property dcat:endpointURL
Requirement level Mandatory
Cardinality 1..n
URI dcat:endpointURL
Range rdfs:Resource
Usage note The root location or primary endpoint of the service (a Web-resolvable IRI)

Property: contact point

Property contact point
Requirement level Mandatory
Cardinality 1..n
URI dcat:contactPoint
Range vcard:Kind
Definition This property contains contact information that can be used for sending comments about the Data Service.
Usage note
  • This property MUST contain an email address that is continuously monitored by the data publisher.
  • If there are several contributors involved in the publication of the Dataset, the property can be used multiple times.

Property: publisher

Property publisher
Requirement level Mandatory
Cardinality 1..1
URI dcterms:publisher
Range foaf:Agent
Definition This property refers to an entity (organization) responsible for making the Data Service available.
Usage note This property refers to an entity (organization) responsible for making the Catalog available.

Property: title

Property title
Requirement level Mandatory
Cardinality 1..n
URI dcterms:title
Range rdfs:Literal
Usage note
  • The title of the catalog in the indicated language
  • This property can be repeated for parallel language versions of the description (see )

Optional Properties

Property: keyword/tag

Property keyword/tag
Requirement level Optional
Cardinality 0..n
URI dcat:keyword
Range rdfs:Literal
Definition This property contains a keyword or tag describing the Data Service.

Property: spatial resolution in meters

Property spatial resolution in meters
Requirement level Optional
Cardinality 0..n
URI dcat:spatialResolutionInMeters
Range rdfs:Literal typed as xsd:decimal
Definition This property refers to the minimum spatial separation resolvable in a Data Service, measured in meters.

Property: temporal resolution

Property temporal resolution
Requirement level Optional
Cardinality 0..n
URI dcat:temporalResolution
Range rdfs:Literal typed as xsd:duration
Definition The minimum time period resolvable by the Data Service.

Property: theme/category

Property theme/category
Requirement level Optional
Cardinality 0..n
URI dcat:theme
Range skos:Concept
Definition This property refers to a theme of the Data Service. A Data Service may be associated with multiple themes.
Usage note CV to be used: [[?DATA-GOV-THEME]]

Property: access rights

Property access rights
Requirement level Optional
Cardinality 0..1
URI dcterms:accessRights
Range dcterms:RightsStatement
Definition This property MAY include information regarding access or restrictions based on privacy, security, or other policies.
Usage note CV must be used: [[?DATA-GOV-AR]]

Property: conforms to

Property conforms to
Requirement level Optional
Cardinality 0..n
URI dcterms:conformsTo
Range dcterms:Standard
Definition This property is used to indicate the general standard or specification that the Data Service endpoints implement.

Property: creation date

Property creation date
Requirement level Optional
Cardinality 0..1
URI dcterms:created
Range rdfs:Literal typed as xsd:date or xsd:dateTime
Definition This property contains the date on which the Data Service has been first created.

Property: creator

Property creator
Requirement level Optional
Cardinality 0..n
URI dcterms:creator
Range foaf:Agent
Usage note This property refers to the Agent primarily responsible for producing the Data Service.

Property: description

Property description
Requirement level Optional
Cardinality 0..n
URI dcterms:description
Range rdfs:Literal
Definition This property contains a free-text account of the Data Service.
Usage note This property can be repeated for parallel language versions of the description (see ). On the user interface of data portals, the content of the element whose language corresponds to the display language selected by the user is displayed.

Property: identifier

Property identifier
Requirement level Optional
Cardinality 0..n
URI dcterms:identifier
Range rdfs:Literal
Definition This property contains the main identifier for the Data Service, e.g. the URI or other unique identifier in the context of the Catalog.

Property: language

Property language
Requirement level Optional
Cardinality 0..n
URI dcterms:language
Range dcterms:LinguisticSystem
Definition This property refers to a language supported by the Data Service. This property can be repeated if multiple languages are supported in the Data Service.
Usage note Resources defined by the Library of Congress ([[ISO 639-1]] SHOULD be used.
Usage note This property can be repeated if the service is provided in multiple languages. (e.g. map service rendering maps in spanish or english)

Property: update/modification date

Property update/modification date
Requirement level Optional
Cardinality 0..1
URI dcterms:modified
Range rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth)
Definition This property contains the most recent date on which the Data Service was changed or modified.

Property: rights

Property rights
Requirement level Optional
Cardinality 0..n
URI dcterms:rights
Range dcterms:RightsStatement
Definition A statement that concerns all rights for the Data Service not addressed with dcterms:license or dcterms:accessRights, such as copyright statements.

Property: rights holder

Property rights holder
Requirement level Optional
Cardinality 0..n
URI dcterms:rightsHolder
Range org:Organization
Definition This property refers to an Agent (organization) holding rights on the Data Service.

Property: spatial/geographic coverage

Property spatial/geographic coverage
Requirement level Optional
Cardinality 0..n
URI dcterms:spatial
Range dcterms:Location
Definition This property refers to a geographic region that is covered by the Data Service.
Usage note TO DISCUSS: Conventions to be used: The Vocabularies Name Authority Lists MUST be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs MUST be used: [[?DATA-GOV-CONT]], [[?DATA-GOV-COUNTRY]], [[?DATA-GOV-PLACE]], [[GEONAMES]]

Property: temporal coverage

Property temporal coverage
Requirement level Optional
Cardinality 0..n
URI dcterms:temporal
Range dcterms:PeriodOfTime
Definition This property refers to a temporal period that the Data Service covers.

Property: category

Property category
Requirement level Optional
Cardinality 0..1
URI dcterms:type
Range skos:Concept
Definition Category of the data service
Usage note This property SHOULD take as value one of the URIs of a concept defined in service type taxonomy or code list.

Property: quality measurement

Property quality measurement
Requirement level Optional
Cardinality 0..n
URI dqv:hasQualityMeasurement
Range dqv:QualityMeasurement
Definition Refers to the performed quality measurements.It represents the evaluation of a given dataset against a specific quality metric
Usage note Use for quality measurements of data services (availability,response time, reliability)

Property: qualified attribution

Property qualified attribution
Requirement level Optional
Cardinality 0..n
URI prov:qualifiedAttribution
Range prov:Attribution
Definition This property refers to a link to an Agent having some form of responsibility for the Data Service.

Property: status

Property status
Requirement level Optional
Cardinality 0..1
URI adms:status
Range skos:Concept
Usage note This property refers to the maturity of the Data Service. It MUST take one of the values Completed, Deprecated, Under Development, Withdrawn from the ADMS status [[VOCAB-ADMS-SKOS]] vocabulary.

Property: was used by

Property was used by
Requirement level Optional
Cardinality 0..n
URI prov:wasUsedBy
Range prov:Activity
Definition This property refers to an Activity that used the Data Service.
Usage note This property MAY be used to specify a testing Activity over a Data Service, against a given Standard, producing as output a conformance degree.

Property: geographic bounding box

Property geographic bounding box
Requirement level Optional
Cardinality 0..n
URI dcat-us:geographicBoundingBox
Range dcat-us:GeographicBoundingBox
Definition This property describes the spatial extent of domain of application of an data service and is standardized in WGS 84 Lat/Long coordinate system.

Example

Dataset

A Dataset is a collection of data, published or curated by a single source and related by a common idea or concept. In contrast to a Data Service a Dataset is expected to be a collection of data that is available for access or download in one or more formats, as Distributions. Distributions belonging to the same Dataset should not differ in regards to the idea of the data that they represent. They may differ in regards to the physical representation of the data such as format or resolution. Or they may split the data of the dataset into portions of comparable size such as data per time period or location.

DCAT 3 provides guidelines about the usage of Data services and Distribution in relation to Datasets [[VOCAB-DCAT-3]].:

RDF Class: dcat:Dataset
Definition: A collection of data, published or curated by a single agent, and available for access or download in one or more representations.
Subclass Of: dcat:Resource
Usage note
  • This class describes the conceptual dataset. One or more representations might be available, with differing schematic layouts and formats or serializations.
  • This class describes the actual dataset as published by the dataset provider. In cases where a distinction between the actual dataset and its entry in the catalog is necessary (because metadata such as modification date might differ), the dcat:CatalogRecord class can be used for the latter.
  • The notion of dataset in DCAT is broad and inclusive, with the intention of accommodating resource types arising from all communities. Data comes in many forms including numbers, text, pixels, imagery, sound and other multi-media, and potentially other types, any of which might be collected into a dataset.
Rationale: The update of dcat:Dataset is crucial as it aligns the DCAT profile with international standards, offering a standardized and widely recognized way to describe datasets. This alignment enhances data interoperability and discoverability, enabling data publishers to provide structured metadata, improving data sharing, and facilitating seamless integration for users and applications.
Property URI Range ReqLevel Card Changes from DCAT-US 1.1
title dcterms:title rdfs:Literal M 1..n Multilingual support
description dcterms:description rdfs:Literal M 1..n Multilingual support
contact point dcat:contactPoint vcard:Kind R 0..n No Change
data dictionary dcat-us:describedBy dcat:Distribution R 0..1 Fixed
dataset distribution dcat:distribution dcat:Distribution R 0..n No Change
identifier dcterms:identifier rdfs:Literal R 0..n Fixed
spatial/geographic coverage dcterms:spatial dcterms:Location R 0..n Fixed
keyword/tag dcat:keyword rdfs:Literal R 0..n No Change
landing page dcat:landingPage foaf:Document R 0..n No Change
update/modification date dcterms:modified rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R 0..1 No Change
publisher dcterms:publisher foaf:Agent R 0..1 No Change
geographic bounding box dcat-us:geographicBoundingBox dcat-us:GeographicBoundingBox R 0..n New!
temporal coverage dcterms:temporal dcterms:PeriodOfTime R 0..n Fixed
theme/category dcat:theme skos:Concept R 0..n Fixed
access rights dcterms:accessRights dcterms:RightsStatement O 0..1 Aligned
conforms to dcterms:conformsTo dcterms:Standard O 0..n No Change
contributor dcterms:contributor dcterms:Agent O 0..n New!
creator dcterms:creator dcterms:Agent O 0..n Aligned
documentation foaf:page foaf:Document O 0..n New!
frequency dcterms:accrualPeriodicity dcterms:Frequency O 0..1 Fixed
has version dcat:hasVersion dcat:Dataset O 0..n Aligned
image schema:image schema:url or schema:ImageObject O 0..n New!
inSeries dcat:inSeries dcat:DatasetSeries O 0..n Aligned
is referenced by dcterms:isReferencedBy rdfs:Resource O 0..n Aligned
language dcterms:language dcterms:LinguisticSystem O 0..n Fixed
liability statement dcat-us:liabilityStatement dcat-us:LiabilityStatement O 0..1 New!
metadata distribution dcat-us:metadataDistribution dcat:Distribution O 0..n New!
next dcat:next dcat:Dataset O 0..1 Aligned
other identifier adms:identifier adms:Identifier O 0..n New!
purpose dcat-us:purpose rdfs:Literal O 0..n New!
prev dcat:prev dcat:Dataset O 0..1 Aligned
provenance dcterms:provenance dcterms:ProvenanceStatement O 0..n New!
qualified attribution prov:qualifiedAttribution prov:Attribution O 0..n Aligned
qualified relation dcat:qualifiedRelation dcat:Relationship O 0..n Aligned
related resource dcterms:relation rdfs:Resource O 0..n Aligned
release date dcterms:issued rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O 0..1 No Change
rights dcterms:rights dcterms:RightsStatement O 0..n Fixed
sample adms:sample dcat:Distribution O 0..n New!
scope note skos:scopeNote rdfs:Literal O 0..n New!
source dcterms:source dcat:Dataset O 0..n New!
status adms:status skos:Concept O 0..1 Aligned
subject dcterms:subject skos:Concept O 0..n New!
quality measurement dqv:hasQualityMeasurement dqv:QualityMeasurement O 0..n Aligned
spatial resolution in meters dcat:spatialResolutionInMeters rdfs:Literal (typed as xsd:decimal) O 0..n Aligned
temporal resolution dcat:temporalResolution rdfs:Literal (typed as xsd:duration) O 0..n Aligned
category dcterms:type skos:Concept O 0..1 Aligned
version dcat:version rdfs:Literal O 0..n Aligned
version notes adms:versionNotes rdfs:Literal O 0..n New!
was generated by prov:wasGeneratedBy prov:Activity O 0..n New!

Mandatory Properties

Property: title

Property title
Requirement level Mandatory
Cardinality 1..n
URI dcterms:title
Range rdfs:Literal
Definition This property contains a name given to the Dataset.
Usage note This property can be repeated for parallel language versions of the title (see Multilingualism).

Property: description

Property description
Requirement level Mandatory
Cardinality 1..n
URI dcterms:description
Range rdfs:Literal
Definition This property contains a free-text account of the Dataset.
Usage note This property can be repeated for parallel language versions of the description (see Multilingualism). On the user interface of data portals, the content of the element whose language corresponds to the display language selected by the user is displayed.

Optional Properties

Property: access rights

Property access rights
Requirement level Optional
Cardinality 0..1
URI dcterms:accessRights
Range dcterms:RightsStatement
Usage note
  • This property refers to information that indicates whether the Dataset is open data, has access restrictions or is not public.
  • CV to be used: [[?DATA-GOV-AR]].

Property: conforms to

Property conforms to
Requirement level Optional
Cardinality 0..n
URI dcterms:conformsTo
Range dcterms:Standard
Usage note
  • This property refers to an implementing rule or other specification.
  • This property SHOULD be used to indicate the model, schema, ontology, view or profile that this representation of a Dataset conforms to. This is (generally) a complementary concern to the media-type or format.

Property: contributor

Property contributor
Requirement level Optional
Cardinality 0..n
URI dcterms:contributor
Range foaf:Agent
Usage note This property refers to an agent contributing to the Dataset.

Property: creator

Property creator
Requirement level Optional
Cardinality 0..1
URI dcterms:creator
Range foaf:Agent
Usage note This property refers to an entity responsible for producing the dataset.

Property: data dictionary

Property data dictionary
Requirement level Recommended
Cardinality 0..1
URI dcat-us:describedBy
Range dcat:Distribution
Usage note

This is used to specify a data dictionary or schema that defines fields (variables, dimensions, measures, attributes) in the dataset.

Property: documentation

Property documentation
Requirement level Optional
Cardinality 0..n
URI foaf:page
Range foaf:Document
Usage note This property refers to a page or document about this Dataset.

Property: frequency

Property frequency
Requirement level Optional
Cardinality 0..1
URI dcterms:accrualPeriodicity
Range dcterms:Frequency
Usage note
  • This property refers to the frequency at which the Dataset is updated.
  • CV to be used: [[CLD-FREQ]].

Property: quality measurement

Property quality measurement
Requirement level Optional
Cardinality 0..n
URI dqv:hasQualityMeasurement
Range dqv:QualityMeasurement
Definition Refers to the performed quality measurements.It represents the evaluation of a given dataset against a specific quality metric
Usage note Use for quality measurements other than spatial resolution in meters (use dcat:spatialResolutionInMeters). Examples of quality measurements includes completeness, accuracy, accuracy, timeliness, granularity.

Property: has version

URI dcat:hasVersion
Definition: This resource has a more specific, versioned resource [[?PAV]].
Equivalent property: pav:hasVersion
Sub-property of: dcterms:hasVersion
Sub-property of: prov:generalizationOf
Usage note

A related Dataset that is a version, edition, or adaptation of the described Dataset.

Property: inSeries

Property inSeries
Requirement level Optional
Optional 0..n
URI dcat:inSeries
Range dcat:DatasetSeries
Usage note The datasets are linked to the dataset series by using the property dcat:inSeries. Note that a dataset series can also be hierarchical, and a dataset series can be a member of another dataset series
Definition A dataset series of which the dataset is part.

Property: is referenced by

Property is referenced by
Requirement level Optional
Cardinality 0..n
URI dcterms:isReferencedBy
Range rdfs:Resource
Usage note This property is about a related resource, such as a publication, that references, cites, or otherwise points to the Dataset.

Property: language

Property language
Requirement level Optional
Optional 0..n
URI dcterms:language
Range dcterms:LinguisticSystem
Definition A language of the dataset. This refers to the natural language used for textual metadata (i.e., titles, descriptions, etc.) of a dataset.
Usage note Resources defined by the Library of Congress ([[ISO 639-1]] SHOULD be used.
Usage note The value(s) provided for members of a catalog (i.e., dataset or service) override the value(s) provided for the catalog if they conflict.
Usage note If representations of a dataset are available for each language separately, define an instance of dcat:Distribution for each language and describe the specific language of each distribution using dcterms:language (i.e., the dataset will have multiple dcterms:language values and each distribution will have just one as the value of its dcterms:language property). In case of multilingual distributions, the distributions will have multiple dcterms:language values.

Property: next

Property next
Requirement level Optional
Optional 0..1
URI dcat:next
Range dcat:Dataset
Definition The following resource (after the current one) in an ordered collection or series of resources.

Property: other identifier

Property other identifier
Requirement level Optional
Optional 0..n
URI adms:identifier
Range adms:Identifier
Usage note A secondary identifier of the Dataset, such as MAST/ADS17, DataCite18, DOI19, EZID20 or W3ID21.

Property: prev

Property prev
Requirement level Optional
Optional 0..1
URI dcat:prev
Range dcat:Dataset
Usage note Unless the dataset is the last in the chain a dataset in a collection must have a previous one.
Definition The previous resource (before the current one) in an ordered collection or series of resources.

Property: provenance

Property provenance
Requirement level Optional
Optional 0..n
URI dcterms:provenance
Range dcterms:ProvenanceStatement
Definition
  • A statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity, and interpretation.
Usage note This property contains a statement about the lineage of a Dataset.

Property: qualified attribution

Property qualified attribution
Requirement level Optional
Cardinality 0..n
URI prov:qualifiedAttribution
Range prov:Attribution
Usage note This property refers to a link to an Agent having some form of responsibility for the resource.

Property: qualified relation

Property qualified relation
Requirement level Optional
Cardinality 0..n
URI dcat:qualifiedRelation
Range dcat:Relationship
Usage note
  • This property provides a link to a description of a relationship with another resource and it is especially meant for relationships between Datasets.
  • It replaces the property rdfs:seeAlso of DCAT-US v1.
  • See here for examples on how to use it: dcat:qualifiedRelation.

Property: release date

Property release date
Requirement level Optional
Cardinality 0..1
URI dcterms:issued
Range rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth)
Usage note
  • This property contains the date of formal issuance (e.g., first publication) of the Dataset.
  • If this date is not known, the date of the first referencing of the data collection in the Catalog can be entered.

Property: rights

Property rights
Requirement level Recommended
Cardinality 0..n
URI dcterms:rights
Range dcterms:RightsStatement
Usage note This property refers to a statement that specifies copyrights associated with the Dataset.

Property: sample

Property sample
Requirement level Optional
Cardinality 0..n
URI adms:sample
Range dcat:Distribution
Definition
  • Links to a sample of an Dataset, which is a dcat:Distribution.

Property: usage note

Property usage note
Requirement level Optional
Cardinality 0..n
URI skos:scopeNote

Property: source

Property source
Requirement level Optional
Cardinality 0..n
URI dcterms:source
Range dcat:Dataset
Usage note A related Dataset from which the described Dataset is derived.

Property: subject

Property subject
Requirement level Optional
Cardinality 0..n
URI dcterms:subject
Range skos:Concept
Definition Primary Subjects of the Dataset.
Usage note Primary Subjects of the Dataset defined in a controlled vocabularies. Subjects are typically narrower in meaning than dcat:theme.

Property: status

Property status
Requirement level Optional
Cardinality 0..1
URI adms:status
Range skos:Concept
Usage note This property refers to the maturity of the Dataset. It MUST take one of the values Completed, Deprecated, Under Development, Withdrawn from the ADMS status [[VOCAB-ADMS-SKOS]] vocabulary.

Property: spatial resolution in meters

Property spatial resolution in meters
Requirement level Optional
Cardinality 0..n
URI dcat:spatialResolutionInMeters
Range rdfs:Literal (typed as xsd:decimal)
Usage note
  • If the dataset is an image or grid this should correspond to the spacing of items. For other kinds of spatial datasets, this property will usually indicate the smallest distance between items in the dataset.
  • The range of this property is a decimal number representing a length in meters. This is intended to provide a summary indication of the spatial resolution of the data as a single number. More complex descriptions of various aspects of spatial precision, accuracy, resolution and other statistics can be provided using the Data Quality Vocabulary [VOCAB-DQV].

Property: temporal resolution

Property temporal resolution
Requirement level Optional
Cardinality 0..n
URI dcat:temporalResolution
Range rdfs:Literal (typed as xsd:duration)
Usage note
  • If the dataset is a time-series this should correspond to the spacing of items in the series. For other kinds of dataset, this property will usually indicate the smallest time difference between items in the dataset
  • This is intended to provide a summary indication of the temporal resolution of the dataset as a single value. More complex descriptions of various aspects of temporal precision, accuracy, resolution and other statistics can be provided using the Data Quality Vocabulary [VOCAB-DQV].

Property: category

Property category
Requirement level Optional
Cardinality 0..1
URI dcterms:type
Range skos:Concept
Usage note
  • A type of the Dataset.
  • A recommended controlled vocabulary data-type is foreseen.

Property: version

Property version
Requirement level Optional
Cardinality 0..n
URI dcat:version
Range rdfs:Literal
Usage note The version indicator (name or identifier) of a resource.

Property: version notes

Property version notes
Requirement level Optional
Cardinality 0..n
URI adms:versionNotes
Range rdfs:Literal
Usage note
  • A description of the differences between this version and a previous version of the Dataset.
  • This property can be repeated for parallel language versions of the version notes.

Property: was generated by

Property was generated by
Requirement level Optional
Cardinality 0..n
URI prov:wasGeneratedBy
Range prov:Activity
Usage note An activity that generated, or provides the business context for, the creation of the dataset.
Example

Property: metadata distribution

Property metadata distribution
Requirement level Optional
Cardinality 0..n
URI dcat-us:metadataDistribution
Range dcat:Distribution
Definition Property referring to a metadata document distribution from which this dataset is derrived from.
Usage note Distribution to "original" metadata document from which the dataset is derived from

Property: liability statement

Property liability statement
Requirement level Optional
Cardinality 0..1
URI dcat-us:liabilityStatement
Range dcat-us:LiabilityStatement
Usage note A liability statement about the dataset

Property: purpose

Property purpose
Requirement level Optional
Cardinality 0..n
URI dcat-us:purpose
Range rdfs:Literal
Usage note The purpose of the dataset

Property: image

Property image
Requirement level Optional
Cardinality 0..3
URI schema:image
Range schema:url or schema:ImageObject
Definition A thumbnail picture illustrating the content of the dataset.
Usage note
  • A thumbnail picture illustrating the content of the Dataset.
  • For distributions that consist of visual content (photographs, videos, maps, etc.) it makes sense to add a limited number of thumbnails to the metadata.
  • It’s a DCAT-US Custom Class

Dataset Series

The DatasetSeries concept in the DCAT-US specification serves a dual purpose. Primarily, it represents a collection of related datasets that share common characteristics and are published as a series, facilitating the organization and discovery of datasets that evolve over time or are updated regularly. Beyond this, DatasetSeries also provides a mechanism for grouping datasets into thematic collections, regardless of whether these collections form a temporal series. This flexibility enhances the specification's utility by supporting a wider range of data publication practices, enabling users to effectively discover and understand datasets grouped by series or thematic similarity.

RDF Class: dcat:DatasetSeries
Definition: A collection of datasets that are published separately, but share some characteristics that group them.
Subclass Of: dcat:Dataset
Usage note
  • Dataset series can be also soft-typed via property dcterms:type as in the approach used in [[?GeoDCAT-AP]]
  • Common scenarios for dataset series include: time series composed of periodically released subsets; map-series composed of items of the same type or theme but with differing spatial footprints.
Rationale: Incorporating dcat:DatasetSeries is essential to enable the structured grouping and presentation of related datasets, ensuring that data publishers can convey meaningful collections of data. This facilitates efficient data organization and discovery for users, aligning the DCAT profile with international standards for dataset series representation.
Property URI Range ReqLevel Card
title dcterms:title rdfs:Literal M 1..n
description dcterms:description rdfs:Literal M 1..n
contact point dcat:contactPoint vcard:Kind R 0..n
first dcat:first dcat:Dataset R 0..1
geographic bounding box dcat-us:geographicBoundingBox dcat-us:GeographicBoundingBox R 0..n
spatial/geographic coverage dcterms:spatial dcterms:Location R 0..n
last dcat:last dcat:Dataset R 0..1
update/modification date dcterms:modified rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R 0..1
publisher dcterms:publisher foaf:Agent R 0..1
series member dcat:seriesMember dcat:Dataset R 0..1
temporal coverage dcterms:temporal dcterms:PeriodOfTime R 0..n
frequency dcterms:accrualPeriodicity dcterms:Frequency O 0..1
release date dcterms:issued rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O 0..1

Mandatory Properties

Property: title

Property Title
Requirement level Mandatory
Cardinality 1..n
URI dcterms:title
Range rdfs:Literal
Usage note
  • This property contains a name given to the Dataset Series.
  • This property can be repeated for parallel language versions of the name (see Multilingualism).

Property: description

Property description
Requirement level Mandatory
Cardinality 1..n
URI dcterms:description
Range rdfs:Literal
Usage note
  • This property contains a free-text account of the Dataset Series.
  • This property can be repeated for parallel language versions of the description (see Multilingualism). It is recommended to provide an indication about the dimensions the Dataset Series evolves.

Optional Properties

Property: frequency

Property frequency
Requirement level Optional
Cardinality 0..1
URI dcterms:accrualPeriodicity
Range dcterms:Frequency
Usage note
  • This property refers to the frequency at which the Dataset Series is updated.
  • The frequency of a dataset series is not equal to the frequency of the dataset in the collection.
  • CV to be used: [[CLD-FREQ]].

Property: release date

Property release date
Requirement level Optional
Cardinality 0..1
URI dcterms:issued
Range rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth)
Usage note
  • This property contains the date of formal issuance (e.g.,publication) of the Dataset Series.
  • The moment when the dataset series was established as a managed resource. This is not equal to the release date of the oldest dataset in the collection of the dataset series.

Example

In this example, ex:populationCensus represents a series of datasets related to the US Population Census Data, which is issued every 10 years (decennial). Individual datasets for specific years (e.g., ex:populationCensus-1950) are also defined, each pointing to the next dataset in the series using dcat:next.

Distribution

In the context of the DCAT-US profile, a metadata entry of this class serves to characterize a distribution of data, which constitutes a specific representation of a Dataset. Datasets within this profile may offer multiple serializations, each potentially differing in various aspects, including natural language, media type or format, schematic organization, temporal and spatial resolution, level of detail, or profiles that specify any combination of these attributes.

A distribution may encompass the entirety of the Dataset's data or only a subset thereof. For example, it could encompass all data related to the population in the United States or focus exclusively on a specific year, such as 2020. Alternatively, it might provide the data in an alternate format, such as a graphical representation covering the years 2010 through 2020.

Within the DCAT-US profile, various relationships between Datasets and their distributions are represented. The most straightforward relationship involves aggregating different physical representations of data, referred to as "Distributions," into a single Dataset. An example of such a Dataset is a time series, where each distribution corresponds to one year of data, and the Dataset spans multiple years.

In the DCAT vocabulary, dcat:Distribution is employed to characterize the diverse representations and formats in which a dataset is disseminated, facilitating the description of different versions or media types of the same data, and often includes properties like dcat:downloadURL for direct download links. On the other hand, dcat:DataService serves the purpose of detailing data access services, such as APIs and endpoints, enabling programmatic or interactive data retrieval, with key properties like dcat:endpointURL specifying service endpoints and dcat:serviceType indicating the type of service, thus distinguishing between the description of data formats and the specification of data access services within the DCAT framework.

RDF Class: dcat:Distribution
Definition: A specific representation of a dataset. A dataset might be available in multiple serializations that may differ in various ways, including natural language, media-type or format, schematic organization, temporal and spatial resolution, level of detail or profiles (which might specify any or all of the above).
Subclass Of: dcat:Resource
Usage note
  • This represents a general availability of a dataset. It implies no information about the actual access method of the data, i.e., whether by direct download, API, or through a Web page. The use of dcat:downloadURL property indicates directly downloadable distributions.
Rationale: The update to DCAT 3 dcat:Distribution is of paramount significance as it greatly enhances data accessibility. It introduces a more comprehensive and structured approach to describing data distributions, ensuring that data consumers can easily understand and access the data in the format that best suits their needs, ultimately fostering greater data utilization and dissemination.
Property URI Range ReqLevel Card Changes from DCAT-US 1.1
license dcterms:license dcterms:LicenseDocument M 1..1 Aligned
access URL dcat:accessURL rdfs:Resource R 0..1 No Change
format dcterms:format dcterms:MediaType R 0..1 Fixed
rights dcterms:rights dcterms:RightsStatement R 0..n Aligned
access Restriction dcat-us:accessRestriction dcat-us:AccessRestriction R 0..n New!
usage restriction dcat-us:useRestriction dcat-us:UseRestriction R 0..n New!
cui Restriction dcat-us:cuiRestriction dcat-us:CuiRestriction R 0..1 New!
data dictionary dcat-us:describedBy dcat:Distribution R 0..1 Fixed
title dcterms:title rdfs:Literal R 0..n Multilingual support
update/modification date dcterms:modified rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R 0..1 Aligned
representation technique adms:representationTechnique skos:Concept O 0..1 New!
status adms:status skos:Concept O 0..1 Aligned
character encoding cnt:characterEncoding rdfs:Literal O 0..n New!
compression format dcat:compressFormat dcterms:MediaType O 0..1 Aligned
spatial resolution in meters dcat:spatialResolutionInMeters xsd:decimal O 0..1 Aligned
quality measurement dqv:hasQualityMeasurement dqv:QualityMeasurement O 0..n Aligned
access rights dcterms:accessRights dcterms:RightsStatement O 0..1 Aligned
access service dcat:accessService dcat:DataService O 0..n Aligned
byte size dcat:byteSize xsd:nonNegativeInteger O 0..1 Aligned
checksum spdx:checksum spdx:Checksum O 0..1 Aligned
documentation foaf:page foaf:Document O 0..n New!
download URL dcat:downloadURL rdfs:Resource O 0..1 No Change
identifier dcterms:identifier rdfs:Literal O 0..1 Aligned
image schema:image schema:url or schema:ImageObject O 0..3 New!
language dcterms:language dcterms:LinguisticSystem O 0..n Aligned
conforms to dcterms:conformsTo dcterms:Standard O 0..n No Change
media type dcat:mediaType dcterms:MediaType O 0..1 Fixed
packaging format dcat:packageFormat dcterms:MediaType O 0..1 Aligned
release date dcterms:issued rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R 0..1 Aligned
temporal resolution dcat:temporalResolution xsd:duration R 0..1 Aligned

Optional Properties

Property: representation technique

Property representation technique
Requirement level Optional
Cardinality 0..1
URI adms:representationTechnique
Range skos:Concept
Definition More information about the format in which a Distribution is released. This is different from the file format as, for example, a XML file (file format) could contain an XML schema (representation technique).
Usage note adms:representationTechnique in DCAT-US metadata plays a crucial role in detailing the specific schema, standard, or method used to structure data within a dataset, like specifying RFC 4180 for CSV, GeoJSON for JSON, OWL for RDF, or XML Schema for XML. This contrasts with dcterms:format, which broadly identifies the file format (e.g., CSV, JSON, RDF, XML), providing a general idea of the data's structure and syntax. Meanwhile, dcterms:mediaType complements these by defining the MIME type, such as 'text/csv' or 'application/json', crucial for software processing and data transmission. The detailed insight provided by adms:representationTechnique is indispensable for users needing comprehensive knowledge about the dataset's internal organization and interpretation, which goes beyond the basic format or MIME type indicated by dcterms:format and dcterms:mediaType.

Property: status

Property status
Requirement level Optional
Cardinality 0..1
URI adms:status
Range skos:Concept
Usage note This property refers to the maturity of the Distribution. It MUST take one of the values Completed, Deprecated, Under Development, Withdrawn from the ADMS status [[VOCAB-ADMS-SKOS]] vocabulary.

Property: character encoding

Property character encoding
Requirement level Optional
Cardinality 0..n
URI cnt:characterEncoding
Range rdfs:Literal
Usage note This property SHOULD be used to specify the character encoding of the Distribution, by using as value the character set names in the IANA register [[IANA-CHARSETS]].

Property: compression format

Property compression format
Requirement level Optional
Cardinality 0..1
URI dcat:compressFormat
Range dcterms:MediaType
Usage note This property refers to the format of the file in which the data is contained in a compressed form, e.g., to reduce the size of the downloadable file. It SHOULD be expressed using a media type as defined in the official register of media types managed by IANA [[IANA-MEDIA-TYPES]].

Property: spatial resolution in meters

Property spatial resolution in meters
Requirement level Optional
Cardinality 0..n
URI dcat:spatialResolutionInMeters
Range xsd:decimal
Usage note
  • This property refers to the minimum spatial separation resolvable in a Distribution, measured in meters.

Property: quality measurement

Property quality measurement
Requirement level Optional
Cardinality 0..n
URI dqv:hasQualityMeasurement
Range dqv:QualityMeasurement
Definition Refers to the performed quality measurements on a distribution.It represents the evaluation of a given distribution against a specific quality metric
Usage note Use for quality measurements other than dcat:spatialResolutionInMeters or dcat:temporalResolution. Examples of quality measurements includes completeness, accuracy, accuracy, timeliness, granularity.

Property: access rights

Property access rights
Requirement level Optional
Cardinality 0..1
URI dcterms:accessRights
Range dcterms:RightsStatement
Usage note
  • This property MAY include information regarding access or restrictions based on privacy, security, or other policies.

Property: access service

Property access service
Requirement level Optional
Cardinality 0..n
URI dcat:accessService
Range dcat:DataService
Usage note This property refers to a data service that gives access to the distribution of the Dataset

Property: byte size

Property byte size
Requirement level Optional
Cardinality 0..1
URI dcat:byteSize
Range xsd:nonNegativeInteger
Definition The size of a distribution in bytes.
Usage note The size in bytes can be approximated (as a non-negative integer) when the precise size is not known.

Property: checksum

Property checksum
Requirement level Optional
Cardinality 0..1
URI spdx:checksum
Range spdx:Checksum
Usage note
  • This property provides a mechanism that can be used to verify that the contents of a distribution have not changed.
  • The checksum is related to the downloadURL.
  • Property added in [[VOCAB-DCAT-3]]: spdx:checksum

Property: coverage

Property coverage
Requirement level Optional
Cardinality 0..n
URI dcterms:coverage
Range dcterms:LocationPeriodOrJurisdiction
Usage note
  • If a dataset contains distributions that differ regarding their content beyond just differences in format or resolution this property can be used to specify temporal or spatial coverage of the data that the distribution contains.

Property: documentation

Property Documentation
Requirement level Optional
Cardinality 0..n
URI foaf:page
Range foaf:Document
Usage note This property refers to a page or document about this Distribution.

Property: download URL

Property download URL
Requirement level Optional
Cardinality 0..1
URI dcat:downloadURL
Range rdfs:Resource
Usage note This must be the direct download URL. Other means of accessing the dataset should be expressed using accessURL. This should always be accompanied by mediaType.

Property: identifier

Property identifier
Requirement level Optional
Cardinality 0..1
URI dcterms:identifier
Range rdfs:Literal
Usage note An identifier for the distribution, that identifies it as a resource mainly for the organization publishing the data.

Property: image

Property image
Requirement level Optional
Cardinality 0..3
URI schema:image
Range schema:ImageObject
Usage note

This property is for associating thumbnail images that visually represent the Distribution's content, especially beneficial for visual content like photographs, videos, maps, etc. Thumbnails should effectively illustrate or summarize the content, enhancing metadata richness and utility. While typically only URLs pointing directly to downloadable images are allowed, for more detailed representation, additional fields from schema:ImageObject, such as schema:caption, can be utilized to provide further context or descriptions. This approach ensures the "image" property not only aids in content identification but also enriches the user's understanding and interaction with the metadata.

Property: language

Property Language
Requirement level Optional
Cardinality 0..n
URI dcterms:language
Range rdfs:Literal
Definition A language of the resource. This refers to the natural language used for textual metadata (i.e., titles, descriptions, etc.) of textual values of a dataset distribution
Usage note

Resources defined by the Library of Congress ([[ISO 639-1]] SHOULD be used.

Usage Note For datasets available in separate languages, create a dcat:Distribution instance for each language version. Assign a unique dcterms:language value to each distribution to specify its language. Distributions with multiple languages should list several dcterms:language values.

Property: conforms to

Property conforms to
Requirement level Optional
Cardinality 0..n
URI dcterms:conformsTo
Range dcterms:Standard (A basis for comparison; a reference point against which other things can be evaluated.)
Definition An established standard to which the distribution conforms.
Usage note This is used to identify a standardized specification the distribution conforms to. It's recommended that this be a URI that serves as a unique identifier for the standard. The URI may or may not also be a URL that provides documentation of the specification. This property SHOULD be used to indicate the model, schema, ontology, view or profile that this representation of a dataset conforms to. This is (generally) a complementary concern to the media-type or format.

Property: media type

Property media type
Requirement level Optional
Cardinality 0..1
URI dcat:mediaType
Range dcterms:MediaType
Definition This property refers to the media type of the Distribution as defined in the official register of media types managed by IANA [[IANA-MEDIA-TYPES]].
Usage note

The mediaType property specifies the media type (MIME type) of the distribution. It should be used when the distribution's format corresponds to a standard media type registered with the IANA Media Types [[IANA-MEDIA-TYPES]]. This property provides a precise technical descriptor of the data format (e.g., application/json, text/csv).

Usage note This property refers to the media type of the Distribution as defined in the official register of media types managed by IANA. [[IANA-MEDIA-TYPES]].
Usage note The encoding in JSON-LD allows to use mime type without the full URL (e.g. text/csv). The JSON-LD context processor will expand automatically to the full uri in RDF using the base uri https://www.iana.org/assignments/media-types/. This preserves backward compatibility with DCAT-US 1.1

Property: packaging format

Property packaging format
Requirement level Optional
Cardinality 0..1
URI dcat:packageFormat
Range dcterms:MediaType
Usage note
  • This property refers to the format of the file in which one or more data files are grouped together, e.g. to enable a set of related files to be downloaded together.
  • It SHOULD be expressed using a media type as defined in the official register of media types managed by IANA.

Property: release date

Property release date
Requirement level Optional
Cardinality 0..1
URI dcterms:issued
Range rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth)
Usage note
  • This property contains the date of formal issuance (e.g., publication) of the Distribution.
  • Date of formal issuance (publication) of the distribution
  • UsageThe first time issuance of the distribution.

Property: temporal resolution

Property temporal resolution
Requirement level Optional
Cardinality 0..1
URI dcat:temporalResolution
Range xsd:duration
Usage note This property refers to the minimum time period resolvable in the Dataset distribution.

Example

Document

RDF Class: foaf:Document
Obligation Optional
Definition: A publication - as a scientific paper, a techni cal report, a book, book chapter, but also a blog post.
Usage note Depending on whether a catalog supports or not publications as first-class citizens, a publication can be fully described, or simply denoted by its URI.
Rationale: The introduction of foaf:Document significantly improves the representation of documents within the DCAT-US profile. It ensures that metadata about documents, such as title, format, language, and access options, are clearly defined and standardized. This alignment with global data standards fosters interoperability and eases document integration into various data ecosystems, benefiting both publishers and consumers.
Reference

§ Class: foaf:Document [FOAF]

Properties Summary

Property URI Range ReqLevel Card
title dcterms:title rdfs:Literal M 1..n
individual author dcterms:creator foaf:Person R 0..n
corporate author dcterms:creator org:Organization R 0..n
author(s) as literal dc:creator rdfs:Literal R 0..n
publisher organization dcterms:publisher org:Organization R 0..1
publisher(s) as literal dc:publisher rdfs:Literal R 0..n
identifier dcterms:identifier rdfs:Literal R 0..1
publication date dcterms:issued rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R 0..1
bibliographic citation dcterms:bibliographicCitation rdfs:Literal R 0..1
document type dcterms:type skos:Concept R 0..1
abstract dcterms:abstract rdfs:Literal O 0..n
description dcterms:description rdfs:Literal O 0..n
conforms to dcterms:conformsTo dcterms:Standard O 0..n
media type dcterms:mediaType dcterms:MediaType O 0..n

Mandatory Properties

Property: title

Property title
Requirement level Mandatory
Cardinality 1..n
URI dcterms:title
Range rdfs:Literal
Usage note

Optional Properties

Property: abstract

Property abstract
Requirement level Optional
Cardinality 0..n
URI dcterms:abstract
Range rdfs:Literal
Usage note

Property: individual author

Property individual author
Requirement level Optional
Cardinality 0..n
URI dcterms:creator
Range foaf:Person
Usage note

Property: corporate author

Property corporate author
Requirement level Optional
Cardinality 0..n
URI dcterms:creator
Range org:Organization
Usage note

Property: conforms to

Property conforms to
Requirement level Optional
Cardinality 0..n
URI dcterms:identifier
Range dcterms:Standard
Usage note An implementing rule or other specification.

Property: media type

Property media type
Requirement level Optional
Cardinality 0..n
URI dcterms:mediaType
Range dcterms:MediaType
Usage note An implementing rule or other specification.

Example

Geographic Bounding Box

GeographicBoundingBox describes the spatial extent of domain of application of an resource and is standardized in WGS 84 Lat/Long coordinate system.

RDF Class: dcat-us:GeographicBoundingBox
Definition: GeographicBoundingBox describes the spatial extent of domain of application of an resource and is standardized in WGS 84 Lat/Long coordinate system.
Usage note Strongly recommended for geospatial data
Rationale There is no consensus and common vocabulary to describe spatial bounding box in the community. GML Envelope was proposed but it is too cumbersome to process. We introduce four separates fields for each bound (west, east, north and south) that removes any ambiguity and make it easy to index and query

Properties Summary

Property URI Range ReqLevel Card
west bounding longitude dcat-us:westBoundingLongitude xsd:decimal M 1
east bounding longitude dcat-us:eastBoundingLongitude xsd:decimal M 1
south bouding latitude dcat-us:southBoundingLatitude xsd:decimal M 1
north bounding latitude dcat-us:northBoundingLatitude xsd:decimal M 1

Mandatory Properties

Property: west bounding longitude

Property west bounding longitude
Requirement level Mandatory
Cardinality 1
URI dcat-us:westBoundingLongitude
Range xsd:decimal
Definition West bound longitude in decimal degrees

Property: east bounding longitude

Property east bounding longitude
Requirement level Mandatory
Cardinality 1
URI dcat-us:eastBoundingLongitude
Range xsd:decimal
Definition East bound longitude in decimal degrees

Property: south bounding latitude

Property south bounding latitude
Requirement level Mandatory
Cardinality 1
URI dcat-us:southBoundingLatitude
Range xsd:decimal
Definition South bound latitude in decimal degrees

Property: north bounding latitude

Property north bounding latitude
Requirement level Mandatory
Cardinality 1
URI dcat-us:southBoundingLatitude
Range xsd:decimal
Definition North bound latitude in decimal degrees

Example

Identifier

RDF Class: adms:Identifier
Obligation Optional
Definition: This is based on the UN/CEFACT Identifier class.
Usage note An identifier in a particular context, consisting of the
  • content string that is the identifier;
  • an optional identifier for the identifier scheme;
  • an optional identifier for the version of the identifier scheme;
  • an optional identifier for the agency that manages the identifier scheme.
Reference

§ Term name: Identifier [ADMS]

Rationale Incorporating adms:Identifier in the DCAT-US profile fosters a culture of data governance and trust by transparently documenting the authority behind each identifier. This enhances data reliability and credibility, boosting confidence for DCAT-US users. Additionally, it enables versatile data access using multiple identifiers, enhancing overall data accessibility and usability for diverse stakeholders.

Properties Summary

Property URI Range ReqLevel Card
notation skos:notation xsd:string R 0..1
creator dcterms:creator dcterms:Agent O 0..1
schema agency adms:schemaAgency rdfs:Literal O 0..1
version dcat:version rdfs:Literal O 0..1
issued dcterms:issued rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O 0..1

Optional Properties

Property: creator

Property creator
Requirement level Optional
Cardinality 0..1
URI dcterms:creator
Range dcterms:Agent

Property: schema agency

Property schema agency
Requirement level Optional
Cardinality 0..1
URI adms:schemaAgency
Range rdfs:Literal

Property: version

Property version
Requirement level Optional
Cardinality 0..1
URI dcat:version
Range rdfs:Literal

Property: issued

Property issued
Requirement level Optional
Cardinality 0..1
URI dcterms:issued
Range rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth)

Example

Liability Statement

RDF Class: dcat-us:LiabilityStatement
Definition: A formal declaration accompanying a dataset intended to limit the legal exposure of the data provider by disclaiming warranties or guarantees.
Usage note
  • This statement often includes information of the following aspects:
    • Limitation of Responsibility: Clarifying that the publisher or provider is not responsible for any errors in the data, and any consequences resulting from its use.
    • No Guarantee of Validity: Indicating that there is no guarantee of the accuracy, reliability, or completeness of the data provided.
    • Absence of Endorsement: Stating that inclusion of the data in the catalog does not imply endorsement by the publisher or provider.
    • Use at Own Risk: Advising users that they use the data at their own risk and are responsible for ensuring its appropriateness for their intended purposes.
  • The statement may be provided as a literal text or as a URL pointing to a detailed liability statement.
  • Utilizing the LiabilityStatement helps in setting clear expectations for consumers of the dataset and limits potential legal exposures of the data provider.
Rationale Introducing dcat-us:LiabilityStatement in DCAT-US clarifies data provider responsibilities and limitations, reducing legal risks by defining acceptable uses and disclaiming warranties. This ensures transparency and legal compliance in data sharing within the United States.

Properties Summary

Property URI Range ReqLevel Card
liability statement text rdfs:label rdfs:Literal O 0..n

Optional Properties

Property: liability statement text

Property liability statement text
Requirement level Optional
Cardinality 0..n
URI rdfs:label
Range rdfs:Literal
Definition Full text of the liability statement.
Usage note Property rdfs:label MAY only be used to specify the text of liability statement information. This property can be repeated for parallel language versions of the description

Example

LicenseDocument

RDF Class: dcterms:LicenseDocument
Obligation Optional
Definition: A legal document giving official permission to do something with a resource.
Usage note License document SHOULD be specified only with URIs from an endorsed Data.gov registry. Property spdx:licenseText MAY only be used to specify license information in legacy metadata records, not compliant with standard license from an endorsed Data.Gov registry.
Rationale: The introduction of dcterms:LicenseDocument in the DCAT profile enables the customization of license text. This flexibility empowers data publishers to tailor license terms to specific dataset requirements, facilitating clear communication of licensing conditions and promoting responsible data sharing and usage while adhering to established international standards.
Reference

§ Term name: LicenseDocument [DCTERMS]

Properties Summary

Property URI Range ReqLevel Card
license text spdx:licenseText rdfs:Literal O 0..n

Optional Properties

Property: license text

Property license text
Requirement level Optional
Cardinality 0..n
URI spdx:licenseText
Range rdfs:Literal
Definition Full text of the license.
Usage note Property spdx:licenseText MAY only be used to specify license information in legacy metadata records, not compliant with1 standard license from an endorsed registry. This property can be repeated for parallel language versions of the description

Example

Location

A spatial region or named place.

RDF Class: dcterms:Location
Definition: A spatial region or named place. It can be represented using a controlled vocabulary or with geographic coordinates.
Usage note
  • For an extensive geometry (i.e., a set of coordinates denoting the vertices of the relevant geographic area), the property locn:geometry [[LOCN]] SHOULD be used.
  • For a geographic bounding box delimiting a spatial area the property dcat:bbox SHOULD be used.
  • For the geographic center of a spatial area, or another characteristic point, the property dcat:centroid SHOULD be used.
Rationale: The introduction of dcterms:Location in DCAT-US 3.0 is driven by the need to restore compatibility with the DCAT standard. DCAT-US 1.1 had deviated from the standard by using strings for location in dcterms:spatial property, which was incompatible. This addition aligns DCAT-US with recognized geospatial standards (e.g., Geosparql, WKT, GeoJSON, W3C Location) for representing geometries, addresses, and location names, ensuring data compatibility, discoverability, and integration while adhering to international data management practices.

Properties Summary

Property URI Range ReqLevel Card
bounding box dcat:bbox rdfs:Literal typed as gsp:wktLiteral (preferred) or gsp:gmlLiteral or gsp:geoJSONLiteral R 0..1
centroid dcat:centroid rdfs:Literal typed as gsp:wktLiteral or gsp:gmlLiteral. O 0..1
geographic identifier dcterms:identifier rdfs:Literal O 0..n
geometry locn:geometry locn:Geometry typed as gsp:wktLiteral (preferred) or gsp:gmlLiteral or gsp:geoJSONLiteral O 0..1
gazetteer skos:inScheme skos:ConceptScheme O 0..1
geographic name skos:prefLabel rdfs:Literal R 0..n
alternate geographic name skos:altLabel rdfs:Literal O 0..n

Optional Properties

Property: geographic name

RDF Property skos:altLabel
Requirement level Optional
Cardinality 0..n
Range rdfs:Literal
Definition Alternate toponyms for the location
Usage note This property contains a alternate labels of the Location. This property can be repeated for parallel language versions of the label.

Property: centroid

Property centroid
Requirement level Optional
Cardinality 0..1
URI dcat:centroid
Range rdfs:Literal typed as geosparql:wktLiteral or geosparql:gmlLiteral
Usage note
  • The range of this property (rdfs:Literal) is intentionally generic, with the purpose of allowing different geometry literal encodings. E.g., the geometry could be encoded as a WKT literal (geosparql:wktLiteral)
  • Please note that the order of usage is as follows: use the most specific geospatial relationship by preference. E.g. if the spatial description is a bbox, use dcat:bbox, otherwise use locn:geometry
  • The WKT encoding supports geospatial positions expressed in coordinate reference systems other than WGS84.

Property: geographic identifier

Property geographic identifier
Requirement level Optional
Cardinality 0..n
URI dcterms:identifier
Range rdfs:Literal
Usage note This property contains the geographic identifier for the Location, e.g., the URI or other unique identifier in the context of the relevant gazetteer.

Property: geometry

Property geometry
Requirement level Optional
Cardinality 0..1
URI locn:geometry
Range locn:Geometry
Definition: Associates a spatial thing [[?SDW-BP]] with a corresponding geometry.
Usage note The range of this property (locn:Geometry) allows for any type of geometry specification. E.g., the geometry could be encoded by a literal, as WKT (geosparql:wktLiteral [[GeoSPARQL]]), or represented by a class, as geosparql:Geometry (or any of its subclasses) [[GeoSPARQL]].

Property: gazetteer

RDF Property skos:inScheme
Requirement level Optional
Cardinality 0..1
Range skos:ConceptScheme
Usage note: This property MAY be used to specify the gazetteer to which the Location belongs.

Example

MediaType

RDF Class: dcterms:MediaType
Obligation Optional
Definition: A media type, e.g. the format of a computer file.
Usage note Data publishers should consider using well-established IANA [[IANA-MEDIA-TYPES]] URLs for media types whenever possible to enhance compatibility and interoperability. However, the ability to create custom media types using labels provides flexibility for unique data requirements. When creating custom media types, it's advisable to provide clear and concise definitions to ensure transparency and understanding for data consumers. Striking a balance between standardized and custom media types optimizes data sharing within the DCAT-US framework.
Rationale: Incorporating dcterms:MediaType in DCAT-US combines the use of established IANA [[IANA-MEDIA-TYPES]] URLs for standardized media types with the flexibility to create custom types using labels. This dual approach ensures compatibility with recognized media types while allowing adaptability to specific needs, promoting both data interoperability and flexibility in data sharing and dissemination.
Reference

§ Term name: MediaType [DCTERMS]

Properties Summary

Property URI Range ReqLevel Card
label rdfs:label xsd:string R 0..1

Example

Metric

RDF Class: dqv:Metric
Obligation Optional
Definition: Represents a standard to measure a quality dimension. An observation (instance of dqv:QualityMeasurement) assigns a value in a given unit to a Metric.
Usage note The concept of a metric is used to define and measure specific aspects or dimensions of data quality within a given context, providing a standardized and quantifiable way to assess the quality of data. It allows for the comparison and evaluation of data quality across different resources and enables the development of consistent quality assessment frameworks and methodologies.
Rationale: Introducing dqv:Metric in the DCAT-US profile enhances dataset quality assessment and management by aligning with international data quality standards. It allows data publishers to systematically define and communicate dataset quality characteristics, promoting transparency and informed data utilization, fostering trust, and supporting responsible data sharing within the DCAT-US ecosystem.
Reference

§ 4.1 Class: Metric [VOCAB-DQV]

Properties Summary

Property URI Range ReqLevel Card
in dimension dqv:inDimension dqv:Dimension M 1
expected DataType dqv:expectedDataType xsd:anySimpleType M 1
definition skos:definition rdfs:Literal R 0..n

Mandatory Properties

Property: in dimension

Property in dimension
Requirement level Mandatory
Cardinality 1
URI dqv:inDimension
Range dqv:Dimension
Definition Represents the dimensions a quality metric, certificate and annotation allow a measurement of.

Property: expected datatype

Property expected datatype
Requirement level Mandatory
Cardinality 1
URI dqv:expectedDataType
Range xsd:anySimpleType
Definition Represents the expected data type for the metric's observed value (e.g., xsd:boolean, xsd:double etc...)

Example

Organization

RDF Class: org:Organization
Definition: Represents a collection of people organized together into a community or other social, commercial or political structure. The group has some common purpose or reason for existence which goes beyond the set of people belonging to it and can act as an Agent. Organizations are often decomposable into hierarchical structures.
Subclass Of: foaf:Agent
Usage note When utilizing the org:Organization class in DCAT-US 3.0, data publishers are encouraged to provide the preferred label (skos:prefLabel) for the organization, along with any relevant alternative labels (skos:altLabel) and abbreviations skos:notation. This usage is consistent with the W3C Organization Recommendation standard [[VOCAB-ORG]].This practice ensures comprehensive and flexible organization identification, improving data discoverability and search accuracy. Data publishers should strive to maintain consistency in naming conventions while considering variations and common aliases used to refer to organizations. By providing a well-rounded representation of organizations, DCAT-US 3.0 enhances data usability and transparency, facilitating efficient data search and retrieval.
Rationale: Improving the org:Organization class in DCAT-US 3.0 by supporting prefLabel, alternative labels, and abbreviations is essential to enhance organization representation. This enhancement accommodates variations in organization naming, promotes data interoperability, and improves discoverability within datasets. By incorporating these features, DCAT-US 3.0 aligns with best practices in data representation, enhances data search and transparency, and optimizes the overall usability of data resources.

Properties Summary

Property URI Range ReqLevel Card Changes from DCAT-US 1.1
name foaf:name xsd:string M 1..1 No Change
preferred label skos:prefLabel xsd:string O 0..1 Aligned
alternative label skos:altLabel xsd:string O 0..n Aligned
notation skos:notation xsd:string O 0..n Aligned
subOrganizationOf org:subOrganizationOf org:Organization O 0..1 No Change

Mandatory Properties

Property: name

Property name
Requirement level Mandatory
Cardinality 1
URI foaf:name
Range xsd:string
Definition The name of the Organization

Optional Properties

Property: preferred label

Property preferred label
Requirement level Optional
Cardinality 0..1
URI skos:prefLabel
Range xsd:string
Definition The legal name or preferred name of the Organization

Property: alternate label

Property alternate label
Requirement level Optional
Cardinality 0..n
URI skos:altLabel
Range xsd:string
Definition alternative names (trading names, colloquial names) for an organization

Property: notation

Property notation
Requirement level Optional
Cardinality 0..n
URI skos:notation
Range xsd:string
Definition abbreviations or codes from code lists for an organization (e.g. DOI, DOD)

Property: suborganization of

Property sub organization of
Requirement level Optional
Cardinality 0..n
URI org:subOrganizationOf
Range org:Organization
Definition Represents hierarchical containment of Organizations or OrganizationalUnits; indicates an Organization which contains this Organization.

Example

Period of Time

PeriodOfTime represents a period of time with a start date and an end.

RDF Class: dcterms:PeriodOfTime
Definition: PeriodOfTime represents a period of time with a start date and an end.
Usage note The start and end of the interval SHOULD be given by using properties dcat:startDate, and dcat:endDate, respectively. The interval can also be open - i.e., it can have just a start or just an end.
Rationale: The introduction of dcterms:PeriodOfTime in DCAT-US 3.0 is pivotal for harmonizing with international standards and rectifying the inconsistency with DCAT 1. In DCAT-US 1.1, [[ISO8601-1]] was used for interval representation in dcterms:temporal, diverging from DCAT 1's requirement of dcterms:PeriodOfTime. This alignment with DCAT 3 standards in DCAT-US 3.0 not only resolves discrepancies but also streamlines data processing, simplifying parsing and indexing of time intervals. By adopting dcterms:PeriodOfTime, DCAT-US 3.0 promotes ease of implementation, ensuring uniformity, flexibility, accuracy, and enhanced interoperability in handling time-related data, ultimately benefiting data usability and exchange.
Property URI Range ReqLevel Card
start date dcat:startDate rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R 0..1
end date dcat:endDate rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R 0..1

Example

Person

RDF Class: foaf:Person
Definition: This class represents an individual human being or a person. It can be used to provide information about individuals, such as their name, email address, homepage URL, and other personal details.
Subclass Of: foaf:Agent
Usage note
Rationale: The rationale for enhancing the foaf:Person class in DCAT-US 3.0 is to provide a more comprehensive and standardized representation of individuals within datasets. In earlier versions, like DCAT 1.1, only a single "name" property was available for describing persons, limiting the richness of personal data representation. By introducing properties like "firstName," "givenName," and "affiliation," DCAT-US 3.0 aligns with best practices in data representation, allowing data publishers to provide more detailed information about individuals and their affiliations with organizations. This enhancement enhances data usability and transparency.
Property URI Range ReqLevel Card
name foaf_name xsd:string M 1..1
given name foaf:givenName xsd:string O 0..1
first name foaf:firstname xsd:string R 0..1
member of org:memberOf org:Organization O 0..n

Mandatory Properties

Property: name

Property name
Requirement level Mandatory
Cardinality 1
URI foaf:name
Range xsd:string
Definition The full name of the Person

Optional Properties

Property: affiliation

Property name
Requirement level Optional
Cardinality 0..n
URI org:memberOf
Range org:Organization
Definition This property MAY be used to specify the affiliation of the Person to an organization.

Example

Provenance Statement

RDF Class: dcterms:ProvenanceStatement
Obligation Optional
Definition: Any changes in ownership and custody of a resource since its creation that are significant for its authenticity, integrity, and interpretation.
Usage note The dcterms:ProvenanceStatement in DCAT-US 3.0 offers flexibility in how it can be referenced. It can either be referred to by a URL or included in-line by using a label. This versatility allows data publishers to choose the most suitable method for providing information about significant changes in ownership and custody, enhancing the accessibility and usability of provenance details within datasets.
Rationale: Introducing dcterms:ProvenanceStatement in DCAT-US 3.0 enhances dataset transparency and trustworthiness. It allows data publishers to provide structured information about significant changes in ownership and custody, aligning with international data quality and provenance standards. This flexibility ensures greater confidence in dataset authenticity and interpretation, promoting responsible data usage within DCAT-US.
Reference

§ Term name: ProvenanceStatement [DCTERMS]

Properties Summary

Property URI Range ReqLevel Card
provenance statement text rdfs:label xsd:string R 0..n

Role

RDF Class: dcat:Role
Obligation Optional
Definition: A role is the function of a resource or agent with respect to another resource, in the context of resource attribution or resource relationships.
Usage note Used in a qualified-attribution to specify the role of an Agent with respect to an Entity. It is recommended that the values be managed as a controlled vocabulary of agent roles, such as [[?ISO-19115-1]] CI_RoleCode.
Rationale: Integrating dcat:Role within dcat:Relationship in DCAT-US enriches data networks by providing clear, navigable, and semantically transparent relationships among datasets, thereby enhancing data discoverability, usability, and integration across various applications and use cases by precisely depicting complex data dependencies and hierarchies.

Properties Summary

Property URI Range ReqLevel Card
alternate label skos:altLabel rdfs:Literal O 0..n
definition skos:definition rdfs:Literal R 0..n
in scheme skos:inScheme skos:ConceptScheme M 1..1
notation skos:notation xsd:string O 0..n
preferred label skos:prefLabel rdfs:Literal M 1.n

Mandatory Properties

Property: preferred label

Property preferred label
Requirement level Mandatory
Cardinality 0..n
URI skos:prefLabel
Range rdfs:Literal
Definition Preferred label for the controlled vocabulary term (one per language).

Property: concept scheme

Property in scheme
Requirement level Mandatory
Cardinality 1
URI skos:inScheme
Range skos:ConceptScheme
Definition Concept scheme defining the role

Optional Properties

Property: alternate label

Property alternate label
Requirement level Optional
Cardinality 0..n
URI skos:altLabel
Range rdfs:Literal
Definition alternative labels for a role

Property: notation

Property notation
Requirement level Optional
Cardinality 0..n
URI skos:notation
Range xsd:string
Definition abbreviations or codes for the role.

Quality Measurement

RDF Class: dqv:QualityMeasurement
Obligation Optional
Definition: Represents the evaluation of a given dataset (or dataset distribution) against a specific quality metric.
Usage note Represents the evaluation of a given resource (as a Data Service, Dataset, or Distribution) against a specific quality metric, such as spatial resolution in scale, angle or metric.
Rationale: The inclusion of dqv:QualityMeasurement in DCAT-US assists end-users in better evaluating the fitness of use of resources. This optional class enhances data quality assessment, aligns with international standards (DQV), and enables more precise evaluation against specific quality metrics, ultimately improving data usability and adherence to recognized quality assessment practices.
Reference

§ 4.1 Class: Quality Measurement [VOCAB-DQV]

Properties Summary

Property URI Range ReqLevel Card
is measurement of dqv:isMeasurementOf dqv:Metric M 1
value dqv:value rdfs:Literal M 1
unit of measure sdmx-attribute:unitMeasure rdfs:Resource O 0..1

Mandatory Properties

Property: is measurement of

Property is measurement of
Requirement level Mandatory
Cardinality 1
URI dqv:isMeasurementOf
Range dqv:Metric
Definition Indicates the metric being observed.

Property: value

Property value
Requirement level Mandatory
Cardinality 1
URI dqv:value
Range rdfs:Literal
Definition Refers to values computed by metric.

Optional Properties

Property: unit of measure

Property unit of measure
Requirement level Optional
Cardinality 0..1
URI sdmx-attribute:unitMeasure
Range rdfs:Resource
Definition Unit of measure associated with the value

Example

Relationship

RDF Class: dcat:Relationship
Definition: An association class for attaching additional information to a relationship between DCAT Resources
Usage note Use to characterize a relationship between datasets, and potentially other resources, where the nature of the relationship is known but is not adequately characterized by the standard [[?DCTERMS]] properties (dcterms:hasPart, dcterms:isPartOf, dcterms:conformsTo, dcterms:isFormatOf, dcterms:hasFormat, dcterms:isVersionOf, dcterms:hasVersion, dcterms:replaces, dcterms:isReplacedBy, dcterms:references, dcterms:isReferencedBy, dcterms:requires, dcterms:isRequiredBy) or [[PROV-O]] properties (prov:wasDerivedFrom, prov:wasInfluencedBy, prov:wasQuotedFrom, prov:wasRevisionOf, prov:hadPrimarySource, prov:alternateOf, prov:specializationOf)
Rationale: The introduction of dcat:Relationship in DCAT-US serves to enhance the representation and description of relationships between datasets and other resources. This class allows for the attachment of additional information to relationships that are not adequately characterized by standard properties, promoting a more comprehensive understanding of dataset connections. By accommodating nuanced relationship types beyond existing standards like [[DCTERMS]] and [[PROV-O]] properties, DCAT-US ensures greater flexibility and precision in documenting dataset relationships, facilitating more informed data discovery and utilization.

Properties Summary

Property URI Range ReqLevel Card
relation dcterms:relation dcat:Resource M 1
role dcat:hadRole dcat:Role M 1

Mandatory Properties

Property: relation

Property relation
Requirement level Mandatory
Cardinality 1
URI dcterms:relation
Range
Definition The resource related to the source resource.

Property: role

Property role
Requirement level Mandatory
Cardinality 1
URI dcat:hadRole
Range dcat:Role
Definition The function of an entity or agent with respect to another entity or resource.

Example

RightsStatement

RDF Class: dcterms:RightsStatement
Obligation Optional
Definition: A statement about the intellectual property rights (IPR) held in or over a resource, a legal document giving official permission to do something with a resource, or a statement about access rights.
Usage note Information about rights SHOULD be provided on the level of Distribution. Information about rights MAY be provided for a Dataset in addition to but not instead of the information provided for the Distributions of that Dataset. Providing rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts.
Rationale: The introduction of dcterms:RightsStatement in DCAT-US is vital for standardizing the conveyance of intellectual property rights (IPR) and access permissions. This optional class accommodates URL references and custom rights statements via attribution text, promoting transparency and compliance. By encouraging consistent rights information at the Distribution and optional Dataset levels, DCAT-US enhances data sharing while reducing legal conflict risks.
Reference

§ Term name: RightsStatement [DCTERMS]

Properties Summary

Property URI Range ReqLevel Card
label rdfs:label rdfs:Literal R 0..n
attribution text odrs:attributionText rdfs:Literal R 0..n

Example

Standard

RDF Class: dcterms:Standard
Obligation Optional
Definition: A standard or other specification to which a Dataset or Distribution conforms.
Usage note A standard or other specification to which a Catalog, Catalog Record, Data Service, Dataset, or Distribution conforms
Rationale: The inclusion of dcterms:Standard in DCAT-US accommodates standard references through URLs or custom, detailed descriptions when specific standards are not available, promoting flexibility and completeness in resource metadata.
Reference

§ Term name: Standard [DCTERMS]

Properties Summary

Property URI Range ReqLevel Card
description dcterms:description rdfs:Literal R 0..n
identifier dcterms:identifier xsd:string R 0..n
issued dcterms:issued xsd:date R 0..1
title dcterms:title rdfs:Literal R 0..n
type dcterms:type skos:Concept R 0..n
version dcat:version xsd:string R 0..1
in scheme skos:inScheme skos:ConceptScheme O 0..1
creation date dcterms:created xsd:date O 0..1
update/modification date dcterms:modified xsd:date O 0..1

Optional Properties

Property: creation date

Property creation date
Requirement level Optional
Cardinality 0..1
URI dcterms:created
Range xsd:date
Definition This property contains the date on which the Standard has been first created.

Property: update/modification date

Property update/modification date
Requirement level Optional
Cardinality 0..1
URI dcterms:modified
Range xsd:date
Definition This property contains the most recent date on which the Standard was changed or modified.

Examples

UseRestriction

A UseRestriction is a set of rules, guidelines, or legal provisions that dictate how a particular resource, asset, information, or object can be utilized. Use restrictions may encompass limitations on access, distribution, reproduction, modification, or sharing, and they are often put in place to protect privacy, intellectual property rights, security, or compliance with legal or ethical standards.

RDF Class: dcat-us:UseRestriction
Definition: A UseRestriction is a set of rules, guidelines, or legal provisions that dictate how a particular resource, asset, information, or object can be utilized. Use restrictions may encompass limitations on access, distribution, reproduction, modification, or sharing, and they are often put in place to protect privacy, intellectual property rights, security, or compliance with legal or ethical standards.
Usage note When utilizing the dcat-us:UseRestriction class, data publishers are encouraged to provide comprehensive and precise details regarding the specific use restrictions applied to a resource. This may include information on access limitations, distribution rules, reproduction guidelines, modification constraints, and any other pertinent restrictions. Adherence to NARA guidelines and standards should be a priority when defining use restrictions, ensuring that data resources align with archival and preservation practices. By offering clear and concise use restriction information, data consumers can make informed decisions about the utilization of these resources while complying with NARA's requirements.
Rationale: The introduction of dcat-us:UseRestriction in DCAT-US 3.0, aligned with NARA (National Archives and Records Administration) guidelines, enhances compliance and interoperability with NARA-specific use restriction standards. This enables organizations to accurately convey NARA-specific restrictions on data resources, ensuring adherence to archival and data preservation requirements, and promoting consistent data management practices within the DCAT-US framework.

Properties Summary

Property URI Range ReqLevel Card
restriction status dcat-us:restrictionStatus skos:Concept M 1..1
specific restriction dcat-us:specificRestriction skos:Concept R 0..1
restriction note dcat-us:restrictionNote rdfs:Literal O 0..1

Mandatory Properties

Property: restriction status

Property restriction status
Requirement level Mandatory
Cardinality 1
URI dcat-us:restrictionStatus
Range skos:Concept
Definition Indication of whether or not there are use restrictions on the archival materials

Optional Properties

Property: restriction note

Property restriction note
Requirement level Optional
Cardinality 0..1
URI dcat-us:restrictionNote
Range rdfs:Literal
Definition Significant information pertaining to the use or reproduction of the data.

Example

Usage Guidelines

Dereferenceable identifiers

The FAIR principles, under the Findability and the Accessibility chapters respectively, state that:

In the expansive realm of digital data and ontology, the ability to unambiguously identify and access resources is foundational. this section delves deep into the principles and practices that underpin this crucial aspect of digital data management. Guided by the FAIR principles, this section unravels the nuances of generating resolvable URLs, the importance of URI resolution, the roles of various identifier resolution services, and the distinctions between alternate identifier properties. Through a comprehensive exploration, this section offers insights into ensuring data is not only uniquely identifiable but also consistently accessible in an ever-evolving digital landscape.

Generating Resolvable URLs

In the context of FAIR data, resources on the web must have unique, persistent, and resolvable identifiers. In order to achieve the capability of persistence, it is necessary for the resource identifiers to comply to the RFC 3986 IETF standard for URIs (and IRIs, which are URI extended to cope with unicode). This means that it must comprise the following components:

  • scheme:http or https
  • an authority: www.example.com
  • optionally a path: /dataset-name/
  • a local identifier (such as database accession number, such as P12133 from uniprot) or a globally unique identifier (such as a UUID or hash code).

Identifier Resolution

URI resolution is a fundamental process that involves directing requests to the appropriate identified entity. The standard approach typically entails resolving an HTTP GET request through content negotiation, enabling the selection of different representations of the desired resource.

A PURL, or persistent URL, serves as a permanent address for accessing web resources. To grasp the concept of PURLs, it's essential to first understand the concept of URL indirection (also known as URL redirect or URL forwarding). This practice involves providing a stable and fixed web address/URL that is configured to point to different content, which might undergo periodic modifications.

When a user accesses a PURL, they are automatically redirected to the current location of the resource. This means that when an author decides to relocate a page, they can easily update the PURL to direct it to the new location.

The practice of indirection proves beneficial as it ensures a consistent URL address for resources that are prone to change, such as due to version updates or ownership changes.

A concrete example of this practice can be observed in the utilization of purl.org URLs for identifying OBO Foundry resources. For instance, the URL http://purl.obolibrary.org/obo/stato.owl redirects to the latest release of the file, which can be found at https://raw.githubusercontent.com/ISA-tools/stato/dev/releases/latest_release/stato.owl.

PURLs sharing a common prefix are organized into domains, each managed by a single maintainer. The maintainer has the authority to add new PURLs to the domain and make modifications to existing PURLs within that domain.

According to FAIR Principle A1, it is essential for (meta)data to be retrievable using its identifier. When the identifier itself is not a resolvable URL, Identifier Resolution Services are required. These services possess the capability to map an Internationalized Resource Identifier (IRI) to a specific location where the corresponding data can be accessed.

Identifier Resolution services

In the digital realm, ensuring consistent and persistent access to resources is paramount. Identifier Resolution Services play a crucial role in achieving this by providing unique and persistent identifiers for various digital objects and entities. This section delves into several prominent services, detailing their functions and significance in the broader digital ecosystem. Please note that this is not an exhaustive list but rather a selection of popular examples intended to illustrate the diversity and importance of such services

purl.org
The PURL system is a service of the Internet Archive, which provides an interface to administer domain. For more information about the service, visit https://archive.org/services/purl/help
w3ids

W3IDs.org provides persistent identifiers for Linked Data resources. These identifiers can be used in DCAT to uniquely identify datasets and data services. This can help to improve the discoverability and interoperability of datasets and data services. W3IDs.org is an important part of the Linked Data ecosystem and plays a key role in making data more discoverable and interoperable.

Send a request to add a redirect to the public-perma-id@w3.org mailing list. Make sure to include the URL that you want on w3id.org, the URL that you want to redirect to, and the HTTP code that you want to use when redirecting. An administrator will then create the redirect for you.

doi.org
DOI.org is a digital identifier system that assigns unique and persistent identifiers to digital objects. These identifiers can be used to cite, share, and track digital objects across different platforms and systems. DOI.org identifiers can be used in DCAT to uniquely identify datasets and data services. This can help to improve the discoverability and interoperability of datasets and data services.
orcid.org
ORCID (Open Researcher and Contributor ID) is a global, non-profit organization that provides a unique and persistent identifier for researchers. ORCID IDs are used to link researchers to their professional activities, such as publications, grants, and affiliations. This helps to ensure that researchers are properly credited for their work and that their work is more easily discoverable. ORCID is a valuable tool for researchers, and it is becoming increasingly important as the research landscape becomes more complex.
arxiv.org
ArXiv identifiers are globally unique identifiers (GUIDs) assigned to scholarly articles submitted to the arXiv preprint server. These identifiers can be used in DCAT (Data Catalog Vocabulary) to uniquely identify authors and their publications. This can help to improve the discoverability and interoperability of research data.
Identifiers.org
The Identifiers.org Resolution Service provides consistent access to life science data using Compact Identifiers. Compact Identifiers consist of an assigned unique prefix and a local provider designated accession number (prefix:accession). The resolving location of Compact Identifiers is determined using information that is stored in the Identifiers.org Registry.

Alternate identifiers

In the realm of data cataloging, identifiers play a pivotal role in ensuring the uniqueness, traceability, and interoperability of resources. Different namespaces and vocabularies offer distinct properties to denote identifiers. Here, we discuss three such properties: dcterms:identifier, adms:identifier, and skos:notation, shedding light on their distinct usages and nuances.

dcterms:identifier

Originating from the Dublin Core Metadata Terms (DCTERMS), dcterms:identifier is a broad and general property used to denote a unique reference for a resource. It does not impose any constraint on the format or nature of the identifier. In essence, it's a flexible property that can be employed across various domains and for diverse types of resources, be they digital documents, physical artifacts, or abstract concepts.

adms:identifier

The Asset Description Metadata Schema ([[VOCAB-ADMS]]) introduces adms:identifier. Unlike the more generic dcterms:identifier, this property is more structured. It's designed to link a resource to its identifier, which is itself described using further properties. This allows for a richer description of the identifier, such as specifying its type (e.g., ISBN, DOI), its status, and its version. It's particularly useful in contexts where there's a need to provide additional metadata about the identifier itself, beyond just its value.

skos:notation

skos:notation is a property from the Simple Knowledge Organization System ([[SKOS-REFERENCE]]) vocabulary. It's used to provide a symbolic string notation for a concept. While it can function similarly to an identifier, its primary intention is to give a machine-readable, often standardized, symbolic name to a concept, especially when such a notation exists in a legacy or external system. For example, in a controlled vocabulary, each concept might have a notation that denotes its code in a classification scheme.

Multilingualism

From a technical perspective multilingualism SHOULD be handled as follows:

The table lists multilingual properties of DCAT-US and the translation strategies that apply to them:

Property RDF property Range Multilingual Support
Catalog title dcterms:title rdfs:Literal Language encoded string
Catalog description dcterms:description rdfs:Literal Language encoded string
Dataset title dcterms:title rdfs:Literal Language encoded string
Dataset description dcterms:description rdfs:Literal Language encoded string
Dataset keyword dcat:keyword rdfs:Literal Language encoded string
Catalog homepage foaf:homepage foaf:Document Content negotiation
Dataset landing Page dcat:landingPage foaf:Document Content negotiation
Catalog publisher dcterms:publisher foaf:Agent Content negotiation for the URI and language encoded string for the name
Dataset publisher dcterms:publisher foaf:Agent Content negotiation for the URI and language encoded string for the name

Stakeholders

In the realm of data cataloging and management, understanding the entities involved in the creation, curation, and maintenance of datasets is paramount. This section delves into the intricate details of these entities, categorizing them into distinct classes such as "Agent," "Person," and "Organization." Each class provides a structured framework to represent various stakeholders, from individuals to software agents and organizations, ensuring that data provenance is transparent and traceable. As we navigate through this section, we'll gain insights into the properties, roles, and significance of these agent representations within the DCAT-US 3.0 context, highlighting their pivotal role in enhancing data discoverability, interoperability, and usability.

Embracing globally unique, resolvable URLs and Persistent Identifiers (PIDs) stands paramount in fortifying the integrity and usability of data ecosystems, especially in identifying diverse agents. This practice not only ensures a crystal-clear, unambiguous identification, thereby averting potential duplications and inconsistencies from multiple URIs but also significantly enhances data discoverability and accessibility. By employing a singular, steadfast identifier per agent, data practitioners safeguard against data misinterpretation and ensure a coherent, traceable data lineage, bolstering data provenance and trust across various platforms and datasets. Furthermore, adherence to standardized identification practices, utilizing reference registries like ORCID or Research Organization Registry (ROR) , not only aligns with global data management standards but also propels collaborative research and data sharing, ensuring a streamlined, reliable, and impactful data management and collaboration across diverse research and data utilization environments. More details about identifiers are provided in Deferenceable identifiers section.

Agent

The foaf:Agent class in the Friend of a Friend [[FOAF]] ontology serves a dual-purpose role, particularly in the context of data cataloging and management.

Firstly, it acts as an abstract class for both org:Organization and foaf:Person, providing a generalized representation that encompasses various entities involved in dataset production and management. This abstraction facilitates the encapsulation of common properties and behaviors, enabling a unified approach to handling different entity types in data documentation and interoperability.

Secondly, foaf:Agent is utilized as a class to represent autonomous software agents, which are self-operating software entities capable of performing tasks and making decisions without direct human intervention.

This dual functionality of foaf:Agent not only streamlines the representation of human and non-human actors in data management processes but also provides a flexible and semantically rich framework to describe and interlink various entities within the DCAT-US schema, thereby enhancing data discoverability and usability.

Person

A person agent represents an individual involved in producing or managing datasets. It provides information about the person and their associated contact details.

In the context of a DCAT-US 3.0, the foaf:Person class plays a crucial role. It is used to represent individuals who are associated with or responsible for the datasets or resources described within the DCAT profile.

Let's break down the specific properties associated with foaf:Person and their significance within this context:

  • foaf:name: This property represents the full name of a person. It can be used to specify the full name of individuals associated with datasets. For example, if a person's full name is "John Smith," you can use this property to provide their complete name. This property is the only property mandatory for describing a person and is typically used for display.
  • foaf:firstName: This optional property represents the first name of a person. It is used to provide the first name of individuals associated with or responsible for resources in the DCAT-US profile. It can be used to provide structured information about an individual's first name.
  • foaf:givenName: This optional property represents the given name of a person. It can be used to provide structured information about an individual's name.
  • org:memberOf While not a FOAF property,this property is typically used to indicate the organization or group to which a person is affiliated to. In the context of a DCAT-US profile, it can be used to specify the organization or entity with which an individual is affiliated in relation to the described resources. For instance, if a person is a member of an organization of Department of Interior, you can use this property to link them to that organization identified by http://www.doi.gov.

Organization

The Organization agent plays a pivotal role in representing an organization or institution that is instrumental in the production or management of a resource. It encapsulates information about the organization accountable for the resource, along with its pertinent contact details, thereby acting proficiently as an Agent. Furthermore, it can be hierarchically decomposed into sub-organizations, offering a structured view of the organizational layers.

When employing org:Organization within DCAT-US 3.0, adherence to the following guidelines is imperative:

  • Use Recognized URL Identifiers: It is strongly recommended to utilize well-known URL identifiers for organizations that are centrally managed by a government registry. This practice ensures the unambiguous identification of organizations and fosters consistency and reliability in organizational referencing across cataloged resources.
  • Ensure Consistency: Employ foaf:name to furnish a consistent and recognizable name for the organization, thereby maintaining a uniform identity across various platforms.
  • Enhance Discoverability: Leverage skos:prefLabel to designate the preferred label, ensuring that the organization is effortlessly discoverable and identifiable across diverse search scenarios.
  • Accommodate Variations: Utilize skos:altLabel to incorporate alternative names, acronyms, or aliases, thereby enhancing searchability and augmenting user-friendliness by accommodating various naming conventions.
  • Provide Abbreviations: Employ skos:notation to document any abbreviations or short forms that are commonly associated with the organization, facilitating users in recognizing and associating the organization with its widely-used abbreviations.
  • Represent Hierarchy: Optionally, utilize org:subOrganizationOf to depict hierarchical relationships, providing a structured and layered view of the organization and its sub-entities.

Contact Point

The Contact Point serves as a crucial element in data cataloging, providing a reference for users to seek additional information, clarifications, or support regarding a resource published in a catalog. In the DCAT-US profile, contact point information is encoded using the widely used [[VCARD-RDF]] vocabulary, ensuring standardized representation and interoperability of contact details across various platforms and applications.

A contact point may refer to an individual, a team, or an organization responsible for the resource (dcat:Dataset, dcat:DataService, dcat:DatasetSeries, dcat:Catalog) and is typically characterized by properties such as name, email, and telephone number. The inclusion of address details, role or title, and associated organizational details further enriches the contact information, providing users with multiple avenues to facilitate communication.

It is imperative to ensure that the contact point information is accurate, up-to-date, and reliable to foster trust and facilitate efficient communication between data providers and consumers. The following sub-sections provide detailed guidance on encoding contact point information, defining associated address details, and linking the contact point to the resources in the DCAT-US profile.

Encoding Contact Information

The contact information is encoded using the vcard:Kind class. If the contact information is reused in many resources, it is recommended to identify it with URI to avoid duplicate entries. The vcard:fn (formatted name) and vcard:email (email address) properties are mandatory to ensure basic contactability. Additional properties like vcard:tel (telephone number) and vcard:title (role or title) can be utilized to provide comprehensive details about the contact point. If the contact is a person, the property vcard:givenName and vcard:familyName can be used.

              :vcard123 a vcard:Kind ;
                  vcard:fn "John Doe" ;
                  vcard:email <mailto:john.doe@example.com> ;
                  vcard:tel <tel:+123456789> ;
                  vcard:family-name "Doe" ;
                  vcard:given-name "John" ;
                  vcard:title "Data Manager" ;
                  vcard:hasAddress :address456 ;
              .
          

Defining Address Details

Address details, when applicable, are encoded using the vcard:Address class and linked to the contact point using the vcard:hasAddress property. The address does not have to a URI, if it not reused anywhere else in the catalog. The address class can include properties like vcard:street-address, vcard:locality, vcard:locality, vcard:region, vcard:postal-code and vcard:country-name to provide detailed location information about the contact point.

The following example illustrates how to define and encode address details, ensuring clarity and usability for data consumers.

            :address456 a vcard:Address ;
                vcard:street-address "123 Main Street" ;
                vcard:locality "Anytown" ;
                vcard:region "CA" ;
                vcard:postal-code "12345" ;
                vcard:country-name "USA" ;
            .
           

Linking Contact Point to Resource

The contact point is associated with the dataset using the dcat:contactPoint property. This linkage ensures that users can easily identify and communicate with the responsible entity for additional information, support, or inquiries regarding the dataset.

The following example illustrates how to link the defined contact point to the dataset, ensuring clarity and facilitating user navigation and communication.

            @prefix dcat: <http://www.w3.org/ns/dcat#> .
            
            :MyDataset a dcat:Dataset ;
                dcat:title "My Example Dataset" ;
                dcat:description "This dataset includes example data for demonstration purposes." ;
                dcat:contactPoint :vcard123 ;
            .
                        

Resource Attributions

Attribution in data catalogs pertains to the systematic association of a resource (such as a dataset or service) with a responsible entity, termed an "agent". Agents, which can be individuals, organizations, or services, may contribute to, create, publish, or interact significantly with the data. The roles of agents, such as contributor, creator, publisher, funder, distributor, custodian, or editor, are crucial in understanding the lineage and responsibility of data management.

Attributions hold paramount importance in data catalog searches for several pivotal reasons:

  1. Provenance and Trustworthiness: Understanding the entities (agents) that have created or interacted with the data can significantly inform assessments of its quality and trustworthiness. Data originating from or managed by reputable and trusted organizations or individuals may be deemed more reliable and credible.
  2. Credit and Accountability: Proper attributions ensure that all contributing individuals or organizations are aptly acknowledged for their work or data. This practice not only adheres to ethical guidelines and potentially legal requirements but also fosters a culture of recognition and accountability in data management and sharing.
  3. Search and Discovery: Attributions serve as a valuable criterion in data search and discovery processes. Users may seek datasets created, managed, or contributed to by specific researchers, organizations, or other agents, thereby making attributions a vital component in filtering and locating data resources.
  4. Collaboration and Networking: Identifying and acknowledging the agents associated with datasets can pave the way for new collaborative opportunities. It enables users and researchers to identify and connect with individuals or organizations possessing relevant expertise or shared research interests.
  5. Issue Resolution: When users encounter issues or have queries about a dataset, attributions provide a clear pathway to seek clarifications, report issues, or obtain additional information. This ensures that data reliability and integrity are maintained through active resolution of issues and continuous improvement.

Standard Attributions and Roles

Employing standard properties such as dcterms:creator, dcterms:contributor, dcterms:rightsHolder, and dcterms:publisher, along with the generic prov:wasAttributedTo from [[!PROV-O]], facilitates the basic associations of responsible agents with a cataloged resource, ensuring clarity and standardization in data attribution.

Extended Attributions and Diverse Roles

While there are numerous roles of significance in relation to cataloged resources, such as funder, distributor, custodian, and editor, some of these roles are enumerated in the CI_RoleCode values from [[?ISO-19115-1]], in the [[?DataCite]] metadata schema, and included within the MARC relators.

Utilizing a generalized method for assigning an agent to a resource with a specified role is facilitated by prov:qualifiedAttribution from [[PROV-O]]. This method is particularly useful when the nature of the relationship is known but does not correspond with one of the standard attribution property roles.

The range of prov:qualifiedAttribution is prov:Attribution. The relevant Agent is specified via property prov:agent, whereas the role is specified with property dcat:hadRole, which takes as value a skos:Concept describing that role, as those included in the relevant code list operated by a US Government-controlled Registry.

The prov:qualifiedAttribution property is utilized to provide more detailed and structured information about the attribution of a resource, allowing for the specification of additional attributes, such as the role or position of the attributed entity, the date of attribution, or other relevant details.

provides an illustration of the usage of attribution properties:

Resource Classification

Controlled vocabularies, including taxonomies and thesauri, dramatically enhance data searchability. Utilizing these vocabularies allows datasets to be systematically classified, tagged, and described with standardized terms, aiding users in retrieving relevant datasets, even when using varied terms or synonyms.

Employing controlled vocabularies enables semantic search, which comprehends the context and relationships behind search terms. This approach enhances search results, for example, linking "automobiles" with related terms like "cars" or "vehicles".

This enriched search experience is crucial for navigating vast, diverse datasets, ensuring comprehensive and relevant results, and bridging the gap between user intent and dataset content.

The DCAT-US profile utilizes properties from the DCAT 3 framework for resource classification, providing flexibility in the choice of controlled vocabularies to meet the specific needs of various communities or agencies.

Spatial Metadata

Spatial metadata play a vital role in the context of geospatial data within the US Government by providing essential information about data quality, facilitating data discovery and interoperability, and ensuring responsible data governance. They describe the characteristics, source, and limitations of geospatial datasets, enabling informed decision-making based on data credibility. Spatial metadata support efficient data discovery, retrieval, and sharing, reducing duplication and promoting collaboration. They also promote interoperability by adhering to standardized metadata schemas and facilitate compliance with legal and regulatory requirements, ensuring accountable data stewardship. Spatial metadata are essential for maximizing the value and effective utilization of geospatial data within the US Government.

The Data Catalog Vocabulary (DCAT) specification provides a standardized way to represent metadata about datasets and services, including information about their spatial properties. In the context of DCAT-US, which is a profile tailored specifically for the United States, several spatial properties are relevant for describing resources. This wiki page aims to provide an overview of these spatial properties and their usage within the DCAT-US framework.

Geographic Bounding Box

A bounding box represents the minimum and maximum coordinates that enclose a specific geographic area. In DCAT-US, the dcat-us:geographicBoundingBox property and the class dcat-us:GeographicBoundingBox are introduced and utilized to define the spatial extent of a resource. This class consists of four numerical properties: the west (dcat-us:westBoundingLongitude) and east longitude (dcat-us:eastBoundingLongitude), followed by the north (dcat-us:northBoundingLatitude) and south latitude (dcat-us:southBoundingLatitude), which are based on the WGS84 coordinate system.

By specifying a bounding box, datasets can be associated with a particular geographic region. If the west bound longitude is greater than the east bound longitude, then the box spans the anti-meridian

Antimeridian crossing
Geographic Bounding Box crossing antimeridian

Defining a common reference system is of utmost importance when searching for geospatial data. Geospatial datasets are typically represented using different coordinate systems, projections, and datums, which can lead to challenges in interoperability and data integration. A common reference system ensures that data from diverse sources can be accurately aligned and combined, enabling effective analysis, visualization, and decision-making.

The introduction of the dcat-us:geographicBoundingBox property in DCAT-US profile addresses this challenge by providing a standardized way to express the spatial extent of a resource. Unlike using a Polygon, which requires explicit geometric coordinates, the dcat-us:geographicBoundingBox offers a simpler and more interoperable approach. Here are a few reasons why the dcat-us:geographicBoundingBox is advantageous:

  • Consistent Spatial Representation: Geospatial datasets can be represented in various coordinate systems, projections, and datums. Without a common reference system, it becomes difficult to align and compare datasets accurately. By establishing a common reference system, data publishers and consumers can ensure consistent spatial representation, enabling seamless integration and analysis of geospatial data from different sources.
  • Interoperability and Integration: The use of a common reference system enhances interoperability among geospatial datasets and systems. It enables data from diverse sources to be combined and used together seamlessly, facilitating cross-domain analysis and decision-making. With a common reference system, data publishers can provide metadata that adheres to a standard, making it easier for data consumers to understand and utilize the data.
  • Simplified Search and Discovery: The dcat-us:geographicBoundingBox property simplifies the search and discovery process for geospatial data. Instead of relying on complex geometric representations like polygons, users can specify a bounding box by defining the minimum and maximum values of latitude and longitude. Filtering geospatial data using a bounding box involves numeric comparisons, where the latitude and longitude values of data points are compared to the minimum and maximum values of the bounding box. This approach efficiently eliminates data points outside the specified spatial extent by performing simple numeric operations. It leverages the inherent numerical properties of latitude and longitude values, making it computationally efficient and compatible with spatial indexing and query optimization techniques. By using numeric comparison, geospatial data can be filtered and retrieved faster, optimizing the search process in various geospatial applications. This makes it easier for users to define their area of interest and retrieve relevant datasets that intersect with that spatial extent.
  • Query Efficiency and Performance: The use of dcat-us:GeographicBoundingBox enables efficient spatial querying of datasets. Data consumers can quickly filter and retrieve resources based on their spatial extent, reducing the need to process unnecessary data. This improves search performance and query efficiency, particularly when dealing with large-scale geospatial data collections.
  • Compatibility with Existing Tools and Standards: The adoption of the dcat-us:geographicBoundingBox property aligns wiofficetropolitan statistical areas, employing multiple bounding boxes for each area helps retrieve data specific to each metropolitan region, ensuring more accurate and focused results. Furthermore, non-contiguous states like Alaska and Hawaii require separate bounding boxes to accurately capture their unique spatial coverage. The inclusion of multiple bounding boxes in geospatial searches improves the accuracy and relevance of the retrieved datasets, facilitating more effective decision-making and analysis in various applications and domains.

Spatial Coverage

In DCAT 3, the use of the dcterms:spatial property is intended to provide information about the spatial coverage or location of a resource. This property allows for the description of the spatial aspect of a dataset, dataset distribution, or data service in a standardized manner.

The dcterms:spatial property can be used to represent spatial coverage using various spatial reference systems, such as coordinates, polygons, or place names. This flexibility allows data publishers to express the spatial extent of their resources in a way that is most appropriate for the given context.

For example, the dcterms:spatial property can be used to indicate the geographic bounding box that represents the extent of a dataset. This can be expressed using minimum and maximum latitude and longitude values, providing a rectangular approximation of the resource’s coverage area. Alternatively, a more precise polygon can be used to describe complex or irregularly shaped spatial extents.

By including the dcterms:spatial property in DCAT 3, datasets can provide explicit information about their spatial coverage. This enables data consumers and applications to understand the geographic scope of a resource and determine its relevance for their specific use cases. It supports efficient searching, discovery, and integration of geospatial datasets across different platforms and systems.

Furthermore, the use of standardized properties like dcterms:spatial enhances interoperability and data exchange among different data catalogs and applications. By conforming to the DCAT 3 specification, data publishers ensure that spatial information is consistently represented and interpreted, facilitating seamless data integration and interoperability within the geospatial community.

Spatial Resolution

Spatial resolution is a characteristic of geospatial datasets that describes the level of detail or granularity in the spatial representation. In DCAT 3, the dcat:spatialResolutionInMeters property is used to specify the spatial resolution of a resource, measured in meters. This property helps users understand the level of detail provided by the dataset and assess its suitability for their specific needs. Applications benefit from this property in various ways. For instance, in remote sensing and satellite imagery, users can determine if the dataset captures the required level of detail for their analysis. In cartography and mapping, spatial resolution influences the clarity and accuracy of displayed features. Environmental modeling relies on appropriate resolution for accurate simulations, and emergency management requires datasets that support informed decision-making. The dcat:spatialResolutionInMeters property supports data integration, ensuring compatibility between datasets with different resolutions. Overall, this property enhances the usability and effectiveness of geospatial datasets across diverse domains.

Handling Map Projections and Coordinate Systems

Geographic datasets in DCAT are commonly referenced using latitude and longitude coordinates based on the WGS84 datum. This is facilitated by the recommended use of the dcat-us:geographicBoundingBox property and the corresponding class dcat-us:GeographicBoundingBox to establish a uniform reference system for searching and indexing. However, the diverse nature of geographic data often necessitates the use of various map projections and coordinate systems.

The dcterms:conformsTo property in DCAT is integral in specifying the Coordinate Reference System (CRS) utilized by a dataset or a distribution. Accurately defining the CRS is essential for understanding the spatial context, enabling precise geographic analysis and ensuring data interoperability.

Additionally, the dcterms:type property is employed alongside dcterms:conformsTo to delineate the type of reference system, be it spatial or temporal. For spatial datasets, dcterms:type typically points to a spatial reference system, as defined by URIs like http://resources.data.gov/categories/SpatialReferenceSystem.

Utilizing URIs to reference EPSG standards ensures a clear and unambiguous specification of the CRS. For example, the URI http://www.opengis.net/def/crs/EPSG/0/4269 explicitly denotes adherence to the NAD 83 CRS. Standardized references like these enhance data consistency and facilitate interoperability across various platforms and applications.

The reference system identifier SHOULD be preferably represented with an HTTP URI. In particular, spatial reference systems should be specified by using the corresponding URIs from the “EPSG coordinate reference systems” register operated by the Open Geospatial Consortium [[?OGC-EPSG]]. This registry is crucial for the precise identification of CRSs, thereby ensuring that spatial data referenced in DCAT are compatible and functional across a multitude of geospatial applications.

Example: Specifying a CRS using an EPSG code for a geographic dataset

Clearly defining the CRS is paramount for the effective use and integration of DCAT datasets, facilitating their application in a broad spectrum of spatial data uses.

Temporal Metadata

Temporal metadata is crucial for understanding and utilizing datasets effectively. This section is divided into three main categories to cover key aspects: Lifecycle Temporal Properties, Temporal Coverage, and Temporal Resolution. Additionally, we provide insights into handling these temporal aspects in JSON-LD format. Accurate temporal metadata ensures datasets are relevant and reliable, especially for time-sensitive analyses.

The use of multiple formats for temporal metadata, such as xsd:date, xsd:dateTime, xsd:gYear, and xsd:gYearMonth, is essential. These formats provide the necessary precision, flexibility, contextual appropriateness, interoperability, and cater to diverse user needs, accommodating different datasets' requirements for detail.

Lifecycle Temporal Properties

Lifecycle temporal properties document the timeline of the dataset's creation, updates, and publication. These properties are crucial for understanding the dataset's history, and currentness.

  • Release Time ( dcterms:issued): Indicates the date when the dataset was first made available. Formats include:
  • Revision/Update Time (dcterms:modified): Shows when the dataset was last updated, using the same formagits as the release time.
  • Update Schedule (dcterms:accrualPeriodicity): Describes the frequency of dataset updates. Terms are taken from the Dublin Core Collection Description Frequency Vocabulary. Multiple formats allow for precise scheduling, whether regular or irregular. Frequency Coding Guide section provides a guide to coding various standard frequencies as per ISO 19115, ISO-8601, and the Dublin Core standards.
  • Record Creation Time (dcterms:created): Specifies the date when the catalog record itself was created, separate from the dataset it catalogs. This property uses the xsd:dateTime format.

Temporal Coverage of the Dataset

Temporal coverage refers to the time period the data within a dataset covers or relates to, as opposed to lifecycle properties like creation or update dates. This concept is central in data management for understanding the relevance and applicability of the dataset's content.

In DCAT, temporal coverage is defined using the property dcterms:temporal associated with the dcterms:PeriodOfTime class. This class allows for a clear specification of the coverage period through defined start and end dates. For detailed representation, formats such as xsd:date, xsd:dateTime, xsd:gYear, or xsd:gYearMonth can be used. For instance, a dataset on a year-long project might use "2023" (xsd:gYear), whereas a dataset with specific event dates might use "2023-03-15T13:00:00" (xsd:dateTime).

Marking these timeframes is typically done using dcat:startDate and dcat:endDate, offering flexibility for either fixed or open-ended periods. For example, a dataset about historical weather patterns might span from "1950-01-01" to "2000-12-31".

Adopting dcterms:PeriodOfTime in DCAT-US 3.0 aligns it with international DCAT 3 standards, improving data interoperability and ensuring consistent handling of time-related data. This alignment rectifies previous inconsistencies and enhances the usability and exchange of data.

Temporal Resolution in Datasets and Distributions

The property dcat:temporalResolution in a dcat:Dataset, or dcat:Distribution, refers to the smallest time interval that can be discerned in the data. This property is essential for understanding the granularity and frequency of data recording within the dataset or its specific distributions.

Temporal resolution is particularly relevant in datasets where time plays a crucial role, such as time-series data. It indicates the level of detail at which changes or updates in the data are recorded and presented. For instance, a dataset with daily weather observations might have a temporal resolution of one day, represented as "P1D" in XML Schema duration format.

In the context of dcat:Dataset, specifying dcat:temporalResolution helps users understand the overall temporal granularity of the dataset. Conversely, when applied to dcat:Distribution, it provides resolution details specific to each distribution format, acknowledging that different formats might be updated at different frequencies.

This distinction is important for datasets available in multiple formats or distributions, as each might have different temporal characteristics. For example, a high-resolution version of a dataset updated every minute would be suitable for detailed, time-sensitive analyses, while a lower-resolution version updated annually might be better suited for long-term trend analyses.

Examples of encoding durations in XML Schema duration format:

  • Daily Resolution: A dataset with daily updates would use "P1D", indicating an update frequency of every day.
  • Hourly Resolution: For hourly data updates, such as in a traffic flow dataset, the encoding would be "PT1H", representing an hourly update frequency.

The dcat:temporalResolution can be specified using various time units such as seconds, minutes, hours, days, or years, depending on the nature of the dataset or distribution. This specification aids in aligning the dataset or distribution with user expectations and analytical requirements.

Frequency Coding Guide

The following table provides a guide to coding various standard frequencies as per ISO 19115, [[ISO8601-1]], and the Dublin Core standards.

ISO 19115 - MD_MaintenanceFrequencyCode ISO-8601 Dublin Core Collection Description Frequency Vocabulary [[CLD-FREQ]]
continual R/PT1S continuous
daily R/P1D daily
weekly R/P1W weekly
fortnightly R/P2W or R/P0.5W biweekly
monthly R/P1M monthly
quarterly R/P3M quarterly
biannually R/P6M semiannual
annually R/P1Y annual
asNeeded - -
Irregular - irregular
notPlanned - -
unknown - -
- R/P3Y triennial
- R/P2Y biennial
- R/P4M threeTimesAYear
- R/P2M or R/P0.5M bimonthly
- R/P0.5M semimonthly
- R/P0.33M threeTimesAMonth
- R/P1W semiweekly
- R/P3.5D threeTimesAWeek

Handling Temporal Formats in JSON-LD for DCAT Datasets

When dealing with a dcat:Dataset in JSON-LD, different temporal formats can be effectively represented using the JSON-LD @type attribute. This ensures that each temporal aspect of the dataset is accurately interpreted.

Example using xsd:date for a dataset's last update time (dcterms:modified):

          {
        "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
        "@type": "dcat:Dataset",
        "title": "Annual Financial Report",
        "modified": {
          "@value": "2023-03-31",
          "@type": "xsd:date"
        }
      }
        

Example using xsd:dateTime for a dataset's precise creation time:

          {
        "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
        "@type": "dcat:Dataset",
        "title": "Real-Time Traffic Data",
        "created": {
          "@value": "2023-03-31T15:00:00",
          "@type": "xsd:dateTime"
        }
      }
        

Example using xsd:gYear for a dataset's publication year (dcterms:issued):

          {
        "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
        "@type": "dcat:Dataset",
        "title": "Decadal 2020 Census Data",
        "issued": {
          "@value": "2020",
          "@type": "xsd:gYear"
        }
      }
        

Example using xsd:gYearMonth for representing the temporal coverage of a dataset:

          {
        "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
        "@type": "dcat:Dataset",
        "title": "Quarterly Weather Observations",
        "temporal": {
          "@type": "dcterms:PeriodOfTime",
          "startDate": {
            "@value": "2023-01",
            "@type": "xsd:gYearMonth"
          },
          "endDate": {
            "@value": "2023-03",
            "@type": "xsd:gYearMonth"
          }
        }
        }
        

These examples illustrate the flexible use of @type to accurately represent various temporal aspects within a dcat:Dataset. By specifying the datatype, datasets can convey precise temporal information, enhancing data usability and interpretation.

Additionally, specifying the dcat:temporalResolution in JSON-LD is straightforward. Since the @type for dcat:temporalResolution is predefined in the JSON-LD context, it's not necessary to explicitly declare it.

Here's an example of how dcat:temporalResolution might be used in a JSON-LD representation of a dataset with daily updates:

          {
          "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
          "@type": "dcat:Dataset",
          "title": "Daily Temperature Observations",
          "temporalResolution": "P1D"
        }
        

In this example, the dataset is defined with a temporal resolution of one day, indicated by "P1D". This notation follows the XML Schema duration format and is understood in the JSON-LD context without requiring an additional @type declaration for the resolution.

The dcterms:accrualPeriodicity property in JSON-LD specifies the frequency at which a dataset is updated or new data is added. This property is vital for users to understand how often the dataset's information is refreshed.

Example of a dataset updated daily:

          {
        "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
        "@type": "dcat:Dataset",
        "title": "Daily Air Quality Index",
        "accrualPeriodicity": "daily"
      }
        

Example of a dataset updated monthly:

          {
        "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
        "@type": "dcat:Dataset",
        "title": "Monthly Employment Statistics",
        "accrualPeriodicity": "monthly"
      }
        

These examples illustrate the use of dcterms:accrualPeriodicity in JSON-LD to clearly represent the update frequency of datasets. By specifying this property, users can easily determine the refresh rate of the dataset's data, which is crucial for its application and relevance.

Provenance Metadata

Provenance and data lineage are crucial aspects of data management and transparency, ensuring that data consumers understand the origins, transformations, and utility of the data. In DCAT-US, leveraging [[DCTERMS]] (Dublin Core Terms) and [[PROV-O]] (W3C PROV Ontology) properties can effectively represent these aspects. This section outlines best practices for utilizing these properties to detail the provenance and data lineage within DCAT datasets.

Basic Provenance Metadata

Within DCAT-US, the Dublin Core Terms [[DCTERMS]] vocabulary offers properties that allow data publishers to articulate basic provenance information effectively. Particularly, dcterms:source and dcterms:provenance are pivotal in this context.

dcterms:source

dcterms:source is used into the following context:

  • Property source metadata (dcterms:source), optional, non-repeatable property for Catalog Record, that refers to the original metadata that was used in creating metadata for the Dataset.
  • Property source (dcterms:source), optional, repeatable property for Dataset, that refers to a related Dataset from which the described Dataset is derived.

The dcterms:source property is utilized to denote the original source from which the current dataset is derived. It can be a URI directly pointing to the original dataset or, in the absence of a URI, a descriptive reference that sufficiently identifies the original source. It's imperative to ensure that the source referenced is the most immediate or direct source from which the data was derived and to utilize persistent URIs when available to ensure stable and long-term linkage to the source.

dcterms:provenance

On the other hand, dcterms:provenance provides a mechanism to describe the history or lineage of the dataset. This property allows publishers to detail the dataset's historical context and sequence of events or processes that have influenced its formation or transformation. The provenance statement should be concise yet comprehensive, providing a clear and adequate understanding of the dataset's history and lineage. Employing standardized nomenclature and terminologies ensures clarity and consistency across provenance statements.

In the context of data cataloging and transparency, embedding provenance information is vital to elucidate the origin and historical context of a dataset. The dcterms:ProvenanceStatement from the Dublin Core Terms (DCTerms) vocabulary provides a structured way to incorporate this information within the DCAT-US framework.

The dcterms:ProvenanceStatement is designed to convey a human-readable explanation or record of the history or lineage of a dataset. It can be utilized to describe the dataset's origins, transformations, ownership, and any other changes it might have undergone, thereby providing a clear and comprehensive historical record.

Property provenance (dcterms:provenance), optional, repeatable property for Dataset, that contains a statement about the lineage of a Dataset.

This property can be expressed in two primary ways within DCAT-US:

  • By URI: A Uniform Resource Identifier (URI) can be used to refer to a dcterms:ProvenanceStatement that is hosted externally. This method is beneficial when the provenance information is extensive or when it is standardized and used across multiple datasets.
  • Using Free Text with rdfs:label: Alternatively, a dcterms:ProvenanceStatement can be expressed as free text using the rdfs:label property. This approach is suitable for providing concise, readable provenance information directly within the dataset's metadata.

Detailed Data Lineage

Data lineage, which traces the discrete steps involving data as it moves through the various stages of a workflow, is crucial for understanding data's origins and transformations. The W3C PROV Ontology [[PROV-O]] provides a rich set of properties to describe detailed data lineage in a standardized manner, ensuring interoperability and clarity in data documentation.

Key PROV-O properties include:

The prov:Activity class in the PROV-O ontology plays a pivotal role in representing processes or actions taken upon or with entities, thereby providing a structured framework to document the transformations, analyses, or other actions that data undergoes. An instance of prov:Activity is utilized to describe a particular occurrence of an action or process, which can involve the consumption, production, or transformation of entities. By associating activities with entities through properties such as prov:wasGeneratedBy, a detailed account of the data's journey, from its origin through various transformations to its current state, can be articulated. This not only enhances the transparency of the data but also provides a robust mechanism to trace back through the steps involved in data creation and processing, thereby contributing to verifiable and trustworthy data lineage. Furthermore, prov:Activity can be associated with prov:Agent through properties like prov:wasAssociatedWith, offering insights into the roles of different agents (e.g., organizations, people, or software) in data processing activities, thereby enriching the data provenance and lineage documentation.

By adhering to these practices and effectively utilizing [[PROV-O]] properties, data publishers can enhance transparency and facilitate informed data usage among consumers by providing a clear view of data sourcing, processing, and transformation.

Distribution Metadata

In the realm of data sharing and management, dcat:Distribution plays a pivotal role as the tangible representation of datasets. A Distribution within the DCAT framework is more than just a link to a dataset; it is the embodiment of the dataset in a practical, accessible format, adhering to the W3C standards. It is the dataset manifested in a specific format, ranging from CSV files to complex databases, inherently tied to its parent dataset. This relationship underscores the fact that a Distribution does not exist in isolation but as a practical form of the dataset, prepared and published by data providers for end users. The core attributes of a Distribution focus on its file-centric properties like download URLs, media types, file formats, byte sizes, character encodings, and checksums, emphasizing its primary function: efficient and reliable data delivery.

Guidelines for Creating DCAT Distributions

The following guidelines are designed to help determine the most effective way to structure DCAT distributions, whether as a single file, a multi-file package, or multiple distributions. The choice depends on the dataset's characteristics, user needs, and the data's intended use. Consider these guidelines to ensure your distributions are user-friendly, accessible, and align with best practices in data management.

  • Single-File Distribution: Ideal for datasets that are cohesive and standalone, typically encapsulated in a single format like CSV or XML. This approach is beneficial for smaller or comprehensive datasets, simplifying access and use. The key is to choose a file format that effectively represents all necessary data.

  • Multi-File Packaged Distribution: Essential for complex datasets, such as ArcGIS shapefiles, which require multiple interdependent files. Packaging related files together is useful for large or component-rich datasets. It's crucial to include all essential components and ensure the package facilitates easy download and usage.

  • Multiple Distributions in a Dataset: Suitable for datasets that can be logically segmented or offered in different formats. This method allows targeted access to specific data parts and enables selective updating. Clear documentation of each distribution is important for user navigation.

When selecting a distribution format, it is important to consider factors such as the interdependence of files, the ease of user accessibility, the size and downloadability of the data, the frequency of updates, and the diversity of formats required. A thoughtful approach to these criteria will help in creating a distribution strategy that is both practical for data providers and beneficial for end-users, enhancing the overall effectiveness of data sharing and utilization.

File-centric Properties

This section focuses on the properties central to the file-centric aspects of dcat:Distribution. These properties are crucial for ensuring datasets are accessible and usable in their practical forms, addressing the aspects of data encoding, structure, packaging, presentation, media type, and language.

  • dcat:downloadURL: This property is preferred for direct links to downloadable resources. It is the most straightforward way to provide access to a distribution, allowing users to directly download the dataset in its entirety without any intermediate steps or interactions.
  • dcat:accessURL: This property should be used for the URL of a service or location that provides access to the distribution, typically through a web form, query, or API call. It is ideal for scenarios where the distribution is accessed via an interactive mechanism rather than direct download. For example, when accessing datasets that require specific queries or are provided through a web service.
  • dcat:mediaType: This property specifies the Internet Media Type (also known as MIME type) of the distribution, which are standardized identifiers for labeling the format of documents, files, or data transmitted via the Internet. It is particularly useful in scenarios where the distribution format aligns with media types registered by the Internet Assigned Numbers Authority (IANA) [[IANA-MEDIA-TYPES]], ensuring standardization and facilitating automated processing.
  • dcterms:format: This property is applicable in scenarios not covered by dcat:mediaType, particularly when aligning with file formats recognized by central authorities. The role of dcterms:format is to offer a detailed description of the distribution's file format or physical medium. For instance, in the geospatial domain, this could include formats like “Shapefile” or “GeoJSON”. These descriptions are crucial for providing human-readable information about the distribution's format, enhancing user understanding and aiding in the effective presentation within data catalogs.
  • dcterms:conformsTo: This property indicates the standards or specifications to which the distribution conforms. Allowing for multiple standards acknowledges that datasets may adhere to more than one set of specifications, either due to the nature of the data or to meet various user needs and compliance requirements. For instance, a dataset might conform to both an industry-specific standard and a general data format standard. Documenting each applicable standard enhances the dataset's interoperability and usability, making it clear to users what to expect in terms of data structure and quality.
  • dcat:compressFormat: This property to be used when the files in the distribution are compressed, e.g., in a ZIP file. The format SHOULD be expressed using a media type as defined by IANA [[IANA-MEDIA-TYPES]] if available.
  • dcat:packageFormat: This property should be employed when the files within a distribution are packaged together, such as in formats like TAR, ZIP, Frictionless Data Package, or Bagit files. The format SHOULD be expressed using an appropriate media type as defined by IANA [[IANA-MEDIA-TYPES]] of available to ensure standardization and broader recognition of the format.
  • dcat:byteSize: Indicates the size of the distribution, important for understanding download requirements and storage planning. The size SHOULD be given as an integer.
  • spdx:checksum: This optional property is used to provide a spdx:Checksum instance for ensuring data integrity during transfer. It serves as a mechanism to verify that the contents of a file or package have not been altered. The checksum should be specified using the spdx:checksumValue property. To indicate the algorithm used for generating the checksum, use the property spdx:algorithm with URIs defined in the SPDX specification, such as spdx:checksumAlgorithm_sha1, spdx:checksumAlgorithm_sha256, or spdx:checksumAlgorithm_sha512, depending on the algorithm employed.
  • adms:representationTechnique: This property can be used to specify the technique or method by which the data is represented in the distribution. This is different from the file format as, for example, a ZIP file (file format) could contain an XML schema (representation technique). It can help users understand the underlying structure or visualization method of the dataset. For example, for spatial datasets, this property SHOULD be used to express the spatial representation type (grid, vector, tin), by using the URIs from a code list managed in a registry.
  • cnt:characterEncoding: This property SHOULD be used to specify the character encoding of the Distribution, by using as value the character set names in the the IANA Character Set names register [[IANA-CHARSETS]]. Character encoding in [[?ISO-19115-1]] metadata is specified with a code list that can be mapped to the corresponding codes in [[IANA-CHARSETS]], as shown in the following table (entries with 1-to-many mappings are in italic).
    ISO 19115 - MD_CharacterSetCode Description IANA
    ucs2 16-bit fixed size Universal Character Set, based on ISO/IEC 10646 ISO-10646-UCS-2
    ucs4 32-bit fixed size Universal Character Set, based on ISO/IEC 10646 ISO-10646-UCS-4
    utf7 7-bit variable size UCS Transfer Format, based on ISO/IEC 10646 UTF-7
    utf8 8-bit variable size UCS Transfer Format, based on ISO/IEC 10646 UTF-8
    utf16 16-bit variable size UCS Transfer Format, based on ISO/IEC 10646 UTF-16
    8859part1 ISO/IEC 8859-1, Information technology - 8-bit single byte coded graphic character sets - Part 1 : Latin alphabet No.1 ISO-8859-1
    8859part2 ISO/IEC 8859-2, Information technology - 8-bit single byte coded graphic character sets - Part 2 : Latin alphabet No.2 ISO-8859-2
    8859part3 ISO/IEC 8859-3, Information technology - 8-bit single byte coded graphic character sets - Part 3 : Latin alphabet No.3 ISO-8859-3
    8859part4 ISO/IEC 8859-4, Information technology - 8-bit single byte coded graphic character sets - Part 4 : Latin alphabet No.4 ISO-8859-4
    8859part5 ISO/IEC 8859-5, Information technology - 8-bit single byte coded graphic character sets - Part 5 : Latin/Cyrillic alphabet ISO-8859-5
    8859part6 ISO/IEC 8859-6, Information technology - 8-bit single byte coded graphic character sets - Part 6 : Latin/Arabic alphabet ISO-8859-6
    8859part7 ISO/IEC 8859-7, Information technology - 8-bit single byte coded graphic character sets - Part 7 : Latin/Greek alphabet ISO-8859-7
    8859part8 ISO/IEC 8859-8, Information technology - 8-bit single byte coded graphic character sets - Part 8 : Latin/Hebrew alphabet ISO-8859-8
    8859part9 ISO/IEC 8859-9, Information technology - 8-bit single byte coded graphic character sets - Part 9 : Latin alphabet No.5 ISO-8859-9
    8859part10 ISO/IEC 8859-10, Information technology - 8-bit single byte coded graphic character sets - Part 10 : Latin alphabet No.6 ISO-8859-10
    8859part11 ISO/IEC 8859-11, Information technology - 8-bit single byte coded graphic character sets - Part 11 : Latin/Thai alphabet ISO-8859-11
    8859part13 ISO/IEC 8859-13, Information technology - 8-bit single byte coded graphic character sets - Part 13 : Latin alphabet No.7 ISO-8859-13
    8859part14 ISO/IEC 8859-14, Information technology - 8-bit single byte coded graphic character sets - Part 14 : Latin alphabet No.8 (Celtic) ISO-8859-14
    8859part15 ISO/IEC 8859-15, Information technology - 8-bit single byte coded graphic character sets - Part 15 : Latin alphabet No.9 ISO-8859-15
    8859part16 ISO/IEC 8859-16, Information technology - 8-bit single byte coded graphic character sets - Part 16 : Latin alphabet No.10 ISO-8859-16
    jis japanese code set used for electronic transmission JIS_Encoding
    shiftJIS japanese code set used on MS-DOS machines Shift_JIS
    eucJP japanese code set used on UNIX based machines EUC-JP
    usAscii United States ASCII code set (ISO 646 US) US-ASCII
    ebcdic IBM mainframe code set IBM037
    eucKR Korean code set EUC-KR
    big5 traditional Chinese code set used in Taiwan, Hong Kong of China and other areas Big5
    GB2312 simplified Chinese code set GB2312

Effective utilization of these properties enhances data discoverability, interoperability, and the overall user experience in accessing and working with datasets.

Data Quality

The quality of a dataset plays a pivotal role in shaping trust, reusability, and the overall performance of applications that rely on it. As a result, it is imperative to integrate data quality information seamlessly into both the data publishing and consumption processes. This inclusion allows for a thorough evaluation of a dataset's quality, thereby determining its suitability for a particular application.

Thorough documentation of data quality significantly streamlines the dataset selection process, enhancing the likelihood of reuse. Regardless of domain-specific nuances, documenting data quality and explicitly stating known quality issues in metadata are fundamental practices. Typically, assessing quality involves multiple dimensions, each encapsulating characteristics of importance to both data publishers and consumers.

The Data Quality Vocabulary (DQV) defines machine-readable concepts such as measurements and criteria to assess quality across various dimensions [[VOCAB-DQV]]. Tailored heuristics designed for specific assessment scenarios rely on quality indicators, which encompass data content, metadata, and human ratings. These indicators offer valuable insights into the dataset's suitability for its intended purpose.

In the context of integrating data quality information into DCAT resources (Dataset, Distribution, Data Service, Dataset Series), the Data Quality Vocabulary [[VOCAB-DQV]] provides a structured and standardized way to represent and assess quality information for fitness of use. The key components of DQV relevant to this discussion are dqv:QualityMeasurement, dqv:Metric, dqv:Dimension, and the property hasQualityMeasurement. Here's how each of these elements is used:

Using these DQV elements, data publishers can document the quality of their datasets in a structured and meaningful way. This documentation includes specific measurements of quality, the criteria used for these assessments, and the quality dimensions they relate to. The use of DQV thus enhances transparency and helps data consumers make informed decisions about the suitability of a dataset for their specific needs.

The use of shareable controlled vocabularies for dqv:Metric and dqv:Dimension is highly encouraged within communities. These standardized vocabularies facilitate consistent and precise communication of data quality aspects across different datasets and applications. By adopting such vocabularies, communities can ensure that their data quality metrics and dimensions are universally understood, enhancing interoperability and the effective use of data across diverse systems and contexts.

Versioning

Versioning is a concept used to describe the relationship between an original resource and its variations, updates, or translations. In this section, we explore how versions resulting from updates or modifications throughout a resource's lifecycle is used in DCAT-US 3.0 profile.

DCAT-US 3.0 relies on established vocabularies, including the versioning section of the PAV ontology and terms from [[?PAV]], [[DCTERMS]], [[OWL2-OVERVIEW]], and [[VOCAB-ADMS]].

It's essential to recognize that versioning is applicable to all primary DCAT resources, such as Catalogs, Catalog Records, Datasets, and Distributions. This versioning capability extends across these resource types.

The versioning methodology detailed in DCAT-US 3.0 is designed to enhance and work alongside existing versioning practices specific to certain resource types (for instance, versioning properties for ontologies are detailed in [OWL2-OVERVIEW]) and customary in various domains and communities. Refer to section 11.4 for an analysis of how DCAT's versioning approach aligns with other vocabularies.

Handling Dataset Changes

Web-based datasets are inherently dynamic, with some undergoing scheduled updates and others evolving due to advancements in data collection techniques. To address these varying changes, the creation of new dataset versions is often necessary. The decision to classify changes as a new dataset or a new version of an existing dataset, however, is not universally agreed upon. The following examples illustrate typical scenarios where a new version is generally warranted:

It's important to note that datasets representing time or spatial series (like annual regional data or weekly weather forecasts) are usually considered separate datasets, each capturing unique observations.

While Scenarios 1 and 2 might lead to significant version updates, Scenario 3 typically results in a minor update. The key is not the scale of the change, but the clarity in marking these changes through version numbering. Keeping a detailed version history is crucial for the integrity of the dataset, especially considering its potential ongoing use by various stakeholders. Publishers are advised to inform users proactively about new versions, particularly for datasets undergoing real-time updates, where automated timestamps can aid in version identification. Ultimately, maintaining a systematic and transparent versioning approach, including the use of semantic versioning, is vital for enabling users to navigate and utilize these evolving datasets effectively.

Version Information

The DCAT-US profile recognizes the importance of associating versioned resources with further details. These details can include aspects like the differences from the original resource (referred to as the version "delta"), the version's name or identifier, and its release date.

To accommodate these details, the DCAT US 3.0 profile employs several specific properties:

Dataset Versions

The versioning of datasets is an essential aspect of data management, facilitating the tracking of changes and updates over time. In DCAT-US 3.0, dataset versioning is primarily managed through the use of properties that identify and describe different versions of a dataset including:

These properties ensure users can easily track dataset evolutions, access different versions, and understand the changes made across versions. Implementing these versioning properties in the DCAT-US profile enhances data discoverability and usability, aligning with best practices in data management.

Version Chains and Hierarchies

DCAT-US 3.0 profile facilitates the management of version histories and hierarchies through specific properties. These properties help in establishing and navigating the relationships between different versions of a dataset.

The key properties for defining version chains and hierarchies include:

Additionally, the dcat:isVersionOf property (inverse of dcat:hasVersion) can be used to provide a backward link from a version to its abstract resource. The utilization of these properties depends on the specific requirements of the use case.

It's important to note that the essential properties for specifying a version chain and hierarchy are dcat:previousVersion and dcat:hasVersion. The choice to use additional properties is determined by the needs of the relevant use case.

For further guidance on specifying a resource's status refer to Resource life-cycle section.

The following example, adapted from § 8.6 Data Versioning of [[?DWBP]] demonstrates how to specify a version chain and hierarchy for a bus stops dataset using the properties described in this section.

Versions Replaced by Other Ones

In DCAT-US 3.0 profile, a significant type of relationship is the one where a given version replaces or supersedes another. To represent this, DCAT adopts the relevant properties from [[DCTERMS]]:

It's important to note that these properties do not necessarily indicate a version chain. That is, a version does not automatically replace its immediate predecessor.

To illustrate how these roperties can be applied in DCAT-US 3.0, the following example reuses the description of the MyCity bus stop dataset in to show how replaced versions can be specified in DCAT.

Resource Life-Cycle

The life-cycle of a resource, while distinct from versioning, is often closely related to it. The evolution of a resource through its life-cycle stages—conception, creation, publication—may lead to new versions, though not invariably (e.g., resources passing through an approval workflow without revisions). Conversely, creating a new version does not always signify a life-cycle status change, such as in cases of minor updates or resources still under development.

The life-cycle status of a resource holds significant value, informing data consumers about its developmental stage, deprecation, or withdrawal, and indicating whether a new version is available. For data providers, marking a resource with its life-cycle status is crucial for managing data workflows, such as ensuring a resource is stable and appropriately flagged before publication.

Resource life-cycle management varies depending on community practices, data management policies, and workflows. This variation extends to different resource types (e.g., datasets vs. catalog records), which may follow distinct life-cycle statuses.

DCAT utilizes the adms:status property [[VOCAB-ADMS]] to specify life-cycle statuses, supplemented by relevant [[DCTERMS]] time-related properties (e.g., dcterms:created, dcterms:dateSubmitted). However, DCAT-US profile does not mandate specific life-cycle statuses, instead deferring to standards and practices suitable for each application scenario and communities of practice.

Dataset Series

A Dataset Series is a collection of related datasets that share common characteristics, making them part of a cohesive group. This section provides guidance on the effective use of Dataset Series within data catalogs, emphasizing the benefits and considerations for publishers and users alike.

A Dataset Series is a way for publishers to convey that a dataset is evolving across specific dimensions and is available as a set of related datasets. However, choosing to group datasets this way depends on the use case. Since it demands extra metadata management from the publisher, it's optional. For instance, a dataset updated frequently via an API may not require individual records for each yearly snapshot unless the publisher wishes to share each snapshot's lifecycle.

Why Use Dataset Series?

Implementing Dataset Series offers several advantages:

Guidelines for Implementing Dataset Series

When using Dataset Series, consider the following best practices:

Expressing Relationships and Connections

Articulating the interconnections between datasets in a series is crucial for user understanding and data management:

Impact on Metadata

Being part of a Dataset Series may necessitate specific metadata considerations:

How to specify dataset series

DCAT-US profile makes dataset series first class citizens of data catalogs by using the [[VOCAB-DCAT-3]] new class dcat:DatasetSeries, defined as a subclass of dcat:Dataset. The datasets are linked to the dataset series by using the property dcat:inSeries. Note that a dataset series can also be hierarchical, and a dataset series can be a member of another dataset series.

Dataset series may evolve over time, by acquiring new datasets. E.g., a dataset series about yearly budget data will acquire a new child dataset every year. In such cases, it might be important to link the yearly releases with relationships specifying the first, previous, next, and latest ones. In such a scenario, DCAT makes use of properties dcat:first, dcat:prev, and dcat:last, respectively.

Controlled Vocabularies

Importance of Controlled Vocabularies

Controlled vocabularies are predetermined sets of terms that have been carefully curated to ensure consistency, accuracy, and standardized representation of concepts within a specific domain. In the context of DCAT-US, controlled vocabularies are used to define and constrain the values of specific metadata elements. These vocabularies enable the creation of a common language for describing datasets, facilitating data integration and harmonization across different repositories.

The use of controlled vocabularies in DCAT-US offers several key benefits:

  • Consistency: By providing a predefined list of terms, controlled vocabularies ensure consistent representation and labeling of metadata elements. This consistency promotes data interoperability and simplifies data integration efforts, as different datasets can be mapped to a shared set of controlled terms.
  • Enhanced search and discovery: Controlled vocabularies enable more effective search and discovery of datasets. By aligning metadata elements with standardized terms, users can easily navigate and explore datasets based on their specific domain knowledge. Furthermore, controlled vocabularies facilitate the development of advanced search capabilities, such as faceted search, which allows users to refine search results based on predefined categories or facets.
  • Data harmonization: In a diverse data landscape where multiple agencies and organizations produce and manage datasets, controlled vocabularies help in harmonizing the data representation. By agreeing on a set of controlled terms, data publishers can ensure that similar concepts are represented consistently across different datasets. This harmonization promotes data integration and interoperability, enabling meaningful analysis and comparison of data from various sources.

Requirements for controlled vocabularies

The following is a list of requirements that were identified for the controlled vocabularies to be recommended in this Application Profile.

Controlled vocabularies SHOULD:

  • Be published under an open license.
  • Be operated and/or maintained by an agency of the US Government, by a recognised standards organization or another trusted organization.
  • Be properly documented.
  • Have labels in english, and optionally in Spanish
  • Contain a relatively small number of terms (e.g. 10-25) that are general enough to enable a wide range of resources to be classified.
  • Have terms that are identified by URIs with each URI resolving to documentation about the term.
  • Have associated persistence and versioning policies.

These criteria do not intend to define a set of requirements for controlled vocabularies in general; they are only intended to be used for the selection of the controlled vocabularies that are proposed for this Application Profile.

Controlled vocabularies to be used

In the table below, a number of properties are listed with controlled vocabularies that MUST be used for the listed properties. The declaration of the following controlled vocabularies as mandatory ensures a minimum level of interoperability.

Compared with [[?DCAT-AP-20200608]], DCAT-US makes use of additional controlled vocabularies mandated by [[?DATA-GOV-REG]], and operated by the Data.gov Registry - with the only exceptions of the coordinate reference systems register maintained by OGC [[?OGC-EPSG]].

For two of these controlled vocabularies, namely the NGDA spatial data themes [[?NGDA-THEMES]] and the ISO topic categories [[?ISO-19115-1]], the DCAT-US Working Group has defined a set of harmonised mappings to the Data.gov Vocabularies Data Themes [[?DATA-GOV-THEME]] (TBD), in order to facilitate the identification of the relevant theme in [[?DATA-GOV-THEME]] for geospatial/statistical metadata.

Other controlled vocabularies

In addition to the proposed common vocabularies in , which are mandatory to ensure minimal interoperability, implementers are encouraged to publish and to use further region or domain-specific vocabularies that are available online. While those may not be recognised by general implementations of the Application Profile, they may serve to increase interoperability across applications in the same region or domain. Examples are the full set of concepts in Global Change Master Directory (GCMD) [[?GCMD]],and numerous other schemes.

For geospatial metadata, the working group has identified the following additional vocabularies:

JSON-LD context file

One common technical question is the format in which the data is being exchanged. For DCAT-US 3.0 conformance, it is not mandatory that this happens in a RDF serialisation, but the exchanged format SHOULD be unambiguously be transformable into RDF. For the format JSON, a popular format to exchange data between systems, DCAT-US profile provides a JSON-LD context file. JSON-LD is a W3C Recommendation [[[json-ld11]]] that provided a standard approach to interpret JSON structures as RDF. The provided JSON-LD context file can be used by implementers to base their data exchange upon, and so create a DCAT-US conformant data exchange. This JSON-LD context is not normative, i.e. other JSON-LD contexts are allowed to create a a conformant DCAT-US data exchange. The JSON-LD context file downloadable here.

JSON Schemas

One common technical question is the format in which the data is being exchanged. For DCAT-US 3.0 conformance, it is not mandatory that this happens in a RDF serialisation, but the exchanged format SHOULD be unambiguously be transformable into RDF.

For JSON, which is a widely adopted format for data exchange between systems, the DCAT-US profile offers an informative JSON Schema. This schema aids in understanding the structure expected for DCAT-US compliant data exchanges in JSON format.

JSON Schema offers a compact way to describe and validate the structure and content of JSON data, ensuring specific formatting and value constraints. However, it's more limited than JSON-LD context and RDF serialization due to its focus on structure over meaning.

JSON Schema's focus on structural validation forms a contrast with JSON-LD and RDF's capabilities. JSON-LD and RDF go beyond just validation, allowing the creation of a graph of interconnected entities that can be easily integrated and reused across various contexts. This interconnectedness is fundamental to the concept of the semantic web, where data is not only readable but also comprehensible to machines.

Specifically, JSON-LD facilitates the representation of data as a graph, making it suitable for more complex, interlinked data representations, which is a cornerstone of linked data systems. This graph-based approach stands in contrast to the tree-like structures that JSON Schema is confined to, limiting its utility in scenarios requiring extensive data interconnectivity and reusability.

Implementers can use the provided JSON Schema for their data exchanges, aligning with DCAT-US standards. However, it's non-normative, meaning alternatives creating compliant exchanges are also valid. Download the current JSON Schema here.

SHACL Validation

In order to verify whether a catalog adheres to the stipulated constraints in this Application Profile, the constraints are articulated utilizing SHACL [[?SHACL]]. All constraints in this specification that were amenable to SHACL expression translation have been incorporated. Consequently, this set of SHACL expressions can be employed to construct a validation check for data exchange between two systems, a common scenario being one catalog being harvested into another.

For example, it may be recognized that the data being exchanged doesn't include the organizations' details since they are uniquely identified by a deferenceable URI. In this scenario, enforcing rules about the mandatory presence of a name for each organization may not be pertinent. Rigorously applying the DCAT-US SHACL expressions would trigger errors, even though the data is accessible via an alternative route. In this context, it's acceptable to omit this check during the validation phase.

This example underscores that to achieve an optimal user experience during a validation process, it's crucial to consider the actual data transferred between systems and apply only the constraints relevant to the data exchange. To facilitate this, the SHACL expressions are organized into separate files, aligning with common validation configurations.

The SHACL application profile for DCAT-US can be found here

Namespaces

Namespaces and prefixes used in normative parts of this recommendation are shown in the following table:

Prefix Namespace IRI Source
adms http://www.w3.org/ns/adms# [[VOCAB-ADMS]]
cnt http://www.w3.org/2011/content# [[Content-in-RDF10]]
dcat https://www.w3.org/TR/vocab-dcat-3/ [[VOCAB-DCAT]]
dcat-us http://resources.data.gov/ontology/dcat-us# [[DCAT-US]]
dct http://purl.org/dc/terms/ [[DCTERMS]]
dqv https://www.w3.org/TR/vocab-dqv/ [[VOCAB-DQV]]
foaf http://xmlns.com/foaf/0.1/ [[FOAF]]
gsp http://www.opengis.net/ont/geosparql# [[GeoSPARQL]]
locn http://www.w3.org/ns/locn# [[LOCN]]
org http://www.w3c.org/ns/org# [[VOCAB-ORG]]
prov http://www.w3.org/ns/prov# [[PROV]]
rdf http://www.w3.org/1999/02/22-rdf-syntax-ns# [[RDF-SYNTAX-GRAMMAR]]
rdfs http://www.w3.org/2000/01/rdf-schema# [[RDF-SCHEMA]]
schema http://schema.org/ [[schema-org]]
sdmx-attribute http://purl.org/linked-data/sdmx/2009/attribute# [[?SDMX-ATTRIBUTE]]
skos http://www.w3.org/2004/02/skos/core# [[SKOS-REFERENCE]]
spdx http://spdx.org/rdf/terms# [[SPDX]]
vcard http://www.w3.org/2006/vcard/ns# [[VCARD-RDF]]
xsd http://www.w3.org/2001/XMLSchema# [[XMLSCHEMA11-2]]