Skip to content Learn about the access keys available for Department of Housing Metadata Registry
Department of Housing Logo
Department of Housing, Local Government and Heritage

Concept help - Data Set

A Data Set describes a record of data, including any location or time boundaries for the data, that has been captured and is available for use under a specific licence. A Data Set may be included in a Data Catalog, and can reference multiple Distributions that record different parts or formats of the data that are available to download.

A a dataset in DCAT is defined as a "collection of data, published or curated by a single agent, and available for access or download in one or more formats". A dataset does not have to be available as a downloadable file. For example, a dataset that is available via an API can be defined as an instance of dcat:Dataset and the API can be defined as an instance of dcat:Distribution. DCAT itself does not define properties specific to APIs description. These are considered out of the scope of this version of the vocabulary. Nevertheless, this can be defined as a profile of the DCAT vocabulary.

Fields available on this metadata type

Field ISO definition
Name The primary name used for human identification purposes.
Definition Representation of a concept by a descriptive statement which serves to differentiate it from related concepts. (3.2.39)
Is Federated
Is Not Federable
Version Unique version identifier of this metadata item.
References Significant documents that contributed to the development of the metadata item which were not the direct source for the metadata content.
Origin The source (e.g. document, project, discipline or model) for the item (8.1.2.2.3.5)
Comments Descriptive comments about the metadata item (8.1.2.2.3.4)
Deleted The date after which the item has been soft deleted and is no longer visible in the registry
License Information about the license document under which the dataset is made available.
Rights Information about rights held in and over the dataset.
Release Date Date of formal publication of the dataset.
Modification Date Most recent date on which the dataset was changed, updated or modified.
Frequency The frequency at which dataset is published.
Spatial Coverage Spatial or geographic coverage of the dataset.
Temporal Coverage The temporal or time period that the dataset covers.
Catalog An entity responsible for making the dataset available.
Landing Page A Web page that can be navigated to in a Web browser to gain access to the dataset, its distributions and/or additional information
Contact Point Relevant contact information for the Dataset.
Conforming Specification An established standard to which the described resource conforms.
Item Base

Custom Fields

Field Short definition Long definition
Item ID The Data Set's own ID number or Identifier, where the Data Set already has one.

Where the data set has an existing ID this should be documented in this field. This could be, for example, a database ID (eg DB001) or other assigned ID.

This is different to the UUID assigned by the Registry.

Point of Contact Contact details for the individual or team in the Owning Business Unit responsible for the asset.

Contact details for the individual or team in the Owning Business Unit responsible for the asset. 

A group email address or contact web form is preferred because it is generic and enduring compared to an individual’s 
contact. This minimises the need to regularly update this attribute. 

Data Set Status Refers to the lifecycle stage of the data asset.

Select one of Active, Inactive, Deprecated.

Data Set Type Indicates the kind of data structure or storage where the asset resides.

Indicates the kind of data structure or storage where the asset resides.

Data Set Format Select the Data Set Format

Select the Data Set Format from the list provided. If not listed or unsure select 'Other'.

Update Frequency Select the Data Set Update Frequency.

Describes how often the data is updated or refreshed.

Data Origin Select the Data Set Origin status.

Indicates whether the data is originally generated (Primary) or derived/amalgamated (Secondary).

 

Is it Primary data (i.e. it is the original source), Secondary data (derived or amalgamated with data from other sources).

Data Source Document the source(s) of the data in the Data Set

Identifies the systems or inputs from which the dataset is compiled. Highlight cross-departmental relevance if applicable.

Data Classification Select the appropriate Classification

Categorises the asset based on its function

Legal & Governance Specify applicable legal or policy requirements.

Specify applicable legal or policy requirements. 

E.g. GDPR, Official Secrets Acts, Freedom of Information Act, Building Control Regulation, etc

Security Classification Select the appropriate security classification for the Data Set.

This should reflect the security classification policy of the organisation.

Access Rights This should reflect the access control policy within the organisation.

Select the appropriate access rights level

Keywords List any keywords that will help users search for the Data Set.

List any keywords that will help users search for the Data Set.

Official Definition

A representation of a dataset in a catalog. Data Catalog Vocabulary (DCAT): 5.3 Class: Dataset