Guidance on Assessing Readiness to Manage Data According to the Findable, Accessible, Interoperable and Reusable (FAIR) Principles

On this page

1. Purpose

This guidance provides foundational advice to departmental officials on how to assess their department’s readiness to manage data according to the Findable, Accessible, Interoperable and Reusable guiding principles (the FAIR principles). It also provides recommendations on how to adhere to the FAIR principles in practice. Applying the FAIR principles will help departments enable data discovery, interoperability, sharing and reuse in support of section 4.3.1.3 of the Directive on Service and Digital.

Advice on how to assess specific data for adherence to the FAIR principles is outside the scope of this guidance.

2. Context

In this section

2.1 What are the FAIR principles?

The FAIR principles are a set of guiding principles that were first published in 2016 in the journal Scientific Data for scientific and research data. Since then, they have become internationally recognized for their applicability to the management of data across multiple domains.

The FAIR principles help improve the extent to which data is findable, accessible, interoperable and reusable for both humans and machines. A complete list of the FAIR principles with additional details is available on the GO FAIR website.

2.2 Why are the FAIR principles important for the Government of Canada?

Managing data according to the FAIR principles supports the open and strategic management of data, enhancing its reusability and promoting effective data stewardship across the Government of Canada (GC). Taking measures to prepare data for reuse helps ensure that both the GC and the public can derive maximum value from it and use it to make evidence-based decisions. Enhancing data reusability allows for it to have an impact beyond its original purpose, such as enabling the use of automated decision systems and supporting the delivery of services.

Applying the FAIR principles to data can also support open data initiatives, but FAIR data is not the same as open data. Privacy, security, policy, legal and other constraints may limit the extent to which data can be accessed, shared or reused. FAIR data should be as open as possible but safeguarded as necessary.

3. Guidance

In this section

This guidance recommends an approach to self-assessment that involves answering a series of yes-no questions to help departmental officials document the current state of their department’s data management practices relative to the FAIR principles. The self-assessment is not mandatory, but it is recommended that departmental officials who complete one implement the good practices included in this document to improve the readiness of their department to manage data according to the FAIR principles.

3.1 Considerations

All GC employees have a responsibility to manage their information and data effectively and may benefit from this guidance. In completing a self-assessment, users are encouraged to answer each question, but they can decide which questions to ask, when to ask them and how broadly to frame each question to best derive insight from the responses based on the context. The list of questions is not exhaustive and may need to be more specific or expanded on to suit a specific type of data.

To respond to the questions, departmental officials may need to consult experts such as data stewards, data custodians, or subject‑matter experts. They may also want to revisit the questions throughout the data life cycle (for example, whenever projects, programs or services are planned or undergo revision) to inform data management practices, because the relevance of each question may vary depending on at which stage in the life cycle an assessment is completed. Departmental officials may also need to consider other complementary principles, such as the CARE Principles for Indigenous Data Governance (collective benefit, authority to control, responsibility and ethics).

3.2. Completing a self-assessment

The questions in this section correspond to each FAIR principle. They are related and are intended to be used together to guide practices. Appendix A contains the complete list of questions.

Positive responses to the self-assessment questions suggest that the department is ready at a foundational level to manage data according to the FAIR principles. Negative responses to the questions suggest that there are gaps and a lack of preparedness to implement the FAIR principles in practice.

The guidance also includes good practices that represent approaches to making data more findable, accessible, interoperable and reusable across the department. The recommended good practices can be applied by individual users, user groups, teams or the entire department to increase application of the FAIR principles. Each good practice may not be relevant in all cases, and the good practices can be adjusted to fit a specific context. It is up to the user to decide whether, when and how to apply each good practice.

Once departments have the necessary practices in place and are ready to manage data according to the FAIR principles, departmental officials may want to use more advanced assessment tools that measure, for example, how well individual datasets adhere to the principles.

3.2.1 Findable

To be able to use and reuse data, users (humans and machines) must be able to find it. Both humans and machines should be able to easily find the data they are searching for. Metadata can help users identify and locate data quickly and efficiently.

Self-assessment questions to assess if the data is findable
  1. Does the department assign a unique number, field name, alphanumeric code, or other value to the data it manages that is clear and can be used to reliably identify and locate that data in its systems over time?
  2. Does the department apply additional information (such as metadata) that is sufficient to accurately describe the data it stores?
  3. Does the department register, catalogue or index its data in a searchable resource?
  4. Does the department register, catalogue or index the metadata that describes its data in a freely available, searchable resource?
  5. Does the department maintain associations (such as links) between its data and corresponding metadata that can persist over time?
  6. Do the systems through which the department manages data make the corresponding metadata available in a format that is searchable by both machines and humans?
Good practices to enable findable data

It is recommended that departmental officials:

Assign globally unique and persistent identifiers to the department’s data.

A unique identifier is a reference to a resource that is not reused and that continues to direct users to the same object (that is, the data itself or its corresponding metadata). A unique identifier is persistent when it is long-lasting and continues to direct users to the same object consistently over time.

A persistent identifier is globally unique if it is not duplicated and if it is not assigned to different data in a given context. Examples of globally unique and persistent identifiers include digital object identifiers (DOIs), Open Researcher and Contributor IDs (ORCIDs) and Legal Entity Identifiers (LEIs).

Globally unique and persistent identifiers ensure that both humans and machines can consistently find and relocate the same departmental data and its corresponding metadata over the long term. It is recommended that a standardized or well-recognized unique identifier be used wherever possible.

Ensure that the department’s data is sufficiently described through additional metadata.

Metadata is data that defines and describes the content, context, structure and meaning of information, and the systems in which it is managed over time. Like a globally unique and persistent identifier, additional metadata can also help ensure that other potential users – including those in other domains – can find the data.

In accordance with the Standard on Managing Metadata, departmental officials must assess their department’s metadata needs to ensure that the metadata applied supports the open and strategic management of the department’s data.

The Guidance on Assessing Metadata Needs provides guidance for determining the metadata that should be applied to departmental data. Departmental officials could consider including the following metadata elements with the department’s data:

  • descriptive information about the data (such as creator, title, publisher, creation and publication date, summary and keywords describing the data)
  • data content (such as resource type, variables measured or observed, data format and size) to accurately reflect the data and increase its reusability
  • access permissions and licensing information
  • meaningful and explicit links to other related outputs (such as previous versions of the data, other relevant data, relevant users such as data creators or collectors, the data source)

The Guidance on Data Quality, which addresses accuracy and completeness as dimensions of data quality, can help departmental officials assess the quality of metadata descriptions and identify metadata that provides insight into the quality of the data it describes.

Ensure that the metadata that describes the department’s data is machine-readable.

Data is easier to find when it is presented in a structured format that machines can read and process.

Ensure that the repositories used to store the department’s data make it easy to search and retrieve the data.

In accordance with the Standard on Systems that Manage Information and Data, when a repository is used as part of a system that manages data, the repository must allow users to discover, access, create, capture and share that data.

Some digital data repositories may have a protocol for how to make the metadata machine-readable, and repositories that support machine readability should be prioritized where possible.

3.2.2 Accessible

Once data has been found, humans and machines need to be able to access it. In the GC context, there may be privacy, security, policy, legal or other constraints that limit the extent to which data may be accessed. However, when data cannot be openly available, its corresponding metadata should indicate whether, and describe how, it can be accessed and reused.

Self-assessment questions to assess if the data is accessible
  1. Does the department control access to its data by including information (such as metadata) about conditions that might limit access to it and by ensuring that access restrictions are removed where data should be open?
  2. Is the department’s data and corresponding metadata accessible through a standardized, open communication protocol?
  3. Does the department store data in open file formats?
  4. Does the department ensure that metadata remains accessible over the life cycle of the data it describes?
  5. Does the department maintain access to the metadata when it is appropriate to do so once the data itself no longer exists?
Good practices to enable accessible data

It is recommended that departmental officials:

Be explicit about data usage rights and access permissions.

In accordance with the Directive on Service and Digital, ensure that data includes corresponding metadata that provides information about the usage rights and access permissions for the data (for example, information on how to request access in case the data cannot be shared openly for privacy, security, ethical, legal, commercial or other reasons).

Ensure that appropriate access restrictions are applied to data. These restrictions must reflect the security categorization of the data and users’ need to access it. Ensure that data is managed securely in accordance with the Directive on Security Management, the Standard on Systems that Manage Information and Data and any applicable departmental security procedures.

Remove any access restrictions on data that can be openly available. Ensure that the data aligns with legal and policy requirements related to access to information, privacy, security and others.

Additional guidance on appropriately enabling or restricting access to data through the use of metadata can be found in the Guidance on Assessing Metadata Needs and in the Guidance on Data Quality, which addresses “access” as a dimension of data quality.

Where applicable, ensure that data and its corresponding metadata are retrievable through standardized, open communication protocols.

Open communication protocols are a type of communication protocol that is openly accessible by anyone, and their use is not restricted.

Some examples of open, standardized communication protocols are Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP) and Simple Mail Transfer Protocol (SMTP).

Ensure that data is stored in open or other acceptable file formats.

File formats are methods for encoding digital information. Open file formats can be used by anyone and typically do not require a specific software to use. Data stored in open file formats can be accessed and reused easily by users without restrictions and without a licence.

Additional guidance on selecting acceptable file formats for the GC can be found in the Guidance on Digital File Formats (accessible only on the GC network) and Library and Archives Canada’s (LAC’s) Guidelines on File Formats for Transferring Information Resources of Enduring Value.

Where applicable (such as for scientific data), ensure that metadata remains available to describe data, even when the data no longer exists.

This includes metadata that describes why the data is no longer available. In instances where this is appropriate, this metadata informs users about the content of data but prevents them from spending time searching for it when it is not available or no longer exists.

3.2.3 Interoperable

Unlocking value from data often means combining or integrating it with other data. Data also needs to be interoperable, meaning that it can work with applications or workflows for processing, storage and analysis. This interoperability can be accomplished by implementing standards, models, applications, tools or processes that enable multiple datasets to be used at the same time.

Self-assessment questions to assess if data is interoperable
  1. Does the department use standardized vocabularies or variables as data values (data values can be letters, numbers or symbols that can be read, moved and manipulated by machines)?
  2. Does the department use standardized vocabularies or standard metadata fields to describe its data (for example, data dictionaries)?
  3. Is the department’s data grouped, organized, structured and stored according to standard models or shared schemas?
  4. Does the department define relationships between different data and between data and metadata?
Good practices to enable interoperable data

It is recommended that departmental officials:

Wherever possible, implement standards.

Ensure that data and metadata are grouped, organized and stored according to applicable GC, departmental, domain- or discipline-specific standards.

The GC Enterprise Data Reference Standards provide standardized data values prescribed by the Chief Information Officer of Canada for departments to implement. Likewise, the GC Enterprise Metadata Reference Standards should also be implemented in accordance with the Standard for Managing Metadata.

Also, consider the applicability and appropriateness of implementing other well-established community standards, such as:

Departmental officials may also want to consider using LAC’s GC controlled vocabulary registry. The registry contains standardized language on Canadian topics and names, as well as vocabularies that departments have developed to describe GC resources.

Additional guidance on adopting and implementing data standards can be found in the Guidance on Data Quality under the data quality dimensions of “coherence” and “interpretability.” Good practices for applying metadata reference standards are set out in the Guidance on Prescribing Metadata Reference Standards.

Ensure that the metadata describing data provides references to any related data and its corresponding metadata to enrich contextual knowledge.

Ensure that references are included with the data itself, such as links to the globally unique and persistent identifiers of related data and metadata.

3.2.4 Reusable

Data reuse is the use of data for a purpose other than the purpose for which it was originally collected, while respecting privacy requirements. There may also be security, policy, legal or other constraints that may limit the extent to which data can be reused. Data should be accurately described by its corresponding metadata so that the data can be reused appropriately (for example, analyzed for a new purpose or combined in different settings).

Self-assessment questions to determine if data is reusable
  1. Does the department apply metadata that describes the origins or sources of the data it manages?
  2. Does the department apply metadata that describes the methodologies used to produce the data it manages (for example, collection)?
  3. Does the department apply metadata that describes the known uses of the data it manages?
  4. Does the department apply metadata to describe the actions that have been performed on the data it manages over its life cycle?
  5. Does the department apply metadata that includes information about how the data it manages can be reused (for example, licensing restrictions)?
  6. Does the department manage data in a way to support its preservation over its lifetime?
Good practices to enable reusable data

It is recommended that departmental officials:

Include metadata about the provenance of the data.

Data provenance is a record of the history of the data, including information about the origins of the data and how it has been used in the past. This information describes why, how, when, where and by whom the data was created. Users can use this metadata to determine whether to trust the data and reuse it.

Include metadata about the lineage of the data, including details about how the data has changed. Consider including metadata that provides provenance information for the department’s data, such as:

  • the source of the data and the method of data collection or creation (for example, model, instrument, methodology)
  • the date of creation or collection of the data
  • the contributors involved
  • the versioning information (how the data relates to other versions and descriptions of changes)
Ensure that digital preservation is considered as part of data management activities.

Digital preservation is a series of activities that ensure continued access to and reusability of data for as long as is necessary. It may keep the data FAIR over the long term.

Perform routine quality assessments of the data, including confirming data integrity (such as fixity) over time. Also, consider retention and disposal requirements for the data (for example, those included in the Privacy Regulations). More information on how to use, share, keep and delete personal information can be found in The Digital Privacy Playbook.

Good practices for managing data quality can be found in the Guidance on Data Quality. Additional advice for managing metadata throughout its life cycle can be found in the Guidance on Metadata Life Cycle Management.

Where applicable, clearly indicate the type of licence that applies to the data.

A licence details how the data can be used or reused and by whom. If the type of licence that applies to the data is not clearly indicated, potential users will not know what they can do with the data, and reuse of the data will be limited. Using a standard type of licence is recommended (for example, Creative Commons).

Data with an open licence can be found, accessed and used by a wider audience. This may increase its reusability. Publish open departmental data to the Open Government Portal in accordance with the Directive on Open Government and as permitted by applicable federal privacy, security and intellectual property frameworks.

Finally, include information about the chosen licence as part of the metadata so that any users (humans or machines) that find the data will know what they are allowed to do with it.

Further information

Office of the Chief Information Officer
Treasury Board of Canada Secretariat
Email: ServiceDigital-ServicesNumerique@tbs-sct.gc.ca

Appendix A: Self-assessment tool

The following is a complete list of the self-assessment questions. Users can adopt it or adapt it to suit their specific context. Users can decide whether, when and how to apply each question.

Number Self-assessment questions Response
1 Does the department assign a unique number, field name, alphanumeric code, or other value to the data it manages that is clear and can be used to reliably identify and locate that data in its systems over time? Yes/No
2 Does the department apply additional information (such as metadata) that is sufficient to accurately describe the data it stores? Yes/No
3 Does the department register, catalogue or index its data in a searchable resource? Yes/No
4 Does the department register, catalogue or index the metadata describing its data in a freely available, searchable resource? Yes/No
5 Does the department maintain associations (such as links) between its data and corresponding metadata that can persist over time? Yes/No
6 Do the systems through which the department manages data make the corresponding metadata available in a format searchable by both machines and humans? Yes/No
7 Does the department control access to its data by including information (such as metadata) about conditions that might limit access to it and by ensuring that access restrictions are removed where data should be open? Yes/No
8 Is the department’s data and corresponding metadata accessible through a standardized, open communication protocol? Yes/No
9 Does the department store data in open file formats? Yes/No
10 Does the department ensure that metadata remains accessible over the life cycle of the data it describes? Yes/No
11 Does the department maintain access to the metadata in instances where it is appropriate to do so once the data itself no longer exists? Yes/No
12 Does the department use standardized vocabularies or variables as data values (data values can be letters, numbers or symbols that can be read, moved and manipulated by machines)? Yes/No
13 Does the department use standardized vocabularies or standard metadata fields to describe its data (for example, data dictionaries)? Yes/No
14 Is the department’s data grouped, organized, structured and stored according to standard models or shared schemas? Yes/No
15 Does the department define relationships between different data and between data and metadata? Yes/No
16 Does the department apply metadata that describes the origins or sources of the data it manages? Yes/No
17 Does the department apply metadata that describes the methodologies used to produce the data it manages (for example, collection)? Yes/No
18 Does the department apply metadata that describes the known uses of the data it manages? Yes/No
19 Does the department apply metadata to describe the actions that have been performed on the data it manages over its life cycle? Yes/No
20 Does the department apply metadata that includes information about how the data it manages can be reused (for example, licensing restrictions)? Yes/No
21 Does the department manage data in a way to support its preservation over its lifetime? Yes/No

Page details

Date modified: