FAIR for Beginners

You have perhaps heard that data should be FAIR - Findable, Accessible, Interoperable and Reusable? Learn about the FAIR principles and why they are important, gain insight into how they are used and find FAIR tools you can use in your research.

DM Forum collaborates across institutions on activities that promote research data management. You can read more about the "Fair Across Disciplines" activity in the overview of inter-institutional data management activities here.

Content of the page

 

Introduction to the FAIR principles

frontpage FAIRy tale bookFindable, Accessible, Interoperable and Reusable are the main principles in FAIR. There is a little more to the story - in total there are 15 principles that define properties that data/metadata/infrastructure must have in order to be found and reused.

The fairy tale "A FAIRy tale - A fake story in a trustworthy guide to the FAIR principles for research data" explains the principles one by one - in an entertaining, factual and not least educational way.

Read/Download the book

Visit the website

Licens - CC-BY-SA 4.0 Attribute: ‘DK Fair på tværs’

DOI: 10.5281/zenodo.2248200

An overview of the principles listed with the accompanying brief explanation can also be found at go-fair.org/fair-principles/ which is the authoritative source.

 

Why FAIR - and how?

frontpage FAIRy tale bookThe poster "The path to a successful research project - How to get on the right track with FAIR data management" explains how to achieve a FAIR data management practice throughout all the phases of a research project.

Slides with the same messages as the poster:

View/download presentation in PDF format

Postcards with the same messages as the poster:

Poster, slides and postcards all have the same license - CC-BY-SA 4.0 Attribute: ‘DK Fair på tværs’

 

Myths about FAIR

There are many myths about FAIR data. Here, some of the most common ones are busted. Each myth is presented on a postcard where the back presents the real picture.

postkort forside med myte 1

Myth 1 | Answer | Print version

postkort forside med myte 2

Myth 2 | Answer | Print version

postkort forside med myte 3

Myth 3 | Answer | Print version

postkort forside med myte 4

Myth 4 | Answer | Print version

postkort forside med myte 5

Myth 5 | Answer | Print version

postkort forside med myte 6

Myth 6 | Answer | Print version

Myth 7 | Answer | Print version

Myth 8 | Answer | Print version

Myth 9 | Answer | Print version

All postcards have the same license - CC-BY-SA 4.0 Attribute: ‘DK Fair på tværs’

 

Overview of FAIR tools

Download an illustration of the research data life cycle her.

The table (below the figure) indicates which phases of the research data life cycle the tools are intended to be used in:

 
  • Create: Collect or generate new data from scratch, e.g. through measurements or surveys.
  • Process/Analyze: Prepare, process and analyze research data, including activities like digitization, conversion and interpretation of data.
  • Document: Add context to the data, including provenance and metadata.
  • Recycle: Use previous output as input for new analysis or interpretation, also in collaborations.
  • Publish/Disseminate: Make selected datasets available for other researchers or the general public.
  • Archive: Deposit selected datasets in systems suitable for long-term preservation.
  • Exploit: Perform research directly on previously published data.
  • Discover & Re-use: Use previously published data for new research, e.g. from public databases and repositories.
  • Release: Provide access to raw data for others to use.
  • Preserve: Retain raw data on long-term storage.
  • Discard: Destroy or delete any data – due to legal or contractual obligations, for example. This is not a FAIR process and is therefore not considered further.

The table also shows how the individual tool fits in with the FAIR dimensions. You can use the overview to find the right tools to make your research data (more) FAIR - depending on the data you have and what you want to do with it.

Availability Discipline FAIR dimension Phases in research life cycle Service Name Description
International Generic (Tabular data) __I_

Process/Analyze

OpenRefine OpenRefine is a standalone open source desktop application for data cleanup and transformation to other formats (i.e. data wrangling)
International Generic FA(I)(R) Publish/Disseminate, Archive, Discover & Re-use, Release, Preserve Zenodo

Zenodo is a general-purpose open access research data repository, hosted by CERN (Switzerland) that provides a place for researchers to deposit datasets. Researchers in any subject area to are able to upload files up to 50 GB. It has an integration with GitHub to make code hosted in GitHub citable.

Support: zenodo.org/support

International Generic - Highly recognized in Social Sciences FA(I)(R) Publish/Disseminate, Archive, Discover & Re-use, Release, Preserve Harvard Dataverse Dataverse is a data repository that is widely used within the Social Sciences. Researchers can login with their institutional credentials via WAYF. Data can be made findable by applying discipline-specific metadata schemes and digital object identifiers (DOIs). Data is made reusable by specifying relevant re-use licenses.
International Generic (Software) _A(I)(R) Create, Process/Analyze, Document, Recycle, Publish/Disseminate Jupyter Notebooks The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.
International Generic _A(I)(R) Publish/Disseminate, Archive B2SHARE B2SHARE is a user-friendly, reliable and trustworthy way for researchers, scientific communities and citizen scientists to store and share small-scale research data from diverse contexts
International Generic F___ Discover & Re-use B2FIND B2FIND is a discovery service based on metadata steadily harvested from research data collections from EUDAT data centres and other repositories.
International Generic (Tabular data) __IR Document Data Package Creator Data Package is a simple container format used to describe and package a collection of data. It can be used to package any kind of data. At the same time, for specific common data types such as tabular data it has support for providing important additional descriptive metadata -- for example, describing the columns and data types in a CSV. The Data Package format provides a simple way to improve data interoperability that supports frictionless delivery, installation and management of data.
International Generic (Tabular data) ___R Process/Analyze Good Tables Good Tables is a web service where data can be provided in CSV or Excel format, and the file will be validated for well-formedness (for example, no empty rows or columns, no duplicate rows, all rows have valid dimensions, and so on), and conformance to a schema (if a JSON Table Schema is supplied).
National Generic F___ Publish/Disseminate, Discover & Re-use, Release DataCite

DataCite is a global non-profit organisation that provides persistent identifiers (DOIs) for research data. It is possible to assign DOIs to datasets and other research objects. This way DataCite supports researchers in their efforts to find, identify, and cite research data and other research objects.

Support: datacite@deic.dk 

National Humanities FA(I)(R) Publish/Disseminate, Archive, Discover & Re-use, Release, Preserve CLARIN.dk Repository

Data repository for making language data shareable. Each data set gets a PID, and the metadata are searchable. CLARIN.dk is part of the European CLARIN ERIC community (see clarin.eu), which offers a number of services. Metadata information from the data sets will be harvested by the European service Virtual Language Observatory vlo.clarin.eu, and will be made availabe for researchers outside Denmark.

Support: info@clarin.dk

National Social Sciences FA(I)(R) Exploit, Discover & Re-use Danmarks Statistik Forskerservice

Statistics Denmark's open data are official statistics data, which are electronically available in a way that users can apply them directly in various applications. They can be accessed either via a user interface or via an API against the StatBank. Data can be freely used in the development of various services, including development of apps for smartphones. All statistics on Statistikbanken.dk and the website dst.dk can be used free of charge. Data are licensed under Creative Commons, CC-BY 4.0.

Support: dst@dst.dk

National Social Sciences and Health Sciences FA__ Publish/Disseminate, Preserve, Archive Rigsarkivet

Rigsarkivet/The National Archives offer archival services for research data.  Data will be preserved according to appraisal standards. All research data disseminated by Rigsarkivet will have a persistent identifyer - DOI.

Support: mailbox@sa.dk

National Social Sciences and Health Sciences FAIR Exploit, Discover & Re-use Rigsarkivets research support and survey data collections

Search engine containing all research data handed in to Danish Data Archive and Rigsarkivet since 1973. Covers both surveydata, cohortes, registries etc. All data and metadata are processed and enriched with metadata according to domain-specific and international standards and contain controlled vocabularies, classifications and keywords. Data are available in several formats. Codebook information is open and available online.

Support: mailbox@sa.dk

National Social Sciences and Health Sciences _(A)IR Exploit, Discover & Re-use DDA Online

Online tool with surveydata from Rigsarkivet, where it's possible to make online statistical analysis and download results.

Support: mailbox@sa.dk

National Generic _A(I)(R) Publish/Disseminate, Archive, Discover & Re-use LOAR

LOAR is a general-purpose open access research data repository for Danish researchers. The service creates a DataCite DOI for each dataset and stores it for at least five years. All data in LOAR is considered Open Access and will either be placed under a basic distribution license or under a Creative Commons licence. LOAR also hosts open access data from the Danish Royal Library collections.

Support: datarepo@kb.dk

Institutional AAU Generic FA(I)(R) Publish/Disseminate, Discover & Re-use, Release, Preserve, Archive VBN

"VBN allow for registration (and possible upload) of datasets, including harvesting to the national and international portals, like forskningsdatabasen.dk and openaire.eu A dataset can utilize the binding features of VBN, to make a connection between a dataset and a publication, a project or the like. This is not mandatory, and a dataset can exist in VBN on its own."

Support: vbn@aub.aau.dk

Institutional AAU Generic (Software) _AI(R) Process/Analyze, Document, Recycle Subversion Subversion is version control system that keeps track of changes made to files and folders (directories), facilitating data recovery and providing a history of the changes that have been made over time.
Institutional DTU Generic FA(I)(R) Publish/Disseminate, Archive, Discover & Re-use, Release, Preserve DTU Data

DTU Data is the instituional data repository of DTU hosted by Figshare. DTU Data allows DTU’s researchers to publish datasets, with almost no restrictions on formats and file sizes. All published datasets are provided with metadata, a DOI and a usage licenses. Data/metadata published in DTU Data will by default be openly accessible. However, creators can restrict the access to their data, e.g. for peer-reviewers or collaborators. It has an integration with GitHub to make code hosted in GitHub citable.

Support: datamanagement@dtu.dk

Institutional DTU Generic (Software) _AI(R) Process/Analyze, Document,Recycle, Publish/Disseminate, Discover & re-use, Release GitLab

GitLab is a tool for versioning, documenting and publishing code. It is available as a Hub for all students and researchers at DTU. It does not provide DOIs automatically. DTU deparments can deploy GitLab on their own servers contacting DTU's Research-IT (ait-fit-server-net-support@dtu.dk).

Support: gbar.dtu.dk/faq/94-gitlab

Institutional KU Generic (F)A(I)(R) Publish/Disseminate, Archive, Release, Preserve Data DOI (ERDA)

Data DOI is a service using KU's internally developed storage system (ERDA) and web-interface with metadata form. The service creates a DataCite DOI for the dataset and enables medium long term archiving of research data for reserachers at KU. The service is only accessible and described via KU's internal network. It is not suitable for Personal Data.

Support: support@erda.dk

 

Overview of services for data sharing

Availability Service Name Description
International B2DROP B2DROP is a secure and trusted data exchange service for researchers and scientists to keep their research data synchronized and up-to-date and to exchange with other researchers.
Institutional DTU/i2 ScienceData "ScienceData is a sharing & collaboration tool provided by DTU/i2. It can be accessed via ORCID or institutional credentials. It has integrated features to: - publish your data to Zenodo - add standard metadata schemas – or supply your own metadata to your datasets - organize data using colored tags - sync individual folders (selected by the user) - connect to compute nodes (ABACUS2.0) or virtual machines - make your own apps for processing (open API) - add back up sites at other geographical locations"
Institutional AAU HCP Anywhere Institutional provided sync’n’share tool for collaboration on files.
Institutional AAU OneDrive Sync’n’share tool for collaboration on files. Provided by Microsoft Corp.
Institutional DTU Files DTU File DTU is the ”Dropbox-like” service (sync’n’share) tool that supplement the existing network drives that meets DTU requirements for IT-Security / IT-Control.

 

Overview of information resources/guides about FAIR

Availability Discipline FAIR dimension Service Name Description
International Social Science FAIR CESSDA Expert Tour Guide on Data Management This tour guide aims to help social scientists to make their research data findable, understandable, sustainably accessible and reusable.
International Generic __IR UK Data Service The UK Data Service (National archive) website contain guidance in data use, best practices for data preservation and sharing standards. In this link, researchers can find guidance on file formats recommended and accepted by the UK Data Service for data sharing, reuse and preservation, which are applicable to use for any other data repository or archive.
International Generic __IR Registry of Research Data Repositories re3data.org is a global registry of research data repositories that covers research data repositories from different academic disciplines.
International Generic __IR FAIRsharing Researchers can find a catalogue of metadata standards, databases and policies. This information is particularly useful during the planning phase of a research project.
National Generic FAIR Rigsarkivet Datamanagement Researchers can find more information about the RDM policy of the Danish National Archives and also find links to several usefull tools.
Institutional AAU Generic FAIR CLAAUDIA CLAAUDIA is a strategic initiative that joins forces between IT Services and the University Library. We provide a technical and a human infrastructure providing compute and storage solutions as well as training, support and advisory. We are staffed with research IT professionals, data managers and data scientists to support the research data handling from application to archive.
Institutional CBS Generic FAIR CBS - Research Data Management On this website you can find information about CBS' RDM policy, CBS RDM Support and best practices within data management.
Institutional DTU Generic FAIR DTU Inside - Research Data On DTU Inside researchers find a guide which covers the background and aspects of data management. The guide is going through the steps of the data life cycle and how to use to the DMPonline tool. It also includes toolboxes for managing research data and data collection, and a section describing the support the data management group offers.
Institutional KU-HUM Humanities FAIR KU-HUM's Datamanagement info site Information to researchers about research data management and where to look for more detailed information, including links to webpages on GDPR.
Institutional KU Generic FAIR KU's Datamanagement info site Information about research data management and GDPR (only accessible for KU-employees).
Institutional SDU Generic FAIR Research Data Management support Researchers can find information about what kind of services SDU provide regarding research data management, a link to the SDU Open Science Policy, links to recommended resources (e.g. DMPonline, an internal webpage on GDPR and links to the participating units). An overview of the group of people providing RDM Support is also available together with a link to the DM Forum website, which governs the RDM Support unit.

 

Total overview for download

 

Examples of studies of FAIR data practices in concrete research projests

The FAIR principles are not necessarily easy to live by. The following describes the challenges of a number of active research projects in relation to FAIR - incl. suggestions on how each project can work towards more FAIR data:

Revised 02/10/20