Why Cite Data?

Research data are important outputs by themselves and should be addressable as an independent resource of research. Data citations promote seeing data as independent scientific products and give recognition to their producers. Furthermore, data citations foster research transparency and open science, aligning with good scholarly practices. Various scholars and institutions have discussed numerous reasons for the importance of data citations (see, e.g., Ball & Duke, 2015; Bornatici & Fedrigo, 2023; Dosso & Silvello, 2020; Finnish Social Science Data Archive, n.d.;  FORCE11, 2014; ICPSR, 2018; Silvello, 2018; UK Data Service, 2023a). The objective of this section is not to conduct an exhaustive review but to highlight selected reasons that hold particular significance, encompassing the following aspects:

  • Accountability
    • Data citations ensure that researchers and all actors in the data ecosystem receive proper recognition for their work following standard scientific practice.
    • Data citation ensures that data are addressable in the long term when it relates to a persistent identifier.
  • Reproducibility
    • Reproducibility requires that the data and related contextual material are discoverable.
    • Data citations help to make data findable and accessible under certain conditions, which facilitates the reproducibility and verification of research and its results.
  • Policy compliance
    • Data citations are often needed to fulfil the requirements of publishers, funders or research-producing organisations’ data policies.
  • Reuse
    • In the social sciences and humanities, research data have a scientific, cultural, or educational value that exceeds its original purpose of collection.
    • Providing independently addressable research data with associated metadata makes various reuse cases possible, including but not limited to:
      • Ensuring and demonstrating the reproducibility of a method used in a study
      • Exploring new research questions using existing data
      • Studying data citation and usage patterns
      • Accessing the data specifically together with associated metadata using automated processes, such as metadata harvesting or AI applications
    • Data citations are key elements of building knowledge graphs and connecting data, publications, authors, projects, and funding.