Research data management (RDM) is defined as the active and ongoing management of data “from its entry to the research cycle through to the dissemination and archiving of valuable results” (Whyte & Tedds, 2011). RDM is an overarching term encompassing the organisation of the data, their storage, their documentation, the curation of active research data and the long term preservation of no-longer-active research data.
Data are a valuable resource that often require a great deal of time, effort and money to create. Like journal articles, research data are a scholarly output, however data are much more fragile and vulnerable to being lost.
There are a huge number of very good reasons why research data should be managed:
A short video demonstrating some of the issues that arise in research data management: Hanson, K.; Surkis, A.; Yacobucci, K. Data sharing and management snafu in 3 short acts: https://www.youtube.com/watch?v=66oNv_DJuPc
At RCSI, our Research Data Management Policy provides a framework for the management of research data to ensure that research data is stored, retained, made available for use and reuse, and disposed of according to best international practices for data management, as well as in compliance with legal, statutory, ethical, contractual and intellectual property obligations, and the requirements of funding bodies and publishers.
Does Research Data Management apply to you? Yes - our Research Data Management Policy applies to all College members engaged in research, including staff and research students, and those who are conducting research on behalf of the College. And it applies to all research, irrespective of funding. See the current RCSI Research Data Management Policy in full.
Where do you start with Research Data Management? It can be helpful to think of research data management in terms of a research data lifecycle and the data-related activities that take place at stages during this lifecycle. The diagram below from the University of Reading illustrates the research data lifecycle in seven stages.
Plan: Identify the data that will be collected or used to answer your research question. This is the stage at which the data management plan is created. Many funders ask for a data management plan to be submitted as part of a research application or within the first six months of starting a new project.
Collect: Data are collected, via experiments, observations, surveys, secondary materials etc. depending on your methodology. You should be actively documenting your data collection, including information on instruments and methods - anything that's necessary to interpret and use the data.
Process: Once data have been collected they are processed in order to be usable. This might involve cleaning data to eliminate noise, combining data from multiple sources, transforming data from one state to another (e.g. by format conversion), and using procedures to validate or quality-control data. Any data processing will need to be documented, such that the end result can be replicated from the raw data.
Analyse: The raw materials of research are interrogated to produce the insights that constitute the research findings, which will be written up and published in research outputs. Instruments and methods used for analysis should be documented; code written for purposes of data analysis and visualisation may need to be preserved and made available in support of research results.
Preserve: Towards the completion of your research you will select the data that is needed to substantiate your research findings, or those with long-term value, and you will preserve these data for the long term. For data to remain accessible and safe in the long term, it must be prepared for preservation and deposited in a suitable location such as a data repository. Preservation activities may involve quality assurance of data, file format conversion, creation of metadata records with assignment of Digital Object Identifiers (DOIs) to datasets, licensing datasets for re-use, and putting in place any required access controls. If the data is confidential or non-digital, it may be held locally, in which case they should be managed by an accountable person or group, who can ensure they are stored and preserved properly.
Share: Publications based on data should include a data citation or a statement indicating where and on what terms the data can be accessed. A data repository will enable discovery of the data in its care by exposing the metadata online, and will provide access to the data when this is permitted. Data may be made publicly available, or restrictions on access may be imposed where data are of a sensitive or confidential nature. Data held locally or in non-public locations should be managed in such a way that others can discover and apply for access to the data.
Re-use: Data that are available for discovery and access may be re-used by other researchers, either to substantiate the findings of the original research, or to generate new insights through further interrogation and analysis. At this stage the data may become raw materials collected within a new cycle of research. Research data may also have other valuable uses, e.g. in policy-making, development of commercial products and services, and teaching.
(Content adapted from The research data lifecycle by the University of Reading)
A short video about sharing Research Data: Dr Kristin Briney, a Data Services Librarian at the University of Wisconsin-Milwaukee, describes the current research data landscape, how it can be improved to increase scientific reproducibility and how shared data can be reused in new ways to generate new innovations and technologies.
The European Commission is currently proposing to establish the European Health Data Space. But what will the proposed EHDS mean for health researchers?
The EHDS creates a strong legal framework for the use of health data for research, innovation, public health, policy-making and regulatory purposes. Under strict conditions, researchers, innovators, public institutions or industry will have access to large amounts of high-quality health data, crucial to develop life-saving treatments, vaccines or medical devices and ensuring better access to healthcare and more resilient health systems.
The access to such data by researchers, companies or institutions will require a permit from a health data access body, to be set up in all Member States. Access will only be granted if the requested data is used for specific purposes, in closed, secure environments and without revealing the identity of the individual. It is also strictly prohibited to use the data for decisions, which are detrimental to citizens such as designing harmful products or services or increasing an insurance premium.
The health data access bodies will be connected to the new decentralised EU-infrastructure for secondary use (HealthData@EU) which will be set up to support cross-border projects.
The EHDS complements GDPR and other data directives. More on the EHDS