RCSI Login Required
Research data management (RDM) is defined as the active and ongoing management of data “from its entry to the research cycle through to the dissemination and archiving of valuable results” (Whyte & Tedds, 2011).
RDM is an overarching term encompassing the organisation of research data, their storage, documentation, and curation, culminating in the long term preservation this data after the research has completed.
The best approach to RDM is to start with a data management plan, and you can read advice on writing a data management plan further on in this guide.
Data are a valuable resource that often require a great deal of time, effort and money to create. Like journal articles, research data are a scholarly output, however data are much more fragile and vulnerable to being lost. There are a huge number of very good reasons why research data should be managed:
Where do you start with Research Data Management? It can be helpful to think of research data management in terms of a research data lifecycle and the data-related activities that take place at stages during this lifecycle. The diagram below from the University of Reading illustrates the research data lifecycle in seven stages.
Plan: Identify the data that will be collected or used to answer your research question. This is the stage at which the data management plan is created. Many funders ask for a data management plan to be submitted as part of a research application or within the first six months of starting a new project.
Collect: Data are collected, via experiments, observations, surveys, secondary materials etc. depending on your methodology. You should be actively documenting your data collection, including information on instruments and methods - anything that's necessary to interpret and use the data.
Process: Once data have been collected they are processed in order to be usable. This might involve cleaning data to eliminate noise, combining data from multiple sources, transforming data from one state to another (e.g. by format conversion), and using procedures to validate or quality-control data. Any data processing will need to be documented, such that the end result can be replicated from the raw data.
Analyse: The raw materials of research are interrogated to produce the insights that constitute the research findings, which will be written up and published in research outputs. Instruments and methods used for analysis should be documented; code written for purposes of data analysis and visualisation may need to be preserved and made available in support of research results.
Preserve: Towards the completion of your research you will select the data that is needed to substantiate your research findings, or those with long-term value, and you will preserve these data for the long term. For data to remain accessible and safe in the long term, it must be prepared for preservation and deposited in a suitable location such as a data repository. Preservation activities may involve quality assurance of data, file format conversion, creation of metadata records with assignment of Digital Object Identifiers (DOIs) to datasets, licensing datasets for re-use, and putting in place any required access controls. If the data is confidential or non-digital, it may be held locally, in which case they should be managed by an accountable person or group, who can ensure they are stored and preserved properly.
Share: Publications based on data should include a data citation or a statement indicating where and on what terms the data can be accessed. A data repository will enable discovery of the data in its care by exposing the metadata online, and will provide access to the data when this is permitted. Data may be made publicly available, or restrictions on access may be imposed where data are of a sensitive or confidential nature. Data held locally or in non-public locations should be managed in such a way that others can discover and apply for access to the data.
Re-use: Data that are available for discovery and access may be re-used by other researchers, either to substantiate the findings of the original research, or to generate new insights through further interrogation and analysis. At this stage the data may become raw materials collected within a new cycle of research. Research data may also have other valuable uses, e.g. in policy-making, development of commercial products and services, and teaching.
(Content adapted from The research data lifecycle by the University of Reading)
At RCSI, our Research Data Management Policy provides a framework for the management of research data to ensure that research data is stored, retained, made available for use and reuse, and disposed of according to best international practices for data management, as well as in compliance with legal, statutory, ethical, contractual and intellectual property obligations, and the requirements of funding bodies and publishers.
Key points of our Research Data Management Policy
Read the RCSI Research Data Management Policy in full.
The FAIR Data Principles are a set of guidelines for best practice in managing the outputs of research, with the ultimate goal of optimising the reuse of research data. The FAIR Data Principles have rapidly come to define best practice in research data management.
Visit our FAIR data library guide for practical steps you can take to make your research data FAIR.
A short video about sharing Research Data: Dr Kristin Briney, a Data Services Librarian at the University of Wisconsin-Milwaukee, describes the current research data landscape, how it can be improved to increase scientific reproducibility and how shared data can be reused in new ways to generate new innovations and technologies.