Skip to Main Content

Research Data Management

Storage, Backup and Security

According to Science Europe, when developing a data management plan, the third topic researchers are required to address is "Storage and backup during the research process", which broadly encompasses two main questions:


 How will data and metadata be stored and backed up during the research process?

  • Describe where the data will be stored and backed up during research activities and how often the backup will be performed. It is recommended to store data in least at two separate locations.
  • Give preference to the use of robust, managed storage with automatic backup, such as provided by IT support services of the home institution. Storing data on laptops, stand-alone hard drives, or external storage devices such as USB sticks is not recommended.

 How will data security and protection of sensitive data be taken care of during the research?

  • Explain how the data will be recovered in the event of an incident.
  • Explain who will have access to the data during the research and how access to data is controlled, especially in collaborative partnerships.
  • Consider data protection, particularly if your data is sensitive for example containing personal data, politically sensitive information, or trade secrets. Describe the main risks and how these will be managed.
  • Explain which institutional data protection policies are in place.

Cloud storage at RCSI


The following advice is for projects not categorised as ‘big data’ and not requiring high performance computing capabilities.

OneDrive is Microsoft's cloud-based storage solution, similar to Google Drive and Dropbox, allowing you to store all your personal files securely in one place. Every staff member at RCSI gets 1TB of storage on MS OneDrive (provided via institutional MS Office 365 plan). OneDrive is a cloud-based storage solution, and you can access your files from anywhere using a web browser (Chrome, Edge etc) or mobile device. OneDrive is appropriate for personal work files, for example, versions that are not ready to share with others, as only the owner has the access rights to view, edit, or share the content. For more information please refer to the RCSI IT Guide on OneDrive.

SharePoint is a cloud storage solution that allows you to share documents with your team and manage your research project. When a new research project commences, IT Support at RCSI provides the Primary Investigator (PI) of that study with a SharePoint Online (SPO) site. SharePoint has replaced the previous RCSI V: drive system. All research data and documentation associated with the project (including signed consent forms) should be stored within the project’s unique SPO site. The owner of the SPO site has full control privileges to their site - only the SPO site owner can add users to the SharePoint group and assign user permissions. The SPO site owner can also grant access to an external collaborator who does not have an RCSI email account, using the procedure described in the RCSI IT Guide on Inviting external guests to a SharePoint online sitePlease note, SharePoint is used in RCSI primarily for internal collaboration to avoid any potential data leakage. As such, when an SPO site is created, the external sharing feature may be disabled as a security measure. If you need to enable the external sharing feature, you should contact RCSI IT support to request an SPO site that has this feature enabled. The project PI can purchase additional storage space using grant funds, by contacting IT with their request. For more information please refer to the RCSI IT Guide on SharePoint.

Data protection and the Microsoft 365 service at RCSI: Both OneDrive and SharePoint are hosted in the Microsoft cloud, with geographical location restricted to the European Economic Area, so are capable of GDPR compliance. Non-RCSI storage systems, such as DropBox and Google Drive, provide no restriction on the geographical location of the storage and are therefore not GDPR compliant, and must not be used for any data subject to GDPR or with any restrictions or conditions from the data provider.

 

Storage options for big data projects


The following options are suitable for processing large datasets that require compute resources that exceed what is possible with a desktop computer or a single server e.g. ‘omics research. If you require any of the below options, please contact the Systems Administrator (Research IT Service) (research-it@rcsi.com) to discuss options. Additional detail is available at the RCSI Research IT Service site.

ICHEC Kay is Ireland's national supercomputer for academic researchers and is based in the Irish Centre for High-End Computing (ICHEC) at NUI Galway. RCSI has contracted capacity on ICHEC Kay for computational work that uses large-scale anonymised data. 

At RSCI, clinical research depends upon the secure storage and processing of personal data in order to deliver precision medicine and in conjunction with clinical trials. This data cannot be processed on most external services. The following storage and processing options are more suited to personal data that is subject to GDPR. 

Isilon storage (PowerScale) is a centrally funded a substantial storage system that is highly scalable (i.e. possible to add more storage nodes in order to handle the increased workload). It provides a single central location for active research data which is highly resilient compared to point solutions. Isilon storage is sufficiently secure and auditable to support the storage of Personal Data. It is integrated with the Active Directory so that access to data is controlled through the Active Directory, and it is easy to audit who has access to any data. For data which the Data Management Plan mandates to be retained beyond the lifetime of the funded project and for which there is no suitable public repository, it can be archived to storage within Microsoft Azure while remaining manageable from the Isilon system.

Local Compute Cluster: This is a small Slurm cluster onsite at RCSI (Dublin) and it works in a similar way to ICHEC Kay but at a much smaller scale. The Local Compute Cluster is physically secure with encrypted disks. It is CIS (Centre for Internet Security) Hardened, which provides additional security to the Linux operating system. The Local Compute Cluster is also integrated to Active Directory so identity is assured. Because of these and other security measures, it can be used to process personal data. The Systems Administrator (Research IT Service) (research-it@rcsi.com) can install required software on the cluster, on request. 

External access: There is a controlled capability to make specific data on the storage system accessible to external collaborators; however, they have to be registered with RCSI in order to maintain audit trails and for technical reasons related to access control.

 

Portable drives and laptops


Confidential data should not be copied to or stored on a portable computing device or a non-RCSI owned computing device. However, in situations that require confidential data to be stored on such devices, data owners and device users must acknowledge how they will ensure that data is encrypted and how encrypted data will be accessible by the owner in the event that an encryption key becomes lost or forgotten. Methods to meet this requirement include:

  • Maintaining an accessible copy of the data on a server managed by RCSI IT
  • Escrowing the encryption key with a trusted party designated by the data owne
  • Use of whole-disk encryption technologies that provide an authorised systems administrator access to the data in the event of a forgotten key.

For more information please see the RCSI Data Encryption Policy.

 

Retention periods for research data


Research data must be retained and disposed of securely according to the relevant retention and disposal schedule, in accordance with legal, ethical and research funder requirements, and with particular concern for the confidentiality and security of the data. Research data that underpins published results or is considered to have long-term value should be retained, subject to informed consent to do so, where relevant. The current RCSI REC guideline is that research data should be retained for 5-7 years and then destroyed. However, this retention time could be significantly less or more depending on the nature of the study being conducted. The RCSI Research Data Management Policy states that in the absence of the other provisions, the default period for research data retention is 10 years from date of last requested access. Retained data must also be deposited in an appropriate national or international reputable data repository.

However, it is often advisable to retain research data/records for a longer period depending on the nature of the study and the data collected. For example, the Medical Research Council (UK) recommends the following retention schedule for various study designs:

  • For basic research: Research data and related material should be retained for a minimum of 10 years after the study has been completed.
  • For population health and clinical studies: Research data should be retained for 20 years after the study has been completed. 
  • For clinical studies: In some cases, such as for clinical studies involving pregnant participants and those who lack capacity to consent, it has been recommended that a minimum of 25 years may be more appropriate for data retention.

However, longer retention periods for both basic research and population health and clinical studies may be appropriate in some cases. For example: For basic research – Retention periods of 10 years+ may be more appropriate where there is the potential for Intellectual Property to arise (e.g. laboratory notebooks could be retained indefinitely). Similarly, research data relating to studies which directly inform national policymaking should be considered for permanent preservation in an appropriate archive or repository.

 

Backing up your data


The data that you collect, organise, prepare, and analyse underpin all of your research, and backups are an important instrument to ensure that data, and related files, can be restored in the event of loss or damage. Some of the most common causes of data loss, such as hardware failure and human error, can be prevented or minimised through an active data backup policy. 

Files that are stored on both OneDrive and SharePoint are hosted in the Microsoft cloud and are backed up to an alternative location within the EU (GDPR) region. Microsoft manage this process as part of their service agreement with RCSI.

It is recommended that you make three copies of your data
Store the copies on two different media (Physical and in the Cloud).
Keep one backup in a different physical location.

The ideal backup strategy will typically include both an online backup cloud service and an offline backup device (e.g. external hard drives, USB) to ensure your data is secure no matter what happens to your device. If you are working with sensitive research data, backups of the data must be protected against unauthorised access in the same manner as the original files and you may need to think a little bit more carefully about your backup strategy and where it is appropriate to store your data. In general, confidential data should not to be copied to or stored on a portable computing device or a non-RCSI owned computing device. You should clearly state in your backup strategy how often backups will be made and who will be responsible for doing so. In many instances, backups of your data can be automatically created using various software and data storage services. You should routinely test your backup solution to ensure you can recover your data in the event that you do actually need to restore from a backup. For more information please see the CESSDA Data Management Expert Guide: Backup.

Protecting your devices


Confidential data at rest on computer systems owned by RCSI and located within controlled spaces and networks are protected by strict access controls that authenticate the identity of those individuals who access the specific system or data. For more information see the RCSI Data Encryption PolicyBelow are some general recommendations to help you secure and protect your devices which may contain research data (these have been adapted from the UCD Device Security Recommendations).

 

  • Update all devices, software, and plug-ins on a regular basis: Check for operating system, software, and plug-in updates often or, if possible, set up automatic updates.
  • Install and regularly update protective software.
  • Control access to your machine: Don't leave your computer in an unsecured area, or unattended and logged on, especially in public places. The physical security of your device is just as important as its technical security.
  • Use secure connections: When connected to the Internet, your data can be vulnerable while in transit. While on the RCSI campus use a wired connection or eduroam for wireless connection. Use remote connectivity and secure file transfer options when off campus.
  • Protect sensitive data: Reduce the risk of identity theft by minimizing the storage of sensitive information on your device. Securely remove sensitive data files from your hard drive or use encryption tools to protect sensitive files you need to retain. 

 

Passwords


A strong password is a key part of ensuring data security, whether you are simply storing your own research files or sending files to collaborators. Access to all RCSI information systems and networks must be controlled via strong password authentication schemes. The RCSI’s password policy is as follows:

  • Passwords must contain a minimum of 8 characters.
  • Those characters must include a combination of any 3 of the 4 below items:
    • English uppercase characters (A through Z)
    • English lowercase characters (a through z)
    • Base-10 digits (0 through 9)
    • Non-alphanumeric (for example, !, $, #, %)
  • User passwords will expire every 90 days and must be reset at that time.
  • Password history is enabled and set to 24
  • Accounts will lock out after 10 failed logon attempts.

 

Passphrases are also recommended as they are often easier to remember, but much more difficult to hack. A passphrase is a password made up of (at least) four randomly chosen words. It is as easy to remember as four randomly chosen letters, but it results in very strong passwords. For example a passphrase could be simple (e.g. apple tower africa elephant ) or more complicated to make it compatible with a service that insists on punctuation marks and capitals (e.g. Ap.ple.Tower@fricaElephant). Please see the RCSI System Access Control Policy and the University of Edinburgh's Guide to Choosing a Strong Password for more information and tips.

 

Encryption


Encryption is simply the process of translating a file into meaningless code. To translate this code back into the original meaningful information a key (often a password) is required. Recovering information from encrypted files without the key is nearly impossible. The key/password for an encrypted file should never be sent with the encrypted file, an alternative method (such as over the phone, or a text) should be used to send the key/password separately. RCSI uses BitLocker Encryption, TLS. SMTP TLS (Transport Layer Security) for encrypting confidential and other college sensitive data (please see the RCSI Data Encryption Policy). All RCSI Windows laptops come with this BitLocker Encryption. Both devices and individual files can be encrypted.

 Device Encryption:

Device encryption helps to protect information on your device should it go missing or get stolen. If your device is encrypted, the data on it can only be accessed by people who've been authorized (usually through a password). Again, a strong password is required to ensure your encrypted device is truly secure. Device encryption is already available on supported devices running any Windows 10 edition (see Microsoft for further information).

 File Encryption:

File encryption can be used to store sensitive data on portable devices (such as a USB drive), to securely email it, or just to add an additional layer of security onto your existing data management.

 

Transferring data


In order to protect your data and information, all files and documents should be encrypted before transferring them. Researchers should follow RCSI acceptable use policies when transmitting data and must take particular care when transmitting or re-transmitting confidential data received from non-RCSI employees. Transmission of data via RCSI email is automatically encrypted using TLS. SMTP TLS (Transport Layer Security) is the mechanism by which two email servers, when communicating, can automatically negotiate an encrypted channel between them. RCSI has configured mail flow to ensure that TLS is always used for email transmission. For more information please see the RCSI Data Encryption Policy.

 

Additionally manual encryption of attachments helps to protect your data and information if either the recipient’s or your email account is compromised. The encrypted files cannot be viewed by anyone, including yourself, without the decryption password, which should be sent to the recipient using a different transfer method (e.g. over the phone or via text). 

How to email files securely

  • Save the confidential information to a Microsoft Office document, such as Microsoft Word or Excel.
  • Encrypt the Microsoft document by password protecting it using a strong password.
  • Attach the encrypted document to the email.
  • Send the password for the encrypted document separately, either in person, over the phone or by text.