Should you require any further information please contact the IT Helpdesk
When developing a data management plan, researchers must address storage and backup during the research process, which broadly encompasses two main questions:
How will data and metadata be stored and backed up during the research process?
How will data security and protection of sensitive data be taken care of during the research?
These questions relate to the 'active research' stage of the project, when data collection, analysis and write up of results is underway. We will focus on what happens to the data after the research concludes in the later section "data sharing and long-term preservation".
The following advice is for projects not categorised as ‘big data’ and not requiring high performance computing capabilities.
Microsoft Teams, SharePoint Online and OneDrive are all provided by RCSI as 'out of the box' products. OneDrive provides every account holder with 1TB of storage. It is a good option if you are working solo and want to keep your files safely, or if you want to share documents you have created with other team members. SharePoint is a good option for team work, and is good for storing and organising research data files that need to be accessible to all members of the team. Some people use the SharePoint online interface to view all of the files and folders belonging to their research project team, while others are more comfortable using the Teams interface to do so - but both options work well.
OneDrive is Microsoft's cloud-based storage solution, similar to Google Drive and Dropbox, allowing you to store all your personal files securely in one place. Every staff member at RCSI gets 1TB of storage on MS OneDrive (provided via institutional MS Office 365 plan). OneDrive is a cloud-based storage solution, and you can access your files from anywhere using a web browser (Chrome, Edge etc) or mobile device. OneDrive is appropriate for personal work files, for example, versions that are not ready to share with others, as only the owner has the access rights to view, edit, or share the content. For more information please refer to the RCSI IT Guide on OneDrive.
This article from Cloudwards.net clearly explains the differences between and shared features of Microsoft's OneDrive and SharePoint products, helping you to decide which option is most appropriate for your data storage needs.
SharePoint is a cloud storage solution that allows you to share documents with your team and manage your research project. All research data and documentation associated with the project, including research tools, signed consent forms, and other project documentation should be stored within the project’s unique SharePoint Online (SPO) site. SharePoint replaces the previous RCSI V: drive system. (Please note, when applying for ethical approval for research with human subjects, you are required to upload templates of the consent form, participant information leaflet, study protocol, questionnaire/survey if relevant, along with the Data Protection Impact Statement to RIMs, but are not asked to submit signed consent forms).
If you are commencing a new research project, the Primary Investigator (PI) of that project should request a SharePoint Online (SPO) site for this project by completing an Azure storage drive request form. If you have team members who are external to RCSI (external guests), and will need access to the SPO site, then you should indicate this in the application form. More information on SharePoint at RCSI is provided in the following IT Guide: Learn SharePoint Online (SPO) \ MS Teams collaboration
The owner of the SPO site has full control privileges to their site - only the SPO site owner can add users to the SharePoint group and assign user permissions. Please note, SharePoint is used in RCSI primarily for internal collaboration to avoid any potential data leakage. As such, when an SPO site is created, the external sharing feature may be disabled as a security measure. If you need to enable the external sharing feature, you should contact RCSI IT support to request an SPO site that has this feature enabled. Additional storage space on SharePoint can be purchased by the project PI using grant funds - please contact Research IT to arrange this.
Find more information on the Research IT Service at RCSI here.
Both OneDrive and SharePoint are hosted in the Microsoft cloud, and the geographical location is restricted to the European Economic Area. Therefore, they are capable of GDPR compliance. Alternative file storage systems that are not provided via RCSI, such as DropBox and Google Drive, may not provide any restriction on the geographical location of file storage and are therefore not GDPR compliant. These alternative options must not be used for any data that is subject to GDPR rules, or provided to you by a third party with similar restrictions or user conditions.
For more information please see the CESSDA Data Management Expert Guide: Backup.
The following RCSI Research IT resources are suitable for processing large datasets that require compute resources that exceed what is possible with a desktop computer or a single server e.g. ‘omics research.
If you require any of these resources, please contact the Systems Administrator (Research IT Service) (research-it@rcsi.com) to discuss options for your research study.
A visual explanation of the Research IT Infrastructure Overview here.
ICHEC Kay is Ireland's national supercomputer for academic researchers and is based in the Irish Centre for High-End Computing (ICHEC) at NUI Galway. RCSI has contracted capacity on ICHEC Kay for computational work that uses large-scale anonymised data. The cluster can provide up to 1600 CPUs for a single job. The RCSI has purchased a finite 'share' of resources (disk storage, CPU time and active user accounts)
Additional detail is available at the RCSI Research IT Service site.
At RSCI, clinical research depends upon the secure storage and processing of personal data in order to deliver precision medicine and in conjunction with clinical trials. Personal data is subject to GDPR rules and cannot be processed on most external services. Researchers working with personal data should use one of the following storage and processing options:
Isilon storage (PowerScale) is a centrally funded a substantial storage system that is highly scalable (i.e. possible to add more storage nodes in order to handle the increased workload). It provides a single central location for active research data which is highly resilient compared to point solutions. Isilon storage is sufficiently secure and auditable to support the storage of Personal Data. It is integrated with the Active Directory so that access to data is controlled through the Active Directory, and it is easy to audit who has access to any data. For data which the Data Management Plan mandates to be retained beyond the lifetime of the funded project and for which there is no suitable public repository, it can be archived to storage within Microsoft Azure while remaining manageable from the Isilon system.
Local Compute Cluster: This is a small Slurm cluster onsite at RCSI (Dublin) and it works in a similar way to ICHEC Kay but at a much smaller scale. The Local Compute Cluster is physically secure with encrypted disks. It is CIS (Centre for Internet Security) Hardened, which provides additional security to the Linux operating system. The Local Compute Cluster is also integrated to Active Directory so identity is assured. Because of these and other security measures, it can be used to process personal data. The Systems Administrator (Research IT Service) (research-it@rcsi.com) can install required software on the cluster, on request.
External access: There is a controlled capability to make specific data on the storage system accessible to external collaborators; however, they have to be registered with RCSI in order to maintain audit trails and for technical reasons related to access control.
The RCSI Research IT Request Form can be used to request access to RCSI Research IT resources, or changes to resources as follows:
- Remote High-Performance Compute at iCHEC
- Local Compute
- Networked Storage
- Long-term archival in Azure Cloud
Confidential data at rest on computer systems owned by RCSI and located within controlled spaces and networks are protected by strict access controls that authenticate the identity of those individuals who access the specific system or data. For more information see the RCSI Data Encryption Policy. Below are some general recommendations to help you secure and protect your devices which may contain research data (these have been adapted from the UCD Device Security Recommendations).
Confidential data should not be copied to or stored on a portable computing device or a non-RCSI owned computing device. However, in situations that require confidential data to be stored on such devices, data owners and device users must acknowledge how they will ensure that data is encrypted and how encrypted data will be accessible by the owner in the event that an encryption key becomes lost or forgotten. Methods to meet this requirement include:
For more information please see the RCSI Data Encryption Policy.
Device Encryption:
Device encryption helps to protect information on your device should it go missing or get stolen. If your device is encrypted, the data on it can only be accessed by people who've been authorized (usually through a password). Again, a strong password is required to ensure your encrypted device is truly secure. Device encryption is already available on supported devices running any Windows 10 edition (see Microsoft for further information).
File Encryption:
File encryption can be used to store sensitive data on portable devices (such as a USB drive), to securely email it, or just to add an additional layer of security onto your existing data management.
A strong password is a key part of ensuring data security, whether you are simply storing your own research files or sending files to collaborators. Access to all RCSI information systems and networks must be controlled via strong password authentication schemes. The RCSI’s password policy is as follows:
Passphrases are also recommended as they are often easier to remember, but much more difficult to hack. A passphrase is a password made up of (at least) four randomly chosen words. It is as easy to remember as four randomly chosen letters, but it results in very strong passwords. For example a passphrase could be simple (e.g. apple tower africa elephant ) or more complicated to make it compatible with a service that insists on punctuation marks and capitals (e.g. Ap.ple.Tower@fricaElephant). Please see the RCSI System Access Control Policy and the University of Edinburgh's Guide to Choosing a Strong Password for more information and tips.
In order to protect your data and information, all files and documents should be encrypted before transferring them. Researchers should follow RCSI acceptable use policies when transmitting data and must take particular care when transmitting or re-transmitting confidential data received from non-RCSI employees. Transmission of data via RCSI email is automatically encrypted using TLS. SMTP TLS (Transport Layer Security) is the mechanism by which two email servers, when communicating, can automatically negotiate an encrypted channel between them. RCSI has configured mail flow to ensure that TLS is always used for email transmission. For more information please see the RCSI Data Encryption Policy.
Additionally manual encryption of attachments helps to protect your data and information if either the recipient’s or your email account is compromised. The encrypted files cannot be viewed by anyone, including yourself, without the decryption password, which should be sent to the recipient using a different transfer method (e.g. over the phone or via text).
How to email files securely
However, it is often advisable to retain research data/records for a longer period depending on the nature of the study and the data collected. For example, the Medical Research Council (UK) recommends the following retention schedule for various study designs:
However, longer retention periods for both basic research and population health and clinical studies may be appropriate in some cases. For example: For basic research – Retention periods of 10 years+ may be more appropriate where there is the potential for Intellectual Property to arise (e.g. laboratory notebooks could be retained indefinitely). Similarly, research data relating to studies which directly inform national policymaking should be considered for permanent preservation in an appropriate archive or repository.