Ownership is key: It is important for researchers to understand the relevant ownership rules for any data that they collect or use. From an ethical standpoint, researchers should consider the implications of data ownership agreements before they are made with other researchers, institutions, or funding agencies.
Typically, when research is funded by federal or nonprofit granting agencies, the data are owned by the institution receiving the grant. The primary researcher or scholar receiving the grant has the responsibility for storage and maintenance of the data, including the protection of confidential or sensitive information.
Data obtained through research supported by private or corporate funding, however, may have different guidelines for ownership and restrictions on sharing. This issue is further complicated when organizations such as universities patent data sets.
Confidential / Sensitive and Proprietary Data
Scholars and researchers have a moral and professional responsibility to ensure that confidential or sensitive data is stored and released in a way that protects research participants. For example, the “Privacy Rule” of the federal Health Insurance Portability and Accountability Act (HIPAA) advises on maintaining confidentiality for research data that comes from health care records; HIPAA calls for specifications of data handling responsibilities and privileges.
Data that include confidential or sensitive information, if properly cleaned, can still be shared by following certain guidelines:
- withhold part of the data
- statistically alter the data in ways that will not compromise secondary analyses
- require researchers who seek data to commit to protect privacy and confidentiality
- provide data access in a controlled site, sometimes referred to as a data enclave.
Sustainability and Data Formats
Data must be archived in a controlled, secure environment in a way that safeguards the primary data, observations, or recordings. The archive must be accessible by scholars analyzing the data, and available to collaborators or others who have rights of access. Primary research data should be stored securely for sufficient time following publication, analysis, or termination of the project. The number of years that data should be retained varies from field to field and may depend on the nature of the data and the research.
Sustainable data management is crucial to the value of research and crucial to ensuring continued scholarship.Typically, in data storage, there is a an access copy, for use, and an archival copy, essentially for preservation and back-up purposes. Backing up data cannot be overemphasized, just as natural disasters and breakdowns in systems and software cannot be predicted. Back up your data early and often!
Choosing data formats and software depends mostly on the preference of the researcher but can often be dictated by discipline-specific standards and customs. While ensuring the long-term usability and sustainability of data requires attention to standard and interchangeable software, there are also Preferred Formats (from the UK Data Archive) for data creation and preservation.
For more information about selecting data formats and software with respect to sustainability, see "Sustainable Data Formats" (University of Wisconsin-Madison).