Skip to main content

 

DATA COLLECTION AND REPOSITORIES: TOPIC SUMMARY


Anyone who has ever done data collection for their research project understands well the chaos of data collection process. One day questionnaires, next day interviews recorded on the laptop, then another two weeks to figure out the meaning behind the strange file name. It is precisely for this reason that knowing how to collect data and work with repositories is so crucial in the MLIS sphere, particularly in regard to the issue of data duration (Borgman, 2015).

I shall begin with data collection process. It is a process of obtaining data for its further analysis. Surveys, interviews, observation and automation are some of the methods of data collection widely used in LIS. All of those techniques have their pros and cons, however, one must remember that whatever way one chooses data should be obtained with full observance of ethical principles. Such important issues like informed consent, privacy, and anonymisation should always be on one's mind. Proper data collection saves one's life in many ways (Digital Curation Centre, 2020).

But what happens after you collect the data? Where do you put it? If you keep the data stored on your own computer or on a USB flash drive, the data will be out of reach from anyone else. Additionally, there is a risk of losing it forever if the device fails. This is when data repositories play their part. Data repositories are digital collections that store and preserve research data. While cloud services, such as Google Drive and Dropbox, are popular choices for storing data, they lack certain features that make data repositories more suitable for researchers.

Which repositories should be considered? Among some great options are Zenodo, an online service hosted by CERN and accepting all types of research outputs, Figshare with its intuitive web interface, ICPSR specialising in social science data management, and Dryad dedicated to life and environmental sciences data curation. In selecting the right repository, it is essential to focus on its governance, metadata quality, access policies, and certifications, such as CoreTrustSeal (Digital Curation Centre, 2020).

Finally, it would be remiss not to consider ethical and legal considerations. Not all data sharing practices are alike. Some data may have sensitive material within it; therefore, anonymization becomes very important. Proper consent is also required from participants before depositing any data into the public domain. Following the FAIR principles will provide guidance on how to manage your data responsibly (Wilkinson et al., 2016).

In conclusion, proper data collecting practices in combination with using repositories will ensure that your data will continue to be discoverable and useful for many years to come. It is critical that MLIS students learn these practices.

 

References

Borgman, C. L. (2015). Big data, little data, no data: Scholarship in the networked world. MIT                     Press.            

Digital Curation Centre. (2020).How to select a data repository. DCC.                                                            https://www.dcc.ac.uk/guidance/how-guides/select-data-repository

Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M., Baak, A., Blomberg, N.,       Boiten, J. W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T.,                      Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., … Mons, B.                    (2016). The FAIR Guiding Principles for scientific data management and stewardship.                     Scientific Data, 3(1), Article 160018. https://doi.org/10.1038/sdata.2016.18

Zenodo. (2025). Zenodo user guide. CERN. https://about.zenodo.org

 

 Picture: An Institution repository depicting the process of data collection and digital curation


Comments

Post a Comment

Popular posts from this blog

Information Literacy Skills

Information Literacy Skills and the Big Six Framework Cosmas Fletcher Mbewe Master of Library and Information Science Mzuzu University, Malawi  1. Introduction Information literacy is a crucial skill for higher education and the workplace. In the current world that is facing exponential information expansion, digitalification, and the prevalence of misinformation, postgraduate students must exhibit highly advanced skills regarding information identification, information evaluation, and the proper application of information. Information literacy skills enable learners to respond effectively to knowledge by applying it in a rigorous search or contribution towards academic and national discourse (Association of College and Research Libraries [ACRL], 2016). In relation to the Malawian higher education setting, information literacy can be considered crucial for such aspects as evidence-based decision-making, research productivity, and sustainable development. In this paper, information ...
USING AND REUSING DATA At first, I thought that data curation only involved such tasks as keeping data secure and backed up, migrating data formats when necessary, and nothing more. In truth, I was completely off the mark in my assessment. Having data stored on some server is of no value if no one can access it. Data usage Data usage is rather simple. This would be a case of a biologist analysing her field observations. It would also involve a student retrieving data and comparing them with data presented in the study. Now, data reuse is something more complicated and more challenging. Reusing data implies taking data from another source and applying it to solve a problem unknown to the original producers of the dataset. This can mean using census data for migration analysis, combining three clinical studies for a meta-analysis, etc. Data re-use According to Lee & Stvilia (2017), majority of the users engage in activities such as searching, browsing, downloading of content fr...