Registry of Research Data Repositories logo

Re3data launched at the tail end of 2012 with the goal of registering all research data repositories. These research data repositories are collections of datasets usually associated with a particular discipline or a particular geographic region. Because of the way data repositories have cropped up on an as-needed basis over the past 50 years, these repositories are myriad and take a specialized knowledge to navigate the options in any academic field.

Research data represents the lion’s share of effort for universities. The value of research data within universities is without peer; however, this data is often vulnerable to loss due to poor preservation practices. Data repositories provide long-term storage and potentially enable access to datasets, while also promoting reproducibility of research. Although this storage and access provide a clear benefit to the researcher, the funding agencies who support research can be the stimulus for researchers to use a data repository. For example, the National Science Foundation requires dissemination and sharing of research results:

Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.

Dissemination and Sharing of Research Results – National Science Foundation

Certain publishers also stipulate the use of data repositories, such as this example for Scientific Data, a Nature Publishing Group journal:

Scientific Data mandates the release of datasets accompanying our Data Descriptors, but we do not ourselves host data. Instead, we ask authors to submit datasets to an appropriate public data repository. Data should be submitted to discipline-specific, community-recognized repositories where possible, or to generalist repositories if no suitable community resource is available.

Recommended Data Repositories – Nature

For librarians, the benefits of data repositories are fairly clear. Repositories manage, organize, preserve, enable discovery of, and usually provide a persistent identifier for data. Re3data allows librarians to point researchers in the right direction regarding repositories. Re3data provides a basic search feature equipped with 27 facets to narrow or refine a search. Each repository record is tagged with icons to let uses know if the repository provides additional information about its service, if it is open, restricted, or closed access, and what persistent identifier is used (i.e. DOI, URN, ARK, handle, Purl, or other).

View of Re3data’s search interface
View of Re3data’s search interface

Users can also browse by country, subject, or content type (ex. Raw data, audiovisual data, source code, to name a few.) The subject browse function is particularly attractive:

View of Re3data’s browse wheel
View of Re3data’s browse wheel

Users can select a discipline and the wheel will react and narrow the search with a rotating animation action.

Also, librarians who play a role in their own institution repository can suggest their repository to be included in Re3data. Data repositories considered for inclusion must be run by a legal entity, clarify access conditions, and have a focus on research data (see:

Samuel R. Putnam, Assistant University Librarian, University of Florida

We welcome your comments and suggestions. If you have a resource that you would like to see highlighted please leave us a comment


Look to the Stars: History of Astronomy Collections from Adler Planetarium

Photo Credit: Emily Gorman

It’s been a few months now since the ALA Annual Meeting in Chicago, and I am still thinking about the Adler Planetarium! A group of librarians from the Science & Technology Section were lucky enough to get a tour of the Webster Institute for the History of Astronomy, which manages the Adler’s collections. The collections include rare books, historic photographs and scientific instruments, and much more. It was amazing to see some of these beautiful and fascinating materials up close.

Continue reading

STEM Preprint Repositories: Where Are They Now?

In light of the one year anniversary of engrXiv, and the recent creation of AgriXiv and PsyArXiv, we wanted to highlight the availability of preprint repositories for STEM disciplines.  Preprint services provide free, open access to research articles.  The goal of these sites include disseminating “knowledge quickly and efficiently” (1), “providing a free, open access outlet for new findings” (2), and making “research outputs …  immediately available to all the stakeholders for understanding and finding suitable solutions” (3).

Researchers often add their preprints to these repositories, so they are called preprint servers, but postprints and published versions might be included as well. Awareness of the open access movement is spreading, and more researchers have a desire to make their research articles open.  Reasons for publishing work open access include funding agency mandates to make research results publicly available, and researcher desire to make their work accessible for increased visibility and public good.  We wanted to highlight STEM disciplinary repositories so science librarians can help patrons both find and share open access scientific research.

arXiv was founded in 1991 as an electronic archive for research articles from physics, math, computer science, nonlinear science, quantitative biology, quantitative finance, and statistics.  arXiv is operated by Cornell University Library, and there are over 1 million submissions to arXiv.  

Life science researchers can use bioRxiv, a free online archive and distribution service for unpublished preprints. It is operated by Cold Spring Harbor Laboratory, and was launched in 2013.  It has about 13,500 content items.  

AgriXiv, engrXiv, and PsyArXiv were all founded within the last year or so.  All three use Open Science Framework’s preprint service.  AgriXiv, preprints for agriculture and allied sciences, was founded earlier in 2017, and does not have lot of content added yet.  AgriXiv stresses the “importance of agricultural research to meet the demands for food production and … livelihood promotion” and the “growing need for dedicated research sharing and dissemination” to “facilitate the sharing of interim research for public good” (3). Learn more at the AgriXiv blog.  engrXiv was founded in 2016, and is dedicated to the “dissemination of engineering knowledge quickly and efficiently” (1).  In addition to the preprint server, engrXiv has a blog to share news related to the site.  Currently, there are about 130 posts on engrXiv.  PsyArXiv, an open-access preprint service for psychological sciences, was founded late in 2016, but already has a large amount of content added, about 700 posts.  You can learn more at the PsyArXiv blog.  

ChemRxiv is an open preprint server for chemistry, still under development by the American Chemical Society (ACS).  ChemRxiv is intended to be a collaborative undertaking to facilitate the discoverability of scientific research.  Interested users can sign up for alerts to get news and updates about ChemRxiv.  

Finally, a note about the Center for Open Science and the Open Science Framework, as these may be helpful open access resources for science librarians and their patrons.  The Center for Open Science is a nonprofit company that aims “to increase openness, integrity, and reproducibility of research” (4).  Open Science Framework is their free and open source tool for research project management across the entire research lifecycle. Researchers can collaborate with their groups, make their projects accessible, and store and archive research data, protocols, and materials.


  1. About engrXiv. (2016, July). Retrieved August 12, 2017, from
  2. Introducing PsyArXiv: Psychology’s dedicated open access digital archive. (2016, December). Retrieved August 12, 2017, from
  3. AgriXiv. (2017, February). Retrieved August 12, 2017, from
  4. Brian Nosek. (n.d.). A Brief History of COS. Retrieved August 12, 2017, from


Emily Gari, Science & Engineering Librarian, University of Colorado Boulder

The Encyclopedia of Life is 10 years old!

The Encyclopedia of Life is 10 years old!  It is freely available on the web.  From their statistics, as of May 11, 2017, they have 5.5 million pages.  Responsibilities are shared by interested groups and individuals.  “The founding partners of the project include the Field Museum of Natural HistoryHarvard University, the Marine Biological Laboratory, the Smithsonian Institution, and the Biodiversity Heritage Library.  The Missouri Botanical Garden later joined, and negotiations are ongoing with the Atlas of Living Australia.  Other partners are the American Museum of Natural History (New York), Natural History Museum (London), New York Botanical Garden, and the Royal Botanic Gardens (Kew).”

Continue reading


The Surgeon General’s Office in the United States Army started an index of all holdings in its library in 1880. The various volumes were printed until 1961. Because the Army Medical Library became the largest medical library in the world in the late 1890s, the Index-Catalogue of the Library of the Surgeon-General’s Office, 1880-1961, can be considered an almost complete compilation of the medical literature. Continue reading

NCBI Bioinformatics Tools: Protein, BLAST, COBALT, and Cn3D Structure Viewer

This tutorial is a step-by-step guide for searching for motifs for the SET domain, which I have taught for epigenetics students.

“For example, a protein called Clr4 from S. pombe contains the SET domain. How could you find mammalian homologous of Clr4? Let’s assume that you find 8 proteins in human database containing SET domain. How close are they? Can we draw a tree out of it? Can we align all these protein sequences together and compare their similarity, and find the most conserved motif (like GXGNA) shared with all these proteins? If I would like to know where this motif located in 3D structure, can we look at it on the published protein structure database?”

Continue reading

Data in the Time of Cholerics: Where to Find Preserved Federal Data

During the recent change in federal government, researchers and librarians were concerned about loss of access to federal data, particularly in the area of environmental science where the new administration’s policies appeared to contradict scientific consensus. Early indications suggested that federal datasets and scientific information would be removed from the web entirely, or at least restricted in access.

In response to these concerns, a number academic institutions and other organizations began to organize data preservation efforts to ensure continued public access to endangered datasets. While websites of specific federal agencies continue to serve as the primary repositories of public data, this post focuses on a few public websites that aggregate and preserve federal datasets and provides a brief description of each. Continue reading

Alleviating the high cost of science textbooks with Open Educational Resources

OER Global Logo by Jonathas Mello is licensed under a Creative Commons Attribution Unported 3.0 License

Academic institutions are searching for ways to alleviate the financial burden that the increasing cost of textbooks places on their students. The average student spends $1200 annually for books and supplies, according to the Open Textbook Network.  Science textbooks are especially expensive, but OERs, or Open Educational Resources, are gaining acceptance as alternatives to traditional textbooks (Open Textbook Network, 2017).

Continue reading

Apps: What are engineering students using?

one smartphone with colorful application icons (3d render)A survey conducted by the Pew Research Center in the fall of 2016 reports that 80% of US adults with some college and 89% of US college graduate own a smartphone.   (Pew Research Center, January 11, 2017).  Not surprising as the smartphone is the go-to for recent news, connecting with friends and family, and learning new things.  We know that smartphones are popular with the college population but how are college students using them in support of their study, in particular engineering students?

Continue reading



MathSciNet is the premier index and reviews database for mathematics and related literature. It contains and continues from Mathematical Reviews (print), published from 1940 to 2012.  Produced by the American Mathematical Society, MathSciNet is international in scope, covering over 1800 current journals with more than 3.3 million publications ranging from 1810 to the current day.  Over 20,000 expert reviewers produce more than 80,000 reviews each year for MathSciNet.  These add tremendous value to the more than 100,000 new citations added annually.

Continue reading