Free (and sometimes overlooked) Chemistry Resources

Often when I teach students about our chemistry databases, I will also mention some free chemistry resources, because once they graduate and begin working, they may no longer have access to commercial databases such as Reaxys or SciFinder. In addition to our subscribed chemistry databases, there are many quality chemistry resources that are freely available. Following is a sample of some of these, but the list is by no means exhaustive. If you have any favorites that I haven’t mentioned here, please be sure to recommend them in the comments.

Resources for Properties Data

NIST Chemistry Web Book  The Web Book provides chemical and physical properties data for atoms, molecules, ions, and other chemical species. Searching is available by direct search for a particular substance, or indirectly through related properties data. From their Welcome page, the Web Book provides  “Thermochemical data for over 7000 organic and small inorganic compounds…reaction thermochemistry data for over 8000 reactions…free energy of reaction, IR spectra for over 16,000 compounds, mass spectra for over 33,000 compounds, UV/Vis spectra for over 1600 compounds, and gas chromatography data for over 27,000 compounds.”

PubChem This open database maintained by the National Institutes of Health contains information on “chemical structures, identifiers, chemical and physical properties, biological activities, patents, health, safety, toxicity data, and many others”. It was written about in a previous Inside Science Resources post, for more information please see “Navigating PubChem”.

Tables of Physical and Chemical Constants Formerly Kaye & Laby, now by the UK National Physical Laboratory, this resource is available online for free. Thermodynamic, electrical, mechanical, acoustical data are just some of the types of data. All charts, tables, formulas, and graphs are included (Currano & Roth, 2014, p. 162).

Thermodex This was one of the first resources I was introduced to as a new chemistry librarian. It was compiled by the University of Texas at Austin Library, and it contains links to books and handbooks in their collection. It is a finding aid for thermodynamic and physical properties data, mostly in print. It is a good starting place for suggestions for properties data resources, but you will need to have these resources in your own collection.

More Spectral Data

Know-it-all free software for spectral analysis KnowItAll from Bio-Rad offers academic users free software to “…draw structures, perform IR and Raman functional group analysis, and generate high-quality reports.” It is available for download from the above link.

Spectral Database for Organic Compounds (SBDS)  A database compiled by the National Institute of Advanced Industrial Science and Technology (AIST), in Japan. Users are requested not to download more than 50 spectra or individual compound information in a single day. Spectra can be found by direct search for compound name, molecular formula or weight, or CAS registry number. IR, NMR, MS, Raman, and ESR spectra are available. Users may also input spectral data to determine an unknown substance.

Resources for Hazards, Materials Safety, and Toxicology

 Materials Safety Data Sheets

Where to Find Material Safety Data Sheets on the Internet is a finding resource that is currently maintained. It lists the number of MSDS available at each site.

Sigma Aldrich Safety Data Sheets is a familiar resource to many, that allows searching by their product number. In addition, there is also web toolbox, structure searching, and “Ask a Chemist” (scroll down to icons in the footer).

Toxicology Resources

TOXNET is the TOXicology Data NETwork. This resource by the National Library of Medicine may be familiar to you, but users may not realize that it is actually a collection of individually searchable databases “covering chemicals and drugs, diseases and the environment, environmental health, occupational safety and health, poisoning, risk assessment and regulations, and toxicology.” A sampling of these databases include:

PAN Pesticide Database is maintained by the Pesticide Action Network (PAN) of North America. It is searchable by chemicals, products, poisoning diagnostic information, and chemicals responsible for aquatic ecotoxicity.

More Free Chemistry Databases

64 Free Chemistry Databases This blog post by Rich Apodaca, PhD, was published in 2011; and 58 of the 64 database links are still active (however one resource is no longer free). There are links provided to specialized databases, some mentioned here, and they contain physical and spectral data, biological activity, drugs, pesticides, and biochemistry, to name just a few of the kinds of information covered.


Apodaca, R.L. (2011, October 12). 64 Free Chemistry Databases. [Blog post]. Retrieved from

Biorad. (2018). KnowItAll Academic Edition – Free Chemistry Software. Retrieved from

Currano, J., & Roth, D. (Eds.). (2014). Chemical information for chemists: a primer. Royal Society of Chemistry.

Interactive Learning Paradigms Incorporated. (2018). Where to find MSDS and SDS on the Internet. Retrieved from

Kegley, S.E., Hill, B.R., Orme S., & Choi A.H. (2016). PAN Pesticide Database, Pesticide Action Network North America.

Kim S., Thiessen P.A., Bolton E.E., Chen J., Fu G., Gindulyte A., Han L., He J., He S., Shoemaker B.A., Wang J., Yu B., Zhang J., & Bryant S.H. (2016). PubChem Substance and Compound databases. Nucleic Acids Research, 44(D1), D1202–D1213.

Lindstrom, P.J. & Mallard, W.G. (Eds.). (2017). NIST Chemistry WebBook, NIST Standard Reference Database Number 69. Gaithersburg, MD: National Institute of Standards and Technology. doi:10.18434/T4D303

National Institute of Advanced Industrial Science and Technology. (2018). SDBSWeb. Retrieved from

Sigma Aldrich. (2018). SDS Search and Product Safety Center. Retrieved from

University of Texas Libraries. (2018). Thermodex. Retrieved from

U.S. National Library of Medicine. (2018). TOXNET. Retrieved from


Laura Palumbo, Chemistry & Physics Librarian/Science Data Specialist, Rutgers University-New Brunswick

We welcome your comments and suggestions. If you have a resource that you would like to see highlighted please leave us a comment.


The Open Access Directory

the logo for the Open Access Directory OAD

Need some good examples of addenda that authors can use to make their work open access? Is your institution considering providing an open access publishing fund and you want to get an overview of what others are offering? Need some suggestions for researchers looking for advice on where to make their documents or research data open?

The Open Access Directory launched in 2008 and is a collection of lists and information related to open access to science and scholarship. Content is overseen by an editorial board of prominent members of the open access community.

While not a science or technology specific resource, it contains useful information for members of the science and technology library community. The table of contents of the directory contains 46 categories spanning the open access landscape. Here I will highlight four of them.

Data Repositories:  a list of repositories and databases containing open data. The list is organized by discipline, many of which are specific to science/technology (for example, Astronomy, Biology, Chemistry, Computer Science, Energy, Environmental Science, Geology, Geosciences, Marine Science, Medicine, and Physics)

Disciplinary Repositories:  this is a list of discipline specific open repositories containing primarily texts instead of data. This list contains 45 disciplines, many of which are science and technology focused.

Open Access Publishing Funds: a list of funds given by various institutions in support of authors publishing in open access journals, books and other types of publications.

Author Addenda: provides a list of copyright transfer agreements from various institutions. These addenda allow authors to retain certain rights to their scholarship, specifically allowing them to make their work open access.

These are just a few examples of the categories of information found on the Open Access Directory. Keep this resource in mind as a possible place to turn to as open access related questions or concerns come up in your work, which seems increasingly likely as the open access movement continues to expand.


Eric Snajdr, Associate Librarian, University Library, Indiana University – Purdue University, Indianapolis

We welcome your comments and suggestions. If you have a resource that you would like to see highlighted please leave us a comment.

FDA Drug Information Resources

As is evident from their name, the U.S. Food & Drug Administration (FDA) is responsible for overseeing drugs that are researched, marketed, sold, etc. in the U.S. As a result, they maintain a host of databases for looking up various types of drug information, most of which can be found on their Drug Approvals and Databases page. I would like to highlight a few of the resources here.

Drugs@FDA: This database contains drugs (brand name, generic, and over-the-counter) approved by the FDA from 1939 onwards and can be searched or browsed. There is also a tool to generate a list of drug approvals for a specific month, which can be useful if you would like to keep up to date on new approvals each month. There are also free Android and Apple apps for this database!

FDA Online Label Repository: This repository contains drug label information submitted to the FDA. You can search by name, ingredient, company name, and more. Note: This database may not work well in Google Chrome.

FDA Drug Shortages: This database contains information about drug shortages and discontinuations, and is updated daily. You can search the database by name or active ingredient, or you can look at specific treatment categories such as Analgesia/Addiction, Anti-Infective, Oncology, etc. Clicking on the “New and Updated” tab allows you to see the newest shortages and their statuses. This database has apps as well!

These three databases are just a few of the many helpful drug resources provided by the FDA. Be sure to explore all of them when you get a chance!


Emily Gorman, School of Pharmacy Librarian, University of Maryland Baltimore

We welcome your comments and suggestions. If you have a resource that you would like to see highlighted please leave us a comment

The Linda Hall Library

Portrait of Linda Southall Hall
Linda Southall Hall.

The non-profit, privately funded Linda Hall Library in Kansas City, Missouri, houses scientific, engineering, and technology resources from the United States and some international works from the 15th century to the present. The library presents technical programs, has digital displays, allows researchers to study there, and provides fellowships for scholars using the collections. It also provides regular library services, such as an online catalog, reference services, and interlibrary loan. It is also a United States Patent and Trademark Resource Center.

Portrait of Herbert Hall
Herbert Hall. Credit:

Grain merchant Herbert Hall and his philanthropist wife, Linda, amassed a fortune but did not have any direct heirs. In their wills, they stated that their home be used as a library that was open to the public and that it be named after Linda Hall. Since the wills did not specify what type of library should be formed, the Trustees, who were five business men, decided to hire a national library consulting firm. The consultants recommended a scientific library and named Joseph Shipman, a librarian and former chemist, as the first director in 1945.

Portrait of Joseph C. Shipman.
Joseph C. Shipman. Credit:

Mr. Shipman studied the holdings of other nearby libraries and determined that Linda Hall Library should not collect works on business, clinical medicine and dentistry. Three important acquisitions increased the collections of the Linda Hall Library:

  • the 1947 purchase of scientific resources of the American Academy of Arts and Sciences of Boston, which was founded in 1782
  • the 1985 transfer of serials and other works by the Franklin Institute from Philadelphia
  • the 1995 transfer from Manhattan of the Engineering Society Libraries that included serials, monographs, and conference proceedings from AIME, IEEE, ASME, ASCE, and AIChE

Because of the growing collections, more space was needed besides the original mansion. The Main Library was opened in 1956, with two significant additions in 1995 and 2007. The History of Science Center and offices were moved to the original library building. In 1964, a new History of Science Center was built on the site of the original mansion; the new Center includes architectural pieces from the first library.

The Linda Hall Library offers lectures, exhibitions, and borrowing privileges for residents near its facility. Other researchers may travel and do research; the History of Science Center requires advance notice.

Online resources include: catalog, exhibitions, search engine for “difficult to find engineering papers”, and digital collections site. Scholars can apply for fellowships who use this library’s resources for their research or who are interested in the History of Science.

The Library continues to grow its collection. There are 48,000 journals and about 500,000 monographs. Besides engineering, physics, and chemistry, the library also has important works in aeronautics, astronomy, earth science, environmental science, infrastructure studies, life sciences, and mathematics.

Lisa Browar, President of the Linda Hall Library, writes in the 2015-2016 Annual Report “where the future of other libraries is in electronic information, the Linda Hall Library’s future remains secure as a print based library of contemporary and historic scientific literature. The Library continues to augment its print holdings with scientific serials and other research materials once held by the libraries that have had to remove them to make way for repurposed learning environments. The Linda Hall Library’s retention of historic printed information will assure its continued survival and use for generations of scholars to come.

The Linda Hall Library similarly finds opportunity in a world challenged to promote science literacy among adults.”

The references listed below provide more details about this unique library.


About the Library. . Accessed 10 March 2018.

Browar, Lisa. “Letter from the President.” Biennial Report of the Linda Hall Library Trust and Affiliates 2015-2016. . Accessed 12 March 2018.

Christiansen, Donald. “What Happened to the Engineering Societies Library?” IEEE*USA InSight. 20 April 2017. . Accessed 12 March 2018.

History of the Linda Hall Library Research Guide. . Accessed 10 March 2018.

Shipman, Joseph. “Linda Hall Library.” College and Research Libraries, vol. 16, no. 2, 1955, . Accessed 10 March 2018.


AIChE – American Institute of Chemical Engineers

AIME – American Institute of Mining Engineers

ASCE – American Society of Civil Engineers

ASME – American Society of Mechanical Engineers

IEEE – Institute of Electrical and Electronics Engineers


Isabel Altamirano, Engineering and Chemistry Librarian, Georgia Institute of Technology

We welcome your comments and suggestions. If you have a resource that you would like to see highlighted please leave us a comment

Understanding LaTeX

LaTeX has been around since the early 1980s and has quite a following among the scientific community, especially mathematicians and engineers.   The following is a brief look at LaTeX, how it works, and its benefits to librarians.  For in-depth instruction visit ShareLaTeX or download The Not So Short Introduction to LaTeX 2e.

LaTeX is a markup language for creating professional looking documents based on content and structure not layout. For non-coders it looks complex, but it really isn’t.  Let’s look at a sample code.

Sample Code
Figure 1: Sample Code

The first section is called the preamble where one defines the document in terms of type, author responsibility, title, packages, etc. In this sample the type is article but there are other choices available such as book, slides, letter, proceedings, reports and others.   Packages are additional predefined instructions in a separate file that are called into use.  For example, the babel package allows for writing in different languages.  The document text follows the preamble along with commands that create structure. Commands are preceded with backslashes.  When ready, the file is compiled, and a pdf document is rendered. (Figure 2)

Figure 2: PDF output
Figure 2: PDF output

A number of complex layouts have been predefined thus, giving the author freedom to focus on the content. As an author, you tell LaTeX, using simple commands, where you want to insert lists, equations, tables, chapters etc.   Each time you compile the file, commands are executed rendering a new pdf.  This is useful when changes or updates affect the structure of a document.     Let’s say you have several equations in your document, each one is numbered, and you refer to them throughout the document.  After much work you realized that you are missing an equation.  All that is needed is to add it to the correct location with corresponding commands.  LaTeX will update the document and all references.

The power of LaTeX is even more evident in its approach to mathematical typesetting.  WYSIWYG applications like MSWord do a poor job at displaying mathematical equations. This is where LaTeX excels.  It uses an extension package from the American Mathematical Society to define structure and layout.  Figure 4 is an example at how a LaTeX script translates into a clear written formula.

Figure 3: Script & Output for Quadratic Formula using LaTeX
Figure 3: Script & Output for Quadratic Formula using LaTeX

LaTeX is a powerful tool for the scientist and it should follow that it’s beneficial for the science librarian.  Many science publishers now require authors to submit papers in LaTeX format.  LaTeX predefined packages make writing papers, citing, referencing, graphing and creating bibliographies simple tasks.  As our patrons rely on this tool, we need to support its usage through marketing, instruction, and development.  Librarians can also benefit from LaTeX for our own publication efforts; manuals, resumes, articles, grants, policy documentation, lesson planning and other document types.  Complex and extensive content is best served by LaTeX.

There is a steep learning curve and for many this is a deterrent.  It takes a bit of time, patience, and practice to fully master LaTeX but the end result is worth it.

To run LaTeX you need to download an editor such as TEXMAKER.  A Google search will result in many other program options.  Or, you can try your hand at using online tools such as ShareLaTeX or Overleaf.


Ana Torres, Assistant Head, Dibner Library, NYU Division of Libraries

We welcome your comments and suggestions. If you have a resource that you would like to see highlighted please leave us a comment.

Navigating PubChem

Launched in 2004 by the National Center for Biotechnology Information (NCBI), PubChem has evolved into an open repository of biological and pharmacological information on chemical substances and compounds. PubChem also covers chemical safety and toxicity information. You can search PubChem by compound name or chemical structure. NCBI’s staff of scientists have created their own YouTube channel, which features the “NCBI Minute” and “NCBI Webinars”. The following videos about PubChem were created by NCBI experts in the field:

NCBI Minute: PubChem, a Source of Laboratory Chemical Safety Information. (21:08)

NCBI Webinar: Maximizing PubChem – New and Upcoming Features. (40:57)

More information about PubChem can be found in these PubMed Central articles below:

Kim S, Thiessen PA, Bolton EE, et al. PubChem Substance and Compound databases. Nucleic Acids Research. 2016;44(D1):D1202-D1213. doi:10.1093/nar/gkv951.

Kim S. Getting the most out of PubChem for virtual screening. Expert Opinion on Drug Discovery. 2016;11(9):843-855. doi:10.1080/17460441.2016.1216967.


Sarah Jeong, Research & Instruction Librarian for Science, Wake Forest University

We welcome your comments and suggestions. If you have a resource that you would like to see highlighted please leave us a comment.

Lesser-known Public Health Resources

Public health is a broad, cross-disciplinary field of study. Drawing on information from primary medical research, disease surveillance, public policy, geography, and sociology (to name just a few), public health researchers and practitioners wade through copious amounts of data to answer their research questions. In addition to the major health information resources (think PubMed), many additional online tools exist to aid anyone interested in public health research. Below are examples of some of the many online tools freely available to expert researchers and the general public alike.



CDC Wonder Logo

CDC WONDER (Wide-ranging Online Data for Epidemiologic Research) brings together all of the epidemiological data from the Centers for Disease Control and Prevention into one search interface. This resource is freely accessible to the public and is intended to be used by public health professionals and the general public alike. The WONDER interface includes databases covering birth and mortality statistics, disease morbidity, and environmental information relevant to health and disease. The information sources discoverable through WONDER include reference materials, reports, guidelines, and statistical research data published by the CDC. Users can browse by health topic or can construct detailed search queries to retrieve targeted results. Additionally, WONDER allows users to generate maps and charts from data extracted from search results to enhance data interpretation.


Center for Infectious Disease Research and Policy


The Center for Infectious Disease Research and Policy (CIDRAP) is an initiative of the Academic Health Center at the University of Minnesota. The purpose of this center is to provide current information about the public health response and preparedness for emerging infectious diseases. This site provides clear overviews about a variety of infectious disease topics as well as links to recent news articles and scholarly literature on these topics. This is a great background information source for anyone interested in learning more about a particular infectious disease. CIDRAP also supports the Antimicrobial Stewardship Project, which provides information on policy and best practices research for antimicrobial use. Particularly useful elements of this project include the series of podcasts and webinars on antibiotic stewardship as well as the policy updates and conference summaries. Other CIDRAP projects include influenza training videos for public health workers as well as BioWatch, an early-warning network to detect biological attacks.



HealthMap Logo

HealthMap is an eye-opening tool to gather real-time data on disease outbreak surveillance and monitoring. Created by software developers at Boston Children’s Hospital and Harvard Medical School, HealthMap gathers information about a broad variety of diseases and other public health concerns from multiple sources, including online media outlets, eyewitness accounts, and national and local health agencies. This tool has wide appeal for a variety of audiences—it could certainly be used by epidemiologist and other health officials, as well as policy makers and international travelers. Users can easily search for a specific disease of interest or can observe all of the current public health threats at a specific geographic location. HealthMap also has a mobile app called “Outbreaks Near Me” to make it easier to use for travelers and individuals who are generally interested in local disease occurrences. If you are a germophobe, be warned—HealthMap might keep you up at night!


National Guideline Clearinghouse

National Guideline Clearninghouse Logo

National Guideline Clearinghouse (NGC) is a one-stop repository of evidence-based clinical practice guidelines. Created by the Agency for Healthcare Research and Quality (part of the U.S. Department of Health and Human Services), NGC is a great resource for physicians, health care providers, researchers, and patients to stay current on a variety of treatments and other medical procedures. Users of this site can browse for guideline summaries by clinical specialty, organization, or NLM MeSH terms. These guideline summaries are succinct, easy to read, and comprehensive in synthesizing large amounts of evidence-based research. Recommendations are categorized by the strength or level of evidence from previous studies. In addition to the guideline summaries, NGC also provides a section on guideline syntheses, which are systematic comparisons of guidelines on similar topics. These syntheses provide areas of agreement between the selected guidelines as well as areas of conflict, allowing the reader to make informed assessments of individual guidelines.




TOXNET (TOXicology Data NETwork) is a network of databases related to toxicology, environmental health, hazardous chemicals, and similar disciplines. Created and maintained by the National Library of Medicine, TOXNET provides a wealth of information on chemicals of public health concern, including chemical structure and synonyms, known toxic effects, occurrence in consumer products, and chemical occupational hazards. Users can search all of the 15 databases included in TOXNET simultaneously or they can select only those databases of interest to their search queries. Some of the more popular databases within TOXNET include TOXLINE, Hazardous Substances Data Bank, Household Products Database, ChemIDplus, and TOXMAP. As an added benefit, TOXNET records are linked to the corresponding PubMed record, simplifying citation management and full-text retrieval.


Michael Goates, Life Sciences Librarian, Brigham Young University

We welcome your comments and suggestions. If you have a resource that you would like to see highlighted please leave us a comment

Web-based Molecular Biology Tools

Molecular biology is the study of biological macromolecules at the structural and functional level, particularly DNA and proteins. There are many free resources on the Internet to study various aspects of these primary constituents. The following is a list of some of these web-based tools and a brief description with some verbiage used from the native site. This is not a comprehensive list, but it is meant to provide a good starting point for researchers. Some resources appear in more than one category.

General Sites

  • BYU DNA Sequencing Center Resources
    The DNA Sequencing Center (DNASC) at Brigham Young University has also created an online resource page with additional resources.
    DBGET is a simple database retrieval system for finding and obtaining specific entries of diverse databases. Here a database is simply considered a sequential collection of entries, which may be stored in a single file or multiple files. Because each entry of a database is given a unique identifier, molecular biology databases in the world can be retrieved uniformly by the combination of the database name and the identifier.
  • European Bioinformatics Institute
    European Bioinformatics Institute (EBI) is a center for research and services in bioinformatics. The Institute manages databases of biological data including nucleic acid, protein sequences and macromolecular structures.
  • Expasy
    Molecular server that is dedicated to the analysis of protein and nucleic acid sequence. Protein identification and characterization tools:

    • Identification and characterization with peptide mass fingerprinting data
    • Identification and characterization with MS/MS data
    • Identification with isoelectric point, molecular weight and/or amino acid composition
    • Other prediction or characterization tools, MS data (vizualisation, quantitation, analysis, etc.), and 2-DE data (image analysis, data publishing, etc.).
  • Java based Molecular Biologist’s Workbench
    This site contains a workbench of tools for DNA and protein analysis: Data entry, data manipulation, data analysis, genetical and functional site mapping, and primer design.
  • National Center for Biotechnology Information
    NCBI’s mission is to develop new information technologies to aid in the understanding of fundamental molecular and genetic processes that control health and disease. It contains links to the Genbank database, tools for data mining including BLAST, COGS, MapViewer, LocusLink, UniGen, ORF finder, Electronic PCR, VAST search, CCAP, Human-Mouse Homology maps, VecScreen, and Cancer Genome Anatomy Project. Also provides access to Entrez: a retrieval system for searching several linked databases, including PubMed, Nucleotide sequence database, protein sequence database, structure, genome, population data sets, Online Mendelian Inheritance in Man, taxonomy, 3D domains, ProbeSet, and online books.
  • National Center for Genome Resources
    The National Center for Genome Resources (NCGR) contains information and links to various genome related projects.

Nucleic Acid Sequencing Tools

  • Biosyn Gizmo Tools
    Bundle of databases (siRNA, protein, peptide antigen) and tools, including a Bioinformatic Glossary, Genetic Code Table, Nucleic Acids and Protein Calculations, and an Oligo Properties Calculator.
    Searches for sequence homology between your sequence and those in the databases. BLASTN will perform search in DNA sequences; BLASTX will translate your sequence in all 6 frames and perform a search in protein sequences.
  • Codon Usage Database
    A query box to search a codon usage table for an organism, is presented. Search can be done with Latin name or its sub-string of organism. Useful for creation of primers and probes.
  • Sequence Manipulation Suite (SMS)
    The Sequence Manipulation Suite in BioSyn’s Gizmo Tools is a collection of JavaScript programs for generating, formatting, and analyzing short DNA and protein sequences. It is commonly used by molecular biologists, for teaching, and for program and algorithm testing.

Genomic Resources

  • GenomeNet
    GenomeNet is a Japanese network of database and computational services for genome research and related research areas in molecular and cellular biology. GenomeNet was established in September 1991 under the Human Genome Program (HGP) of the Ministry of Education, Science, Sports and Culture (MESSC).
  • National Center for Genome Resources
    National Center for Genome Resources (NCGR) contains information and links to various genome related projects.
  • SoftBerry
    Softberry, Inc. is a leading developer of software tools for genomic research. Their primary areas of interest and expertise are in the following areas: *Genome annotation *Functional site identification in DNA and Proteins *Sequence database managing *Genome comparison *Expression data analysis *Protein structure prediction. *Protein compartment (destination) prediction.
  • UCSC Genome Browser
    The University of California, Santa Cruz (UCSC) Genome Browser website contains the reference sequence and working draft assemblies for a large collection of genomes.
  • db GAP (NCBI)
    The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits.
  • Ensembl
    The Ensembl project produces genome databases for vertebrates and other eukaryotic species, and makes this information freely available online.

Protein Sequence Analysis Tools

  • Expasy
    Molecular server that is dedicated to the analysis of protein and nucleic acid sequence. Protein identification and characterization tools:

    • Identification and characterization with peptide mass fingerprinting data
    • Identification and characterization with MS/MS data
    • Identification with isoelectric point, molecular weight and/or amino acid composition
    • Other prediction or characterization tools, MS data (vizualisation, quantitation, analysis, etc.), and 2-DE data (image analysis, data publishing, etc.).
  • FramePlot
    Protein coding region prediction in Bacterial DNA.
  • MPEx
    Membrane Protein Explorer (MPEx) is a tool for exploring the topology and other features of membrane proteins by means of hydropathy plots based upon thermodynamic principles.
  • PredictProtein
    PredictProtein is an Internet service for sequence analysis and the prediction of protein structure and function. Users submit protein sequences or alignments; PredictProtein returns multiple sequence alignments, PROSITE sequence motifs, low-complexity regions (SEG), nuclear localization signals, regions lacking regular structure (NORS) and predictions of secondary structure, solvent accessibility, globular regions, transmembrane helices, coiled-coil regions, structural switch regions, disulfide-bonds, sub-cellular localization, and functional annotations. Upon request fold recognition by prediction-based threading, CHOP domain assignments, predictions of transmembrane strands and inter-residue contacts are also available.
  • ProDom
    ProDom is a protein domain family database constructed automatically by clustering homologous segments. The ProDom building procedure MKDOM2 is based on recursive PSI-BLAST searches [ALTS2]. The source protein sequences are non-fragmentary sequences derived from SWISS-PROT and TrEMBL databases.
  • ProtScale
    ProtScale allows you to compute and represent the profile produced by any amino acid scale on a selected protein. An amino acid scale is defined by a numerical value assigned to each type of amino acid. The most frequently used scales are the hydrophobicity or hydrophilicity scales and the secondary structure conformational parameters scales, but many other scales exist which are based on different chemical and physical properties of the amino acids. This program provides 57 predefined scales entered from the literature.
  • Sequence Manipulation Suite (SMS)
    The Sequence Manipulation Suite in BioSyn’s Gizmo Tools is a collection of JavaScript programs for generating, formatting, and analyzing short DNA and protein sequences. It is commonly used by molecular biologists, for teaching, and for program and algorithm testing.
  • Worldwide Protein Data Bank (wwPDB)
    The wwPDB maintains a single Protein Data Bank Archive of macromolecular structural data that is freely and publicly available to the global community.

3D Macromolecular Structure Tools

  • Cn3D
    Cn3D is a helper application for web browsers that allows you to view 3-dimensional structures from NCBI’s Entrez retrieval service. Cn3D runs on Windows, Mac, and Unix. Cn3D simultaneously displays structure, sequence, and alignment, and now has powerful annotation and alignment editing features.
  • DeepView
    Swiss-PdbViewer (aka DeepView) is an application that provides a user friendly interface allowing to analyze several proteins at the same time. The proteins can be superimposed in order to deduce structural alignments and compare their active sites or any other relevant parts. Amino acid mutations, H-bonds, angles and distances between atoms are easy to obtain thanks to the intuitive graphic and menu interface.
  • Povray
    When used with Swiss-PDB viewer the rendered output image appears much sharper and the colors are more vivid.
  • RasMol
    Protein Explorer, a RasMol-derivative, is the easiest-to-use and most powerful software for looking at macromolecular structure and its relation to function. It runs on Windows or Mac computers. RasMol users will find its menus very familiar, and it understands RasMol commands. It is very fast: rotating a protein or DNA molecule shows its 3D structure.
  • RCSB Protein Database
    The RCSB PDB provides a variety of tools and resources for studying the structures of biological macromolecules and their relationships to sequence, function, and disease. This site offers tools for browsing, searching, and reporting that utilize the data resulting from ongoing efforts to create a more consistent and comprehensive archive. The Research Collaboratory for Structural Bioinformatics (RCSB) is a non-profit consortium dedicated to improving our understanding of the function of biological systems through the study of the 3-D structure of biological macromolecules.

Phylogeny Tools

    PHYLIP is a free package of programs for inferring phylogenies. It is distributed as source code, documentation files, and a number of different types of executables.
  • TreeView
    TreeView is a simple program for displaying phylogenies on Apple Macintosh and Windows PCs. It can be used to view PHYLIP generated phylogeny trees.


Greg Nelson, Chemical and Life Sciences Librarian, Brigham Young University

We welcome your comments and suggestions. If you have a resource that you would like to see highlighted please leave us a comment. Registry of Research Data Repositories logo

Re3data launched at the tail end of 2012 with the goal of registering all research data repositories. These research data repositories are collections of datasets usually associated with a particular discipline or a particular geographic region. Because of the way data repositories have cropped up on an as-needed basis over the past 50 years, these repositories are myriad and take a specialized knowledge to navigate the options in any academic field.

Research data represents the lion’s share of effort for universities. The value of research data within universities is without peer; however, this data is often vulnerable to loss due to poor preservation practices. Data repositories provide long-term storage and potentially enable access to datasets, while also promoting reproducibility of research. Although this storage and access provide a clear benefit to the researcher, the funding agencies who support research can be the stimulus for researchers to use a data repository. For example, the National Science Foundation requires dissemination and sharing of research results:

Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.

Dissemination and Sharing of Research Results – National Science Foundation

Certain publishers also stipulate the use of data repositories, such as this example for Scientific Data, a Nature Publishing Group journal:

Scientific Data mandates the release of datasets accompanying our Data Descriptors, but we do not ourselves host data. Instead, we ask authors to submit datasets to an appropriate public data repository. Data should be submitted to discipline-specific, community-recognized repositories where possible, or to generalist repositories if no suitable community resource is available.

Recommended Data Repositories – Nature

For librarians, the benefits of data repositories are fairly clear. Repositories manage, organize, preserve, enable discovery of, and usually provide a persistent identifier for data. Re3data allows librarians to point researchers in the right direction regarding repositories. Re3data provides a basic search feature equipped with 27 facets to narrow or refine a search. Each repository record is tagged with icons to let uses know if the repository provides additional information about its service, if it is open, restricted, or closed access, and what persistent identifier is used (i.e. DOI, URN, ARK, handle, Purl, or other).

View of Re3data’s search interface
View of Re3data’s search interface

Users can also browse by country, subject, or content type (ex. Raw data, audiovisual data, source code, to name a few.) The subject browse function is particularly attractive:

View of Re3data’s browse wheel
View of Re3data’s browse wheel

Users can select a discipline and the wheel will react and narrow the search with a rotating animation action.

Also, librarians who play a role in their own institution repository can suggest their repository to be included in Re3data. Data repositories considered for inclusion must be run by a legal entity, clarify access conditions, and have a focus on research data (see:

Samuel R. Putnam, Assistant University Librarian, University of Florida

We welcome your comments and suggestions. If you have a resource that you would like to see highlighted please leave us a comment

Look to the Stars: History of Astronomy Collections from Adler Planetarium

Photo Credit: Emily Gorman

It’s been a few months now since the ALA Annual Meeting in Chicago, and I am still thinking about the Adler Planetarium! A group of librarians from the Science & Technology Section were lucky enough to get a tour of the Webster Institute for the History of Astronomy, which manages the Adler’s collections. The collections include rare books, historic photographs and scientific instruments, and much more. It was amazing to see some of these beautiful and fascinating materials up close.

Continue reading