Results of data needs survey
Gary Wiggins wiggins
at indiana.edu
Thu Jan 23 13:50:25 GMT 1997
This is being sent to CHMINF-L, CHEMWEB, and CHEMIND-L.
(It's long.)
------------>
Data Needs of Academic Research on the Internet
Gary Wiggins
Indiana University Chemistry Library
wiggins at indiana.edu
Data on the Web
"All in all, the chemical data now available on
the web is in a different class from the data
found in refereed journals, critical reviews and
books from reputable publishers.
- David Lide (CHMINF-L, 30 October 1996)
The above response was one of several received in response
to questions sent to three chemically-oriented discussion
lists in the fall of 1996. This was in preparation for a
lecture and demonstration delivered at the National
Institute of Standards and Technology on December 4, 1996.
Most of the information in this paper was included in that
presentation.
Questions were sent to CHMINF-L, CHEMWEB, & CHEMIND-L in
late October 1996. They were designed to:
- Gauge the extent of inaccurate data in Web databases
- Define the characteristics of data on the Web
>> Sources of data
>> Need for standardization of data formats
- Determine the best guides to data.
Respondents to the survey noted these problems with the
accuracy of data on the Web:
- Units are frequently omitted
- Transcription errors are often encountered
- This leads to a need to find redundant data
- Very few sources have quality assurance statements
- Few of the Web data sites give the source of the data
- If they do, data are likely to be copied from outdated
sources.
Other Survey Results
Several people commented on efforts or practices that will
likely improve the quality of data on the Internet,
including:
- Standardization efforts:
>> CLIC, Chemical MIME, CML
>> Roles for IUPAC, CODATA: certification?
(One person, however, questioned whether standardization
efforts were worthwhile.)
- Efforts to share data or to cooperatively compile data
sources
>> Open Molecule Foundation
>> Molecule of the Month
>> Reciprocal Net
>> Structure and Reactivity Across the Periodic Table
- Provision of a minimal level of auxiliary information
(metadata)
>> authorship
>> units
>> conditions of measurement
>> references to primary and secondary sources of data
- Use of standard symbols and terminology
- Guidelines on how to handle special characters.
General Comments on Data on the Web
"While some might argue that the Internet is designed to make
information in a single location accessible to users around the
world, the large number of mirrored sites already in existence
points out the Net's inadequacy."
- Byte, December 1996
There are a number of steps needed to improve the quality of
data found on the Web. Among them are:
- Mechanisms to synchronize changes made at multiple
sites
- Faster access to resources
- More secure transactions
- Progress on chemical metadata standards
- Interoperability of chemical plug-in programs.
Some Goals for Improving Data on the Web
- Assemble the most reliable data available
- Arrange data for easy retrieval
- Provide a "SuperIndex" of available data sources
- Establish criteria for evaluation of data sources:
>> descriptions of physical theories on which data are
based
>> full references to literature
>> format of the database
>> search capabilities
How to Find Data Now
A second part of the NIST presentation was a look at how to
find data on the Web today. One person pointed me
toward Alexander Lebedev's "Best Search Engines for Finding
Scientific Information in the Web"
(http://www.chem.msu.su/eng/comparison.html). He searched
11 Web search engines and concluded:
- Excite retrieves a comparable number of documents to
Altavista
- Metacrawler is the most powerful search engine for SATI
- Two of the search engines are not being updated.
Lebedev also compared the Web searches to INSPEC
results for 1994 & 1995 on the same topics. He found:
- Only 5-10 % of relevant information is on the net
- The Web is particularly good for supplemental
information:
>> on authors
>> on their work and research projects
>> on foundations supporting them.
Besides using search engines, these are some other ways to
find data using the Internet:
- Submit the question to a knowledgeable source
- Consult lists of sources (guides)
- Try known sources
- Try comprehensive chemistry guides.
Lists of Sources (Guides)
CIS-IU (Chemical Information Sources from Indiana
University)
http://www.indiana.edu/~cheminfo/ca_accc.html
http://www.indiana.edu/~cheminfo/ca_ppi.html
Databases for Atomic and Plasma Physics
http://plasma-gate.weizmann.ac.il/DBfAPP.html
IOP's Software and Data Page
http://www.iop.org/Physics/Resources/phsoft.html
Known Sources
NIST Physics Laboratory
http://physics.nist.gov/PhysRefData/contents.html
Sheffield ChemPuter
http://www.shef.ac.uk/~chem/chemputer/
Biocatalysis/Biodegradation Database
http://dragon.labmed.umn.edu/~lynda/index.html
Comprehensive Chemistry Guides
Chemfinder
http://chemfinder.camsoft.com/
WWW Chemical Structures Database
http://schiele.organik.uni-erlangen.de/services/webmol.html
SpaceCrunch
http://www.tripos.com/spacecrunch/
Other Examples
University of Texas's ThermoDex
http://www.lib.utexas.edu/Libs/Chem/info/thermodex/
Table of the Properties of 200 Linear Macromolecules
and Small Molecules
http://funnelweb.utcc.utk.edu/~athas/databank/intro.html
Chemical errors found on WWW sites; A discussion of
problems encountered while creating the ChemFinder WebServer
database
http://www.camsoft.com/chemfinder/errorsfound.html
Internet Demos at NIST
CIS-IU ca_accc.html
Go to Anal Chem page, then to MS Links at SIS, then Dave's
Math Tables
www.sisweb.com/math/tables.htm
NMR Information Server at U of Florida
micro.ifas.ufl.edu/
playing Happy Birthday to You on an NMR Spectrometer
Dababase of Core-Edge (Inner-Shell) Excitation Spectra of
Gas Phase Atoms and Molecules
xray.uu.se/hypertext/corexdb.html
SEARCH naphthalene
Spin trap Data Base
alfred.niehs.nih.gov/LMB/stdb
ENTER THE DATABASE doesn't work, but HIPPO does
Electron Paramagnetic Resonance at Bristol
emrs.chm.bris.ac.uk/
Beautiful background!
In "About the Database" in the Introduction,
Spectra examples,
Show the example Cu(II) (nothing else works!)
Look at IU Molecular Structure Center's Reciprocal Net
www.cica.indiana.edu/~recip/
www.indiana.edu/ReciprocalNet.html
Molecules R Us
molbio.info.nih.gov/cgi-bin/pdb
Search dehalogenase (E.C.3.8.1.5)
NIST Chemistry WebBook
webbook.nist.gov/chemistry
Look for 91-56-5
AIRSITE
ozone.sph.unc.edu
Has "Environmental Data, but it's "under construction"
THERMODEX
www.lib.utexas.edu/Libs/Chem/info/thermodex/
Search Gibbs Free Energy and organic
Chemfinder
chemfinder.camsoft.com
Search MEK
WWW Chemical Structures Database
schiele.organik.uni-erlangen.de/services/webmol.html
Search MEK, then 2-butanone
SpaceCrunch
www.tripos.com/spacecrunch/
Molecule of the Month
www.bris.ac.uk/MOTM/motm.html
-----
chemweb: A list for Chemical Applications of the Internet.
Archived as: http://www.ch.ic.ac.uk/hypermail/chemweb/
To unsubscribe, send to listserver at ic.ac.uk the following message;
unsubscribe chemweb
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)
More information
about the chemweb mailing list