Whether it’s in industry, academic research or chemistry classes, many people encounter NIST’s standard reference data (SRD).
In 1968, Congress authorized NIST to produce and maintain standardized reference data in order to support U.S industry and fundamental research. This data can have critical importance.
For example, SRD can characterize the properties of structural steel. “If you don’t properly know its expansion coefficients, for example, then the bridge you build may fail in a catastrophic way,” says Robert Hanisch, the director of NIST’s SRD program.
Standard reference data has several key characteristics. It is quantitative information. It’s related to a measurable physical or chemical property of a substance or system of substances. Last but not least, the data is critically evaluated to verify its reliability.
“It's very, very carefully vetted for quality,” said Hanisch. “So if there's a data point in a standard reference database, somebody has looked at it. Does it fit with known physics? Does it have a properly characterized uncertainty? Some set of expert eyes has looked at all of the information in SRD.”
Like SRMs, SRD may be purchased, with the price set to cover the costs of producing, collecting and disseminating the data.
NIST provides 49 free SRD databases and 14 fee-based SRD databases.
The 1968 Standard Reference Data Act allowed SRD databases to be copyrighted both domestically and internationally. This doesn’t usually happen with government work, Hanisch points out.
One important reason for securing copyright protection, he says, is to maintain the integrity and reliability of the data in the SRD databases. Copyright protection gives NIST certain tools to prevent unauthorized use, modification or misrepresentation of the data.
“Some of these databases contain information where, if the information were tampered with in any way, either inadvertently or on purpose, it could have dire effects,” Hanisch says. “For example, some of our standard reference data characterizes the properties of refrigerants.” As refrigerants can have variable properties of flammability, it’s important to safeguard the data through intellectual property protection.
Hanisch says that the databases reflect NIST’s values and culture.
“It requires a kind of NIST persistence and emphasis on quality to put these databases together,” he says. “It's really a labor of love. NIST is the place where this work gets done.”
One of the most popular SRD databases is NIST’s Mass Spectral Library. The library contains masses for many chemical species that researchers may need to identify in their work, whether it’s basic chemistry or a criminal forensics investigation.
“We sell the library through distributors, who then sell it to instrument vendors of mass spectrometry devices. The vendors embed the NIST database into the software that accompanies those instruments,” says Hanisch. “There's a real value added there in that.”
“We charge money for the library,” he explains. “That money goes to NIST’s mass spectroscopy group, to continue to augment the contents of the database. But it also creates a product for these instrument vendors that has higher value than simply the hardware that they sell,” he says.
One oft-mentioned body of standard reference data is NIST REFPROP, the Reference Fluid Thermodynamic and Transport Properties Database. Manufacturers of heating, ventilation and air conditioning (HVAC) systems refer to the database when studying the refrigerants they are using in their projects.
“REFPROP is really kind of unique in the market,” Hanisch says. “Any industry that deals with refrigerants of one sort or another relies on REFPROP.”
Another NIST database devoted to thermodynamics is used by Joshua Heyne, associate professor in the University of Dayton’s department of mechanical and aerospace engineering.
"We use the Web Thermo Table PRO nearly every day in the lab,” he says. “WTT Pro provides us with self-consistent and accurate data, which we use to facilitate sustainable aviation fuel development. Without these data, much of our work would not be possible at the speed we need.”
The most popular web-based SRD is the freely available Chemistry WebBook, which is visited by more than hundreds of thousands of users every month throughout the year.
“It gets used by not only industry, but by educators and researchers, because it is a compendium of the best-known properties of elements and compounds,” Hanisch says.
The SRD program has been focused on making its data available as conveniently as possible in digital form. As Hanisch points out, standard reference data was first published in books, then floppy disks and CDs, and finally on the internet. For its databases, the team strives toward what the data community calls FAIR: findable, accessible, interoperable and reusable. It’s part of an international movement around open research data “and making that data maximally valuable, assuring reproducibility, reliability, integrity and transparency,” he says.
Hanisch envisions many critical data needs in the near future. “On the horizon is standard reference data in support of artificial intelligence (AI) and machine learning,” he says.
AI systems are already making important decisions, from matching faces in databases to evaluating applications for home loans. Testing the trustworthiness and reliability of these systems is a very important need, to ensure their decisions are transparent and sound.
“Having standard training sets that companies can use to test their machine-learning algorithms and characterize the uncertainties of their outputs,” Hanisch says, “is bread and butter for studying these systems and measuring their performance.”