This database was formerly part of the NIST Special Databases collection, it was known as Special Database 20. The images contain a very rich set of graphic elements such as graphs, tables, equations, two column text, maps, pictures, footnotes, annotations, and arrays of such elements. No ground truthing or original typesetting information is available.
The images contain predominantly machine printed English, although three French and German documents are included.
Major features of the database include:
Please click
to view the PDF version of Users' Guide.
The database is available as a four 5.25 inch CD-ROM set .
System requirements: CD-ROM drive with software to read ISO-9660 format.
The contact for this database is:
Karen Marshall
National Institute of Standards and Technology
100 Bureau Drive,
Gaithersburg, MD 20899-8940
karen.marshall [at] nist.gov (karen[dot]marshall[at]nist[dot]gov)(link sends e-mail)
Keywords: Automated character recognition; automated image recognition; full text databases; OCR; optical character recognition; software recognition.