NOTICE: Due to a lapse in annual appropriations, most of this website is not being updated. Learn more.
Form submissions will still be accepted but will not receive responses at this time. Sections of this site for programs using non-appropriated funds (such as NVLAP) or those that are excepted from the shutdown (such as CHIPS and NVD) will continue to be updated.
An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
Impact of Image Quality in Machine Print Optical Character Recognition
Published
Author(s)
Michael D. Garris, Stanley Janet, W Klein
Abstract
The National Institute of Standards and Technology (NIST) is in the process of setting up a new series of conferences named the Metadata Text Retrieval Conferences (METTREC). They will focus on evaluating two critical technologies: document conversion using optical character recognition (OCR) and information retrieval(IR). Large collections of document images labeled with correct recognition and retrieval responses are needed to measure performance. Currently, the production of these materials is extremely expensive. NIST is developing a semi-automated truthing tool that will help reduce the cost of data preparation and enable evaluations to scale up. To accomplish this, current OCR technology is needed to produce an initial text to image alignment. This paper describes a small experiment in which three different vendor products (two Windows NT/95-based and one UNIX-based) are evaluated across three sets of document images containing progressively decreasing print and image quality. The evaluation images contain subjectively selected pages from the 1994 Federal Register. Results demonstrate the impact of degrading print and image quality with reported character recognition error rates ranging from 1% to as high as 74%.
Garris, M.
, Janet, S.
and Klein, W.
(1997),
Impact of Image Quality in Machine Print Optical Character Recognition, NIST Interagency/Internal Report (NISTIR), National Institute of Standards and Technology, Gaithersburg, MD, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=151348
(Accessed October 10, 2025)