October 1, 1998
Author(s)
Michael D. Garris, Stanley Janet, W Klein
A new, fully-automated process has been developed at NIST to derive ground truth for document images. The method involves matching optical character recognition (OCR) results from a page with typesetting files for an entire book. Public domain software