Multilingual Automatic Document Classification and Translation Evaluation (MADCAT)
The Multilingual Automatic Document Classification Analysis and Translation (MADCAT) program is a five-year DARPA research program whose purpose is to explore and develop technologies that convert non-English language document images into English transcripts so that the information can be readily used by monolingual English speakers.
The Multimodal Information Group in the Information Technology Laboratory's Information Access Division at NIST is responsible for measuring the performance of these developed technologies.The evaluation of the MADCAT program has two tracks, a Go/No-Go track and a challenge track. The Go/No-Go track focuses on monitoring progress on controlled data sets. The challenge track focuses on system performance using real-life data.
A public version of the MADCAT evaluation will be offered under the NIST Open HaRT Evaluation planned for Summer/Fall 2010.
If you would like more information regarding our involvement in the MADCAT program please contact our staff.