Face Recognition Vendor Tests (FRVT) provide independent government evaluations of commercially available and mature prototype face recognition systems. These evaluations are designed to provide U.S. Government and law enforcement agencies with information to assist them in determining where and how facial recognition technology can best be deployed. In addition, FRVT results help identify future research directions for the face recognition community.
FRVT 2002 follows four previous face recognition technology evaluations – three FERET evaluations (1994, 1995 and 1996) and FRVT 2000. The FERET program introduced evaluations to the face recognition community and helped advance face recognition from its infancy to the prototype system stage. By 2000, face recognition technology had matured from prototype systems to commercial systems. The Face Recognition Vendor Test 2000 (FRVT 2000) measured the capabilities of these systems and their technical progress since the last FERET evaluation. Public interest in face recognition technology had risen significantly by 2002. FRVT 2002 was designed to measure technical progress since 2000, to evaluate performance on real-life large-scale databases, and to introduce new experiments to help understand face recognition performance better.
Each successive evaluation increased in size, difficulty and complexity, reflecting the maturing of face recognition technology as well as evaluation theory. Table 1 shows the representative sizes of these evaluations and the minimum number of comparisons per second required to complete the evaluation. Table 2 shows the largest database sizes for any experiment in these evaluations for verification, identification and watchlist, as well as a breakdown of the types of experiments found in each evaluation.
Evaluation | # of Signatures | # of Comparisons | Time Given to Perform Comparisons | Minimum # of Comparisons made per Second |
---|---|---|---|---|
FERET '96 | 3813 | ~14.5 million | 72 hours | 56 |
FRVT 2000 | 13872 | ~192 million | 72 hours | 742 |
FRVT 2002 – MCINT | 7500 | ~56 million | 264 hours | 59 |
FRVT 2002 - HCINT | 121589 | ~15 billion | 264 hours | 15,555 |
Note that the MCINT portion of FRVT 2002 is the only test in this chart that included "video" signatures.
Signatures in all other tests were a single still image.
Measurable | FERET Aug94 | FERET Mar95 | FERET Sep96 | FRVT 2000 | FRVT 2002 | |
---|---|---|---|---|---|---|
Largest number of individuals in: | ||||||
A verification experiment | 1,196 | 1,196 | 37,437 | |||
An identification experiment | 498 | 831 | 1,196 | 1,196 | 37,437 | |
A watch list experiment | 25 | 3,000 | ||||
Basic experiment categories | ||||||
Indoor same day –expression change | * | * | * | * | * | |
Indoor same day—illumination change | * | * | * | * | ||
Indoor different day | * | * | * | * | * | |
Indoor different day—greater than 18 months | * | * | * | |||
Outdoor same day | * | * | ||||
Outdoor different day | * | |||||
Pose—left or right | * | * | * | |||
Pose—up or down | * | |||||
Detailed analysis | ||||||
Resolution of face | * | * | ||||
Image compression | * | |||||
Media | * | |||||
Distance of face from camera | * | |||||
Standard error ellipses | * | |||||
Id. Performance as a function of gallery size | * | |||||
Watch list performance as a function of gallery size | * | |||||
Watch list performance as a function of rank | * | |||||
Technologies evaluated | ||||||
3D morphable models | * | |||||
Normalization | * | |||||
Video | * | |||||
Demographic factors | ||||||
Sex | * | |||||
Age | * | |||||
Interaction between sex and age | * |
In the USA Patriot Act, the National Institute of Standards (NIST) is mandated to measure the accuracy of biometric technologies. In accordance with this legislation, NIST, in cooperation with other Government agencies, conducted the Face Recognition Vendor Test 2002.
FRVT 2002 consisted of two tests: the High Computational Intensity (HCInt) Test and the Medium Computational Intensity (MCInt) Test. Both test required the systems to be full automatic, and manual intervention was not allowed. Participants could sign up to take either or both tests.
The High Computational Intensity (HCInt) Test was designed to test state-of-the-art systems on extremely challenging real-world images. These were full-face still frontal images. This test compared still database images against still images of an unknown person. The HCInt required participants to process a set of approximately 121,000 images, and match all possible pairs of images from the 121,000-image set. This required performing 15 billion matches in 242 hours. The results from the HCInt measure performance of face recognitions systems on large databases, examine the effect of database size on performance and estimate variability in system performance.
The Medium Computational Intensity (MCInt) Test consisted of two separate parts: still and video. MCInt was designed to provide an understanding of a participant's capability to perform face recognition tasks with several different formats of imagery (still and video) under varying conditions. The still portion of the MCInt is similar to the FERET and FRVT 2000 evaluations. It compared a database of still images against still images of unknown people. The still portion of the MCInt was designed to measure performance on different categories of images. Examples of different effects that were measured were time between images, changes in illumination, and variation in pose. The video portion was designed to provide an initial assessment of whether or not video helps increase face recognition performance. This portion used video style imagery that was extracted from digital video sequences.