October 24, 2004

TITLE: This text applies to Biometric Scores Set - Release 1, BSSR1.

Question and Comments to Patrick Grother

OVERVIEW:

The CD contains three directories which, whose names are intended to suggest the kind of fusion one might consider with such data.

1. face_x_face
2. fing_x_fing
3. fing_x_face

Thus:

1. face_x_face contains face scores from two algorithms - whose performance is correlated.
2. fing_x_fing contains fingerprint scores from the left and right index fingers.
3. face_x_fing contains face and fingerprint scores from the same individuals.

ORIGIN:

All fingerprint scores come from a freely available fingerprint recognition system, now in a second release This system was compared with commercial systems in the FpVTE test.

The face scores were generated by two commercial face recognition systems, labelled G and C here, in 2002.

DATA STRUCTURES:

A score results from the comparison of two images, representing the comparison of an enrolled user's image with a subsequent image of either the same or another user.

In this release the data is bundled into "similarity files" - a similarity file contains the scores from the comparison, by one recognition system, of a user's sample with N enrolled users. There is one similarity file for each entry in the users.xml file.

The order of the elements in the similarity file are fixed for all similarity files in the tree. They are not sorted on similarity value. The order corresponds to the entries in the enrollees.xml file.

In the case of the fingerprint data the N scores in each file were computed by separate calls of a 1:1 comparison function. For the face data the N scores in each file were generated en-masse - it is not known if the scores are produced by a 1:1 comparison function. This is a consequence of the structure of the protocol under which the scores were generated.

DIRECTORIES CONTAINING SIMILARITY SCORES

The files are containing in trees. For the three sets the tree roots look like this:

1. A 3000-person extract of a population referred to as "dos" is present for face-face.
face_x_face/sims/dos/face/C scores "sims" from pop. "dos" by commercial face system "C"
face_x_face/sims/dos/face/G scores "sims" from pop. "dos" by commercial face system "G"

2. A 6000-person extarct of a population referred to as "dos" is present for fing-fing
fing_x_fing/sims/dos/li/V scores "sims" from pop. "dos" from left index fingers "li" using NIST system "V"
fing_x_fing/sims/dos/ri/V scores "sims" from pop. "dos" from right index fingers "ii" using NIST system "V"

3. A 517-person extract of a population referred to as "dos" is present for fing-face. All members of this population are represented in the 6000-person fingerprint set. The intersection of this 517-person set with the face-face persons is non-zero.
fing_x_face/sims/dos/face/C scores "sims" from pop. "dos" by commercial face system "C"
fing_x_face/sims/dos/face/G scores "sims" from pop. "dos" by commercial face system "G"
fing_x_face/sims/dos/li/V scores "sims" from pop. "dos" left index "li" fingers by NIST fingerprint system "V"
fing_x_face/sims/dos/ri/V scores "sims" from pop. "dos" right index "ri" fingers by NIST fingerprintsystem "V"

There are additonal levels of tree structure under these.

fing_x_face/sims/dos/face/C/output/1999330/00409944 contains scores from a face user image captured on the 330-th day of 1999

fing_x_face/sims/dos/face/V/output/1999330/00409944 likewise contains the fingerprint scores for the same image capturing event

The 6000 person fingerprint-fingerprint data is undated.

GENUINE VS IMPOSTOR

Whether a score is a genuine score or an impostor score can be recovered by knowing the identities of the person appearing in the enrolled and user samples. The status the k-th score in a similarity file is available by comparing the "subject_id" listed in the k-th entry in the enrollees.xml with the "subject_id" of the element whose "name" in the users.xml file is the file's name.

Example:

fing_x_face/sims/dos/face/V/output/1999330/00409944 is a similarity file in BSSR1 - the latter part of its pathname appears as the 389-th entry the probe set. The subject_id appearing there is C495CB6FF97ECBDE9187F2D134BD52E5 and this subject_id also appears in entry 389 of the enrollees file corresponding to an image that had the name output/2001011/00747702.

Although this mechanism is cumbersome here, it affords flexibility and generality in more complicated problems.