Head Pose Image Database



    1. Image Database


            The head pose database is a benchmark of 2790 monocular face images of 15 persons with variations of pan and tilt angles from -90 to +90 degrees. For every person, 2 series of 93 images (93 different poses) are available. The purpose of having 2 series per person is to be able to train and test algorithms on known and unknown faces (cf. sections 2 and 3). People in the database wear glasses or not and have various skin color. Background is willingly neutral and uncluttered in order to focus on face operations.
            Face positions on each image are labeled in an individual text file. Here is a small sample of a serie:

sample


Download               Code               Code

            The database has a complete size of approximatively 30 MB. Files are dispatched into directories. Each directory contains images from a person, thus 2 series. All images are in JPEG format. There are a few examples of series in MPEG format in section 6. The Front directory consists of 30 frontal (pan and tilt angles equal to 0) images of persons from the database. This serie can be used to learn or to test on frontal images.


            Filenames are constructed according the following grammar :

person[Id][Serie][Number][Tilt][Pan].jpg


    Id = {01, ..., 15}
                Number of the person,

    Serie =  {1, 2}
                Number of the serie,

    Number = {00, 01, ..., 92}
                Number of the file in the directory,

    Tilt = {-90, -60, -30, -15, 0, +15, +30, +60, +90}
                Vertical angle,

    Pan = {-90, -75, -60, -45, -30, -15, 0, +15, +30, +45, +60, +75, +90}
                Horizontal angle.


Each filename is unique. For example, take the file person08123-30+45.jpg :

    Id = 08
    Serie = 1
    Number = 23
    Tilt = -30
    Pan = +45
 


Negative values
Positive values
Pan Angle
Bottom
Top
Tilt Angle
Left
Right


In case the vertical angle is -90 or +90, the person is looking at the bottom or the top, and then the horizontal angle is 0. Each serie contains therefore 7 x 13 + 2 x 1 = 93 images.

    2. Face labels


            The corresponding face labels are stored as rectangle coordinates in an individual text file of the corresponding name:


person[Id][Serie][Number][Tilt][Pan].txt


            The text file contains the following elements:


                [Corresponding Image File]
               
               Face
                [Face Center X]
                [Face Center Y]
                [Face Width]
                [Face Height]
 

    3. Testing on Known Faces


            Estimating the head pose of known faces is done by splitting the database into two groups. Each group gathers sets of the same serie of all persons. The test is done by doing a 2-fold cross validation on these two groups. The picture below describes the process:

View
2-Fold cross validation algorithm. Each square represents a serie. Squares of the same column are series of the same person



    4. Testing on Unknown Faces


            Estimating the head pose of unknown faces is done by doing a Jack-Knife (also called Leave One Out) algorithm on the persons of the database. All images are used for training, except images of one subject, which will be used for testing. The person to be tested is then changed at each step. This is an exhaustive algorithm. No images of the same person is both in the training and in the testing parts:

View
Jack-Knife algorithm. Each square represents a serie. Squares of the same column are series of the same person




    5. Image Acquisition


            All images have been taken using the FAME Platform of the PRIMA Team in INRIA Rhone-Alpes. To obtain different poses, we have put markers in the whole room. Each marker corresponds to a 2D pose (pan, tilt). Post-it are used as markers. The whole set of post-it covers a half-sphere in front of the person.

            In order to obtain the face centered on image, the person is asked to adjust the chair to see the device in front of him. The person stands at a distance of 2 meters from the camera. After this initialization phase, the person stares successively at 93 post-its, without moving his eyes. All images are obtained by using this method.
           

TopView

Top sight



SideView

Side sight



Fame

The FAME Platform



    6. FAQ


            Q: Are nose positions labeled?
            A: No, since face labels are provided as rectangles.

            Q: Why are filenames so complex?
            A: It's a way to both have files ordered (from bottom to top and right to left) and angles identified within their filenames while keeping them unique.
           
            Q: Which camera model did you use?
            A: Sony CCD EVI-D31 Pan-Tilt-Zoom Camera.


    7. Example Videos



Video-Person01                         Video-Person03                         Video-Person08






This database can be used for any purpose, provided that the following article is cited:


N. Gourier, D. Hall, J. L. Crowley
Estimating Face Orientation from Robust Detection of Salient Facial Features
Proceedings of Pointing 2004, ICPR, International Workshop on Visual Observation of Deictic Gestures, Cambridge, UK