M2R - MSc in Informatics

Graphics Vision and Robotics
and Ubiquitous Interface Systems

Computer Vision

MoSIG M2 2020-2021 Academic Year

Zoom - Thursdays 9:45 to 12:45

Professors: James L. Crowley and Edmond Boyer

Teaching Assistants: Yantao Wang and Nachwa Aboubakr

These class notes can be found at http://crowley-coutaz.fr/jlc/Courses/2020/GVR.VO/GVR-VO.html
The original planned Class schedule. Here is a pointer the the ADE class reservation system
Programming Teams

Source Files for lecture Notes

Thursday 1 Oct 2020 Lesson 1: Theory: Performance Evaluation for Recognition and Detection

Course Introduction: James Crowley
• Course Organisation
Computer Vision Theory: (Nachwa Aboubakr)

Pattern Recognition and Machine Learning

Performance Evaluation Evaluation Metrics

             Exercise Questions (exam questions from past years)
Practical Instruction: Jupyter Notebooks, OpenCV, and FDDB.
         • Using OpenCV and Keras in Python with Jupyter Notebooks
         • Opening and displaying a face with the FDDB Data set
Programming Exercise 1: Displaying Faces from the FDDB data set
          Examples of Best Notebooks: Team2-Domps-Zhu.ipynb, Team4-Handowsk-Granzio.ipynb, Team11-Brusca-ALsaka.ipynb
Background Reading: The FDDB Data Base (Jain and Learned-Miller 2010)

8 october 2020 - ENSIMAG Partners Day - no classes

Thursday 15 Oct 2020 Lesson 2: Visual Perception in Man and Machine (Recorded Lecture)

Computer Vision Theory (Recorded Lecture for Part 1)

Albedo and Reflectance

The Human Visual System

Vergence, Version and Fixation

Color Perception and Color Spaces

          Exercise Questions (exam questions from earlier years)
Practical Instruction (Recorded Lecture for Part 2)
           • Sliding Window Face Detectors
        • Programming Neural Networks in Keras
          • Detecting Faces with a 3 Layer MLP in Keras
Programming Exercise 2: Face Detection with a Multi-Layer Percetron
Evaluation Data for Exercise 2 (from folds 9 and 10 of FDDB)
    Examples of Best Notebooks: Team12-HMedan-Zhong-Lab2.ipynb, Team2-DOMPS-ZHU-Lab2-buildDataset.ipynb, Team2-DOMPS-ZHU-Lab2-MLP.ipynb
Background Reading: (Rowley and Kanade 87)

Thursday 22 Oct 2020 Lesson 3: Scale Space and Image Pyramids (Recorded Lecture)

Computer Vision Theory

Scale Space

Gaussian function as a low-pass digital filter

Scale Invariant Gaussian Pyramids

Equivariance Properties of Scale Space

Practical Instruction:   ( Recorded Lecture)
Yangtao Wang's Jupyter Notebook demo
           • Constructing an Image pyramid with OpenCV
           • Detecting Faces at multiple scales with a pyramid
Programming Exercise 3: Detecting Faces at multiple scales with a pyramid and a sliding window MLP (updated 25 oct) - Due on 11 Nov.
Background Reading: Face Detection with Half octave Pyramid (Ruiz 2008) (Crowley-Riff 2003)

29 Octobere 2020 Fall Vacation (Toussaint) - no classes

Thursday 5 Nov 2020 Lesson 4: Local Image Description with Receptive Fields (Recorded Lecture - Theory Part)

Computer Vision Theory: (Recorded Lecture)

Gaussian Derivatives

Using the Gaussian to compute image derivatives

Steerability and Intrinsic Orientation

Intrinsic Scale

Histogram Of Oriented Gradients Recorded Lecture for Practical Part

Scale Invariant Feature Transform (SIFT)

Practical Instruction: Revised Version of Exercise 3 - Due 12 Nov. - Presentation Slides of Revised Exercise 3   Recorded Lecture for Practical Part
          • Yangtaos's Jupiter notebook for Half-Octave Gaussian Pryamid
Background Reading:
            • SIFT Paper (Lowe 1999)
            • Fast Computation of Receptive Fields - (Crowley-Riff 2003)
            • Fast computation of Characteristic (Intrinsic) scale - (Crowley-Piater 2003)
          • Face Detection with Gaussian Derivatives (Ruiz-Crowley 2008)

Thursday 12 Nov 2020 Lesson 5: Attention and Cognition for Computer Vision (Recorded Lecture)

Practical Instruction: Detecting and Tracking Faces in Video Sequences - (Recorded Lecture)
        • Bayesian Tracking process: Predict, Detect, Update
        • Tracking with Adaptive Background Subtraction
          • Face Tracking with Skin Color Blobs

Example of OpenCV code to display video squences ( Zip of file for download)
Programming Exercise 4: Experimental Performance Evaluation for Face Tracking.

Computer Vision Theory: Attention and Cognition for Computer Vision - (Recorded Lecture)
        • Cognitive Vision
        • Visual Concepts and Visual Attention
          • Structured Knowledge Representations
          • Situation Models.

Background Reading:
        Robust Face Tracking using Color - Schwerdt-Crowley 2000
        Benchmarking Face Tracking - Fischer et al 2011
        Evaluating Multiple Object Tracking Performance - Bernadin2008

Thursday 19 November 2020 Lesson 6: Homogeneous Coordinates and Projective Camera Models (Recorded Lecture)

Computer Vision Theory: (Recorded Lecture) Homogeneous Coordinates and Projective Camera Models (Recorded Lecture Part 1, Part 2)
       • Homogeneous Coordinates and Tensor Notation
        • Homogeneous Coordinate Transforms
        • Homographies
        • The Projective Camera Model
        • Camera Calibration

Practical Instruction:   More on Bayesian Tracking: Tips and Techniques (Recorded Lecture for Practical Part)

Instructions and Advice for mid-term report

Midterm Project Report Definition and Grading Scale (50% of the final Grade)

Report on performance evaluation for different techniques for face detection and face racking
Report due by E-mail on 4 January 2020

Lesson 7: Structure from Motion (Edmond Boyer)

Thursday 26 Nov 2020 - 9:45 to 12:45

Lesson 8: Structure from Motion (Edmond Boyer)

Thursday 3 December 2020 - 9:45 to 12:45

Lesson 9 Shape Models (Edmond Boyer)

Thursday 10 December 2020 - 9:45 to 12:45

Lesson 10: Articulated Motion (Edmond Boyer)

Thursday 17 December 2020 - 9:45 to 12:45

Thursday 25 December 2020 Christmas holidays - no class

Thursday 1 january 2021 - New Years holidays - no class

Lesson 11: Articulated Motion (Edmond Boyer)

Thursday 7 janvier 2021 - 9:45 to 12:45

Lesson 12: Articulated Motion (Edmond Boyer)

Thursday 14 janvier 2021 - 9:45 to 12:45

Face Detection Data Sets:

FDDB dataset: FDDB dataset contains the annotations for 5,171 faces in a set of 2,845 images.
WIDER FACE: A face detection benchmark dataset with 32,203 images and labels for 393,703 faces with a high degree of variability in scale, pose and occlusion.
Head Pose Data Set: Created by Nicholas Gourier in 2004.
MALF dataset: Face Detection in the Wild. MALF consists of 5,250 images and 11,931 faces.
AFW dataset: Face Detetion in the Wild. AFW dataset is built using Flickr images. It has 205 images with 473 labeled faces.
For each face, annotations include a rectangular bounding box, 6 landmarks and the pose angles.

Face Tracking Data Sets:

This link gives direct direct access to the individual directories the AVDIAR: A Dataset for Audio-Visual Diarization.
The data-set includes 27 short stereo video recordings of people walking and talking, along with hand-labeled bounding boxes. The audio files have been removed from this version.
( See https://team.inria.fr/perception/avdiar/ for the complete multi-modal data set and additional documentation. )

Exercise Problem Sets / sample exam questions (Do these at home or in a group)

Lesson 1: Performance Evaluation
Lesson 2: Human Vision: Attention and Fixation
Lesson 3: Pyramids, Scale Space
Lesson 4: Gaussian Receptive Fields
Lesson 5: Bayesian Detection and Tracking
Lesson 6: Projective Transformations

Past Exams

M2R GVR 2016:    Computer Vision Exam from November 2016
M2R GVR 2014:    Computer Vision Exam from November 2014
M2R GVR 2012:    Computer Vision Exam from November 2012
M2R GVR 2011:    Computer Vision Exam from November 2011
M2R GVR 2010:    Computer Vision Exam from November 2010
M2R GVR 2009:    Computer Vision Exam from September 2009