Introduction to Face Recognition using Python

ORL Database of faces ATT

A few years ago (just a few, in 2003) I discussed the thesis for my degree in Electronic Engineering: “Classification of Log-Gabor Features for Biometric Face Recognition“. It was a really engaging project and I achieved good results in terms of recognition rate using a wavelet transform and a neural network classificator. Afterward, I dropped the topic and did not develop it further. Maybe I will post some results someday.

Back to these times, a few days ago I decided to spend some spare time testing feature extraction and classification in Python and created this repository with a couple of tests. Here are some assumptions.

  • I am using the “The ORL Database of Faces” to train and test a system for feature extraction and classification. The ORL database is among the simplest face databases, made up of pictures of 40 individuals, 10 pictures each, for a total of 400 pictures, 92×112 black and white bitmaps. The faces are already aligned, normalized, and ready to pass through a feature extraction algorithm.
  • I am using 5 images to train the system and the remaining 5 images for testing through the classification algorithm. So, in total, there are 200 training faces for 40 individuals and 200 testing faces.

Features extraction and iterative vector similarity classification

In the iter_test.py demo, feature extraction is performed using several models in the Pytorch library. One vector embedding is extracted for every face using Christian Safka’s library img2vec, and is stored in a dictionary:

{'person1_face1':'vector1', 'person1_face2':'vector2', ...,'person2:face1':'vector3',...}
  • I tested the classification using different distances, being the cosine similarity the most effective (98% of the 200 test faces are recognized) to calculate the similarity of a test vector extracted from the face under test to the face from the training set.
  • Classification is performed iteratively (brute force), a quite slow approach, but it is sufficient for the sake of showing how a full system can be coded in a few lines of code.

HNSW classification

In this other test hnswlib_test.py I am using Hierarchical Navigable Small World similarity search (HNSW) for the classification, which is made available by hnswlib Python library, and I achieve the same good results using cosine similarity (98% of the 200 test faces are recognized).

Wrapping up

Using Dense Convolutional Network and Hierarchical Navigable Small World similarity search it is straightforward to create a face recognition system. In these simplified tests the provided faces are normalized, centered, and ready to use. In the future, I will test different models, such as multiple vectors for a face (using Log-Gabor wavelet transform to extract vector embeddings for different points). Or different kinds of indexing or even a neural network to speed up the classification of testing images. Using HNSW I appreciate 10% to 20% performance improvement. I expect a larger impact as the number of tests increases.

ModelsRecognition rateBrute forceHNSW
Dense Convolutional Network98%17.76s15.15s
Resnet-1896.5% 9.72s6.28s
Vgg-1197.5%22.25s20.72s
Alexnet97%6.01s3.98s
Comparison between different modeling and classification methods using ORL Database of Faces

I am planning to set up an end-to-end system with face detection, preprocessing, modeling, and indexing for classification, possibly using a larger database of faces (e.g. color FERET Database). Check my tests in this repository.

Leave A Comment