Face Recognition in Video

7 0
  • 0 Collaborators

Bio-metric evaluations of face recognition on single images are well established by many papers over the years. However, applying the classification algorithms in the video format comes with its own challenges i.e. blurring due to motion, change in lighting and sheer volume of faces. Thus, a novel approach has been thought of, using supervised name-face recognition using deep learning technique. Our project has many applications such as recognition of celebrities in videos, labeling characters in feature films and face recognition in CCTV camera footage. Thus, this project investigates the problem of labeling faces in videos using machine-learning algorithms. ...learn more

Project status: Under Development

Artificial Intelligence

Intel Technologies
AI DevCloud / Xeon, Intel Opt ML/DL Framework

Links [1]

Overview / Usage

We have implemented a method for recognising faces in the video feed using deep neural network (combination of convolutional, max-pooling, fully connected layers) with goal to achieve satisfactory real-time performance.

Current Progress :

  1. We have implemented the face detection module which is the preprocessing part for our face recognition system.
  2. And we have designed the skeletal model for our classification pipeline.
  3. We have also trained model to get 97.41% validation accuracy.

Future Steps:

  1. But model needs to be fine tuned for real-time performance.

Methodology / Approach

Methodology:

First module of face recognition system is face detection which is made by using tensorflow object detection API. We have modified it to detect and localise faces in the video frame. It outputs the location of faces in the frame which then cropped and feed to classification module.

For classification we are making use of model called as VGGFace network architecture and it is consisting of 19 layers. It consists of 5 blocks of convolution and max-pooling layers. And on the top of that two fully connected layers followed by dropout gives it ability to generalise over multiple classes. Finally last layer is softmax which outputs the class out of 534 classes depending upon the max probability. So in total it consists of 24 layers.

For making any deep learning project successful it is very much important to collect large amount of quality data. So we have used the dataset named Facescrub which consisting of images of 530 actors and actresses in total. And for each class there are average of 150 images. Also we added our own 4 classes to it. The images from dataset were retrieved from the Internet and are taken under real-world situations (uncontrolled conditions).

Number of images:
Male(265) - 55,306 images and
Female(265) - 51,557 images

Another important thing needs to be considered is use of advance techniques for successful and speedy training process. So we have used technique in which knowledge learned from one task used to perform another task called as transfer learning. We used VGGFace pre-trained weights to load into model and then frozen first 7 layers. Frozen layers won't participate in the backward pass of the neural network, they only do forward pass. Since these layers don't update their weights it's great idea to pre-compute the output of these layers and save onto disk. This approach of pre-computing predictions of 7 layers reduced almost 44.44% of time per epoch of training. And results are also awesome. This recognition system is able to recognise faces with 97.41% accuracy.

Algorithm:

  1. Start.
  2. Take video as input.
  3. Localize the faces the input video using detection module.
  4. Crop the faces in each video frame.
  5. Resize each face to the resolution 96*96.
  6. Convert it into 4D tensor of shape (number of samples9696*3).
  7. Feed the converted tensor to trained pipeline.
  8. Apply bounding box with label to the faces.
  9. End

Technologies Used

Hardware:

  1. Intel AI DevCloud hosted by Colfax

Software:

  1. Python 3.6.3 (Intel Distribution)
  2. Packages in python such as
    (a) TensorFlow
    (b) Keras
    (c) OpenCV
    (d) numpy
Comments (0)