HoloXR

0 0
  • 0 Collaborators

HoloXR Augmented Reality done the correct way using OpenCV and Dlib ...learn more

Virtual Reality, HPC, Artificial Intelligence

Overview / Usage

The HoloXR project was is inspired by OpenCV and Dlib for snap chat filters. It uses Haar features and the Viola–Jones object detection framework implemented in OpenCV to detect mainly faces positions and inside the faces, eyes and mouth position. It uses then this information to add different accessories to the faces (hat, moustache, etc).

The Dlib implementation of this app uses a Histogram of Oriented Gradients (HOG) feature combined with a linear classifier, an image pyramid, and sliding window detection scheme to detect faces. Then it finds the 68 facial landmarks using an Ensemble of Regression Trees described in the paper One Millisecond Face Alignment with an Ensemble of Regression Trees.
In Augmented Reality the user looks at a live video image of reality. The video image is
real-time and interactive, i.e. the user can choose where (s)he looks and move around. Often
head-up displays are employed for this, or tablet PCs (with cameras), or smart mobile
phones. The live video in analyzed by a computer, which deduces from the image features where the
user is and which way (s)he looks. This way the computer can insert virtual objects to the
scene so that they are displayed in the correct size and angle, i.e. the virtual objects seem to
be “glued” to the reality.

How it works
HaarClassifiers
The algorithm give us the rectangles where faces might be as it can be seen in the next picture. After that, specialized HaarClassifiers are used to find mouths, noses, and eyes inside the image recognized as a face. With that information it is relatively easy to put the different "gadgets" in superposition with the image. We just use the alpha channel (transparency) of the "accessories" to merge the images.
Dlib face and landmarks detection
HarrCascades technique is very old, so detection is often not so good when the faces are slightly bent for example. It is not possible to estimate the inclination of the head for example. Detection of face characteristics (eyes, nose, mouth, eyebrows, etc) is also not robust with HaarClassifiers. Here comes into play more recent techniques implemented in the dlib library. If you compare the to detection schemes the dlib is more accurate and stable. As described, we can also get the main facial landmarks which are used to detect the face characteristics and to estimate the tilt angle of the face.
The 68 facial landmarks are numbered in the following form. With tha points we calculate the inclination so the "accessories" are also bent. Wth the points of the mouth can be detected when the mouth is open and then display some trick, like a rainbow coming out.

Comments (0)