OpenGesture Skill

We propose a deep neural network for the prediction of sign language and gesture recognition in natural video sequences using CPU, and further integrated with Voice to Sign Language Gestures. To effectively handle complex evolution of pixels in videos, we propose to decompose the motion and content, two key components generating dynamics in videos. Furthermore, we integrate Alexa to provide a voice to sign language translation system using Speech and Image processing technique. ...learn more

Project status: Under Development

RealSense™, HPC, Artificial Intelligence, PC Skills

Groups
Early Innovation for PC Skills

Intel Technologies
MKL, Movidius NCS, Intel CPU, OpenVINO

Code Samples [1]

Overview / Usage

OpenGesture skill seeks to simplify the process of learning and understand the sign language and gesture recognition, which is normally a communication barrier to some.

The OpenGesture Skill uses a model built upon an Encoder-Decoder Convolutional Neural Network and Convolutional LSTM for pixel-level prediction, which independently capture the spatial layout of an image and the corresponding temporal dynamics. By independently modelling hand motion and content, predicting the next frame reduces to converting the extracted content features into the next frame content by the identified hand motion features, which simplifies the task of prediction.

Alexa handles the Speech Recognition using a custom built skill Speech-To-Sign Language translation which recognizing the words being spoken, regardless of whom the speaker is. The OpenGesture skill for Alexa will perform the recognition process through matching the parameter set of the input speech with the stored templates to finally display the sign language in video format.

Methodology / Approach

OpenGesture Skill is an inference-based application supercharged with power-efficient Intel® processors and Intel® Processor Graphics on a laptop. Based on OpenVINO inference engine backend is used by default since OpenCV 3.4.2 (OpenVINO 2018.R2) when OpenCV is built with the Inference engine support, so the call above is not necessary. Also, the Inference engine backend is the only available option (also enabled by default) when the loaded model is represented in OpenVINO™ Model Optimizer format. OpenGesture Skill is built .NET Framework and AWS Lambda to effectively handle NLP workloads.

Technologies Used

ZenBook UX430
Intel Core i7 8th Gen
16 GB DDR4 RAM
512 SSD Storage
Intel RealSense D435
Ubuntu 16.04 LTS
IntelOpenVINO Toolkit
Realsense SDK 2
OpenCV
AWS Lambda
Visual Studio
.NET Framework
Alexa Skills Kit

Repository

https://github.com/TebogoNakampe/OpenGesture

Collaborators

1 Result

1 Result

Comments (0)