Speech Assistance using Artificial Neural Netowrk

Ramy Mounir

2 0

0 Collaborators

This project is intended to help people with speech impediments to use speech-to-text applications with better accuracy. ...learn more

Robotics, Artificial Intelligence

Groups
Student Developers for AI

Overview / Usage

As the name suggests, we will be using a Deep Bidirectional Recurrent Neural Network with LSTMs (DBRNN) to achieve the state-of-the-art performance described by Graves et al. using a normal speech dataset (no speech impediment). This model will include Mel Frequency Cepstral Coefficients (MFCC) for filtering and feature extraction. We will also use Connectionist Temporal Classification (CTC) for data aligning and labeling unsegmented sequences; CTC is used as the cost function.A Word to ARPAbet phonemes dictionary from CMU is used here as well.

Output phonemes are then post processed by altering the phonemes sequence to generate potential words. Those word are then fed to another Recurrent neural network that acts as a language model assigning probabilities for words to occur given previous word(s). Beam search is to be used for efficient scanning of possible sentences.

This project is likely to have a speaker dependent version to increase the accuracy of Automatic Speech Recognition. TensorFlow is the framework to be used in this project. A GUI will be designed using QT designer for easy use and demonstrations.

Comments (0)

You have disabled JavaScript

We are sorry, but without JavaScript we are currently unable to display the latest activity feed. Please, enable Javascript in your browser.

Speech Assistance using Artificial Neural Netowrk

Ramy Mounir

Overview / Usage

Login to continue

This action requires you to be logged in.

Thanks for voting. Please leave a comment.

Speech Assistance using Artificial Neural Netowrk

Ramy Mounir

Overview / Usage

Login to continue

This action requires you to be logged in.