Accent Classification of Nigerian English Speakers

1 0
  • 0 Collaborators

The use of Deep learning and spectrogram images of audio data to detect the accent of native Nigerian speakers. ...learn more

Project status: Under Development

Artificial Intelligence

Groups
Student Developers for AI

Overview / Usage

One of the major challenges in speech recognition is to understand speech by non-native English speakers. Accent classification can enhance the automatic speech recognition system by identifying the ethnicity of a speaker (voice recognition) and switching to a speech recognition system that is trained for that particular accent. Also, accent recognition, which provides identification of a speaker’s ethnicity, is crucial in applications such as crime investigation. In this project, I am attempting to solve this problem by classifying about an hour accented voice clips as one of target Nigerian native languages.

Methodology / Approach

The approach to this project of accent classification consists of feature extraction and machine learning classifiers. In this project, the three target languages are Igbo, Yoruba, and Hausa. I plan on gathering audio data of the languages aforementioned, splitting them into short audio files based on the audio sample rate to form a large dataset, processing the audio signal by denoising it using techniques like Gaussian-smooth and Median filters, then using acoustic features like MFCCs and Spectrograms to represent the audio signals as images. I will also apply PCA to these features to reduce the data dimensionality and capture the important data variations. I will explore machine learning classifiers for accent classification of Nigerian English speakers into one of the languages aforementioned. The classifiers include 𝑘-Nearest Neighbors, Support Vector Classifier, Multi-Layer Perceptron, and Convolutional Neural Network. I intend on achieving an accuracy of 96% on classifying the accent of totally unknown Nigerian English speakers correctly.

Technologies Used

I am using Librosa python library to convert the audio data to spectrogram images, and Pytorch deep learning framework to train the images.

Comments (0)