Bengali Character Recognition using Deep Learning

1 1
  • 0 Collaborators

The aim of this project is to develop a Bengali Character Recognizer. In this project, we have collected appropriate datasets and pre-processed them for training. We have used standard deep learning models for training. Finally, we have developed a web application by deploying the saved models. ...learn more

Project status: Under Development

Artificial Intelligence

Groups
Student Developers for AI, DeepLearning

Intel Technologies
AI DevCloud / Xeon, Intel Opt ML/DL Framework, Intel Python

Overview / Usage

Bengali is the 4th most popular language in the world and 2nd in India. The Bengali language has a very rich character set. There are 10 numerals and 50 Bengali basic characters. There are also over 100 compound characters in Bengali. Hence, it is challenging to create an efficient optical character recognizer (OCR) for the same. The practical applications of OCR include reading aid for the blind, searching, etc.
The aim of our project is to apply deep learning models for recognition of Bengali characters and numerals. For training, we have used publicly available datasets. We have also explored how to develop a functional Bengali character and digit recognizer (BCR).

Methodology / Approach

The project consists of three main parts:

Pre-processing of datasets
We have pre-processed all images in all datasets to establish uniformity across all datasets. The final output of pre-processing is binarized images with white characters and numerals on black background. For pre-processing images, we have used Intel optimized python OpenCV library.

Training
We have used LeNet5 model with inception and dropout for deep learning. We have saved the models for deploying them in our application. We have used intel optimized TensorFlow library for training stage.

Application development
We have used python Django web framework to test our model in real. Since all the previous pre-processing steps and training have been performed using python, hence the integration of those parts with the application was very smooth.

Technologies Used

We have leveraged various Intel resources for the project. We have performed all the training and processing on the Intel Nervana DevCloud. We have used Intel optimized Python 3 and Intel distribution of python libraries which include OpenCV, TensorFlow, etc.
We have used Python Django web framework for developing the application.

Comments (1)