SCRU model

Taimur Muhammad Khan

Taimur Muhammad Khan

Rawalpindi, Punjab

1 0
  • 0 Collaborators

SCRU model, Spelling correction for Roman Urdu is first of it's kind model in Roman Urdu, which corrects spelling mistakes. The model has been developed using some famous libraries in python. Now we plan to enhance the model through oneAPI toolkits. ...learn more

Project status: Under Development

oneAPI

Groups
Student Developers for oneAPI

Intel Technologies
oneAPI, Intel Python, Intel CPU, Intel Opt ML/DL Framework

Overview / Usage

In this project, I've developed spell correction for Roman Urdu in python. This model will be correcting non-word errors using the Noisy Channel model. This spell correction technique has been widely used by word processors and is also being used by Google in its search engine. Whenever you type in a query with a misspelled word such as "corection", Google would instantly return the results for correction instead. I would further like to use the oneAPI ML/DL libraries to improve the model's performance on Intel CPUs.

Methodology / Approach

I used numpy, pandas, itertools, collections, difflib and operator to train my noisy channel model. The project has no frontend yet. Below are the papers, concepts and sources which helped me build this project:

  1. Bayes Theorem
  2. Noisy channel paper: https://aclanthology.org/C90-2036.pdf
  3. Language model

Technologies Used

Intel CPU was used to train the model, with a normal capacity RAM. I'm also exploring the use of Intel's oneAPI libraries and toolkits for this project. I hope the ML/DL libraries provided by oneAPI can enhance the performance by a prominent margin.

Comments (0)