Character RNN to generate Nepalese text
Robin Ranabhat
Dhulikhel, Central Development Region
- 0 Collaborators
I trained a character RNN to generate new Nepali text and see for myself these tiny LTSM cells remembers such outrageously complex word sequence. ...learn more
Project status: Under Development
Intel Technologies
Intel Opt ML/DL Framework
Overview / Usage
I trained a 2 layer sequence to sequence Character generation LSTM model on nepalese articles. It's harder for the network to learn than english language because Devanagari is a complex language and LSTM model did a pretty good job in keeping those combinations intact.
example is : झा प्रहरी गर्ने सम्ता गर्ने सम्ता गर्ने सम्ता गर्ने सम्ता सम्ति गर्ने सम्ता गर्ने सम्ता सम्ति गर्ने सम्ता गर्ने सम्ता सम .
It's quite amazing the LSTM generated above text which is syntactically correct. For those not familiar with devanagari, my Vocabulary size was 87 .
Methodology / Approach
The model is trained on 2 layer LSTM with 256 units each for 3 hours. Then a softmax is used to generate the most probable character.
I used the deep learning framework keras with tensorflow backend to train the model.
Technologies Used
Python -> keras with tensorflow backend , scrapy to scrape nepalese dataset , numpy .
Repository
https://github.com/whatsGr8t/Devenagari-character-RNN-Model-python-keras