Devanagari Article Classifier
Robin Ranabhat
Dhulikhel, Central Development Region
- 0 Collaborators
It's a research project on using several machine learning models on Nepalese news text corpus. ...learn more
Project status: Published/In Market
Intel Technologies
Other
Overview / Usage
What makes the project different is the work was done on Devanagari corpus. Also ,we experimented with generating word embeddings with Nepali news corpus. We scraped 20000 article data from several nepali online new portal for the project. Our classifier achieves significant accuracy in classifying different genres of news.
Methodology / Approach
- Scraped news article data from new portals.
- preprocessed the raw text dataset.
- Converted the articles into vectors using several approaches like word embeddings, tf-idf , count based etc.
- applied different machine learning model to train the classifier.
Technologies Used
Python -> numpy , scrapy , gensim , nltk , scipy ,
Repository
https://github.com/whatsGr8t/Nepalese-News-Classifier-with-word2vec-model