Devanagari Article Classifier

Robin Ranabhat

Robin Ranabhat

Dhulikhel, Central Development Region

0 0
  • 0 Collaborators

It's a research project on using several machine learning models on Nepalese news text corpus. ...learn more

Project status: Published/In Market

Artificial Intelligence

Intel Technologies
Other

Code Samples [1]

Overview / Usage

What makes the project different is the work was done on Devanagari corpus. Also ,we experimented with generating word embeddings with Nepali news corpus. We scraped 20000 article data from several nepali online new portal for the project. Our classifier achieves significant accuracy in classifying different genres of news.

Methodology / Approach

  1. Scraped news article data from new portals.
  2. preprocessed the raw text dataset.
  3. Converted the articles into vectors using several approaches like word embeddings, tf-idf , count based etc.
  4. applied different machine learning model to train the classifier.

Technologies Used

Python -> numpy , scrapy , gensim , nltk , scipy ,

Repository

https://github.com/whatsGr8t/Nepalese-News-Classifier-with-word2vec-model

Comments (0)