Hidden Markov Model for Text classification
Md. Fantacher Islam
Khulna, Khulna Division
- 0 Collaborators
Here Hidden Markov Algorithm used for classifying document using Natural Language Processing. Primarily, spam-ham data-set was used as the data-set. ...learn more
Project status: Under Development
Groups
Student Developers for AI
Overview / Usage
This model can use any kind of document classification like sentimental analysis.
Methodology / Approach
Hidden Markov models are created and trained (one for each category), a new document d can be classified by, first of all, formatting it into an ordered wordlist Ld in the same way as in the training process. Then, as words are considered observations in T-HMM, we calculate the probability (likelihood) of the word sequence Ld being produced by the two HMMs. That is, P(Ld|�R) and P(Ld|�N) need to be computed, where �R is the model for relevant documents and �N the model for non-relevant documents. The final output class for document d will be the class represented by the HMM with the highest calculated probability.
Technologies Used
Python 3.6.5
IDE: Spyder
Library: NLP
Repository
https://github.com/FantacherJOY/Hidden-Markov-Model-for-NLP