Utilization of Oversampling for multiclass sentiment analysis on Amazon Review Dataset

SABYASACHI MUKHOPADHYAY

SABYASACHI MUKHOPADHYAY

Kolkata, West Bengal

0 0
  • 0 Collaborators

We use variations of recurrent neural networks, such as simple RNN, GRU, LSTM and Bidirectional LSTM, to find out which model performs the best in multi-class classification of sentiment. Then, we use that model to understand the effect of oversampling on a dataset before using it to train a model. ...learn more

Project status: Published/In Market

Artificial Intelligence

Groups
DeepLearning, Artificial Intelligence India

Intel Technologies
DevCloud

Links [1]

Overview / Usage

In this paper, we have demonstrated the workings of various RNN models for multi-class sentiment analysis on different datasets. We have performed a comparative study between the models, used oversampling on the database to reduce class distribution, and finally visualized the working of the models in some sample texts and achieved satisfactory results. We observed that oversampling plays a significant role in improving the model performance. In future we plan to use other oversampling techniques such as SMOTE and ADASYN to further investigate the impact of oversampling for multi-class sentiment analysis.

Methodology / Approach

Sentiment Analysis is a major element in Artificial Intelligence. Its applications include machine translation, text analysis, computational linguistics, etc. In most cases, classification of sentiment is done into two or three classes. But in some situations, for example rating a product from Amazon, there are multiple classes. One major challenge in such tasks is the class imbalance which reduces the accuracy by making the model biased. To deal with this problem, we use oversampling to reduce the class imbalance of the dataset before training the model. In this research work, first we use variations of recurrent neural networks, such as simple RNN, GRU, LSTM and Bidirectional LSTM, to find out which model performs the best in multi-class classification of sentiment. Then, we use that model to understand the effect of oversampling on a dataset before using it to train a model.

Technologies Used

  1. TensorFlow

  2. Intel DevCloud

Comments (0)