Support Vector Machines Training With Stochastic Gradient Descent for Multiclass News Categorization
Segun sodimu
Ogun State
- 0 Collaborators
SGD has been successfully applied to large-scale and sparse machine learning problems often encountered in text classification and natural language processing. Given that the data is sparse, the classifiers in this module easily scale to problems with more than 10^5 training examples and more than ...learn more
Project status: Published/In Market
Intel Technologies
Intel Integrated Graphics
Overview / Usage
The aim of this project is to extend support vector machine with stochastic gradient descent training to increase accuracy during training for multiclass news categorizations model.
Methodology / Approach
Data collection
We start by downloading BBC news
Data preprocessing
We perform data preprocessing techniques such as remove stopwords, remove non-alpha numeric, lemmatization and typecasting.
Feature representation
We implemented Word2Vec as the feature representation.
Training and prediction.numericnon-alpha
Technologies Used
Python3
SKlearn
Pandas
Numpy
Word2Vec