Fake News Detection Using Python and Machine Learning

0 0
  • 0 Collaborators

This repository contains a Python-based solution for Fake News Detection using Machine Learning. This project was done by Team Infinity of Saintgits College of Engineering, consisting of Jumana Jouhar, Neharin Tijo, and Meenakshi Mony, as part of the Intel Unnati Industrial Training Programme. ...learn more

Project status: Published/In Market

Intel® Unnati

Intel Technologies
DevCloud

Docs/PDFs [1]Code Samples [1]

Overview / Usage

The project focuses on addressing the critical issue of false information proliferation in the digital era. By employing machine learning techniques and the Python programming language, the project aims to detect fake news articles and enhance information credibility. The overarching goal is to combat the spread of misinformation, thereby bolstering trust in media, democratic processes, and content authenticity across various platforms.

Methodology / Approach

First step is to acknowledge the severity of false news in the digital age and conduct a comprehensive literature review on past research on fake news detection using Python and machine learning. Diverse classifiers, metrics, and methodologies from previous studies are examined. The ISOT Fake News Dataset is selected for its breadth of reputable and unreliable articles. Data preprocessing and cleaning are done to refine the dataset's insights and quality. This is followed by TF-IDF vectorization.

Python serves as the foundation, backed by libraries like Scikit-learn and XGBoost. Various machine learning algorithms, including logistic regression, decision tree, random forest, gradient boosting, XGBoost, and passive aggressive classifier, are utiilised. These algorithms enable classification based on learned cues. Evaluation metrics such as Accuracy, Precision, Recall, F1 Score, AUC-ROC Score, and Confusion Matrix are computed and analyzed, offering quantitative insights into model performance. The integration of Intel Extension for Scikit-learn enhances code runtime efficiency.

Technologies Used

The project is centered around machine learning technologies, incorporating a range of algorithms including Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, XGBoost and Passive Aggressive Classifier, for fake news identification. To enhance the models' capabilities, the TF-IDF vectorization technique is employed, translating text data into numerical representations. The project capitalizes on Intel optimized libraries, specifically for logistic regression, which significantly reduces code runtime.

Python acts as the foundational programming language, while machine learning libraries play a vital role in algorithm implementation and evaluation. Notably, the integration of Intel Extension for Scikit-learn further enhances logistic regression efficiency. Various libraries such as Matplotlib and Seaborn generate informative visualizations. Core software components include Python, Scikit-learn, and other libraries. Intel Devcloud is used for coding and result presentation.

Documents and Presentations

Repository

https://github.com/jumanajouhar/intelunnati_infinity.git

Comments (0)