Water Quality Prediction

Aditya Purwar

Aditya Purwar

Noida, Uttar Pradesh

1 0
  • 0 Collaborators

This project is about water quality prediction using and without using oneAPI libraries like oneDAL and oneDNN for Random Forest Classification, Logistic Regression, Support Vector Classifier, Fully Connected Neural Networks and XGBoost Classifier ...learn more

Project status: Under Development

Artificial Intelligence

Intel Technologies
oneAPI, Intel Opt ML/DL Framework, Intel Python, MKL, DevCloud

Docs/PDFs [1]Code Samples [1]

Overview / Usage

The primary objective of this project is to predict water quality based on various input parameters using machine learning and deep learning techniques. The project will explore the use of Intel's OneAPI libraries, including OneDAL and OneDNN, alongside traditional machine learning and deep learning frameworks, for water quality prediction. The key algorithms to be implemented and compared include Random Forest Classification, Logistic Regression, Support Vector Classifier, Fully Connected Neural Networks, and XGBoost Classifier.

Methodology / Approach

Data Collection: Collect a dataset that includes water quality-related features such as pH, Iron concentration, Nitrate levels, Chloride concentration, Lead content, Zinc content, color, turbidity, fluoride levels, copper concentration, odor, sulfate levels, conductivity, chlorine concentration, manganese levels, total dissolved solids, water source, water temperature, and air temperature. This dataset should be suitable for classification tasks.

Data Preprocessing: Clean and preprocess the dataset to handle missing values, outliers, and categorical variables. Normalize or standardize the numerical features as needed. Convert categorical variables like color and source into numerical or one-hot encoded representations.

Model Development:
a. Random Forest Classification: Implement the Random Forest Classification algorithm using traditional machine learning libraries (e.g., scikit-learn) both with and without OneDAL to evaluate its impact on model performance.
b. Logistic Regression: Implement Logistic Regression using traditional machine learning libraries and, optionally, with OneDAL for performance comparison.
c. Support Vector Classifier (SVC): Develop a Support Vector Classifier using traditional libraries and evaluate its performance with and without OneDAL.
d. Fully Connected Neural Networks: Create a Fully Connected Neural Network using deep learning frameworks (e.g., TensorFlow or PyTorch) and optionally with OneDNN for optimization. Experiment with different architectures and hyperparameters.
e. XGBoost Classifier: Implement the XGBoost Classifier using the XGBoost library and, if available, explore the use of OneDAL for optimization.

Model Evaluation: Evaluate the models' performance using appropriate metrics for classification tasks, such as accuracy, precision and, F1-score. Compare the performance of models with and without OneAPI libraries to assess their impact on prediction accuracy and efficiency.

Hyperparameter Tuning: Perform hyperparameter tuning for the machine learning models to optimize their performance. Use techniques like grid search or random search.

Technologies Used

oneAPI

oneDNN

oneDAL

TensorFlow

Keras

Random Forest Classification

XGBoost Classifier

Support Vector Classifier

Logistic Regression

Documents and Presentations

Repository

https://github.com/Aditya3012Purwar/Intel-oneAPI/

Comments (0)