RespiScan

RespiScan: A machine learning-based web application that predicts lung cancer risk from user inputs like smoking, fatigue, coughing, and more. ...learn more

Project status: Under Development

Artificial Intelligence, Cloud, oneAPI

Intel Technologies
oneAPI

Code Samples [1]

Overview / Usage

Project Overall:

Lung cancer prediction using 9 machine learning classification models using Scikit-learn library in Python is a code implementation that aims to develop a predictive model for detecting lung cancer in patients. The code uses 9 different machine learning algorithms, including logistic regression, decision tree, k-nearest neighbor, Gaussian naive Bayes, multinomial naive Bayes, support vector classifier, random forest, multi-layer perceptron, and gradient boosting classifier, to predict the likelihood of lung cancer based on a range of variables. The dataset used in the code includes various columns such as gender, age, smoking, yellow fingers, anxiety, peer pressure, chronic disease, fatigue, allergy, wheezing, alcohol consuming, coughing, shortness of breath, swallowing difficulty, chest pain, and lung cancer. By analyzing these variables and using machine learning algorithms to identify patterns and correlations, the predictive models can provide accurate assessments of a patient's risk of developing lung cancer.

Problem being Solved:

Lung cancer is a serious and common disease that affects millions of people around the world. It is caused by genetic damage to the cells in the lungs, often due to smoking or exposure to harmful substances. Lung cancer can cause symptoms such as difficulty breathing, coughing up blood, chest pain, hoarseness, headache and weight loss. Lung cancer can be diagnosed by various tests such as X-rays, CT scans, MRI scans, PET scans, sputum cytology and biopsy. Lung cancer can be treated by surgery, chemotherapy and radiation therapy, depending on the type and stage of the cancer.

The inspiration for doing this project on lung cancer is to raise awareness about this disease and its prevention. Lung cancer is the leading cause of cancer deaths worldwide, but many people are unaware of its risk factors and symptoms.

Additionally, the RespiScan project could potentially be used for public health education and awareness campaigns. By highlighting the risk factors and symptoms of lung cancer, the model could help raise awareness and encourage individuals to take steps to reduce their risk of developing the disease. This could include initiatives to promote smoking cessation, improve air quality in workplaces, and encourage early detection and treatment through regular screenings.

However, it is important to note that any diagnostic tool, including the RespiScan model, would need to be thoroughly validated and tested before being used in a clinical setting. This would involve testing the model on a large and diverse population of patients to ensure that it is accurate and reliable across different demographics and patient groups.

Overall, the RespiScan project has the potential to improve the early detection and treatment of lung cancer, which could have significant benefits for patient health and survival rates. However, further research and testing would be needed to ensure that the model is accurate and reliable for use in a clinical setting.

How is the Work is experienced or used in production:

RespiScan can be experienced and used in production as a user-friendly tool for individuals who want to assess their risk of developing lung cancer from the comfort of their own homes. Users can input their basic information and symptoms into the RespiScan model, and receive a prediction of their likelihood of having lung cancer.

The model could be integrated into a web-based platform or mobile application, which could be easily accessed by users. The platform could also provide educational resources on lung cancer prevention and early detection, which could help users reduce their risk of developing the disease.

Overall, RespiScan has the potential to be a valuable tool for individuals who are concerned about their risk of developing lung cancer, as it can provide a quick and convenient way to assess their risk and take appropriate action to prevent or detect the disease.

Methodology / Approach

Methodology:

The methodology used in the RespiScan project involves the application of machine learning techniques to develop a predictive model for detecting lung cancer in patients. The model uses a range of data input variables such as age, gender, smoking status, and various symptoms and medical conditions that may be associated with lung cancer. The output of the model is a prediction of the likelihood that the patient has lung cancer.

Approach and Use of technology to solve problem:

The development of the RespiScan model involves the use of the Scikit-learn library in Python, which provides a range of machine learning algorithms that can be applied to classification tasks such as predicting the presence or absence of lung cancer. Specifically, we used 9 different machine learning algorithms, including logistic regression, decision tree, k-nearest neighbor, Gaussian naive Bayes, multinomial naive Bayes, support vector classifier, random forest, multi-layer perceptron, and gradient boosting classifier.

The development process involved several steps, including data preprocessing, feature selection, model training, model validation, and performance evaluation. The data preprocessing step involved cleaning and transforming the raw data to ensure that it was suitable for analysis. Feature selection involved identifying the most relevant variables to include in the model, based on their correlation with the presence or absence of lung cancer.

Model training involved splitting the data into training and testing sets, and applying each of the machine learning algorithms to the training set. The performance of each algorithm was then evaluated using metrics such as accuracy, precision, recall, and F1 score. The best performing algorithm was selected for use in the final RespiScan model.

The RespiScan model was then deployed in a web-based platform or mobile application, which could be accessed by users. The platform provided a user-friendly interface for inputting patient information and receiving a prediction of the likelihood of having lung cancer. The model could also provide educational resources on lung cancer prevention and early detection.

Frameworks, Standards, Techniques:

In terms of frameworks and standards, the RespiScan project followed best practices in data science and machine learning, including the use of standardized libraries such as Scikit-learn and Pandas, and adherence to principles of reproducibility and transparency in research.

Overall, the RespiScan project applied cutting-edge technology and methodologies in machine learning to solve the problem of lung cancer detection and prevention, and developed a user-friendly tool that can be easily accessed by individuals concerned about their risk of developing the disease.

Technologies Used

Technologies:

  • Python programming language
  • Machine learning algorithms (e.g. logistic regression, decision tree, k-nearest neighbor, Gaussian naive Bayes, multinomial naive Bayes, support vector classifier, random forest, multi-layer perceptron, and gradient boosting classifier)
  • Scikit-learn library for Python
  • Flask web framework for Python
  • HTML/CSS/JavaScript for web development

Libraries:

  • Pandas for data manipulation and analysis
  • NumPy for numerical computing with Python
  • Matplotlib and Seaborn for data visualization
  • Flask-CORS for cross-origin resource sharing

Tools:

  • Jupyter Notebook for interactive data analysis and prototyping
  • Visual Studio Code for Python development
  • Git and GitHub for version control and collaboration

Software:

  • Operating system (e.g. Windows, macOS, Linux)
  • Anaconda for package management and environment setup

Hardware:

  • Personal computer or server with sufficient processing power and memory

Intel Technologies:

  • Intel OneAPI for optimizing performance on Intel hardware
  • Intel CPU/GPU/FPGA for hardware acceleration

Repository

https://github.com/Abhinav00711/RespiScan

Collaborators

3 Results

3 Results

Comments (0)