Breast Cancer Detection
Rohit Midha
Unknown
- 0 Collaborators
A research project that focuses on Early Breast Cancer Detection by using various Machine Learning Techniques. ...learn more
Project status: Under Development
Groups
Student Developers for AI
Overview / Usage
Breast cancer is the most common cancer in women and thus the early stage detection in breast cancer can provide potential advantage in the treatment of this disease. Early treatment not only helps to cure cancer but also helps in its prevention of its recurrence. Machine Learning algorithms can provide great assistance in prediction of early stage breast cancer that always has been a challenging research problem.
The model can also be hosted on a server to help doctors decide based on data inputs.
Methodology / Approach
In this project various different Machine Learning Techniques have been used to come up with an accurate answer and a hybrid model for the same, however I will mainly be focusing on the Artificial Neural Network that I developed as that gives a 100% accuracy.
Before we get started with ANN's we should get to know our data a little and I have done that by using Seaborn and Matplotlib, both ,packages you can install for python. I started the visualisation process by calculating the number of Benign and Malignant cases, the two outcomes we have to map our predictions to. Then I have plotted a Correlation matrix and plotted some correlated and uncorrelated features, which I have used for other models such as Random Forest Classifier, XGBoost, Naive Bayes, etc, all of which can be found in the Github Repo below.
The Model :
An Artificial Neural Network (ANN) is a computational model based on the structure and functions of biological neural networks. Information that flows through the network affects the structure of the ANN because a neural network changes - or learns, in a sense - based on that input and output. ANNs are considered nonlinear statistical data modeling tools where the complex relationships between inputs and outputs are modeled or patterns are found. ANN is also known as a neural network.
A single neuron consists of a layer of inputs (corresponding to columns of a data-frame). Each input has a weight which controls the magnitude of an input. The summation of the products of these input values and weights is fed to the activation function. Activation functions are really important for a Artificial Neural Network to learn and make sense of something really complicated and Non-linear complex functional mappings between the inputs and response variable.
More about Artificial Neural Networks can be found here : https://software.intel.com/en-us/ai-academy/students/kits/deep-learning-501/week2
My model uses two Dense Layers, each with a Dropout and ReLu activation. The third dense layer has sigmoid activation and the model is compiled with RMSProp optimiser. (Even sgd or adam optimises can be used)
Technologies Used
Python Packages Used :
Pandas
Numpy
Keras
Matplotlib
Seaborn
Sklearn
XGBoost
Dataset used: Breast Cancer Wisconsin (Diagnostic) Data Set