Use of Metalearning to recommend algorithms for analysis of gene expression data
- 0 Collaborators
Cancer is one of the leading causes of death today and understanding its internal mechanisms can bring important benefits to humanity. Recently, new sequencing technologies (RNA-Seq and micro-RNA) made available a large amount of data, which can be used to improve cancer diagnosis. As the manual analysis of these data is impracticable, machine learning algorithms have been employed successfully. However, each algorithm has an inductive bias, which makes it better to suit a given subset of problems. This work investigates the potential of using metalearning to associate features present in the dataset with the most appropriate machine learning algorithms. As an initial result, a model was constructed which recommends when using support vector machines or random forest for gene expression data from tissues with and without cancer. ...learn more
Project status: Under Development
Intel Technologies
Other
Overview / Usage
The choice of an AM algorithm capable of generating models with high predictive capacity is not an easy task. This is because each AM algorithm has an inductive bias, which may or may not fit well with the data. This requires the assistance of AM specialists and a large set of experiments to choose an appropriate algorithm. An alternative to dealing with the challenge of choosing an algorithm is to use meta-learning. Meta-learning investigates the induction of predictive meta-models capable of recommending adequate techniques for a given task, making use of accumulated knowledge. This study investigates the use of Metalearning to recommend machine learning algorithms for gene expression data.
Methodology / Approach
We are using Metalearning to build a recommendation system for automatic algorithms. Initially, we are using support vector machine and Random Forest. The work is being expanded to use kNN, MLP, ANNs and other algorithms arranged in the literature. We have an initial framework in R. We are working to build a more sophisticated version in python
Technologies Used
R, packages: mlr, randomForest, e1071, mfe
Python. packages: scikit-learn, numpy