SDGB-7847代做、c++程序语言代写、代做Java、Python-代写C/C++编程

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

The data we are working with is in longitudinal format. Each column represents a patient, and each row represents a gene expression reading for genes 1-5913. The patient’s disease status is marked in the column header. The first 20 patients are marked with ‘meta,’ meaning these patients have a form of metastatic cancer (disease=1). The last 20 patients do not have the disease (disease=0).

You will need to transform this data into a model-ready format in order to predict metastatic disease by patient’s expression of each gene.

Set your R’s seed to 1234.

Once your data is ready to model, separate it into training and test sets.

Apply the following algorithms- training on your training data and testing on your test data- to predict disease based on gene expression. From your test data, pull out your accuracy, sensitivity and specificity.

RF (RF on the full dataset may take a long time to run due to the number of genes being used as predictor variables)

RF+PCA

KNN + PCA (Use iteration to find optimal value of K)

In an external document, write a discussion on which algorithm you would choose and why. Discuss what the variable importance plot showed for RF and RF + PCA, the number of principal components you chose and what you chose as your optimal value of K.

Upload your code and your external explanation document by Thursday, April 30th at 8pm.

Thank you for a wonderful class and have a great summer! Stay in touch!

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：代写COSC363、Computer Graphics代写、C/C++代做、C/C++编程调试

【下一篇】：代写COSC363、Computer Graphics代写、C/C++代做、C/C++编程调试

联系方式

最新辅导

热门辅导

您当前位置：首页 >> C/C++编程C/C++编程

SDGB-7847代做、c++程序语言代写、代做Java、Python

日期：2020-05-01 06:09

相关文章