联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2019-11-06 09:01

Lab Assignment 2

Vasant Honavar

DS 310 - Machine Learning

Available: Nov 4, 2019

Nov 11, 2019

In all of the following exercises, if there is a need for a random seed, set it to 1234. Using

different sklearn libraries are permitted as long as usage is well-understood and explained in

the code. In case you would need to interpret your results, do so in your Ipython Notebook

by changing the cell type and writing your interpretation immediately below the code and

its result so that the interpretation can be matched with the result and the code. Submit a

single Ipython Notebook in which all of the answers are organized in a way that can be run

and evaluated.

1. Random forest (RF). Import the Breast Cancer data set from sklearn. Train and

evaluate using 5-fold cross-validation, a Random Forest Classifier from the ensemble

library of sklearn using 100 trees. Report the following:

(a) The average Accuracy, Sensitivity, Specificity, and AUC across all 5 runs of the

cross-validation.

(b) Report the average feature importance score for each feature across all 5 runs of

the cross-validation.

2. Multinomial Naive Bayes Classifier. Import the 20 News Groups data set from

sklearn. Preprocess the news articles to obtain the bag of words representation of the

data using the sklearn library functions. See sklearn tutorials on text analytics for

documentation. Train and evaluate a Multinomial Naive Bayes classifier from sklearn

using 5-fold cross-validation.

Report the average Accuracy, Sensitivity, Specificity, and AUC across all 5 runs of the

cross-validation.

1


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp