联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2022-03-26 08:34

Part 1:

1.Download the Titanic Dataset from here: https://www.kaggle.com/brendan45774/test-file  

2.Import the dataset into Jupyter Notebook, and if there are NANs in column “Age”, replace them with Median of Age (0.5 pts)

3.Use a Markdown Cell to answer this question: Do you think Gender/Sex matters on whether a passenger will survive in Titanic Incident? And why is that? Do you have evidence, or have you heard some stories? (0.5 pts)

4.Use a Markdown Cell to answer this question: Which model will you use on this dataset? And why did you choose them? Please kindly note that the output should be “Survive” column. (0.5 pts)

5.Convert the “Sex” column from “object” to “numeric” (0.5 pts)

6.Use a Markdown Cell to answer this question: Which variables will you choose as inputs to build the model? Why? (0.5 pts)

7.Check all the variables you choose from step 6, if there are any NANs, replace them with the Average of the column.  (0.5 pts)

8.Now build the model. Please remember to split a train and valid set first, then use the train set to build the model (0.5 pts)

9.Use a Markdown Cell to answer this question:  Does this model have an “accuracy rate”? Explain why. (0.5 pts)

10.Use the .predict function to predict the results for the valid set. What is the accuracy rate of the model on valid set? Is it good? (0.5 pts)

11.Make up an imaginary individual, and use markdown cells to give a brief introduction of this individual (such as Sex, Age, Fare, etc.). Will this individual survive? (0.5 Pts)

Part 2:

1.Download the house Dataset from here: https://www.kaggle.com/thomasnibb/amsterdam-house-price-prediction

2.This dataset is plain and simple, the output should be “Price”, and input should be “Area”, “Room”, “Lon”, and “Lat”. Check if there are any NANs in these variables, if there is, then replace the NANs with Mean. (0.5 Pts)

3.Split the dataset into Train and Valid sets. Calculate the CV score of the Train sets for Linear Regression, Polynomial Regression (degree 1 to 4), Lasso Regression, Ridge Regression, KN Regression, Decision Tree Regression, and Random Forest Regression. (3 Pts)

4.Use a markdown cell to answer this question: Which model will you choose and why? (0.5 Pts)

5.Make up an imaginary house, and use the .predict function to predict the price of it with the model you choose.(1 Pt)


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp