COMP47670
Data Science in Python
2023 Autumn Practical Test
Overview
The notebook Aut2023Test_Core.ipynb contains code to load the file sheep.csv that contains data on 300 sheep. There are three breeds from three locations and the data has been gathered over three years. In addition to the location, breed and year information there is also data on the fleece weight and overall weight of each sheep.
Tasks
1. Provide summary statistics of the numeric features and counts for the ‘breed’ feature. (10 marks)
2. The column fleece_w gives the fleece weight in kilograms. Add a new column fleece_g that gives the weight in grams. (10 marks)
3. Produce a bar-chart showing the mean fleece weight in kg of the three species. (10 marks)
4. What breeds are found in just one location? (15 marks)
5. Produce a bar-chart (grouped bar-chart) showing counts of the different breeds across the three years. (15 marks)
6. Develop and test a regression model to estimate fleece weight from body weight.
Test using hold-out testing. For testing you can use the .score method on the model; this has the form. <model>.score (X_test, y_test). (20 marks)
7. Repeat the exercise in Task 6 (above) but just for the Blackface sheep. Which model is more accurate? Why might this be the case? (20 marks)
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。