代写ISE529 Predictive Analytics 2024 Fall Homework 1代做Python编程-代写Database作业

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

您当前位置：首页 >> Database作业Database作业

代写ISE529 Predictive Analytics 2024 Fall Homework 1代做Python编程

日期：2024-09-11 06:26

ISE529 Predictive Analytics

2024 Fall

Homework 1

Due by: Sept. 17, 2024, 11:59 PM

Instructions:

1. Print your First and Last name and NetID on your answer sheets

2. Submit all your answers including Python scripts and report in a single Jupyter Lab file

(.ipynb) or along with a single PDF to Brightspace by due date. No other file formats will be graded. No late submission will be accepted.

3. Total 7 questions. Total points: 100

1. (16 points)

For each of parts (a) through (d), indicate whether we would generally expect the performance of a flexible statistical learning method to be better or worse than an inflexible method. Justify your answer.

(a) The sample size n is extremely large, and the number of predictors p is small.

(b) The number of predictors p is extremely large, and the number of observations n is small.

(d) The variance of the error terms, i.e. σ2 = Var(ε), is extremely high.

2. (20 points)

Answer the following questions.

(a) Provide a sketch of typical (squared) bias, model variance, training error, test error, and irreducible error curves, on a single plot, as we go from less flexible statistical learning methods towards more flexible approaches. The x-axis should represent the amount of flexibility in the method, and they-axis should represent the values for each curve. There should be five curves. Make sure to label each one.

(b) Explain why each of the five curves has the shape displayed in part (a).

3. (12 points)

If random variable X follows Poisson distribution with parameter λ, use the definition of expectation and variance learned in the class, show that its E(X) = Var(X) = λ .

4. (12 points)

Let X be the amount (in ounces) of soft drink in a randomly chosen bottle from company A, and Y be the amount of soft drink in a randomly chosen bottle from company B. A study has shown that the probability distributions of X and Y are as follows:

Find E(X), E(Y), Var(X), and Var(Y) and interpret them.

5. (15 points)

The table below provides a training data set containing six observations, three predictors, and one qualitative response variable. Use Jupyter Lab with Python to answer the following questions.

Suppose we wish to use this data set to make a prediction for Y when X1 = X2 = X3 = 0 using K- nearest neighbors.

(a) Compute the Euclidean distance between each observation and the test point, X1 = X2 = X3 = 0.

(b) What is our prediction with K = 1 or 3? Why?

(c) If the Bayes decision boundary in this problem is highly nonlinear, then what would be the best choice for the value of K? Why?

6. (25 points)

Given the Auto data set (see attached Auto.csv), use Jupyter Lab with Python to answer the following questions. Make sure that the missing values in the data set have been removed before analysis is performed.

(a) Which of the predictors are quantitative, and which are qualitative?

(b) What is the range of each quantitative predictor?

(d) Now remove the 10th through 85th observations. What is the range, mean, and standard deviation of each predictor in the subset of the data that remains?

(e) Using the full data set, investigate the predictors graphically, using scatterplots or other tools of your choice. Create matrix of scatter plots highlighting the relationships among the predictors.

(f) Suppose that we wish to predict gas mileage (mpg) on the basis of the other variables. Do your plots suggest that any of the other variables might be useful in predicting mpg?

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：代写DSCI550: Data Science at Scale Homework 5 Spring 2024帮做R程序

【下一篇】：代写DSCI550: Data Science at Scale Homework 5 Spring 2024帮做R程序

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Database作业Database作业

代写ISE529 Predictive Analytics 2024 Fall Homework 1代做Python编程

日期：2024-09-11 06:26

相关文章