联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2019-03-01 09:08

STAT 440 - Spring 2019 - Midterm Project

Recall that you may use your notes, books, or even the internet to help answer these questions, but all of the

work should be your own and you should not ask anyone for help or about any details related to the class

and project during this 60 hour period (this includes face to face interactions, emails, internet forums, etc.).

You have half of Wednesday and all of Thursday and Friday to work on this project. You are to turn it in by

midnight on Friday. You will be graded on the accuracy of your answers, the efficiency of your coding, and

the organization/clarity of your write up. You should have plenty of writing to convey what you are doing as

well as comments in your code to explain and make it easier to read. Show all of your work.

You should turn in an RMarkdown file as well as the output through Canvas. If you have any clarifying

questions or notice any issues, don’t hesitate to contact me and Nick. Note that in some of these questions, it

is up to you to select certain things; this is done on purpose so choose wisely and explain your decisions.

1. Consider the following density

f(x) = Cx3e4

for x ≥ 0.

a. Find the CDF and determine the normalizing constant C.

b. Use the inverse CDF method to simulate 100,000 draws from f. Plot the histogram of your sample.

Create a second plot where you zoom in on the x-axis and plot a kernel density estimate as well as

the true density. Comment on the results.

c. Suppose you want to use the normal density with mean 0 and variance σ

2, call it g(x|σ2), to

produce samples from f. Find a constant such that

f(x)g(x|σ2)≤ M,

for all x. Note this constant can/should depend on σ2

. Feel free to either do this analytically or

do this numerically for a few different values of σ2. Try to find a σ2, which produces a small value

of M. Provide a few plots to justify your choice (or show the mathematics if you can).

d. Using your choice of σ2

from above, produce a sample of size 10,000 from f using the accept/reject

method. Produce a histogram and use the sample to estimate the mean of f and produce a

standard error of your estimate.

e. Using the same σ

2 and g, use importance sampling (sample size 10,000) to again estimate the

mean of f and produce a standard error for your estimate. Compare with what you saw in (d).

2. The file Szeged_Weather_Summary.csv contains monthly averages for different weather metrics in

Szeged, Hungary.

a. The variable WindBearing denotes the direction in which the wind is originating. The units are in

degrees with 0 denoting due north, 90 due east, 180 due south, and 270 due west. All of the winds

come from either the south east or south west. Create a new variable, Direction which indicates if

the direction is southeast (<=180) or southwest (>180). Construct boxplots of temperature vs

Direction.

b. Use a permutation test to determine if Direction is associated with Temperature (measured in

Celcius).

c. Use a bootstrap method to construct a 95% confidence interval for the effect of Direction on

Temperature (response variable here is Temperature). Do both a parametric and nonparametric

bootstrap. Compare the results.

1

d. Pick another variable of your choice to associate with Temperature while also including Direction as

another predictor. Explain why you think this variable is either important or interesting (to you).

Fit a linear regression model with the two predictors and use a bootstrap method to construct a

95% confidence interval of the two variables. Interpret your results in the context of the problem.

e. Suppose we wish to compare Temperature and ApparentTemp, as we suspect they may be quite

similar. Let μ1 be the true mean of Temperature and μ2 be the true mean of ApparentTemp. Use

the nonparametric bootstrap to test H0 : μ1 = μ2 versus H1 : μ1 =6 μ2. Use a 5% significance level.

Perform similar two-tailed tests using nonparametric bootstrap for the median and IQR. For each

test, be sure to include the test statistic, p-value, and a proper conclusion.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp