Sample Prediction代做、代写data、代写R编程语言、R代写-代写Algorithm 算法作业

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

您当前位置：首页 >> Algorithm 算法作业Algorithm 算法作业

Sample Prediction代做、代写data、代写R编程语言、R代写

日期：2020-02-19 10:26

Exercise - LR and Out of Sample Prediction

Generate 99 independent variables uniformly distributed between -100 and 100 of size 100 observations each.

Generate the dependent variable y = 3 + 10*V99, where V99 is the last covariate and add some noise

Construct 3 models: one linear model with no variables, one with all the variables and one with only the variable V99

Compute the MSE of each model

Hint: for the first two points code is provided below.

In [1]:

set.seed(123)

n <- 100

p <- 99

x <- matrix(runif(n*p, min=-100,max=100), n , p)



## Generate the output variable as a linear combination of x



## With jitter() you add random noise

y <- jitter(3 + 10*x[,99], factor=10000)

Pick from your data only 1/5th random observations

Use the remaining 4/5th observations to rebuild the three models

Make prediction on the 1/5th observations

What do you observe now?

Hint: for the first point code is provided.

In [2]:

## Pick randomly 1/5th of observastions

ii <- sample(nrow(x), floor(nrow(x)/5))

## Built a test and training set

data.te <- x[ii, ]

data.tr <- x[-ii,]

y.te <- y[ii]

y.tr <- y[-ii]

Exercise - Part 2

Now:

Build 99 different models including from 1 to 99 input variables on training data (4/5th observations)

For each model compute the out-of-sample MSE on the remaining 1/5th (test data)

Plot the out-of-sample MSE as a function of the number of variables

Hint: you may prefer to use a for-loop.

Exercise Cross Validation

We are interested in predicting the quality of wines using chemical indicators. To do so, we have a disposal two data sets for white and red wine, reporting the variable quality on a scale from 0 to 10.

white wine data

Find three models you might think are meaningful for the prediction with different number of variables

Compute the in-sample mean squared error and the R squared

Compute the out-of-sample mean squared error using a test-training set approach (remember to set the seeds)

Compute the out-of-sample mean squared error using 10-folds cross validation

Which wine would you buy now?

Hint: the skeleton for cross validation is provided.

In [4]:

wine.white <- read.table("./data/wine-white.txt")

y <- wine.white$quality

#for (i in 1:K) {

# hold <- (ii == i)

# train <- (ii != i)

## Build model

## Store the predictions for the left-out segment

# predictions[hold] <-

## Calculate estimated MSPE

#mean((y - predictions)^2)

Ridge Regression

We are interested in predicting the level of alchol consumption during the weekend for students, controlling for many social and academic indicators. Some of them are the average grades for three years, the income of the family, the age, etc. In total we have 32 variables, but we want to find just the ones most correlated with alchol consumption.

We will explore the linear mode, the ridge regression and lasso.

Do the following:

Download the student txt file

Note: the dependent variable is Walc (Week-end alchol consumption)

In [ ]:

student <- read.table("./data/student.matG.txt")

Explore the variables and construct two different linear models. You can use any specification you think is most appropriate. Provide justifications.

Report the interpretation of the coefficients

Ridge Regression:

Construct a sequence of lambda from to

Use cross validation to find the best lambda to be used for estimating ridge regression (use the skeleton provided in the hints of the previous exercises)

Construct a ridge regression with the lambda with minimum error

Hint: Code for the first two points is provided.

Model comparison:

Use cross validation to compare the linear models that you choose and the ridge regression.

Do you think it is the correct way to compare the models?

In [ ]:

## Hint code for the first part of the exercise

## Expand matrix

xm <- model.matrix(~. ,data=student[,-27])[,-1]

y <- student[,27]

## Use this functions to standardize

standard_for_dummy <- function (k){ if (length(k[!duplicated(k)])==2)

{ return(1)}

return(sd(k)) }

sd.tr <- apply(xm, 2, standard_for_dummy)

mu_for_dummy <- function (k){ if (length(k[!duplicated(k)])==2)

{ return(0.5)}

mean(k) }

mu.tr <-apply(xm, 2, mu_for_dummy)

## New covariate matrix

xmn <- scale(xm, center = mu.tr, scale=sd.tr)

## Set your lambda

lambdas.rr <- exp(seq(-4, 4, length=50))

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：IY2840留学生代做、代写Threat Detection、代写R语言、R编程设计调试

【下一篇】：IY2840留学生代做、代写Threat Detection、代写R语言、R编程设计调试

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Algorithm 算法作业Algorithm 算法作业

Sample Prediction代做、代写data、代写R编程语言、R代写

日期：2020-02-19 10:26

相关文章