联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2018-10-25 10:00

Homework Week 3: Linear Regression

and Neural Network

1.1 (5pts)

Provide the python code to do the following:

Choose the seed 6996

Epsilon_vector = random_number_from_normal_distribution(500,0,1)

X_matrix<- random_number_from_normal_distribution(500*500,0,2)

ReshapeMatrix(X_matrix,500,500) # you need to have a matrix 500 x 500

slopesSet_vector<- random_number_from_uniform_distribution (500,1,5)

Y<-sapply(2:500,function(z) 1+ X_matrix[,1:z]%*%slopesSet_vector

[1:z]+Epsilo n)

If you test the dimension of Y. You will find 500 x 499.

By construction all the predictors are expected to be significant and uncorrelated.

1.2 Analysis of accuracy of inference as

function of number of predictors (5pts)

Plot the p-values for the 490 predictors. You should obtain a chart similar as the one below. Please

do not try to replicate this chart, your p-values will be different from this one.

1.3 (5pts)

Plot r-squared of all the models from 2 to 500 by adding a new parameter.

1.4 (5pts)

Plot the confidence interval lower bound and upper bound for the coefficient beta_1 for all the models

from 2 to 500.

Conclusions:

1. The more predictors, the higher R squared.

2. But inference for a fixed predictor becomes less and less accurate, which is shown by the

widening confidence interval.

3. This means that if there is, for example, one significant predictor Xi,1Xi,1, by increasing the

total number of predictors (even though they all or many of them may be significant) we can

damage accuracy of estimation of the slope for Xi,1Xi,1.

4. This example shows one problem that DM has to face, which is not emphasized in traditional

courses on statistical analysis where only low numbers of predictors are considered.

2.1 Simulation of the data (2pts)

Generate the following data:

import numpy as np

import pandas as pd

import statsmodels.api as sm

import matplotlib.pyplot as plt

# from sklearn import linear_model

# from sklearn.metrics import mean_squared_error, r2_score

np.random.seed(6996)

# error term

epsilon_vec = np.random.normal(0,1,500).reshape(500,1)

# X_matrix or regressors or predictiors

X_mat = np.random.normal(0,2,size = (500,500))

# Slope

slope_vec = np.random.uniform(1,5,500)

# Simulate Ys

Y_mat = 1 + np.cumsum(X_mat * slope_vec,axis=1)[:,1:] + epsilon_vec

# each col of Y_mat representing one simulation vector: starting with 2

regressors, end with 500

print(Y_mat.shape)

#You should have (500, 499)

2.2 Fitting linear models (5pts)

1)Fit linear model with the first 10 predictors. Store the result in the variable m10.

2)Fit linear model with 491 predictors. Store the result in the variable v490.

2.3 Ridge regression (5pts)

1)Apply ridge regression to the data with 10 predictors.

2) Separate the sample into train and test.

3) Select the best parameter λ using cross validation on train set.

4)Calculate mean squared prediction error for the best selected λ

5) Compare the mean squared prediction error of linear

model What you should observe:

Ridge regression did not select predictors. It is expected because we simulated all predictors

to be significant.

Ridge regression made a small improvement to mean squared prediction error. This is

consistent with expectation because it has one additional parameter.

Regularization is expected to reduce number of predictors when there are collinear

(highly correlated) predictors.

Predictors in this example are not collinear.

2.4 Lasso regression (5pts)

1)Fit lasso regression to the first 10 predictors.

2)Fit the model to the entire data.

Lasso regression marginally improved the mean squared error relative to the linear model,

but did worse than ridge regression.

It kept all 10 predictors and produced similar estimates of parameters.

2.5 Large number of significant

predictors (5pts)

1)Apply lasso regression analysis to data with 490 predictors.

2)Note that there are no actual slopes close to zero, but lasso regression still pushes them to

zero when λ=0.

3)Calculate mean squares prediction error for the best lambda.

4)Fit lasso regression model to the entire data.

Plot the set of true slopes used in simulation and mark slopes removed by lasso.

Lasso removed predictors seemingly randomly regardless of the value of slope.

Given the way the sample was simulated (independent predictors with slopes between 1 and 3) it

would be more reasonable removing none or removing the predictors with smallest slopes.

3. Neural Network

You can download the data session3_homework.csv on piazza.

You will use the following libraries:

import numpy as np

import pandas as pd

from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt

from sklearn.neural_network import MLPClassifier as MLP

from sklearn.metrics import confusion_matrix

data = pd.read_csv('session3_homework.csv')

data.head()

Credit scoring is the practice of analysing a persons background and credit application in order to assess the

creditworthiness of the person.

We are trying to find which parameters impact the creditworthiness:

creditworthiness=f(income, age, gender, …)

The dataset contains information on different clients who received a loan at least 10 years ago.

The variables:

income (yearly),

age,

loan (size in euros),

LTI (the loan to yearly income ratio)

are available.

The goal is to predict, based on the input variables LTI and age, whether or not a default will occur within 10 years.

Step 1: Separate the data into train and test.

X = np.array(data[['LTI','age']])

Y = np.array(data['default10yr'])

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3)

Step 2: Train neural net with one hidden layer including 4 neurons. Plot the configuration if possible

XOR_MLP = MLP(activation='tanh',alpha=0.,batch_size='auto',beta_1=0.9,beta_2=0.999,\

early_stopping=False,epsilon=1e-08,hidden_layer_sizes= (4,),\

learning_rate='constant',learning_rate_init = 0.1,max_iter=5000,momentum=0.5,\

nesterovs_momentum=True,power_t=0.5,random_state=0,shuffle=True,solver='sgd',\

tol=0.0001, validation_fraction=0.1,verbose=False,warm_start=False)

XOR_MLP.fit(X_train,Y_train)

def draw_neural_net(ax, left, right, bottom, top, layer_sizes, coefs_, intercepts_, input_list,

out_put , np, plt):

n_layers = len(layer_sizes)

v_spacing = (top - bottom)/float(max(layer_sizes))

h_spacing = (right - left)/float(len(layer_sizes) - 1)

layer_top_0 = v_spacing*(layer_sizes[0] - 1)/2. + (top + bottom)/2.

for m in range(layer_sizes[0]):

plt.arrow(left-0.18, layer_top_0 - m*v_spacing, 0.12, 0, lw =1, head_width=0.01,

head_length=0.02)

for n, layer_size in enumerate(layer_sizes):

layer_top = v_spacing*(layer_size - 1)/2. + (top + bottom)/2.

for m in range(layer_size):

circle = plt.Circle((n*h_spacing + left, layer_top - m*v_spacing), v_spacing/8.,\

color='w', ec='k', zorder=4)

if n == 0:

plt.text(left-0.125, layer_top - m*v_spacing, input_list[m] , fontsize=15)

elif n == n_layers -1:

plt.text(n*h_spacing + left+0.05, layer_top - m*v_spacing, out_put, fontsize=15)

ax.add_artist(circle)

for n, layer_size in enumerate(layer_sizes):

if n < n_layers -1:

x_bias = (n+0.5)*h_spacing + left

y_bias = top + 0.005

circle = plt.Circle((x_bias, y_bias), v_spacing/8.,color='w', ec='b', zorder=4)

plt.text(x_bias, y_bias, str(1),color='k', fontsize=15)

ax.add_artist(circle)

# Edges between nodes

for n, (layer_size_a, layer_size_b) in enumerate(zip(layer_sizes[:-1], layer_sizes[1:])):

layer_top_a = v_spacing*(layer_size_a - 1)/2. + (top + bottom)/2.

layer_top_b = v_spacing*(layer_size_b - 1)/2. + (top + bottom)/2.

for m in range(layer_size_a):

for o in range(layer_size_b):

line = plt.Line2D([n*h_spacing + left, (n + 1)*h_spacing + left],

[layer_top_a - m*v_spacing, layer_top_b - o*v_spacing], c='k')

ax.add_artist(line)

xm = (n*h_spacing + left)

xo = ((n + 1)*h_spacing + left)

ym = (layer_top_a - m*v_spacing)

yo = (layer_top_b - o*v_spacing)

rot_mo_rad = np.arctan((yo-ym)/(xo-xm))

rot_mo_deg = rot_mo_rad*180./np.pi

xm1 = xm + (v_spacing/8.+0.05)*np.cos(rot_mo_rad)

if n == 0:

if yo > ym:

ym1 = ym + (v_spacing/8.+0.12)*np.sin(rot_mo_rad)

else:

ym1 = ym + (v_spacing/8.+0.05)*np.sin(rot_mo_rad)

else:

if yo > ym:

ym1 = ym + (v_spacing/8.+0.12)*np.sin(rot_mo_rad)

else:

ym1 = ym + (v_spacing/8.+0.04)*np.sin(rot_mo_rad)

plt.text( xm1, ym1,str(round(coefs_[n][m, o],4)),rotation = rot_mo_deg,fontsize =

10)

# Edges between bias and nodes

for n, (layer_size_a, layer_size_b) in enumerate(zip(layer_sizes[:-1], layer_sizes[1:])):

if n < n_layers-1:

layer_top_a = v_spacing*(layer_size_a - 1)/2. + (top + bottom)/2.

layer_top_b = v_spacing*(layer_size_b - 1)/2. + (top + bottom)/2.

x_bias = (n+0.5)*h_spacing + left

y_bias = top + 0.005

for o in range(layer_size_b):

line = plt.Line2D([x_bias, (n + 1)*h_spacing + left],[y_bias, layer_top_b -

o*v_spacing], c='b')

ax.add_artist(line)

xo = ((n + 1)*h_spacing + left)

yo = (layer_top_b - o*v_spacing)

rot_bo_rad = np.arctan((yo-y_bias)/(xo-x_bias))

rot_bo_deg = rot_bo_rad*180./np.pi

xo2 = xo - (v_spacing/8.+0.01)*np.cos(rot_bo_rad)

yo2 = yo - (v_spacing/8.+0.01)*np.sin(rot_bo_rad)

xo1 = xo2 -0.05 *np.cos(rot_bo_rad)

yo1 = yo2 -0.05 *np.sin(rot_bo_rad)

plt.text( xo1, yo1,str(round(intercepts_[n][o],4)),rotation = rot_bo_deg,

fontsize = 10)

layer_top_0 = v_spacing*(layer_sizes[-1] - 1)/2. + (top + bottom)/2.

for m in range(layer_sizes[-1]):

plt.arrow(right+0.015, layer_top_0 - m*v_spacing, 0.16*h_spacing, 0, lw =1,

head_width=0.01, head_length=0.02)

input_list = ['LTI','age']

out_put = 'default10yr'

fig = plt.figure(figsize=(12, 12))

ax = fig.gca()

ax.axis('off')

Step 3: Predict the output and measure its accuracy.

(You do not have to find the same results, this is just an example)

Accuracy = 0.996%


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp