联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Matlab编程Matlab编程

日期:2018-11-22 10:18

MATH10098: Numerical Linear Algebra

Project: Regularisation for Least Squares Problems

November 14, 2018

In the course, we have defined the least squares problem (LSQ) as that of finding the

minimiser x ∈ R

n of kAx bk2

, for A ∈ R

m×n and b ∈ R

m. Theorem 4.1 tells us that this

minimiser x solves the normal equations

ATAx = ATb. (1)

For rank(A) = n and m > n, ATA is invertible, and the unique solution of the normal

equations (1) is

x = Ab, where A= (ATA)

1AT

. (2)

In Assignment A3, we saw how the method of least squares could be used to fit a linear

model of the form b = x0 + ax1 to a set of observed data points {a

(i)

, b(i)}i=1,...,m, by forming

the Vandermonde matrix A ∈ R

m×2 with entries Aij = (a

(i)

)

j1

, and solving the normal

equations (1). This process is called simple linear regression, and finds the straight line which

best fits the observations.

Suppose a straight line doesn’t quite fit the observations {a

(i)

, b(i)}i=1,...,m. We want to fit

a polynomial of degree (n ? 1) to the data, of the form

b = x0 + ax1 + a

2x2 + . . . + a

n1xn1 =

nX1

j=0

a

jxj ,

We can do this with polynomial regression, which is simply an extension of simple linear regression.

The least squares method for polynomial regression requires forming the Vandermonde

matrix A ∈ R

m×n

, with elements Aij = (a

(i)

)

j1

for j = 1, . . . , n, and solving the normal

equations (1).

Although the model is a polynomial in terms of the input variable a, it is a linear function

of the unknown model parameters xj — this is why we can still perform linear regression.

(a) In this question, we want to implement the least squares method to fit a polynomial to a

set of data observations. The degree of the best fitting polynomial is unknown.

In a MATLAB script named LSQ.m:

(i) Load data_points_50.m, containing a set of 50 data points with values {a

(i)

, b(i)}i=1,...,50

stored in the arrays a and b.

(ii) We want to fit the data with a polynomial of degree (n ? 1), for n = 2, . . . , 11. For

each n, set up the Vandermonde matrix A of size m × n.

(iii) For each n, solve the normal equations for x using backslash. Compute the condition

number of ATA in the 2-norm with the cond function in MATLAB. Comment on the

results.

1

(b) The matrix ATA is badly conditioned. As we’ve seen in the lectures, we can use the reduced

SVD of A to solve the least squares problem, allowing us to avoid forming ATA.

Write a new MATLAB script named LSQ_SVD.m, adapted from LSQ.m. Instead of solving

the normal equations directly, for each n, compute the SVD decomposition of A using the

function svd. Compute x using Algorithm LSQ-SVD.

(c) A more complex model is not necessarily a better model. Overfitting occurs when a model

is complex enough to correspond very closely to a particular set of data points, but does

not generalise well, and will not be able to predict future observations. Instead of capturing

the relationship between input and output in a dataset, an overfitted model describes the

statistical noise in the data.

(i) Plot the discrete data points with plot(a,b,’.’). Plot all the resulting polynomials

using the plot_poly function provided on Learn, on the same graph as the data points,

over the interval [0, 6]. Comment on your plots; do the high-degree polynomials fit

the data more closely Would they be useful to extrapolate the behaviour of the

data beyond the interval over which we have collected observations? Do you observe

overfitting?

(ii) A method to avoid overfitting is regularisation. Instead of minimising kAx ? bk

2

2

, we

seek to minimise the new cost function

kA bk

2

2 + μ kxk

2

2

,

where μ > 0 is a regularisation parameter. The idea is to introduce a penalty term,

so that the polynomial coefficients x (particularly the high-order coefficients) remain

relatively small.

Show that if x minimises kAx ? bk

2

2 + μ kxk

2

2

, then it solves the regularised normal

equations

(ATA + μI)x = ATb. (3)

(iii) Let A = U1Σ1VT be the reduced SVD of A. Show that x is given by

x = VSU1

Tb,

where the matrix S ∈ R

n×n

is diagonal, with diagonal entries given by

sii =

σi

σ

2

i + μ

, i = 1, . . . , n, (4)

with σi the singular values of A.

(iv) Write a script called LSQ_reg.m, adapted from LSQ_SVD.m, which computes x using

Equation (4) (again, for each n).

(v) Test your script with μ = 5, 0.5, 0.01, and 10?5

. For each value of μ, produce the same

plots as in Question (i).

How does regularisation affect the fitted polynomials, and the coefficient values?

Which value of μ produces the best results? What happens when μ is too large,

or too small?


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp