MAST90083代做、代写Data Mining、代写R设计、R编程语言调试-代写Algorithm 算法作业

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

您当前位置：首页 >> Algorithm 算法作业Algorithm 算法作业

MAST90083代做、代写Data Mining、代写R设计、R编程语言调试

日期：2019-10-28 10:40

Student

Number

Semester 2 Assessment, 2019

School of Mathematics and Statistics

MAST90083 Computational Statistics and Data Mining

Writing time: 3 hours

Reading time: 15 minutes

This is NOT an open book exam

This paper consists of 3 pages (including this page)

Authorised Materials

• Mobile phones, smart watches and internet or communication devices are forbidden.

• No handwritten or print materials may be brought into the exam venue.

• This is a closed book exam.

• No calculators of any kind may be brought into the examination.

Instructions to Students

• You must NOT remove this question paper at the conclusion of the examination.

Instructions to Invigilators

• Students must NOT remove this question paper at the conclusion of the examination.

This paper must NOT be held in the Baillieu Library

MAST90083 Semester 2, 2019

Question 1 Suppose we have a model p(x, z | θ) where x is the observed dataset and z are the

latent variables.

(a) Suppose that q(z) is a distribution over z. Explain why the following

F(q, θ) = Eq [log p(x, z | θ) − log q(z)]

is a lower bound on log p(x | θ).

(b) Show that F(q; θ) can be decomposed as follows

F(q, θ) = −KL(q(z) || p(z|x, θ)) + log p(x | θ)

where for any two distributions p and q, KL(q||p) = −Eq log p(z)

q(z)

is the Kullback-Leibler

(KL) divergence.

(d) Note that the KL divergence is always non-negative. Furthermore, it is zero if and only if

p = q. Conclude the optimal q that maximises F is p(z | x, θ).

[10 + 10 + 5 + 5 = 30 marks]

Question 2 Let {(xi

, yi)}

i=1 be our dataset, with xi ∈ R

p and yi ∈ R. Classic linear regression

can be posed a empirical risk minimisation, where the model is to predict y using a class of

functions f(x) = w

T x, parametrised by vector w ∈ R

p using the squared loss, i.e. we minimise

(a) Show that the optimal parameter vector is

wˆn = (XT X)

−1XT Y

where X is n × p matrix, with i-th row given by x

and Y is a n × 1 column vector with

i-th entry yi

(b) Consider regularising the empirical risk by incorporating an l2 penalty. That is, find w

minimising.

Show that the optimal parameter is given by the ridge regression estimator

wˆridge

n = (XT X + λI)−1XT Y.

φ(x). Let Φ be a matrix with i-th row given by φ(xi)T.

(i) Show the optimal parameters would be given by

wˆkernel

n = (ΦT Φ + λI)−1ΦT Y

(ii) Express the predicted y values on the training set, Φ ˆw

kernel n, only in terms of y and

the Gram matrix K = ΦΦT

, with Kij = φ(xi)

T φ(xj ) = k(xi

, xj ), where k is some

kernel function. (This is known as the kernel trick.) Hint: You will find the following

matrix inversion formula useful:

Page 2 of 3 pages

MAST90083 Semester 2, 2019

(iii) Compute an expression for the value of y

∗ predicted by the model at an unseen test

vector x∗.

[5+5+5+10+5 = 30 marks]

Total marks = 60

End of Exam

Page 3 of 3 pages

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：MATH20811代写、代做data analysis、代写R、R程序语言调试

【下一篇】：MATH20811代写、代做data analysis、代写R、R程序语言调试

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Algorithm 算法作业Algorithm 算法作业

MAST90083代做、代写Data Mining、代写R设计、R编程语言调试

日期：2019-10-28 10:40

相关文章