代写COMP0083、c++，java程序代写-代写Java编程

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

您当前位置：首页 >> Java编程Java编程

代写COMP0083、c++，java程序代写

日期：2023-01-09 08:55

COMP0083: Advanced Topics in Machine Learning 2022/2023

Introduction to Convex Optimization

Coursework

Due Date: 9 January 2022

Instructor: Massimiliano Pontil

TA: Isak Texas Falk

Coursework based on the notes by Saverio Salzo

Instructions There are 4 sections. Two assess the knowledge of the theory and two require im-

plementing algorithms. Provide sufficient explanations for all solutions. For the coding parts the

following libraries are required: numpy, sklearn, matplotlib.

Deadlines Submit your report by January 9 2022.

Policies Late submissions: past the online due date, late submissions will be penalized by 20% of

the original total score per late day (e.g., 40% for 2 days). Excused extensions will be given only for

significant issues and if requested well in advance the due date.

1 Questions with multiple answers [20%]

Question 1 [4%] Which one of these is a convex function?

a) max{ax+ b, x4 5, ex2}

b) min{ax+ b, x4 5, ex2}

c) ax+ b+ x4 5 ex2 .

Question 2 [4%] Consider the function

f(x) =

{x if x ∈] 1, 0]

x2 if x ≥ 0.

Indicate which one of the two graphs below represents the subdifferential of f .

! i

fixs

y 8

of I

d2 -2- -

i"04A 1:

)

r bJ

Figure 1: Question 2

Question 3 [4%] The gradient of f(x) = ?Ax, x? + ?x, b? + c, where A is a square matrix, not

necessarily symmetric is

a) A?x+Ax+ b

b) 2Ax+ b

c) A? + b, where A? denotes the transpose of A.

Question 4 [4%] Let g ∈ Γ0(R). The Fenchel conjugate of f(x) = g(2x) is

a) f?(u) = g?(u/2)

b) f?(u) = g?(2u)

c) f?(u) = g?(u).

Question 5 [4%] Referring to the ridge regression problem

min

w∈H

n∑

i=1

(yi ? ?w,Λ(xi)?)2 + λ

∥w∥2 ,

and denoting by K the Gram matrix of the data, indicate the solution of the dual problem

a) uˉ = (K+ λnId)y

b) uˉ = (K2 + λnId)?1y

c) uˉ = (K+ λnId)?1y.

2 Theory on convex analysis and optimization [30%]

Problem 1. [4%] Compute (showing a complete derivation) the Fenchel conjugates of the following

functions.

1. f : R→ ]?∞,+∞], with f(x) =

{

+∞ if x ≤ 0

log x if x > 0.

2. f(x) = x2.

3. ι[0,1] (the indicator function of the interval [0, 1]).

Problem 2. [7%] Let f : X → ]?∞,+∞] be a proper convex function.

1. Prove by induction the Jensen’s inequality, that is

n∑

i=1

( n∑

i=1

λixi

)

≤

n∑

i=1

λif(xi),

for all x1, . . . , xn ∈ X and for all λ1, . . . , λn ∈ R+ with

∑n

i=1 λi = 1.

2. Using the characterization for differentiable functions prove that the function log is convex

3. Applying Jensen’s inequality to log, prove the following arithmetic-geometric inequality

√

x1 · · ·xn ≤ 1

n∑

i=1

xi,

for all x1, . . . , xn ∈ R+.

Problem 3. [4%] Given a polytope C = co(a1, . . . , am) in X. Prove that the maximum of a convex

function f on C is attained at one of the vertices a1, . . . , am.

Problem 4. [2%] Prove that the function f(x, y) = ∥x? 2y∥2 is (jointly) convex

Problem 5. [4%] Provide minimal sufficient conditions for the existence and uniqueness of mini-

mizers for a convex function f : X → ]?∞,+∞].

Problem 6. [9%] Consider the optimization problem.

min

∥Ax?b∥∞≤ε

∥x∥2 ,

where ε > 0, A ∈ Rn×d and b ∈ Rn. Solve the following points.

1. Compute the dual problem (using the Fenchel-Rockafellar duality theory). [Hint: put the

problem in the form f(x) + g(Ax)]

2. Does strong duality hold? Justify the answer. [Hint: use the qualification condition]

3. Write the KKT conditions.

4. Derive a rate of convergence on the primal iterates from the application of FISTA on the dual

problem. [Hint: recall the it is possible to bound the square of the distance to the primal

solution by the dual objective values]

3 Solving the Lasso problem [25%]

Consider the following problem

where (xi, yi)1≤i≤n is the training set, xi ∈ Rd and yi ∈ {?1, 1} and Λ: Rd → H is the feature map

corresponding to the Gaussian kernel, that is,

K(x, x′) = ?Λ(x),Λ(x′)? = exp

(

? ∥x? x

′∥2

2σ2

)

. (4)

The dual problem is

min

u∈Rn

?Ku, u? ? ?y, u?+

n∑

i=1

ι[0, 1

λn

](yiui), (5)

where Ki,j = K(xi, xj) and ι[0, 1

λn

] is the indicator function of the interval [0,

λn ]. The problem can

be equivalently restated as

min

α∈Rn

?Kyα, α? ? ?1n, α?+

n∑

i=1

ι[0, 1

λn

](αi), (6)

where (Ky)i,j = yiKi,jyj . Note that the primal solution can be recovered via w =

∑n

i=1 yiαiΛ(xi) and

hence the decision function is

hw(x) = sign(?w,Λ(x)?) = sign

( n∑

i=1

yiαiK(xi, x)

)

. (7)

Figure 2: A realization of the two-moons dataset.

Generate the data (2D) according to the following python code:

from sklearn.datasets import make_moons

X, y = make_moons(n_samples=200, noise=0.05, random_state=0)

y = 2*y - 1

Choose appropriate values of λ and σ and address the following points.

1. Implement FISTA for solving the dual problem and plot the dual objective function.

2. Implement the randomized coordinate projected gradient algorithm on the dual problem (6)

and plot the dual objective function.

3. For each algorithm above, using a contour plot command, plot the decision boundary as well

as the two classes.

4. compare the performance of the two approaches in terms of speed and accuracy.

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：代写BISY90009、java,c++代写

【下一篇】：代写BISY90009、java,c++代写

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Java编程Java编程

代写COMP0083、c++，java程序代写

日期：2023-01-09 08:55

相关文章