代写Math 173A、代写Python编程-代写Python编程

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

您当前位置：首页 >> Python编程Python编程

代写Math 173A、代写Python编程

日期：2022-11-27 10:27

Homework 7 for Math 173A - Fall 2022

1. Indicate whether the following functions are strongly convex. Justify your answer.

(a) f(x) = x

(b) f(x) = x2

2. Let f : Rn → R be a differentiable function. Prove the following two conditions

are equivalent (i.e. each one implies the other)

(a) f(y) ≥ f(x) +?f(x)T (y ? x) + c2‖x? y‖2 for all x, y ∈ Rn

(b) The function g(x) := f(x)? c2‖x‖2 is convex.

Hint: Recall that g(x) is convex if and only if g(y) ≥ g(x) +?g(x)T (y ? x)

3. Here we will prove the inequality used in class to prove fast convergence for strongly

convex functions.

Let F (x) be a strongly convex function with constant c. Our goal is to show

F (x)? F (x?) ≤ 1

‖?F (x)‖2 for all x ∈ Rd. (1)

(a) Fix x ∈ Rd and define the quadratic function

q(y) = F (x) +?F (x)T (y ? x) + c

‖x? y‖2.

Find the y? that minimizes q(y).

(b) Show that q(y?) = F (x)? 12c‖?F (x)‖2

(d) Explain the proof technique in your own words to demonstrate understanding

of what we did.

4. Let A be a diagonal matrix with diagonal entries Aii = i (i.e. the entries run from

1 to n), and let f(x) = 12x

TAx+ bTx+ c. We plan to run gradient descent on this

function, and want to understand in theory how fast this will converge.

(a) Show which assumptions are satisfied by f . (i.e. bounded Hessian, strong

convexity).

(b) What should we pick as a step size?

5. We will implement the SVM algorithm with gradient descent to classify two gaus-

sians in 2D.

(a) Sample 100 points from one Gaussian with mean (?3,?3) and one Gaussian

with mean (3, 3). Both Gaussians have identity covariance. You can do this by

calling randn (Matlab) or np.random.randn (Python) to generate each point

cloud, and then add or subtract 3 from each coordinate. Create a vector Y

of the labels that’s 1’s on one class and -1’s on the other class. Create and

turn in a scatter plot of the data points colored by label to ensure you have

correctly created your data.

(b) Create a function for the gradient of the loss

L(w) =

‖w‖2 +

n∑

i=1

max(0, 1? yi〈xi, w〉)

?L(w) = w +

n∑

i=1

?yixi · 11?yi〈xi,w〉>0,

where 11?yi〈xi,w〉>0 =

{

1 if 1? yi〈xi, w〉 > 0

0 else

. Also, here n = 200. To

compute the gradient, you’ll have to compute an indicator of whether 1 ?

yi〈xi, w〉 is positive or negative at every point, and sum up the contribution

of this term for all points where it’s positive.

of gradient descent. You will create twp plots.

i. Plot the classification error (averaged over all the points) as a function

of the iterations. The classification of xi is determined by sign(〈xi, w〉).

ii. Plot the margin 2‖w‖ as a function of the iterations. This shows how much

of a gap you have between the classes you’ve learned.

(d) Create another scatter plot of your data, but this time color the points by the

function f(xi) = 1?yi ·〈xi, w〉. The numbers closest to 0 (positive numbers or

largest negative numbers) will show you which points were “most important”

in determining the classification.

Note: Here we’re only defining a subspace classifier (i.e. the classifier goes through

the origin). This is fine for our problem as the gaussians are on opposite sides of

the origin. If you want to create an intercept term, simply append a vector of all

1’s as a column of your data, and now your weight vector will be of dimension 3

instead of 2. This is done the same way as when running least squares regression.

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：代写TAT 7008、代写Python程序语言

【下一篇】：代写TAT 7008、代写Python程序语言

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Python编程Python编程

代写Math 173A、代写Python编程

日期：2022-11-27 10:27

相关文章