联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2019-10-22 11:15

Pennsylvania State University

College of Information Sciences and Technology

DS 310 Midterm Exam

Instructions

1. There are 4 problems each of which is worth 25 points.

2. Please consult the instructor if you have difficulty understanding any of the problems.

1. (25 pts.) Consider a M-class pattern classification problem in which each pattern

X to be classified belongs to exactly one of M mutually exclusive classes ω1 ··· ωM.

Suppose that X is represented using a vector of N binary features X = [x1, x2 ··· xN ].

Let pji = P(xi = 1|ωj ). Assume that the features are independent given the class

label. Let P(ω1)··· P(ωM) be the prior probabilities of each class.

(a) Describe in words, the meaning of pji.

(b) Show that a classifier that assigns X to class ωk gk(X) ≥ gj (X)∀j ̸= k. where

gi(X) can be written in the form !N

i=1 wjixi + wj0 is a minimum error (Bayes

optimal) classifier. Express wji, in terms of pji and P(ωj ) where (1 ≤ j ≤ M;

0 ≤ i ≤ N).

(c) Generalize the above solution for (b) above to obtain minimum loss classification

where λij is the loss incurred in assigning a pattern to class ωi when it in fact

belongs to class ωj .

2. (25 pts.) Suppose you have been hired by an AI consulting firm. Indicate which

algorithm you would choose in each of the following data-driven knowledge acquisition

scenarios. In each case, briefly justify your recommendation.

(a) Your client, has a database of patient records containing symptoms and expert

diagnosis. She would like to build a diagnosis system. The attributes can be

numeric (e.g., patient’s temperature), as well as categorical (e.g., whether the

patient is pregnant). In addition to performing accurate diagnosis of patients, your

client would like to use the system to obtain insight regarding the relationships

between attributes for different diseases.

(b) Your client is designing an email organizer to be integrated into an email program.

The SPAM detector is to be trained using information that is obtained

by observing the user’s actions on each email (read and respond, mark as junk,

ignore).

(c) Your client is a large company which has managed to gather a large database of

information about past applicants to jobs in various categories (e.g., software engineer,

data scientist, etc.) and the hiring decisions on each applicant. Experts in

the organization are convinced that they can automate the process of shortlisting

applicants for further consideration. You are told that it is important that the

1

Calculateprobability ofXi 18咒j

pcxniilwikRF.hu

Bayes

decision

he

decision-making process be transparent – that is, it should be easy to understand

why an applicant was shortlisted (or not).

(d) Your client is a large hospital that is interested in reducing the workload on

radiologists working on breast cancer diagnosis from radiological images. The

hospital has a large database of radiological images labeled with expert diagnosis.

The goal here is to automate diagnosis on the cases where the AI system can

produce high confidence results, and identify a subset of cases that the AI system

is not so confident about for further examination by expert radiologists.

3. (25 pts.) Recall that the perceptron learning algorithm that was described in class

is an additive weight update algorithm - that is, we add or subtract a fraction of

the misclassified sample to the weight vector at each iteration. Consider instead,

a multiplicative weight update algorithm for an n-input neuron defined as follows:

Consider a neuron defined by two weight vectors w+ and w−. Suppose both weight

vectors are initialized with a value 1 for each of their components. Consider a training

example (xp, dp) where xp ∈ {−1, 1}n is an input pattern and dp ∈ {−1, 1} is its

class label. Let yp, the output of the classifier be 1 if w+ · xp > w− · xp and yp =

−1 otherwise. Suppose the weights are updated as follows: w+

i ← w+

i β−(dp−yp)xip

and w−

i ← w−

i β(dp−yp)xip where 0 < β < 1 is a learning rate. Comment on the

potential advantages of such a multiplicative weight update algorithm over its additive

counterpart. Prove that this algorithm is guaranteed to converge to a pair of weight

vectors (w+⋆ , w−⋆ ) that correctly classify the training data whenever such weight vectors

exist. Hint: Show that the multiplicative weight update algorithm as an instance of

the standard perceptron algorithm operating on a suitably transformed weight space.

4. (a) (12.5 pts. )

Suppose you have designed (or trained) a n-input perceptron with the weight

vector W and threshold θ to correctly classify a set S of n-dimensional patterns.

Further assume that all the weight values of the perceptron so designed have

been hard-wired but the threshold θ can be set under user control. When the

perceptron is later used on a factory floor, suppose you find that the source of

the patterns (say the camera or the digitizer) adds a fixed amount of noise (given

by the n-dimensional noise vector V) to each pattern. Assuming that you can

somehow measure (or can calculate) V, how would you change θ so that the

perceptron with the same weight vector W continues to correctly classify all the

patterns in S?

(b) (12.5 pts) Briefly explain the significance of the following properties of the error

functions used to derive gradient-descent based learning algorithms of the type

considered in part (a).

i. Continuity and Differentiablilty (with respect to the weights or other parameters

of interest)

ii. Convexity with respect to weights or other parameters of interest



版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp