联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Matlab编程Matlab编程

日期:2019-03-03 09:47

Department of Informatics, King’s College London Pattern

Recognition (6CCS3PRE/7CCSMPNN).

Assignment: Support Vector Machines (SVMs) and Ensemble

Methods

This coursework is assessed. A type-written report needs to be submitted online

through KEATS by the deadline specified on the module’s KEATS webpage. In this

coursework, we consider (before Q8) a classification problem of 3 classes. A multi-class

SVM-based classifier formed by multiple SVMs is designed to deal with the classification

problem. And after Q8 (included Q8) considers your “own created” dataset to

investigate the classification performance using the techniques of Bagging and Boosting.

Some simple “weak” classifiers will be designed and combined to achieve an improved

classification performance for a two-class classification problem.

Q1. Write down your 7-digit student ID denoted as s1s2s3s4s5s6s7. (5 Marks)

Q2. Find R1 which is the remainder of . Table 1 shows

the multi-class methods to be used corresponding to the value of R1 obtained.

(5 Marks)

R1 Method

0 One against one

1 One against all

2 Binary decision tree

3 Binary coded

Table 1: R1 and its corresponding multi-class method.

Q3. Create a linearly separable two-dimensional dataset of your own, which consists of

3 classes. List the dataset in the format as shown in Table 2. Each class should contain at

least 10 samples and all three classes have the same number of samples. Note: This is

your own created dataset. The chance of having the same dataset in other submissions is

slim. Do not share your dataset with others to avoid any plagiarism/collusion issues.

(10 Marks)

Table 2: Samples of three classes.

Q4. Plot the dataset in Q3 to show that the samples are linearly separable. Explain why

your dataset is linearly separable. Hint: the Matlab built-in function plot can be used and

show some example hyperplanes which can linearly separable the datasets. Identify

which hyperplane is for which classes. (20 Marks)

Q5. According to the method obtained in Q2, draw a block diagram at SVM level to

show the structure of the multi-class classifier constructed by linear SVMs. Explain the

design (e.g., number of inputs, number of outputs, number f SVMs used, class label

assignment, etc.) and describe how this multi-class classifier works.

Remark: A blocking diagram is a diagram which is used to, say, show a concept or a

structure, etc. Here in this question, a diagram is used to show the structure of the multiclass

SVM classifier, i.e., how to put binary SVM classifiers together to work as a multiclass

SVM classifier. For example, Q5 of tutorial 9 is an example of a block diagram at

SVM level. Neural network diagram is a kind of diagram to show its structure at neuron

level. The block diagrams in lecture 9 are to show the architecture of ensemble

classifier, etc. (20 Marks)

Q6. According to your dataset in Q3 and the design of your multi-class classifier in Q5,

identify the support vectors of the linear SVMs by “inspection” and design their

hyperplanes by hand. Show the calculations and explain the details of your design.

(20 Marks)

Q7. Produce a test dataset by averaging the samples for each row in Table 2, i.e.,

(sample of class 1 + sample of class 2 + sample of class 3)/3. Summarise the results in

the form of Table 3, where N is the number of SVMs in your design and “Classification” is

the class determined by your multi-class classifier. Explain how to get the

“Classification” column using one test sample. Show the calculations for one or two

samples to demonstrate how to get the contents in the table. (20 Marks)

Table 3: Summary of classification accuracy.

Marking: The learning outcomes of this assignment are that student understands the

fundamental principle and theory of support vector machine (SVM) classifier; is able to

design multi-class SVM classifier for linearly separable dataset and knows how to

determine the classification of test samples with the designed classifier. The assessment

will look into the knowledge and understanding on the topic. When answering the

questions, show/explain/describe clearly the steps/design/concepts with reference to

the equations/theory/algorithms (stated in the lecture slides). When making comments

(if necessary), provide statements with the support from the results obtained.

Purposes of Assignment: This assignment provides the overall classification idea from

samples to design to classification. It helps you to make clear the concept, working

principle, theory, classification of samples, design procedure and multiple-class

classification techniques for SVM.

Q8. Create a non-linearly separable dataset consisting of at least 20 two-dimensional

dataset. Each data is characterised by two points x1 ∈ [?10, 10] and x2 ∈ [?10, 10] and

associated with a class y ∈ {?1, +1}. List the data in a table in a format as shown in Table

1 where the first column is for the data points of class “?1” and the second column is for

the data points of class “+1”. (20 Marks)

Q9. Plot the dataset (x axis is x1 and y axis is x2) and show that the dataset is nonlinearly

separable. Represent class “?1” and class “+1” using “×” and ‘?”, respectively.

Explain why your dataset is non-linearly separable. Hint: the Matlab built-in function

plot can be used. (20 Marks)

Q10. Design Bagging classifiers consisting of 3, 4 and 5 weak classifiers using the steps

shown in Appendix 1. A linear classifier should be used as the weak classifier. Ex- plain

and show the design of the hyperplanes of weak classifiers. List the parameters of the

design hyperplanes.

After designing the weak classifiers, apply the designed weak classifiers and bagging

classifier to all the samples in Table 1. Present the classification results in a table as

shown in Table 2. The columns “Weak classifier 1” to ‘Weak classifier n” list the output

class ({?1, +1}) of the corresponding weak classifiers. The column “Overall classifier” list

the output class ({?1, +1}) of the bagging classifier. The last row lists the classification

accuracy in percentage for all classifiers, i.e., .

Explain how to determine the class (for each weak classifier and over all classifier) using

one test sample. You will have 3 tables (for 3, 4 and 5 weak classifiers) for this question.

Comment on the results (in terms of classification performance when different number

of weak classifiers are used). (30 Marks)

Table 2: Classification results using Bagging technique combining n weak classifiers. The

first row “Data” are the samples (both classes 1 and 2) in Table 1.

Q11. Design a Boosting classifier consisting of 3 weak classifiers using the steps shown

in Appendix 2. A linear classifier should be used as a weak classifier. Explain and show

the design of the hyperplanes of weak classifiers. List the parameters of the design

hyperplanes. After designing the weak classifiers, apply the designed weak classifiers

and boosting classifier to all the samples in Table 1. Present the classification results in a

table as shown in Table 2. Explain how to determine the class (for each weak classifier

and boosting classifier) using one test sample. Comment on the results of the overall

classifier in terms of classification performance when comparing with the 1st, 2nd and

the 3rd weak classifiers, and with the bagging classifier with 3-weak classifiers in Q.3.

(30 Marks)

Appendix 1: Bagging1

Q1. Start with dataset D.

Q2. Generate M dataset D1, D2, . . ., DM .

Each distribution is created by drawing n

′ < n samples from D with.

replacement.

? Some samples can appear more than once while others do not appear at. all.

Q3. Learn weak classifier for each dataset.

weak classifiers fi(x) for dataset Di

, i = 1, 2, ..., M.

Q4. Combine all weak. classifiers using a majority voting scheme.

Appendix 2: Boosting 2

Dataset D with n patterns

Training procedure:

(1Details can be found in Section “Bagging” in the Lecture notes

2Details can be found in Section “Boosting” in the Lecture notes. )

Step 1: Randomly select a set of n1 ≤ n patterns (without replacement) from D to

create dataset D1. Train a weak classifier C1 using D1 (C1 should have at least 50%

classification accuracy).

Step 2: Create an “informative” dataset D2 (n2 ≤ n) from D of which roughly. half of

the patterns should be correctly classified by C1 and the rest is wrongly classified.

Train a weak classifier C2 using D2.

Step 3: Create an “informative” dataset D3 from D of which the patterns are not well

classified by C1 and C2 (C1 and C2 disagree). Train a weak classifier C3 using D3.

The final decision of classification is based on the votes of the weak classifiers.– e.g.,

by the first two weak classifiers if they agree, and by the third weak classifier if the

first two disagree.

Marking: The learning outcomes of this assignment are that student understands the

fundamental principle and concepts of ensemble methods (Bagging and Boosting); is

able to design weak classifies; knows the way to form Bagging/Boosting classifier and

knows how to determine the classification of test samples with the designed

Bagging/Boosting classifiers. The assessment will look into the knowledge and

understanding on the topic. When answering the questions, show/explain/describe

clearly the steps/design/concepts with reference to the equations/theory/algorithms

(stated in the lecture slides). When making comments, provide statements with the

support from the results obtained.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp