#### 联系方式

• QQ：99515681
• 邮箱：99515681@qq.com
• 工作时间：8:00-23:00
• 微信：codinghelp

#### 您当前位置：首页 >> Matlab编程Matlab编程

###### 日期：2019-11-24 07:52

COMP61021: Modelling and Visualisation of High Dimensional Data

Lab 3: Self-Organizing Map Implementation and Application (Assessed)

This coursework (a zipped file) must be submitted via the Blackboard. The deadline of this

lab exercise is 23:30 on 26th November 2019. The late submission policy is applied (see the

teaching website and FAQs for details).

Kohonen’s Self-Organizing Map (SOM) is a biologically inspired unsupervised learning method that

can be used to learn and visualise the topological nature of high dimensional data. In this exercise,

you are asked to use Matlab to implement and make use of SOMs to visualize 2D and 3D data

sets. You will then build a SOM which is capable of finding similar images in a collection of images.

You can download the relevant Matlab code and the image data set from

http://syllabus.cs.manchester.ac.uk/pgt/COMP61021/Lab/lab2.zip

After unzipping it, you can see three zipped packages, lab2_Part1.zip, lab2_Part2.zip and

Image.zip, that contains the Matlab code for Parts 1 and 2 as well as an image data set used for

Parts 2 and 3, respectively. To use those functions contained in these zipped packages, you need

to unzip them to a location such as C:\COMP61021\som and then use the Matlab command

Now… the lab exercise ...

PART 1 – SOM Implementation

The purpose of this exercise is to demonstrate that you understand the details of the SOM

algorithm. For Part 1, you need to complete the unfinished one-dimensional SOM implementation

in the file lab_som.m. [3 Marks] Comments in the file indicate what you need to add to complete

of the function in your code. To test your implementation, you can use the following commands:

1. data=nicering;

2. som=lab_som(data, numNeurons, steps, learningRate, radius);

3. lab_vis(som, data);

You should design and document an experiment to find suitable hyper-parameters for the function

lab_som. Your results should show that the inter-connected SOM units, represented by yellow

points and red lines, approximate the shape of the ring accurately using a chain. [2 Mark]

You should now complete the SOM implementation in lab_som2d.m, which has similar

functionality to lab_som.m but makes use of a two-dimensional grid instead of a chain. [3 Marks]

You should again investigate the best hyper-parameters for the SOM and visualize your results by

using the following commands:

1. data=nicering;

2. [som,grid]=lab_som2d(data, width, height, steps, learningRate, radius);

3. lab_vis2d(som, grid, data);

This time your visualisation should show a mesh approximated to the shape of the ring. [2 Mark].

All the hyper-parameters used in experiments and main results must appear explicitly in

PART 2 – Image clustering with SOM

The purpose of this part is to show you a practical application of SOM. In Part 2, you will use a

sophisticated SOM Toolbox (see http://www.cis.hut.fi/projects/somtoolbox/documentation/ for

details), which provides various methods for initializing, training and visualizing SOMs. The toolbox

is included lab2_Part2.zip so there is no need to download it. The aim is to train a SOM that

can be used to find images that are similar to each other. You are provided with 172 images in

Image.zip, which your SOM will be trained with. Images are very large pieces of data with many

redundant components and are usually unsuitable to fed directly into a machine learning algorithm.

Therefore, a feature extraction/selection process is often necessary before applying a machine

learning algorithm. To extract features that can be used for training, run the following command:

[imgs,training]=lab_featuresets('x:\My\Path\To\Images\', -1);

This will return the array of images used to train the data imgs and the training data itself

training. You should then run the command som_gui to launch the training interface. The GUI

allows you to import the training data training from the workspace, train a SOM and visualize its

structure. You can experiment with the various training parameters.

After you have trained your SOM, you should save it to the workspace with a name such as som

using Load/Save->Save Map. After saving, you can visualise the U-Matrix for the SOM using the

command som_show(som, 'umat', 'all'). You should explain what the U-matrix represents

in general and what it says about the SOM you have just trained. [1 Mark]

You can then run the command lab_showsimilar(imgs,training,som.codebook,1), which

will allow you to click through various sets of similar images the SOM thinks. It is expected that

your results will not be perfect and there will be quite a lot of invalid matches. However, your

results should produce matches that are better than using the function lab_showsimilar on an

untrained SOM. You should experiment with the parameters in the function som_gui and report on

what you found to be the best in your own experiment. [1 Mark].

All the hyper-parameters used in this experiments and main results, e.g., U-Matrix and

clusters you observed from the U-Matrix, must be described in your report. You are

suggested highlighting the clustering results by annotating the U-Matrix directly if possible.

PART 3 – Bonus marks

Additional marks are available for students who further look into feature extraction issues. In

Lab2_Part2, the file lab_features.m contains a function that extracts various types of features

from an image. This function is called automatically for each image in the image directory when the

function lab_featuresets is run. You should read the comments in the code and attempt to

implement some of the missing features. You should then test whether you achieve better results

or not with your new features. One may instead be able to find an alternative method that is more

effective than those proposed in lab_features.m. Full marks can be awarded only if you can

justify why to gain a success with sufficient evidence and your result is significantly better

than that with the default code given in Part 2.

DELIVERABLES

A zipped file, named “yourname-lab2.zip”, including a report in the PDF format (two singleside

A4 pages (font of 11pt) and one additional page can be used only for Part 3 if you do)

and all relevant source code in Part 1 along with a readme.txt file in the text format. The

zipped file must be submitted via the Blackboard.

In your report, you should give key points of your implementation and results for Part 1 and a full

description of your observation for Part 2 as well as your feature extraction/selection method and

its justification for Part 3 if you do it, and any graphs that you think represent your achievements for

different parts. For Part 2, you should describe the hyper-parameters used in your experiments, the

U-matrix and observed clusters in the report without including the images/the system in the zip file.

Your readme.txt file must contain a step-by-step procedure for Parts 1 and 2 so that a marker

can follow your instructions to run your submitted code and the SOM system used in Part 2

straightforwardly for replicating the results described in your report.

Take note, we are not interested in the details of your code, what Matlab functions are called, what

they return etc. This course unit is about machine learning algorithms, and is indifferent to how you

program them in Matlab.

There is no specific format – marks will be allocated roughly on the basis of:

? rigorous experimentation

? how informative and well your results are presented in your report

? imagination/research/understanding/performance in Part 3 (if you do)

The lab is marked out of 15:

Part 1 – Implementation 10 marks

Part 2 – Image clustering 2 marks

Part 3 – Bonus 3 marks

Mark and Feedback will be available on the Blackboard. Once the marking is completed,

you will be notified via email.