联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> C/C++编程C/C++编程

日期:2024-01-02 09:19

Coursework 2 (Group) – Scene Recognition

Brief

This is a group coursework: please work in teams of four people.

Due date: Wednesday 10th January, 16:00.

Development data download: training.zip in the coursework (CW) folder

Testing data download: testing.zip in the CW folder

Required files: report.pdf; code.zip; run1.txt; run2.txt; run3.txt

Credit: 25% of overall module mark

Overview

The goal of this project is to introduce you to image recognition. Specifically, we will examine the

task of scene recognition starting with very simple methods -- tiny images and nearest neighbour

classification -- and then move on to techniques that resemble the state-of-the-art.

This coursework will run following the methodology used in many current scientific benchmarking

competitions/evaluations. You will be provided with a set of labelled development images from

which you are allowed to develop and tune your classifiers. You will also be provided with a set of

unlabelled images for which you will be asked to produce predictions of the correct class.

Details

You will need to write software that classifies scenes into one of 15 categories. We want you to

implement three different classifiers as described below. You will then need to run each classifier

against all the test images and provide a prediction of the class for each image.

Data

The training data consists of 100 images for each of the 15 scene classes. These are arranged in

directories named according to the class name. The test data consists of 2985 images. All the

images are provided in JPEG format. All the images are grey-scale, so you don't need to consider

colour.

Objective measure

The key classification performance indicator for this task is average precision; this is literally the

proportion of number of correct classifications to the total number of predictions (i.e. 2985).

Run conditions

As mentioned above, you need to develop and run three different classifiers. We'll refer to the

application of a classifier to the test data as a "run".

Run #1: You should develop a simple k-nearest-neighbour classifier using the "tiny image" feature.

The "tiny image" feature is one of the simplest possible image representations. One simply crops

each image to a square about the centre, and then resizes it to a small, fixed resolution (we

recommend 16x16). The pixel values can be packed into a vector by concatenating each image

row. It tends to work slightly better if the tiny image is made to have zero mean and unit length.

You can choose the optimal k-value for the classifier.

Run #2: You should develop a set of linear classifiers (an ensemble of 15 one-vs-all classifiers)

using a bag-of-visual-words feature based on fixed size densely-sampled pixel patches. We

recommend that you start with 8x8 patches, sampled every 4 pixels in the x and y directions. A

sample of these should be clustered using K-Means to learn a vocabulary (try ~500 clusters to

start). You might want to consider mean-centring and normalising each patch before

clustering/quantisation. Note: we're not asking you to use SIFT features here - just take the pixels

from the patches and flatten them into a vector & then use vector quantisation to map each patch

to a visual word.

Run #3: You should try to develop the best classifiers you can! You can choose whatever feature,

encoding and classifier you like. Potential features: the GIST feature; Dense SIFT; Dense SIFT in a

Gaussian Pyramid; Dense SIFT with spatial pooling (commonly known as PHOW - Pyramid

Histogram of Words), etc. Potential classifiers: Naive bayes; non-linear SVM (perhaps using a linear

classifier with a Homogeneous Kernel Map), ...

Run prediction format

The predictions for each run must be written to a text file named runX.txt (where X is the run

number) with the following format:

For example:

<image_name> <predicted_class>

<image_name> <predicted_class>

<image_name> <predicted_class>

...

0.jpg tallbuilding

1.jpg forest

2.jpg mountain

3.jpg store

4.jpg store

5.jpg bedroom

...

Restrictions

• You are not allowed to use the testing images for anything other than producing the final

predictions They must not be used for either training or learning feature encoding.

The report

The report must be no longer than 4 sides of A4 with the given Latex format for CW2, and must be

submitted electronically as a PDF. The report must include:

• The names and ECS user IDs of the team members

• A description of the implementation of the classifiers for the three runs, including information on

how they were trained and tuned, and the specific parameters used for configuring the feature

extractors and classifiers. We expect that your "run 3" section will be considerably longer than the

descriptions of runs 1 & 2.

• A short statement detailing the individual contributions of the team members to the coursework.

What to hand in

You need to submit to ECS Handin the following items:

• The group report (as a PDF document in the CVPR format same as CW2; max 4 A4 sides, no

appendix)

• Your code enclosed in a zip file (including everything required to build/run your software and to

train and use your classifiers; please don't include binaries or any of the images!)

• The run prediction files for your three runs (named "run1.txt", "run2.txt" and "run3.txt").

• A plain text file listing the user ids (e.g. xx1g20) of the members of your team; one per line.

Marking and feedback

Marks will be awarded for:

• Successful completion of the task.

• Well structured and commented code.

• Evidence of professionalism in implementation and reporting.

• Quality and contents of the report.

• The quality/soundness/complexity of approach used for run 3.

Marks will not be based on the actual performance of your approach (although you can expect to

lose marks if runs 1 and 2 are way off our expectations or you fail to follow the submission

instructions). We will open the performance rankings for run 3. !"#$

Standard ECS late submission penalties apply.

Individual feedback will be given to each team covering the above points. We will also give overall

feedback on the approaches taken in class when we announce the winner!

Useful links

• Matlab

o Image processing toolbox tutorials

o Recommended: VLFeat

§ Example of using VLFeat to perform classification

o Linear and non-linear SVMs

• Python

o numpy, PIL, sklearn (Scikit-learn), OpenCV, etc.

• C and C++

o OpenCV

o Recommended: VLFeat

o Example of using VLFeat to perform classification (Note this code is Matlab, but most of the

functionality is available in the C/C++ API)

• Java

o Recommended: OpenIMAJ

§ Chapter 12 of the tutorial deals with image classification

o BoofCV

Questions

If you have any problems/questions, use the Q&A channel on Teams


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp