INTRODUCTION TO AI
2020 Fall Semester
PROJECT
Due Date: 12 December (Saturday) before midnight to YSCEC
INSTRUCTIONS:
1. This paper consists of 5 pages with 2 Questions only.
2. You may use deep learning library/toolbox to solve the problem. However, you must describe
the related information such as name, version etc in the introduction part.
3. Submit a report containing the following items:
Example:
Introduction: A brief introduction of the problem.
Problems and Solution
Discussion
Conclusion
Remark: Introduction and conclusion only write once in the report.
4. Use English in the report.
5. Attach your code as an appendix. You may show the code snippet (not whole code) to aid your
explanation.
Question 1: AlexNet with CIFAR-10
In this question, we will adopt AlexNet discussed in the lecture. We will use CIFAR-10 dataset
(https://www.cs.toronto.edu/~kriz/cifar.html). You can find the tutorial for how do you train a
CNN with CIFAR-10 according to the specific deep learning library [1]-[4]. Most of the major
deep learning libraries (Tensorflow, Pytorch, MATLAB) come with a tutorial with CIFAR10
example.
You should have good knowledge on AlexNet architecture so that you can modify the
hyperparameters in the network according to the CIFAR 10 characteristics.
Train a standard AlexNet with CIFAR 10 via CE loss. Note the following:
Modify output layer (Softmax layer) to 10 nodes, correspond to 10 classes in CIFAR-10.
Use original CIFAR 10 images (32 x 32 x 3) to input AlexNet. With this input size, you will
find feature map vanishes at the last conv layer, so modify the stride number, padding etc.
If you face insufficent memory problem, modify the filter size, channel numbers, number of
neurons in the FC layers.
What you cannot change is the architecture (5 conv layers, 3 max pooling layers and 2 FC
layers).
(a) Describe training setup eg. data pre-processing/augmentation, initialization, hyperparameters
such as learning rate schedule, momentum, dropout rate, batch number, BN etc. Present them
systematically in the table form.
(b) Fill in the table below to show the weights and neurons of each layer in your modified
AlexNet network.
(c) Plot the following:
Training and test loss (not classification accuracy) vs epoch.
Classification accuracy on the training and test set vs epoch.
(d) Fill the table with the final accuracy that you obtained.
Training Accuracy (%) Testing Accuracy (%)
Remark:
1. The testing accuracy should be at least 75% and above. With proper tuning the network
hyperparameters, data augmentation, BN etc you can achieve more than 85%.
2. The different of training accuracy and testing accuracy should not more than 10% only.
3. The generalization techniques discussed in the lecture would be helpful. The score for this part
is proportional to the accuracy that you can obtain. Note the accuracy values should be consistent
to the graph shown in (c). If no graph is shown, answer in (d) is discounted.
(e) Discuss the issues that you experienced and the effort that you’ve taken to overcome them.
This could be non-convergence training or/and poor testing accuracy etc.
Remark: If your machine is slow in solving this question due to CPU or poor GPU, consider
Google Colab if you are using Python. See https://towardsdatascience.com/getting-started-withgoogle-colab-f2fff97f594c
Question 2: Transfer Learning
Now we want to see whether the features of AlexNet developed in Question 1 can transfer to other
domain. Here we use imdb_wiki face dataset (subset) that consists of 100 subjects with 30
samples per subject as target dataset. Note this is the case where source and target data are
dissimilar and small in size, relative to CIFAR 10.
(a) First, based on the model that has been trained with CIFAR 10, re-train the last (softmax)
layer in the prebuilt network in question 1 with face training samples (20 samples per subject).
This is equivalent to you freeze the rest of the layers in the network and replace with a new
(linear) classifier for face. Note number of output nodes should change to 100 instead of 10.
This serves as a baseline model. Evaluate the baseline model with face test set, which
composed of 10 samples per subject.
(b) Then, try to fine tune/remove/retrain chosen convolution and/or FC layers. Create three models
with different trials. Fill in the table below. Evaluate the model with face test set.
Trial Finetune Learning Rate
(If applicable)
(specify in this form: x% of the
original lr)
Model #
Finetune Conv 4 and 5, freeze the rest
(Example)
10% of original lr
(Example)
Model 1
(Example)
Remove Conv 5, FC 1 and FC2
(Example)
Not applicable (Example) Model 2
(Example)
For each model,
(i) Show both training and test loss vs. epoch graphs.
(ii) Show both classification accuracy on the training and test set vs. epoch graphs.
Fill in the table
Training Accuracy (%) Testing Accuracy (%)
Baseline Model
Model 1
Model 2
Model 3
(c) Discuss your observation in terms of generalization performance of transfer learning.
Remark: Almost every deep learning library comes with tutorial of CNN tuning such as
Tensorflow [5], MATLAB [6], Keras [7] etc.
References:
[1]. https://www.mathworks.com/help/nnet/examples/train-residual-network-on-cifar-10.html.
[2]. https://www.tensorflow.org/tutorials/deep_cnn
[3] https://blog.plon.io/tutorials/cifar-10-classification-using-keras-tutorial/
[4] http://caffe.berkeleyvision.org/gathered/examples/cifar10.html
[5] https://kratzert.github.io/2017/02/24/finetuning-alexnet-with-tensorflow.html
[6] https://www.mathworks.com/help/nnet/examples/transfer-learning-using-alexnet.html.
[7] https://blog.keras.io/building-powerful-image-classification-models-using-very-littledata.html
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。