联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2020-12-11 11:24

In this project, I want us to combine several discussions to appreciate the advancements and challenges of machine learning, to practice scientific investigations, and to write scientific reports. We will do this in a setting where we combine pretrained recognition networks with a reinforcement learning model implemented as function approximator.

The task is to implement and evaluate a reinforcement learner that takes handwritten numbers as input and counts either up or down depending on the best action to choose. The reward in this environment is r=1 for state 0, and r=2 for state 9. There is no reward given in the states between, and the rewards are not provided to the learner in advance, hence it is a model free RL task.

We started already in Assignment 8 to use a recognition network to first map an image of a number to the corresponding integer or one-hot representation. Such a solution should be included as baseline in your investigations. The task of the final is now to implement a single network that takes an MNIST image as input and provides the value function Q for counting up or down.

There should be two stages to this investigation above the baselines of Assignment 8. The first is to use a MNIST recognition network that you pretrain that then becomes part of the RL network. You should then test the performance with frozen and unfrozen layers of the pretrained network. You should be able to use all test and training data provided in the MNIST dataset. However, please include arguments in case you must reduce this set.

The second stage is to try and train the network without pretraining. I want to caution you that this is not an easy task and that you might not even be able to get this performing well, if at all. Some discussion of this most challenging part of the project must be included in your paper.

The report of this final project will be in form of a scientific paper such as a typical scientific paper for a conference proceeding with a strict page limit of 4 pages. The paper is to be submitted in unzipped pdf format. The font size must be 11pt or larger, including all the fonts used in illustrations. A margin of 1 inch is expected around all edges. I am fully aware that reporting all your findings in this space will require some careful writing. You can assume that the readership has some machine learning background, though some brief introduction of your method is still required. Also note that readers must be able to reimplement your experiments from the paper alone. Hence, all parameters of the model and the experiments must be provided. Please also submit your programs.

This final project is meant to be an interactive research project, so I expect you to reach out to your study group, to TAs, or to the instructor if you have questions or you run into problems in your implementation. Although this is an individual project, I fully expect that you will be discussing this with your peers. However, you need to write your own paper and you must be prepared to defend your paper in the end.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp