Python Practical
Examine the accuracy of a TensorFlow recommendation engine.
Tensorflow is a popular framework for machine learning. The popularity is in some for some part that it can utilise the GPU on computers making it very fast.
TF-Recomm is a movie recommendation engine built on tensorflow which uses a Factorization model and a data set from movielens. The original code https://github.com/songgc/TF-recomm does not have a interface to ask for recommendations.
Andy has created a version that listens on a port for a user number and will return recommendations for that user. The code is by no means perfect and there may be errors ! The code is bundled into a Docker container for ease of use. To use this code :
Standard Small Version
Clone the project from https://github.com/acobley/TF-recomm.git
git clone https://github.com/acobley/TF-recomm.git
Run the docker container:
docker run --name runtfrecomm -p 81:81 -i -t acobley/tfrecomm bash
This will start a docker container an open it with a command line. Change to the TF-recomm directory and run the engine
python svd_train_val.py
This will train the model and once done wait for a socket connection. Open another window and open a telnet session (if you are on windows you will need to get one):
telnet localhost 81
Type and number between 1 and about 5000 and you should get recommendations back.
GPU large version
This version uses a much bigger data set and should use your GPU. This version has not been tested, make sure the small version works first.
Clone the project as before but switch to the GPU branch
git checkout GPU
run the docker container as :
docker run --name runtfrecommgpu -p 81:81 -i -t acobley/tfrecommgpu bash
You should be able to run it in the same way as the small model.
The Problem
This model is trained from the set of data from movielens. The model only uses the ratings.dat file to train the model. However there are two other files, movies.dat and users.dat. The challenge is to write a python program that will interrogate the model and allow you to get an idea of how accurate the model is. To do this you will need to use the movies and users file.
The Movies file:
See http://files.grouplens.org/datasets/movielens/ml-1m-README.txt
The file is in the format:
MovieID::Title::Genres
Genres are a list with | delimitors.
The Users file
See http://files.grouplens.org/datasets/movielens/ml-1m-README.txt
The file is in the format:
UserID::Gender::Age::Occupation::Zip-code
See the readme file for details of the Age and Occupation field.
What to do
If you look carefully, you will see that there are users with similar gender, age and occupation and possibly zip code. You would imagine that users same demographic would receive similar recommendations. Write a Python program that finds users of a similar demographic, attach to the tensorflow container, send the user id’s, store the recommendations and look for similarities. You could look at the genres returned and see if they are similar for similar users. You could extend the the movies file using python to include age ratings and use that as a comparison of the returned movies. You might want to use R to analyse the results you obtain to look for statistical meaning.
Write a report on the code you have produced and any results you have generated. Include your code in the appendix (and a zip file of the code). Your report should be less than 10 pages.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。