R语言程序代写代做、代写Machine Learning Lab 国外、代做Machine Learning报告-代写OS作业

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

您当前位置：首页 >> OS作业OS作业

R语言程序代写代做、代写Machine Learning Lab 国外、代做Machine Learning报告

日期：2018-07-11 08:28

Lab Course: Distributed Data Analytics

Exercise Sheet 10

Mohsan Jameel

Information Systems and Machine Learning Lab

University of Hildesheim

Submission deadline: Friday July 13, 23:59PM (on LearnWeb, course code: 3114)

Instructions

Please following these instructions for solving and submitting the exercise sheet.

1. You should submit a zip or a tar file containing two things a) python scripts and b) a pdf document.

2. In the pdf document you will explain your approach (i.e. how you solved a given problem), and

present your results in the form of graphs and tables.

3. The submission should be made before the deadline, only through learnweb.

Distributed Computing with Apache Spark : Recommender Systems

In this lab you will build a recommender system. For this exercise you can work with movielens100k,

movielens1m and/or movielens10m datasets available at https://grouplens.org/datasets/movielens/.

The movielens rating dataset consists of user, movie and rating tuples, where ratings are on a five scale.

Exercise 1: Recommender System from scratch ( 10 points)

You will build a basic recommender system using Apache Spark. To build a basic recommender system

you will implement a matrix factorization (MF) without a bias term. [you can also add the bias term

if you want to but it is not required for this task]. Matrix Factorization is basically approximating a

Rating matrix R ∈ RM×N using low rank matrices U ∈ RM×K and V ∈ R

N×K, where U is a user latent

matrix, V is a movie latent matrix, M is the number of users and N the number of movies and K the

number of latent dimensions. Read more about matrix factorization see reference [3] and [4].

Your tasks are:

1. Use Apache Spark transformations and actions to implement MF using Stochastic Gradient Descent.

(Remember PSGD from Exercise Sheet 04).

(a) One strategy to implement MF with Spark is to use map or mapPartition function to create

multiple splits of data and learn an independent MF model for each split. Once the individual

learning is done you can use aggregation on the model parameters i.e. U and V . http:

//apachesparkbook.blogspot.com/2015/11/mappartition-example.html

(b) report the train and test RMSE scores. You can follow standard 3-fold cross validation

Exercise 2: Recommender System using Apache Spark MLLIB

( 10 points)

In this exercise you will use matrix factorization using Apache Spark MLLIB library build in function.

You will experiment with the same dataset as mentioned in the Exercise 1.

1. Implement recommender system using Apache Spark MLLIB functions.

2. report the train and test RMSE scores. You can follow standard 3-fold cross validation

3. compare your scores from Exercise 1 and Exercise 2.

4. look at the results for movilens dataset at http://www.mymedialite.net/examples/datasets.

html and compare your own results.

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：Object-Oriented Programming程序代写代做、代写留学生面向对象作业

【下一篇】：Object-Oriented Programming程序代写代做、代写留学生面向对象作业

联系方式

最新辅导

热门辅导

您当前位置：首页 >> OS作业OS作业

R语言程序代写代做、代写Machine Learning Lab 国外、代做Machine Learning报告

日期：2018-07-11 08:28

相关文章