联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Database作业Database作业

日期:2018-11-07 10:13

COMP9313 2018s2 Assignment

Question 1. MapReduce (5 pts)

Problem Background: Given an undirected graph G, its “line graph” is another graph L(G)

that represents the adjacencies between edges of G, such that:

each vertex of L(G) represents an edge of G; and

two vertices of L(G) are adjacent if and only if their corresponding edges share a

common endpoint ("are incident") in G.

The following figures show a graph (left) and its line graph (right). Each vertex of the line

graph is shown labelled with the pair of endpoints of the corresponding edge in the original

graph. For instance, the vertex on the right labelled (1,3) corresponds to the edge on the left

between the vertices 1 and 3. Vertex (1,3) is adjacent to three other vertices: (1,2) and (1,4)

(corresponding to edges sharing the endpoint 1 in G) and (3,4) (corresponding to an edge

sharing the endpoint 3 in G).

Problem: Given you the adjacency list of an undirected graph G, use MapReduce to generate

the adjacency list of its line graph L(G). Note that each edge connecting two nodes i and j is

represented by (i, j) in L(G) (if i<j). In the output, the edges in each list should be ranked in

ascending order by comparing the first node and then the second node. Write the

pseudocode for this problem, and consider the efficiency of your solution.

Take the above figure as an example, sample input and output are as below:

Input: Output:

1: 2, 3, 4

2: 1, 5

3: 1, 4

4: 1, 3, 5

5: 2, 4

(1, 2): (1, 3), (1, 4), (2, 5)

(1, 3): (1, 2), (1, 4), (3, 4)

(1, 4): (1, 2), (1, 3), (3, 4), (4, 5)

(2, 5): (1, 2), (4, 5)

(3, 4): (1, 3), (1, 4), (4, 5)

(4, 5): (1, 4), (2, 5), (3, 4)

Question 2. LSH (5 pts)

(i) Given two documents A (“the sky is blue the sun is bright”) and B (“the sun in the sky is

bright”), using the words as tokens, compute the 2-shingles for A and B, and then compute

their Jaccard similarity based on their 2-shingles.

(ii) We want to compute min-hash signature for the two documents A and B given in

Question (i), using two pseudo-random permutations of columns using the following

function:

h1(n) = 3n - 1 mod M

h2(n) = 2n + 1 mod M

Here, n is the row number in original ordering of the 2-shingles (according to their

occurrence in A then B), and M is the number of all 2-shingles you have computed from A

and B. Instead of explicitly reordering the columns for each hash function, we use the

implementation discussed in class, in which we read each data in a column once in a

sequential order and update the min hash signatures as we pass through them.

Complete the steps of the algorithm and give the resulting signatures for A and B.

Submission:

Deadline: Sunday 4th November 11:59:59 PM

Please provide your solutions to these questions in a pdf file named as

“answers.pdf”. Log in any CSE server (williams or wagner), and use the

give command below to submit your solutions:

$ give cs9313 assignment answers.pdf

Or you can submit through:

https://cgi.cse.unsw.edu.au/~give/Student/give.php

If you submit your assignment more than once, the last submission will

replace the previous one. To prove successful submission, please take a

screenshot as assignment submission instructions show and keep it by

yourself.

Late submission penalty

You will receive zero marks for this assignment.

Plagiarism:

The work you submit must be your own work. Submission of work partially

or completely derived from any other person or jointly written with any

other person is not permitted. The penalties for such an offence may include

negative marks, automatic failure of the course and possibly other academic

discipline. Assignment submissions will be examined manually.

Relevant scholarship authorities will be informed if students holding

scholarships are involved in an incident of plagiarism or other misconduct.

Do not provide or show your assignment work to any other person - apart

from the teaching staff of this subject. If you knowingly provide or show

your assignment work to another person for any reason, and work derived

from it is submitted you may be penalized, even if the work was submitted

without your knowledge or consent.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp