联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2019-09-10 11:25

MAST90085 – Semester 2, 2019. Assignment #1

Instructions:

• The assignment contains 2 problems worth a total of 20 marks which will count towards 10% of

the final mark for the course.

• A PAPER copy of your assignment must be turned in by 5pm Wednesday 11 September, 2019. You

must complete the online plagiarism form on LMS by 5pm Wednesday 11 September, 2019. The

assignment must be return in the MAST90085 pigeonhole assigned in the School of Mathematics

and Statistics. Make sure you place your assignment in the right pigeonhole (there are three

pigeonholes, one per tutorial and you should use only the pigeonhole corresponding to your tutorial).

• [1 mark] Your assignment should clearly show your name and student ID number, your tutor’s

name and the time and day of your tutorial class. Your answers must be clearly numbered and in

the same order as the assignment questions. Your answers must be easy to read (marks may be

deducted for illegible handwriting). Include all of your working out in your answers. All R outputs,

including graphs and tables, must be accompanied by your concise and clearly written R code used

to produce it. Any graph, table or R code must be accompanied by clear and concise comments.

• Use tables, graphs and concise text explanations to support your answers. All tables and graphs

must be clearly commented and identified.

• All R code should be clearly written and commented. Uncommented R code is not acceptable.

• Comments should be brief and concise: marks will be awarded for clarity.

• Late assignments will only be accepted under exceptional circumstances with a written application

for submitting late and/or a medical certificate. A late penalty may be imposed.

• Your lecturer may not help you directly with assignment questions, but may provide some appropriate

guidance.

Data: In the assignment you will analyse some wheat data. The dataset is available in .txt format on

the LMS web page within the Assignments menu. The data come from three different varieties of wheat

denoted by 1 to 3 in the dataset. Each row of the dataset corresponds to a different wheat kernel. Seven

numerical characteristics were measured on the data: X1: area, X2: perimeter X3: compactness X4:

length of kernel, X5: width of kernel, X6: asymmetry coefficient X7: length of kernel groove. whereas

the eighth variable X8 contains values 1, 2 or 3 dependent on the variety of wheat the kernel comes from.

Problem 1 [8 marks]

(a) [3 marks] Give all possible values that a and b can take in order for the following matrix to be a

covariance matrix. Give arguments that justify your answer:

(b) [3 marks] Compute explicitly and without using R, all the eigenvectors and the eigenvalues of the

Deduce from there two orthogonal eigenvectors of norm 1 of that matrix. Give explicitly an

orthogonal matrix Γ and a diagonal matrix Λ such that we can write

Σ = ΓΛΓT

1

(c) [1 mark] Read the wheat data in R and create a data matrix X of size n × p, where n = 210

and p = 7 which contains the seven attributes X1 to X7 described above from all n kernels. Then

create a vector of length n which contains, for each kernel, the wheat variety it comes from, coded

1 to 3 as described above. If you use the menus in R studio to read your data, please print out the

corresponding instructions (they are given by R studio).

(d) [1 mark] Using R, for the covariance matrix S of X at (c), give explicitly an orthogonal matrix Γ

and a diagonal matrix Λ such that we can write

Problem 2 [11 marks]

(a) [2 marks] Perform a principal component analysis of the wheat data. Store the eigenvalues of the

covariance matrix in a vector called lambda and the eigenvectors in a matrix called gamma. What

percentage of the variability of the data does each principal component explain? Also compute the

cumulative percentages of variance ψ1, . . . ψ7 defined in class and draw a screeplot for these data.

How many principal components does this suggest we should keep?

(b) [3 mark] Give explicitly the linear combinations of the original data used in this example to create

the first and second principal components and give an interpretation of these linear combinations,

describing which variables play the biggest roles in the construction of those two PCs.

(c) [3 mark] Draw scatterplots of the principal components, using colours to identify different groups

of data. Describe what you can extract from those graphs. Which groups are visible on the graph?

What do they correspond to? How do the original variables contribute to those groups?

(d) [3 marks] Using the formula given in class, but replacing each population quantity by its empirical

estimator, compute the correlation matrix that contains the correlations between each principal

component and each original variable. Draw the correlation graph showing the correlations between

of the original variables X1 to X6 and the first two PCs. For each of the six original variables, use

an arrow to represent the correlations with the first two principal components as in the correlation

picture shown in class in week 5, and indicate the names of the variables near each arrow as done in

the example shown in class. Add to your graph a circle of radius 1 centered at the origin. Use this

and the other results of your PC analysis to describe further the results of the principal component

analysis, explicitly discussing the original variables, the groups of individuals, and the connection

between these two.

Hints: 1) To draw an arrow in R, use the command arrows 2) To add some text to a graph in R,

used the command text(x,y,yourtext) where x and y are the x and y coordinates of where to

write your text and yourtext is the text you want to write there. 3) To add a circle to a graph, use

radius <- 1

theta <- seq(0, 2 * pi, length = 200)

lines(x = radius * cos(theta), y = radius * sin(theta))

2


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp