MATH5855代写、代写R语言程序-代写Algorithm 算法作业

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

您当前位置：首页 >> Algorithm 算法作业Algorithm 算法作业

MATH5855代写、代写R语言程序

日期：2022-10-15 02:46

MATH5855: Multivariate Analysis

Assignment 2

Due data: 5 pm on Tuesday October 25, 2022

Instructions:

The assignment 2 contains 3 questions and worth a total of 100 points which will count

towards 15% of the final mark for the subject.

Use tables, graphs and concise text explanations to support your answers. Unclear answers

may not be marked at your own cost. All tables and graphs must be clearly commented and

identified.

You may choose to submit two files, the pdf file of the answers and the R markdown file,

containing the R codes, OR answer all the questions as an R markdown file.

Questions

Question 1. (Test and confidence region for mean) [20 Marks] Municipal wastewater treatment

plants release their discharges into rivers and streams and they are required to test the biochemical

oxygen demand (BOD) and suspended solids (SS) of their discharges on a regular basis. There are

some concerns about the reliability of the results provided. So, to confirm the results, a study was

conducted and n = 11 samples of effluent were divided and sent to two laboratories for testing.

One-half of each sample was sent to the Wisconsin State laboratory of Hygiene , and one-half was

sent to a private commercial laboratory routinely used in the monitoring program. The data are

displayed in Table 1.

Assume the data follows a multivariate normal distribution. We are going to answer the question

if there is enough statistical evidence to indicate the two lab analysis procedures are different in

the sense that they produce systematically different results.

(a) Use R to find the p-value for testing the hypothesis H0 : μ1 = μ2 versus H1 : μ1 = μ2,

where μ1 and μ2 are the mean vectors for measurements from commercial and and state

labs, respectively. You can write the function or use a predefined function in R.

1Commercial lab State lab

Sample BOD SS BOD SS

Table 1: Effluent Data.

(b) Use R to find and draw the T2 confidence region for μ1 ∞ μ2 at confidence level 95%. Does

this confidence region confirms the result obtained in part (a)?

Question 2. (principal component analysis) [45 Marks] The dataset ”consum2007.dat” contains

some information about per capita consumption expenditures of urban households in 31 regions

in China in 2007, Lang and Jin (2021). The variables are the consumption expenditures on food

(Food), clothing (Cloth), residence (Resid), household facilities, articles and services (HousF),

health care and medical services (Health), transport and communication (TranC), education, cul

ture and recreation (Educ) and miscellaneous goods (Miscel).

(a) Use R to calculate the correlation between variables. The correlation between which vari

ables is different from zero??

(b) For principal component analysis, do you suggest using the covariance matrix or the corre

lation matrix? Why?

(d) What percentage of the variability of the data does each principal component explain? Also

compute the cumulative percentages of variance and draw a screeplot for these data.

(e) Give explicitly the linear combinations of the original data to create the first and second

2principal components and give an interpretation of these linear combinations, describing

which variables play the biggest roles in the construction of those two PCs.

(f) Draw the biplot for the first 2 principal components. Describe what you can extract from the

plot.

Question 3. (Canonical Analysis)[35 Marks]

(a) Let X and Y be p-variate and q-variate random vectors, respectively. Assume that

Let X? = ATX + u and Y? = BTY + v, where A and B are non-singular matrices with

properly defined dimensions. Show that the first canonical correlation between X? and Y?

is the same as the first canonical correlation between X and Y and the canonical correlation

vectors are given by a? = A?? 1a and b? = B?? 1b, where a and b are the vectors connected

with the first canonical correlation vectors of X and Y, respectively.

(b) Consider the provided data in Question 2. Let X and Y denote the set of variables {Food,

Cloth, Resid, HousF, Miscel} and {Health, TranC, Educ}, respectively. Calculate the canonical

correlations between X and Y and write the the first canonical variables in the explicit form.

Do they have a clear interpretation??

canonical correlations are significant??

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：代写program、代写R语言编程

【下一篇】：代写program、代写R语言编程

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Algorithm 算法作业Algorithm 算法作业

MATH5855代写、代写R语言程序

日期：2022-10-15 02:46

相关文章