联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2022-05-15 06:09

ENGSCI 211 2022 S1 – Data Analysis Assignment

DUE: Friday 20 May at 11:59pm on Canvas

This assignment requires you to conduct statistical analyses on three data sets.

Preparation and Submission Instructions

Each task should be prepared as a separate document and converted to a HTML or PDF file, which

should be submitted to the appropriate Canvas dropbox prior to the due date. For each task, include

your R code and output, and then your reports.

Clear and succinct communication is an important part of Engineering, regardless of specialisation. We

expect that you will write clear and concise English detailing your understanding of the analysis you conducted.

In Executive Summaries, this means describing analysis in context, not using variable names, using

units when known, rounding sensibly and not using technical language (e.g. p-value).

Most of the marks in each task are allocated to the Methods and Assumption Checks and

Executive Summary. These must be consistent with your R output for credit.

For R code and output, please use a fixed-width font such as Courier New or Consolas.

You may wish to hand-write your Models and Assumption Checks and Executive Summaries. This is

permitted as long as you merge your files such that only one file is submitted task.

There will be penalties for not following instructions!

Late submissions will be penalised per the policy on Canvas.

Rmarkdown / R Notebooks

This is NOT compulsory.

You may use the method demonstrated in class / in recordings to publish your R Notebooks. Note that

Knit PDF only works if you have a LATEX distribution installed; so knit to HTML or knit to Word (and

then converting this to PDF) will generally be the easiest methods.

It is completely acceptable to produce your assignment by copying and pasting R code and output directly

into a word processor of your choice.

Academic Integrity

By submitting this assignment, you confirm that:

? you understand the University’s policies on cheating, plagiarism and group work.

? you declare that your submission is entirely your own work and reflects your own learning.

? you have not allowed access to any part of the assignment to any other person.

We will be monitoring for academic misconduct and will not hesitate to investigate any suspected cases.

Substantial penalties will apply, and will likely result in a delay in the release of your final grade by up to

six months. This alone may negatively impact your internship prospects. If misconduct is confirmed, your

name will be recorded in the University’s Register of Academic Misconduct for 10 years.

In particular, do not send your files to ANYONE, not even to ‘compare answers’. Once a file leaves

your control it may be submitted by your ‘friend’ and leave you liable for misconduct. University procedures

considers both giving and receiving files as academic misconduct and both will be penalised, regardless of

intent. There is no flexibility on this. YOU HAVE BEEN WARNED!

Assistance available

Piazza is the best place to receive assistance from your peers and your lecturer.

Kevin will run office hours. Keep an eye out for Canvas announcements!

However, course staff will NOT answer questions in the 12 hours before the assignment is due.

Therefore, DO NOT LEAVE QUESTIONS TO THE LAST MINUTE.

Page 1 of 2

Tasks

For each task, we expect to see the following, as done in the case studies and discussed in lectures:

? exploratory analysis, including brief comments below the relevant plot(s) and / or summaries

– this is not printed in your coursebook case studies, but is expected in your assignment!

? checking modelling assumptions via appropriate plots

? appropriate inference, including predictions where required

? reports: Methods and Assumption Checks and an Executive Summary

In your submission, you should include all your R code and output, including all plots produced by R.

Task 1: Tyre Wear (9 marks)

A tyre manufacturer has created a new material formulation that reduces tyre wear (and hence allow the

tyres to be used for longer). An experiment was conducted to measure the difference in tyre wear between

the new material formulation and the old one – an old and a new tyre each was installed in the rear of

twenty cars, and the distance until each tyre wore out was recorded. We are interested in finding whether

there is a difference in the wear-out distance, and to quantify that difference if there is one.

The file tyredistance.txt contains the following variables:

Car identifier of car, 1, ..., 20

DistanceNew wear-out distance for new design tyre in a particular car, in thousands of km.

DistanceOld wear-out distance for old design tyre in a particular car, in thousands of km.

Hint: consider carefully whether this is a paired-sample analysis or a two-sample analysis.

Task 2: Pavement Conditions (12 marks)

The quality of a road surface (pavement) deteriorates over time due to wear-and-tear and environmental

conditions. It is of interest to quantify how much a pavement deteriorates in a year, on average, in order to

inform plans on pavement resurfacing. It is also of interest to estimate the pavement condition index for an

individual pavement section that is 15 years old.

The file PavementConds.txt contain the following variables:

Age age of the pavement section, in years

PCI pavement condition index, a composite measure of surface deterioration,

a higher measure means the pavement is in better condition

Task 3: Netflix Movies (16 marks)

A business analyst at Netflix is interested in optimising the assignment of advertising to various TV shows

and movies. In a particular project, the analyst wants to determine if there are any differences in the lengths

of movies with different age ratings as determined by the Motion Picture Association of America (MPAA),

and to quantify any differences detected. The lengths of 20 randomly selected movies with each rating was

collected for this analysis.

The file movies.txt contains the following variables:

length length of the movie, in minutes

rating rating of the movie, either G, PG, PG-13 or R

More information on MPAA ratings (for interest only, no discussion on this is expected):

https://en.wikipedia.org/wiki/Motion_Picture_Association_film_rating_system

Hints:

? Don’t forget to convert the explanatory variable to a factor

? A transformation is probably required. You should check for this.

? Only quantify effects when they are statistically significant.

Page 2 of 2


相关文章

【上一篇】:到头了
【下一篇】:没有了

版权所有:编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。