Final Data Analysis Report
Due: December 20, 2022 by 11:59PM ET on Quercus
No late submissions will be accepted
Goal of the Assessment:
Part 3 of the Final Project is your opportunity to demonstrate all that you have learned
throughout the course. This will be done by showing the teaching team that you can use the
methods and techniques learned in the course appropriately. You can use the feedback that
you have received in Part 1 and 2, as well as in the video project to write a report that is in a
common research paper format (IMRD: Introduction, Methods, Results, Discussion). Writing
these kinds of reports is likely something that, as a graduate student or a statistician working in
industry, you will find yourself doing occasionally.
Since this assignment is used to assess how familiar you are with the use of the tools and
methods from this course only, you should NOT use materials that were not covered in this
course. Instead, focus on showing us how much you know about everything we have discussed
throughout the term.
It can also be used as part of a dossier when applying to jobs to showcase your abilities as a
statistician and data analyst.
General Instructions:
Using only methods and techniques presented in the lecture slides throughout the term, you
are tasked with answering your proposed research question by creating the ‘best’ linear
regression model that meets the requirements of your research question. You will then need to
write a report (details below) that (i) introduces your research question and presents some
background, (ii) outlines the steps in your analysis that you followed to reach the ‘best’ model,
(iii) presents the results of your analysis and describes and justifies the decisions you made, and
finally (iv) discusses the final model, its interpretation and its limitations in terms of its ability to
meet your research goals. It should be made clear whether you are aiming for a model that
makes good predictions, or a model that is more descriptive and easier to interpret, or some
combination of both.
The feedback and work you have put into Part 1 of the final project should help you structure
your report in a professional and easy-to-read fashion, as well as provide you with a good
beginning to your introduction section. You may want to consider adding some additional
background research or more discussion about how your research question is important and
different from the background you present. The EDA portion of part 1 should be helpful in
writing the beginning of the results section, where you display the characteristics of the data
you will use to answer your question.
The feedback and work you have put into Part 2 of the final project should help you structure
the methods section of your report, where you will outline the process you followed/tools and
methods you used to answer your research question. The feedback should also help you with
how you approach your data analysis itself.
How to present your final report:
Once you have decided upon the ‘best’ model to fulfill the goal of the project, you must write
up a short scientific report. There should be 4 main sections of your report:
Introduction section: where you introduce the purpose and relevance/importance of
the project and provide some relevant background information on the topic (no results
or data should be presented here).
Methods section: where you describe and explain the methods, tools and techniques
used to arrive at your final model (no results or data should be presented here, but you
can tell us where you found your data and what variables it contains).
Results section: where you present a numerical/graphical description of your study
sample and important results that led you to make crucial decisions in building your
model (following the methods you outline in the earlier section), followed by the final
model and any other important results
Discussion section: where you interpret your final model and describe why it answers
the research question and why it is important, as well as discuss any limitations that still
exist based on your results.
You may use tables and plots to help present your results, but they must be relevant and well-
thought-out to convey as much information as possible without being too overwhelming or
confusing. When explaining your methods, try to avoid just stating that you used a specific
method, but add an explanation for how it is used to achieve a specific task. When presenting
your results, avoid repeating exactly what you wrote in your methods section. Instead, focus on
the results of the process you described earlier, and use numerical values/graphical results to
support the decisions you made in arriving at your final model. See the rubric for more
information regarding the various report components.
If you want more information about how to structure your report and what should be
contained in each section, see this cheat sheet and this outline for reports (you may ignore the
abstract portion since you do not need one). Note that not all the elements in these resources
need to be included in your report. But you can use these to better understand how to
structure your submission.
Finally, if you use any external resources outside of the lecture slides, e.g. to provide
background on your topic, you should include a reference section at the end of your report. You
may follow APA citation styles to help format your references. For some resources on how to
cite, see the library page on citations.
What to do if you want to change your dataset or research question:
If you wish to change your dataset or research question from what was originally proposed in
Part 1, you are allowed to do so. However, you will need to provide a written statement that
proposes the change you wish to make. In order to change your dataset or research question,
you will need to submit a 1-page document (to be submitted by December 4 at 11:59PM ET on
Quercus) that answers the following two questions:
1. Why are you changing your topic or dataset? Elaborate on what made your original
dataset or topic not appropriate for the final project.
2. What makes your new topic and/or dataset more appropriate than the previous one?
Be sure to clearly state your new research question and provide a short, written
description of where you located your dataset and what information it contains.
The instructor will then approve or provide suggestions to improve your new dataset/research
question.
Technical Requirements of the Final Report:
Your report should be typed using whatever software you prefer but must be saved and
submitted as a PDF or .docx file on Quercus. Your report must meet the following
requirements:
Font: 12-point font in a style similar to Times New Roman (this is the default in R
Markdown)
Spacing: single-spaced
Word count: up to a maximum of 1500 words in total (this does not include captions on
figures and tables, however, you should also not make captions excessively long or
contain information that isn’t mentioned in the main text). We will still accept a report
that exceeds the word limit by no more than 150 words.
Number of tables/figures in the main report: 5 in total, but you may use any
combination of tables and figures
Figures and table captions: all figures and tables included should include a caption that
describes what is being presented (caption not included in the word count).
o Captions should not contain information that is not also discussed in the main
report
Figure properties:
o All plots should have an appropriate title and axis labels, avoiding the use of
variable names as they appear in the dataset
o A figure may include multiple individual plots but they should be related to each
other and make sense as to why they are being presented together
§ Avoid having too many plots in the same figure to ensure that they are
legible and clear.
Reference list or bibliography at the end of the report (will not count towards word
count), using appropriate citation style
Appendix: you may add an appendix at the end of your report to include some
additional tables or figures that were not important enough to be part of the main
report, but still relevant to your analysis:
o up to 3 additional tables/figures but they should only be included if they are
relevant to the analysis and are referred to in the main text.
R code: In a separate file (i.e. RMD file), you should upload your cleaned and complete
version of the R code that was used to conduct your analysis. The R code should be well-
organized and commented appropriately to indicate what each line/section of code is
doing.
Checklist for submitting final project part 3:
1. Your final written report which follows the requirements above.
2. Your R code that shows your complete analysis (this will be used to verify the results
displayed in your written report and will not be assessed for content).
Things to keep in mind while writing your final report:
o You do not need to write out the results of every step you took in your analysis as this
will make your report too long.
o Instead, focus on summarizing the most important results, especially where a big
decision was made. You need to justify it any big decisions.
o For the rest of your results, very short mentions of the process with a brief piece
of evidence provided are enough to allow your reader to follow your analysis and
understand how you arrived at the final model.
o Rather than presenting the results of each step separately (e.g creating separate tables
for each), consider putting together one larger table that you can refer to in your
discussion of many steps in your analysis so that you don’t use too much space
o For example, if you are selecting between a few different models, you could
consider presenting a table that includes many different summaries of the fit of
each model and refer to each part as needed in the text, instead of making
individual tables for each component.
o Avoid using R output taken directly from R/RStudio. Instead create your own tables
where you select only the relevant pieces of the output to display.
o Generally, the methods and results sections tend to be the longest sections, while the
introduction and discussion tend to be shorter.
o Keep this in mind when deciding how much background to provide in your
introduction. Often just a paragraph or two is plenty, given the word limits in this
project.
o However, make sure you leave yourself enough space for a solid discussion
where you can discuss the impact of the limitations that may exist in your model.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。