BIO773P Analysis Assignment 2022
Introduction
You have been allocated a unique dataset to analyse. To find yours, look at the table
at the end of this document. Your datasets are supplied as .csv files in a zip archive
called “Data.zip. Having read the description of the way the data were collected, you
need to import each dataset into R. Remember to check the datasets when you import
them for things like proper allocation of each variable as a factor or a numeric
variable, and to carry out proper exploratory data analysis. Once you're happy that
you've got the data into a condition where it can be analysed, fit a model and use that
to try to answer the questions given below. Your writeup should consist of two
documents: a properly annotated R script that will allow me to exactly replicate your
analysis, and the results section of a paper describing your findings. The latter should
be no more than 1000 words (and can be a lot less...) and should not contain any
unnecessary figures or tables: I would recommend no more than two figures. If in
doubt have a look at some papers in (for example) Proceedings B or Journal of
Animal Ecology to see how it's done. Please don’t paste output from R directly into
your report, by which I mean don’t just paste a summary table or an ANOVA table in,
not that you shouldn’t use graphics from R.
The deadline for this work is 5PM on Friday 4th November, submission via QM+
please.
Endophytic fungi and plant growth
Many plants harbour endophytic fungi, and these often seem to exist in mutualistic
relationships with their hosts: plants harbouring endophytes have increased growth
and vigour and the fungus benefits from a ready supply of sugar and a protected place
to live. Some recent work, however, has suggested that this mutualism only holds
under some environmental conditions and that when nutrients are limiting it can break
down. Here, you have the data from two experiments investigating the nature of this
mutualism in red fescue grass, Festuca rubra. In the experiment, seeds of the grass
from a cultivar that is known to house endophytic fungi were grown in three nutrient
treatments: a high nutrient condition with 10ml of 3g.L-1 20-20-20 NPK fertiliser
applied biweekly, a medium nutrient condition with 3ml of fertiliser and a low
nutrient condition which received tap water.
After 11 weeks all plants were harvested by cutting at soil level. A sample was taken
of the leaf sheath, mounted on a microscope slide, stained with aniline blue and
examined at x400 on a compound microscope. The number of hyphae present in three
800 um diameter fields of view were counted as an indication of the degree of
infection of each plant. Leaf area was measured by arranging the plant on a clean flat
surface, photographing the plant next to a scale and then using the image analysis
programme ImageJ to measure the area.
Report
For the report, please use an appropriate analysis to produce a results section which
addresses the following questions:
1) How is leaf area related to the degree of infection in F. rubra?
2) Is this relationship dependent on nutrient supply, and if so how?
General feedback from previous years
Overall, I was impressed with the standard of most of these reports. Most of you
managed to get the analyses more or less right, and most of you produced good-
quality scripts. None of them had too much annotation - remember that when you’re
annotating a script you’re really writing a guide to yourself describing what it does.
Think about the information that might be useful if you come back to the analysis a
year or two in the future.
Regarding the “results section” of the reports, these were a lot more variable. Many of
you put too much analysis into these, and in particular included material that would
not normally go in a journal results section. Preliminary and exploratory analysis,
diagnostic plots and the like are not generally included in journal results sections.
Many of you had redundant graphs – you shouldn’t show the reader the same set of
data twice. A common problem was too much focus on the statistics and not enough
on what the statistical results mean in terms of the biology of the system, or in terms
of effect sizes: don’t just tell me that there’s an effect, tell me how big the effect is,
and put it in meaningful biological terms.
Figure captions are something that most of you need to work on. When you’re writing
a figure caption, it’s a good idea to try to write it so that a casual reader who is
skimming through the paper can look at the figure, have a look at the caption and
have at least a rough idea about what the figure is showing. That doesn’t mean that
each figure should have the whole methods section reproduced, but you can include a
sentence or two that gives the casual reader a basic idea of what’s going on.
Other common problems included:
Including p-values without test statistics or degrees of freedom
Including multiple tests of the same thing (e.g. using a post-hoc test on a fitted
model, collapsing two factor levels and then comparing models with a partial
F-test)
Carrying out tests for normality on data prior to analysis (I explained in one of
the lectures why this is a bad idea)
Mixing p-values and AIC as criteria for model selection - the philosophy
behind these is very different and you should use one or the other but not both.
Giving significance levels for main effects when they are also included in
higher-order interaction terms
Including graphs showing no effect. There are circumstances when you might
want to do this, if you’re reporting a negative result and you think it’s
necessary to make a point about how little relationship there is, but in general
we wouldn’t put this sort of thing in.
Including code, function names etc. from R. Results sections from journals
wouldn’t usually include this kind of material unless you’re describing an
esoteric analysis that the readers will not be familiar with, in which case you
might say “We fitted a generalised additive model to the data using the gam()
function as implemented in the mcgv package (Wood 2014)” but usually you
would just tell the reader the type of analysis used.
Referring to “non-significant” results as “insignificant”. Don’t do this - have a
think about why.
No references! Not a single person put a reference in their results section. You
don’t necessarily need references in a results section but you need them for
any software you use, including R packages that aren’t part of the base R
installation and for any methods which are not sufficiently common as to be
considered standard. As an example, you wouldn’t use a reference for a
standard linear model but you might for something like an MCMC GLMM.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。