R: Assignment 2
Background
• This assignment is worth 50% of your course grade. You can complete
it yourself, or in groups of two or three people (each group member will
receive the same grade).
• The due date will be advised via Blackboard.
• In answering the questions properly, you will need to consult the Part 2
notes for the course, use the R help file and think carefully. Do not trawl the
internet looking for inspiration – that is a waste of time.
• There are three main sections: 1, 2 and 3. Answer all questions.
• Section 4 is a “substitute question”. If you choose this option you can do
only two questions from 1, 2 or 3, rather than three.
• Each question will be marked out of 10. The score will be determined
holistically and subjectively (à la marking an essay), so there are no pre-set
levels of marks for any particular parts of a question.
• Do all calculations and computer code with R.
Instructions
• Be creative. Don’t spend too long doing this assignment.
• No inter-group copying or extensive answer sharing.
• Ask me for help if you get muddled, but not before
you have carefully read the assignment and discussed problems with your
group members.
• Your answers should include a number of Word (or PDF) files and R code
.R files. Submit all files in one .zip file.
• Email me your answers directly; in the subject line
put “LND” followed by your student IDs (e.g. LND 2011213, 2011215,
2011214). Repeat your ids and full names in email text.
• I do not want to see spreadsheets or CSV files. For your data outputs
please send them to .CSV files from R and format them there as you wish
in Excel. Then integrate your formatted data outputs into a Word (or PDF)
file with the related text.
1
R: Assignment 2
• Provide one Word (or PDF) file that explains who your team members are
(student IDs and names) and outlines which questions you have selected.
Call this file “overview.doc” or something else obvious.
• Provide a separate Word (or PDF) file for each question you answer. Call
them something logical and obvious e.g. “answer1.doc”. In each file, briefly
outline how you went about answering the question, including references to
attached R scripts and functions. This is also where you should display the
following: any tables of numbers (fully annotated and nicely formatted); any
general text relating to your analysis; and your specific answers to different
parts of the questions.
• Don’t forget to include all of your R scripts and functions in the “.zip” file.
• I have deliberately left the specific requirements of each question vague.
Decide for yourself how to express your answers to me. Obviously, if the
question simply asks you do something on R, then all you need to do is give
me your scripts and/or functions and in your answer document say that the
code does what the question asks. Where you need to make substantive
conclusions from results or data summaries, show me a table of numbers
and/or a graph, describe in words what is being displayed and say why you
conclude what you do (and also give me the R code).
Assignment Files
• I have included a set of auxiliary computer files to be used in conjunction
with this assignment. I will refer to these as the “Assignment Files”.
Assignment Files
Functions
fsumstats Like from the notes (adjusts for NA values)
fmoment Called by fsumstats
freg Unfinished regression function
fregquick Same as freg but with R2
Data Spreadsheets
assetsq2.csv Asset descriptions in CSV format.
pricesq2.csv Prices in CSV format.
• You can install all the files in your R working directory if you so wish (but my
strong suggestion is to use a simple structure in which you have “functions”
and “data” as separate folders, with the folder paths appropriately added
in your R code).
2
R: Assignment 2
1 Returns, Portfolios and Summary Statistics
The questions in this section relate to the type of analysis introduced in chapters
1 and 2 of the Part 2 notes.
1.1 Statistical Summaries
Extend the statistical summary function “fsumstats” found in the Assignment
Files. Make at least the following additions.
a. Summaries for the minimum, maximum and quantiles corresponding to
.025 and .5 and .975 (make sure you give each row an appropriate name).
b. Autocorrelation estimates - use several lags that you think are practical. For
the autocorrelation function, follow the details for ρˆ(k) from section 2.12 in
the Part 2 notes.
c. P-values for the autocorrelation estimates. Assume that if the true autocorrelation
is zero then √
(T)ˆρ(k) ∼ N(0, 1) where T is the number of observations.
Required: Make the p-values simply the value in the CDF of the
test statistic under the null (like the ones for kurtosis and skewness) and
assume that the underlying null hypothesis is that the true autocorrelation
is zero.
1.2 Practical Regression
For this question use the function “fregquick” from the Assignment Files. For
the base data use the “pricesq2.csv” and “assetsq2.csv” files, also in the Assignment
Files.
Part A
a. First convert all the equity prices (total return indices that include capital
changes and dividends) and the FRCAC40 index (including dividends) to log
returns. And then calculate excess returns relative to TRBD3MT (a proxy
for the risk free rate).Required: Dynamically select the columns for the
various series you require, using the details in “assetsq2.csv” i.e. your code
should work even if the column structure was different (assets and indices
in a different order, for example) or more or less time periods were included.
Hint: Don’t forget to transform TRBD3MT to the correct number of decimal
places.
b. Run regressions for the excess returns of every French stock (y-variable)
3
R: Assignment 2
versus the excess return for the CAC40 (x-variable), over the entire period
contained in the data. Required: Store the estimated parameters (intercept
and slope) and R2
values for each asset.
c. Do the same regressions using what you think are interesting subsets of the
data through time based on when the VSTOXXI index is high (you decide
what “high” is). Required: Store the estimated parameters (intercept and
slope) and R2
values for each asset.
d. Do the same regressions using what you think are interesting subsets of
the data through time based on when the “US SMOOTHED US RECESSION
PROBABILITIES NADJ” index is high (you decide what “high” is). Required:
Store the estimated parameters (intercept and slope) and R2
values
for each asset.
e. Calculate, report and comment upon, the summary statistics of the estimated
parameters and R2
in each case. Required: Use your version of
“fsumstats”.
4
R: Assignment 2
2 Regression and Optimisation
The questions in this section relates to chapters 3 and 4 of the Part 2 notes.
2.1 Technical Regression
.
# create same model as before
n<-100 ; set.seed(999) ; x<-cbind(rep(1,n),rnorm(n,.1,.4))
e<-rnorm(n,0,.2) ; b<-rbind(0,1) ; y<-x%*%b+e
The code above repeats the fake data that we used in the Part 2 notes for
checking regression estimates. Use this in your answers to the questions below.
Part A
Let the following formula define an objective function to minimize the sum of the
absolute value of errors from a linear regression.
minˆb[(|y − Xˆb|)′ι)]
(1)
where ι is a vector of ones and the absolute value operation is for each element
of the vector within it.
For the avoidance of doubt, a simple regression model based on equation (1)
can be re-written as
minβˆ0,βˆ1∑Ti=1|yi − βˆ0 − βˆ1xi| (2)
a. Find the “optimal” betas for a regression model that has the objective function
in formula (2), using the fake data defined above.
Hint: Follow the example for minimising the sum of squared errors in the Part
2 notes.
Part B
Add the following components to the output structure variable in the regression
function “freg” from the Assignment Files. The section references refer to the
formulas given in the Part 2 notes.
a. yhat from the yˆ in section 3.3.
b. e from estimated errors in section 3.3.
5
R: Assignment 2
c. rsq from R2
in section 3.9.
d. rsqadj from R¯2
in section 3.9.
e. covbeta from cov(ˆb) in section 3.9.
f. tbeta from ti, for every ˆbi, from section 3.9.
Hint: Start with the list of variables contained in “freg” and then add out$yhat
and so on.
Part C
The following assumptions describe a maximum likelihood set-up for a simple
linear regression that assumes the errors are IID Normal.
.
Table 1: Assumptions for a Linear Regression MLE
1. The data are y and x from the fake data defined above.
2. The linear regression equation is yi = β0 + β1xi + ϵi.
3. The errors, ϵi, are independently and identically Normally distributed.
4. We have ϵi = yi − β0 − β1xi.
5. The applicable Normal distribution is f(ϵi) = 1
6. The likelihood function for IID Normals is L =∏ni=1 f(ei)
7. We do not know σ, β0 or β1: we seek to estimate them such
that L is maximised.
a. Solve the maximum likelihood equation (“MLE”) for the regression model
defined by the assumptions in Table 1 with a numerical optimisation.
Hint: Follow the MLE example in the Part 2 notes, changing only the pieces
that you need to.
6
R: Assignment 2
3 Simulation
The background material for simulation was covered in Chapter 5.
3.1 Option Valuation via Simulation
Consider the option to sell an asset at time T to someone at the maximum
price of the asset between 0 and T. This is a so-called Asian put option which
depends on the price path of the asset (as discussed in class).
Assume that the underlying asset follows geometric Brownian motion, as per
chapter 5 of the Part 2 notes.
Let the risk-neutral valuation formula (Campbell, Lo and MacKinlay (1997), section
9.4) of an Asian put option be
H(0) = e−µrfTE∗[max0≤t≤TP(t)]− P(0), (3)
where H is the option, P is the price of the asset, µrf is the risk free rate, the
“max” function applies to any price between the start (t = 0) and end (t = T) of
the option and the expectation is with respect to “risk neutral probabilities”.
The analytic solution (Goldman, Sosin and Gatto, 1979) to equation (3) is
H(0) = P(0)e–µrfT ϕ(−αTσ√T) [1 −σ22µrf]− P(0)+P(0) (1 +σ22µrf) [1 − ϕ(−(α+σ2)Tσ√T)] (4)
where ϕ(.) is a function that represents the cumulative N(0, 1) distribution, σ is
the asset’s standard deviation and α = µrf − σ
2/2.
Let the option for this question be defined by the assumptions in Table 2.
Part A
a. Value an Asian put option with equation (4). Use the assumptions in Table 2.
Part B
7
R: Assignment 2
.
Table 2: Assumptions for an Asian Put Option
1. Starting price of €15.
2. Period length of 3 (three years).
3. Risk free rate of 0.5% per year.
4. Asset standard deviation of 25% per year.
5. Strike price set by the maximum of the price over the 3 years.
a. Value the Asian put option, as above, by using price path simulations. Use
10,000 simulations. Required: Use equation (3) to define your valuation.
Use n = 100 for the number of steps. Hint: Follow the code from chapter
5 of the Part 2 notes, changing only what you need to.
b. Increase the number of steps (n) and the number of simulations until you
are sure that your answer is converging to the analytic results given by
equation (4). Required: Report your simulated option values for different
levels of n and simulations.
c. Explain why increasing the number steps brings the simulated values closer
to the analytic solution.
8
R: Assignment 2
4 Substitute Question
You are welcome to substitute one of the questions from 1, 2 or 3 for your one
of your own creation — subject to my prior approval!
If you want to take this option then please talk to me first. The idea is that you
can base your substitute question on a part of the notes that you find interesting
but isn’t covered in the other questions. Good topics include: searching for
momentum and reversal effects in asset returns, non-parametric analysis, nonlinear
regression and bootstrapping. Or maybe you have a different dataset that
you wish to analyse.
9
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。