联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2022-10-28 09:52

T2187 Business analytics, applied modelling and prediction

Important note

This commentary reflects the examination and assessment arrangements for this course in the

academic year 2019–20. The format and structure of the examination may change in future years,

and any such changes will be publicised on the virtual learning environment (VLE).

Information about the subject guide and the Essential reading

references

Unless otherwise stated, all cross-references will be to the latest version of the course (2019). You

should always attempt to use the most recent edition of any Essential reading textbook, even if the

commentary and/or online reading list and/or subject guide refer to an earlier edition. If different

editions of Essential reading are listed, please check the VLE for reading supplements – if none are

available, please use the contents list and index of the new edition to find the relevant section.

General remarks

Learning outcomes

At the end of the course and having completed the essential reading and activities you should be

able to:

apply modelling at varying levels to aid decision-making

understand basic principles of how to analyse complex multivariate datasets with the aim of

extracting the important message contained within the large amount of data which is often

available

demonstrate the wide applicability of mathematical models while, at the same time,

identifying their limitations and possible misuse.

Format of the examination

The examination is two hours long and you must answer four questions out of five. The examination

is worth 70% of the final grade. The other 30% is determined by the coursework component. (The

coursework comprised a freestyle data visualisation project using Tableau, requiring the design of a

five-dashboard story and accompanying report – see the ‘Assessment’ section in the VLE for details.)

Overall performance

The performance of candidates in the examination was very pleasing, with many excellent answers.

Use of Excel is not directly examined, rather some questions required the interpretation of Excel

1

ST2187 Business analytics, applied modelling and prediction

output and/or formulae. Some answers lacked sufficient depth of explanation – remember to

comment in detail on each argument of a function.

Although this is an applied statistics course, candidates are reminded that commercial insight is also

important. Always think about which business decisions could be taken as a consequence of the

analysis, justifying the decision(s) based on the results – the course is about business analytics after

all!

Examination revision strategy

Many candidates are disappointed to find that their examination performance is poorer than they

expected. This may be due to a number of reasons, but one particular failing is ‘question

spotting’, that is, confining your examination preparation to a few questions and/or topics which

have come up in past papers for the course. This can have serious consequences.

We recognise that candidates might not cover all topics in the syllabus in the same depth, but you

need to be aware that examiners are free to set questions on any aspect of the syllabus. This

means that you need to study enough of the syllabus to enable you to answer the required number of

examination questions.

The syllabus can be found in the Course information sheet available on the VLE. You should read

the syllabus carefully and ensure that you cover sufficient material in preparation for the

examination. Examiners will vary the topics and questions from year to year and may well set

questions that have not appeared in past papers. Examination papers may legitimately include

questions on any topic in the syllabus. So, although past papers can be helpful during your revision,

you cannot assume that topics or specific questions that have come up in past examinations will

occur again.

If you rely on a question-spotting strategy, it is likely you will find yourself in difficulties

when you sit the examination. We strongly advise you not to adopt this strategy.

2

Examiners’ commentaries 2020

Examiners’ commentaries 2020

ST2187 Business analytics, applied modelling and prediction

Important note

This commentary reflects the examination and assessment arrangements for this course in the

academic year 2019–20. The format and structure of the examination may change in future years,

and any such changes will be publicised on the virtual learning environment (VLE).

Information about the subject guide and the Essential reading

references

Unless otherwise stated, all cross-references will be to the latest version of the course (2019). You

should always attempt to use the most recent edition of any Essential reading textbook, even if the

commentary and/or online reading list and/or subject guide refer to an earlier edition. If different

editions of Essential reading are listed, please check the VLE for reading supplements – if none are

available, please use the contents list and index of the new edition to find the relevant section.

Comments on specific questions

Candidates should answer FOUR of the following FIVE questions. All questions carry equal marks.

Question 1

(a) Consider a modified version of the Monty Hall problem. In this version, there

are 8 boxes, of which 1 box contains the prize and the other 7 boxes are empty.

You select one box at first. Monty, who knows where the prize is, then opens 6

of the remaining 7 boxes, all of which are shown to be empty. If Monty has a

choice of which boxes to open (i.e. if the prize is in the box you chose at first),

he will choose at random which one of the boxes to leave unopened.

i. Suppose that you have chosen Box 1, and then Monty opens Boxes 3 to 8,

leaving Box 2 unopened. After we have observed this, what is the probability

that the prize is in Box 1, and what is the probability that it is in Box 2?

(10 marks)

ii. How should a risk-neutral decision-maker use the probabilities computed in i.

to inform their strategy?

(3 marks)

Reading for this question

Block 1 on the VLE covers the Monty Hall problem as part of decision-making under

uncertainty.

3

ST2187 Business analytics, applied modelling and prediction

Approaching the question

i. Let Bj denote the event that the prize is in Box j, for j = 1, 2, . . . , 8, and let N2 denote

the event that Monty does not open Box 2. Bayes’ theorem tells us that the conditional

probability that the prize is in Box 1 is:

P (B1 |N2) = P (N2 |B1)P (B1)8∑

j=1

P (N2 |Bj)P (Bj)

=

P (N2 |B1)P (B1)

P (N2 |B1)P (B1) + P (N2 |B2)P (B2) +

8∑

j=3

P (N2 |Bj)P (Bj)

.

According to the rules of the game, the probabilities to use here are:

? P (B1) = P (B2) = · · · = P (B8) = 1/8, since all the boxes are initially equally likely to

contain the prize.

? P (N2 |B1) = 1/7. If the prize is in Box 1, Monty has a choice, so he chooses at

random which of the remaining boxes to leave unopened.

? P (N2 |B2) = 1. If the prize is in Box 2, Monty must leave it unopened.

? P (N2 |Bj) = 0 for j = 3, 4, . . . , 8. If the prize is in either Box 3 to 8, Monty cannot

open that box. This means that he must open all the other boxes not chosen by you,

including Box 2.

Therefore, the conditional probabilities are:

P (B1 |N2) = 1/7× 1/8

1/7× 1/8 + 1× 1/8 + 6× (0× 1/8) =

1

8

and P (B2 |N2) = 7/8. Hence you have an 87.5% chance of winning the prize if you

switch to Box 2.

ii. While neither conditional probability computed in i. offers certainty, a risk-neutral

decision-maker would ‘play to the probabililties’ and choose the strategy to always switch

to the unopened box, since a long-run expected win rate of 87.5% by switching is far

better than a long-run expected win rate of 12.5% by not switching.

(b) ‘If there is uncertainty about some monetary outcome and you are concerned

about return and risk, then all you need to see are the mean and standard

deviation. The entire distribution provides no extra useful information.’ Do you

agree or disagree? Provide an example to back up your argument.

(5 marks)

Reading for this question

Block 5 on the VLE covers probability and probability distributions.

Approaching the question

The mean and standard deviation are very important measures of a probability distribution,

but the entire shape of the distribution, especially the tails, can be relevant in

decision-making. This is especially true when you are risk-averse and large losses (or gains)

are possible!

(c) It is assumed that inter-person arrival times at a bank during the peak period

covering lunchtime follow an exponential distribution with a mean of 20

seconds. An Excel analysis was conducted as follows:

4

Examiners’ commentaries 2020


Explain what the value of 0.0498 in cell B5 represents and why this function is

used.

(7 marks)

Reading for this question

Block 6 on the VLE covers common probability distributions in business applications. This

question combines the Poisson and exponential distributions.

Approaching the question

This returns the probability of having to wait at least 1 minute (equivalently 60 seconds) for

the next arrival. The Excel function returns P (X = 0), where X ~ Pois(3), which is:

P (X = 0) =

e?λλx

x!

=

e?3 × 30

0!

= e?3.

Since the inter-person arrival times, Y , follow an exponential distribution, then Y ~ Exp(λ)

with E(Y ) = 1/λ = 20, hence λ = 1/20, which represents the per-second rate parameter for

the Poisson distribution. Since this is multiplied by 60 in cell B5, we have the Poisson

parameter for 1 minute.

Question 2

A landlord is deciding whether to upgrade a building’s heating system and is willing

to do so if the cost of the upgraded system can be recovered over the next five

consecutive years of winter weather, which can be classified as one of ‘mild’,

‘normal’, ‘cold’ and ‘severe cold’, occurring with probabilities of 0.2(1 ? x),

0.5(1 ? x), 0.3(1 ? x) and x, respectively.

The predicted annual heating costs (assume no inflation), with and without the

upgrade, are:

Type of winter weather Mild Normal Cold Severe cold

Annual costs without upgrade £420 £590 £720 £900

Annual costs with upgrade £358 £503 £612 £765

Probability 0.2(1 ? x) 0.5(1 ? x) 0.3(1 ? x) x

(a) Given x = 0.2, what is the maximum amount a risk-neutral landlord would be

willing to pay for an upgraded heating system?

(6 marks)

(b) If the upgraded heating system costs £600, what is the minimum value of x

such that installing the upgraded heating system becomes economically viable?

(7 marks)

5

ST2187 Business analytics, applied modelling and prediction

(c) Given x = 0.2, calculate and interpret the expected value of perfect

information, EVPI.

Hint: You may find it helpful to draw the decision tree for this problem.

(12 marks)

Reading for this question

Block 7 on the VLE covers decision-making under uncertainty using decision trees.

Approaching the question

(a) Given x = 0.2, the expected monetary value, EMV, of annual costs without the upgrade is:∑

x p(x) = 420× 0.16 + 590× 0.40 + 720× 0.24 + 900× 0.2 = £656.00

and so for five consecutive winters is:

£656.00× 5 = £3,280.00.

Given x = 0.2, the expected monetary value, EMV, of annual costs with the upgrade is:∑

x p(x) = 358× 0.16 + 503× 0.40 + 612× 0.24 + 765× 0.2 = £558.36

and so for five consecutive winters is:

£558.36× 5 = £2,791.80.

Hence the maximum amount a risk-neutral landlord would be willing to pay for an upgraded

heating system is:

£3,280.00? £2,791.80 = £488.20.

(b) In terms of x, the expected monetary value, EMV, of annual costs without the upgrade is:∑

x p(x) = 5× (420× 0.2(1? x) + 590× 0.5(1? x) + 720× 0.3(1? x) + 900× x)

= 1,525x+ 2,975

while with the upgrade, the EMV is:∑

x p(x) = 5× (358× 0.2(1? x) + 503× 0.5(1? x) + 612× 0.3(1? x) + 765× x)

= 1,291.5x+ 2,533.50

So we require:

(1,525x+ 2,975)? (1,291.5x+ 2,533.50) > £600

which solves for:

x > 0.6788.

Hence the value of x must be at least 67.88% before the landlord decides it is economically

worthwhile to install upgraded heating system.

(c) From (a), the expected monetary value with no information is £3,280.00.

6

Examiners’ commentaries 2020

The decision tree is:


The expected monetary value with perfect information is £3,265.00.

Hence the expected value of perfect information is:

EVPI = £3,280.00? £3,265.00 = £15.00.

Question 3

A company is interested in explaining the variation in annual salaries of its

employees, in dollars, using the following variables:

Gender, coded as 1 = female, and 0 = male

Age, in years

Work experience with the company, in years

Amount of post-16 education, in years.

(a) The correlation matrix of the above variables is:

7

ST2187 Business analytics, applied modelling and prediction


What conclusions can be drawn from this correlation matrix?

(6 marks)

The results of a multiple regression, including an interaction between gender and

age are:


(b) State the estimated regression equation and comment on R2.

(5 marks)

(c) Assess the overall significance of the model, as well as the significance of the

individual regression coefficients.

(5 marks)

(d) According to this multiple regression model, how does age impact annual

salaries for males and females?

(4 marks)

(e) Calculate an approximate 99% prediction interval for the annual salary of a

26-year-old female, who has worked at the company for 8 years, with a total of 2

years of post-16 education. Use a t coefficient of 2.601.

(5 marks)

Reading for this question

Blocks 11 and 12 on the VLE cover regression analysis (estimating relationships and statistical

inference, respectively).

8

Examiners’ commentaries 2020

Approaching the question

(a) The dependent variable (annual salary) shows moderate correlation with each of the

candidate independent variables, albeit gender is categorical, suggesting performing a

multiple regression would be appropriate.

Among the independent variables the strongest correlation is only 0.269 and so we would

not expect multicollinearity to feature as a problem in a multiple regression.

(b) We have:

?Annual salary = ?6,583.78 + 9,465.53×Gender + 592.82×Age

+ 2,831.16× Experience + 8,182.46× Education

? 474.85×Gender×Age.

R2 = 0.7110 indicating that 71.10% of the variation in annual salary is explained by the

model, which is a fairly high percentage.

(c) Testing H0 : β1 = β2 = · · · = β5 = 0, the F -ratio is 97.42 with a p-value < 0.0001, hence the

test is highly significant and so we have very strong evidence that the model has significant

explanatory power. Testing H0 : βi = 0 for i = 1, 2, . . . , 5, all the variables are statistically

significant at the 5% significance level, except Gender with a p-value of 0.28 > 0.05. Note

that the interaction variable involving Gender (with Age) is significant.

(d) For females, the effect of an extra year on annual salary is:

592.82? 474.85 = 117.97

while for males the effect of an extra year on annual salary is 592.82.

Other things equal, females earn $474.85 less than males in a year.

(e) The prediction is:

?6,583.78 + 9,465.53× 1 + 592.82× 26 + 2,831.16× 8 + 8,182.46× 2? 474.85× 1× 26

which is $44,966.66. Hence an approximate 99% prediction interval is:

Y? ± t0.005, 198 × se ? 44,966.66± 2.601× 16,467.77

which gives ($2,133.99, $87,799.33).

Question 4

(a) Holt’s method, as an exponential smoothing model, assumes an additive trend.

Suppose instead that the model is changed to a multiplicative trend so that, for

example, if the current level estimate is 10, and the current trend estimate is

1.3 (i.e. the forecast increases by 30% per period), then the forecast next period

is 10 × 1.3, and for the following period is 10 × (1.3)2. Holt’s method equations

then become:

Lt = αYt + (1 ? α)(?) and Tt = β(#) + (1 ? β)Tt?1.

i. State the expressions for (?) and (#) in the above equations, briefly

explaining your answer.

(6 marks)

9

ST2187 Business analytics, applied modelling and prediction

ii. Assume you are working with monthly data, with December being month 12,

January being month 13 etc. Suppose L12 = 80 and T12 = 1.2 and you

observe Y13 = 90. Determine the forecast for Y15, assuming α = 0.3 and

β = 0.7 using a multiplicative trend.

(7 marks)

Reading for this question

Block 13 on the VLE covers time series analysis and forecasting.

Approaching the question

i. The first term is the product of the previous level and the previous trend:

Lt?1 × Tt?1.

The second term is the ratio of the current level to the previous level:

Lt

Lt?1

.

ii. We have:

L13 = α× Y13 + (1? α)× L12 × T12 = 0.3× 80 + 0.7× 90× 1.2 = 94

and:

T13 = β × L13

L12

+ (1? β)× T12 = 0.7× 94

80

+ 0.3× 1.2 = 1.18.

Therefore:

F15 = L13 × T 213 = 94× (1.18)2 = 132.11.

(b) Is it more appropriate to use an additive or a multiplicative model to forecast

seasonal data? Summarise the difference(s) between these two types of seasonal

models.

(5 marks)

Reading for this question

Block 13 on the VLE covers time series analysis and forecasting.

Approaching the question

Both can be useful, and it is hard to say that one is better than the other. The additive

model is relevant if you believe each season’s typical value is a certain amount above or

below the overall average. The multiplicative model is relevant if you believe each season’s

typical value is a certain percentage above or below the overall average.

(c) Let Yt be the sales in dollars during month t, and let Pt be the price in dollars

charged during month t. A regression of the form:

Yt = β0 + β1Yt?1 + β2Pt + εt

was run and was estimated to be:

Y?t = 972,545.98 + 0.6714 × Yt?1 ? 23,604.19 × Pt.

i. If the price during month 21 is $14, and sales in month 20 were $781,000,

what would you predict for sales in month 21?

(3 marks)

10

Examiners’ commentaries 2020

ii. The correlation coefficient between Yt and Yt?1, i.e. rYt,Yt?1 , is 0.864.

Comment on this value.

(2 marks)

iii. The correlation coefficient between Yt and Pt, i.e. rYt,Pt , is ?0.325. Comment

on this value.

(2 marks)

Reading for this question

Block 13 on the VLE covers time series analysis and forecasting.

Approaching the question

i. We have:

972,545.98 + 0.6714× 781,000? 23,604.19× 14 = $1,166,451

ii. There is fairly strong positive autocorrelation for sales – for example, high sales one

month tend to be followed by high sales the next month, which seems reasonable.

iii. There is weak-to-moderate negative correlation between sales and price, consistent with

the law of demand.

Question 5

(a) Suppose you are a financial analyst and your company runs many simulation

models to estimate the profitability of its projects. If you had to choose just two

measures of the distribution of any important output, such as net profit, to

report, which two would you choose? Why? What information would be missing

if you reported only these two measures? How could they be misleading?

(6 marks)

Reading for this question

Block 15 on the VLE covers Monte Carlo simulation models.

Approaching the question

The point here is that there is an entire distribution of any such output, and we are limited

to reporting only two of its measures. (Granted, this is unrealistic.) Anyway, you probably

want a measure of central tendency and a measure of variability. Two useful measures for a

financial analyst might be the median (half the time, net profit is above it; half the time, it

is below it) and the 5th percentile (this is nearly as bad as net profit will be).

(b) A country, with no free health service, offers a tax-free savings account for

medical expenses. At the start of the year you decide how much to pay into the

account. When medical expenses are incurred you may withdraw up to this

amount (with no tax paid) exclusively to pay for medical expenses (if the

amount invested exceeds your medical expenses, you lose this money). Any

medical expenses which exceed the amount you invested must be paid by you

using income taxed at 35%. Your annual salary is $85,000, and you assume

annual medical expenses are normally distributed with a mean of $1,700 and a

standard deviation of $350.

The following is the spreadsheet model which shows the results of one

simulation in the cell range A16:D16.

11

ST2187 Business analytics, applied modelling and prediction


i. For the simulation in the cell range A16:D16, explain how the value of

$53,861 in cell D16 would have been calculated (with reference to relevant

cells in the Excel spreadsheet model above).

(6 marks)

ii. Write down an appropriate Excel function for cell C16, i.e. to determine the

medical expenses above the amount invested. Explain your choice of function.

(5 marks)

iii. One thousand simulations were run and the distribution of outcomes

produced the following:


12

Examiners’ commentaries 2020

Explain why the maximum amount of money left to you is $54,340.

(4 marks)

iv. Based on the simulation results in iii., do you think investing $1,400 in the

tax-free savings account is appropriate? Or should you invest more? Or less?

Explain your answer.

(4 marks)

Reading for this question

Block 15 on the VLE covers Monte Carlo simulation models.

Approaching the question

i. Cell A16 returns a simulated value from N(1,700, (350)2), which is $1,879 denoting the

medical expenses of the individual.

The tax liability of 35% is levied on the difference between annual income and amount

invested:

($85,000? $1,400)× 0.35 = $29,260.

The medical expenses, $1,879, exceed the amount invested by:

$1,879? $1,400 = $479.

The amount of money left is:

($85,000? $1,400)× 0.65? $479 = $53,861.

ii. One possibility is =MAX(A16-B12,0), which returns the greater of ‘excess medical

expenses over amount invested’ and zero (when the amount invested is sufficient to cover

medical expenses).

iii. This represents the best case scenario when medical expenses do not exceed the amount

invested, i.e. do not exceed $1,400. Net income is then:

($85,000? $1,400)× 0.65? $0 = $54,340.

iv. Any reasonable comments accepted, taking into account the risk appetite of the

individual.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp