联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2023-07-25 10:35

UNSW Sydney Business School

School of Risk and Actuarial Studies

ACTL1101

Introduction to Actuarial Studies

Main Assignment

(Due date: 28 July, Friday 4pm)

T2 2023

June 2023

Contents

1 Part One: Taxation Data by Postcode 2

1.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Your Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Part Two: Optimal Investments 4

2.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 Your Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Format Requirements 6

4 Marking Criteria 7

4.1 Part One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4.2 Part Two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4.3 Plagiarism Awareness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4.4 Late Penalties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

5 Answering Students’ Questions 9

A Variable Description for Part One 10

1

1. Part One: Taxation Data by Postcode

1.1 Context

In this part of the assignment, you will perform an analysis (and create visualisations) of

a dataset which contains variables mostly related to income and taxation within different

Australian postcodes for tax year 2018-2019. Some information about the schools (elementary

and secondary) within those postcodes is also present in the dataset (last 4 columns).

Warning: because some postcodes do not have any schools in them, those last 4 variables

contain many ‘NA’ values. Other variables may also contain some ‘NA’ values.

Each of your datasets consists of 800 randomly generated records and can be downloaded

as the csv file. At the end of this document, you can find ”Appendix A: Variable Description,”

which provides a brief description of each variable and its meaning. It is important to

understand the representation and meaning of the variables used in order to interpret the

results accurately.

For your information, those datasets are ‘real’ (and publicly available). The taxation and

income data is available here1 (with its license found here). The school data is available

here.

1We must disclose that, compared to the original income/tax data found here, we have modified many

variables to obtain averages by individual (as opposed to the total amounts by postcode found in the original

data). We have also removed many variables from the original dataset, and deleted one postcode for which

the average tax rate was strongly negative (we consider this postcode to be an outlier which we do not want to

include in our analysis).

2

ACTL1101 Introduction to Actuarial Studies 2023 Main Assignment

1.2 Your Tasks

1. (1pt) Produce a visualisation of the distribution of variable Private.Health.Proportion

across all postcodes. Briefly describe this distribution.

2. (1pt) Produce a table containing, for each State:

the number of postcodes within that state

the mean of variable Private.Health.Proportion (across postcodes within that

state)

the standard deviation of variable Private.Health.Proportion (across postcodes

within that state)

3. (1pt) Create a new variable called Avg.Gross.Rent, which is simply:

Gross.Rent.Amt

Total.Nb

.

Then, compute the sample correlation between Avg.Gross.Rent and Avg.Tax.Rate.

Report and briefly interpret your result.

4. (2pts) Add a variable called Tax.Bracket to this dataset. This new variable should be

based on variable Avg.Tax.Rate, and be equal to:

‘Low’ if Avg.Tax.Rate is below its 25% quantile.

‘Medium’ if Avg.Tax.Rate is equal or above its 25% and below its 75% quantile.

‘High’ if Avg.Tax.Rate is equal or above its 75% and below its 99% quantile.

‘Very High’ if Avg.Tax.Rate is equal or above its 99% quantile.

Then, report the average Avg.Income within each Tax.Bracket.

Hint: consider using the R function quantile().

5. (2pts) Produce a visualisation which illustrates the relationship between variable

Private.Health.Proportion and the new variable Tax.Bracket. Briefly discuss what

you observe.

6. (3pts) Open Question: use any variable(s) you want in this dataset to tell a brief story

about the data. This can be anything you find relevant, but you must include at

least one visualisation to support your ‘story’. Example: an interesting/surprising link

between variables, an insight that could help set a new public policy (or improve an

existing one), a finding that is the starting point for new research, etc.

3

2. Part Two: Optimal Investments

2.1 Context

In this second part of the assignment (which is totally unrelated to the first part), you will

work on an investment problem that would be difficult to tackle without programming. You

need to use the R Shiny app to look for the values that correspond to your zID. The app

will provide the numerical values of r, μ0, and w0 specific to your zID. Please use the app to

retrieve these values and incorporate them into your analysis. The context is as follows.

You want to invest your money and you have two investment Options. Both of them will

yield a random rate of return. However, Option B is substantially riskier than Option A. Their

dynamic is as follows:

An amount of 1 invested in Option A will yield a random amount A, with

A = 1 + U · r, (2.1)

and where U is a Uniform(0,1) random variable and r is a constant.

An amount of 1 invested in Option B will yield a random amount B, with

B = exp

(

μ0 +

0.12 ? c2Z + cΦ?1(U)

)

, (2.2)

and where U is the same Uniform(0,1) as in Equation (2.1), Z is a N(0,1) random

variable (independent of U ), Φ?1() is the quantile function of the standard Normal

distribution and μ0, c are constants (with 0 ≤ c ≤ 0.1).

You make financial decisions using a utility function given by:

v(w) = 1? exp(?w), for w ∈ R,

and you invest all your wealth w0 in some proportion to Option A, and in some proportion to

Option B. Call γ the proportion of your wealth you invest in Option B (where 0 ≤ γ ≤ 1). Your

final (and random) wealth W is then equal to:

W = w0 [(1? γ)A+ γB] .

Note 1: Don’t worry about function Φ?1(), simply know that you can get this function in R as

qnorm(). So, to get Φ?1(U) you would write qnorm(U).

Note 2: We stretch that the same U is used in the calculation of Option A and Option B. This

induces a correlation between A and B.

4

ACTL1101 Introduction to Actuarial Studies 2023 Main Assignment

2.2 Your Tasks

1. (3pt) Create a function called generate.AB. This function has two arguments: n (no

default value) and c (default value of 0). This function does the following:

It generates n random pairs of (A,B) under the dynamic given by Equations

(2.1)-(2.2), with constant c specified via the second argument ‘c’ of the function.

It returns this sample as a matrix of size n × 2 (n rows, 2 columns). The first

column is the sample (A1, . . . , An); the second column is the sample (B1, . . . , Bn).

Then, use your function to create a scatter plot of a sample (A1, B1), . . . , (An, Bn) for

n = 2000 and c = 0.08 (the A values should be on the x axis, while the B values should

be on the y axis) and briefly comment on the relationship between A and B.

Hint: In this question, vectorization is your friend. Remember that a single command

like rnorm(n) can generate a vector of n random variables in one go.

2. (2pt) Use a visualisation of your choice to illustrate the relationship between:

the correlation ρ(A,B).

the constant c.

Then, briefly analyse and interpret this relationship.

Hint: We do not know the theoretical correlation between A and B, but we can use

function generate.AB() to obtain a sample from (A,B) and then use the R function

cor() to estimate the correlation, from that sample.

3. (3pt) Following this investment strategy, your Expected Utility is:

E[v(W )] = E[1? exp(?W )] = E[1? exp (? w0(1? γ)A? w0γB)].

Assume that c = 0.05. Find the γ which maximises E[v(W )]. [Recall that 0 ≤ γ ≤ 1.]

Hint 1: It would be hard to compute E[v(W )] with pen and paper, but here again you

can use generate.AB() to obtain a sample v(W1), . . . , v(Wn). You can then compute

the mean of that sample, which is a good estimate of E[v(W )]. You can repeat that for

many values of γ, as to find the (approximate) γ that maximises Expected Utility.

Hint 2: The ‘computational burden’ here is quite high, as you need a fairly large sample

(we suggest n > 200,000) to approximate E[v(W )] by the sample mean of v(W ). That

said, in this process you only need to generate ONE sample (A1, B1), . . . , (An, Bn).

Indeed, using this ONE sample you can then obtain samples v(W1), . . . , v(Wn) for

different values of γ. Said otherwise, you do not need to generate a new random

sample for every different value of γ.

4. (1pt) Assume that c = 0.05. If you follow this investment strategy with the γ value

as derived in the previous question, what is: Pr[W < w0] (i.e. the probability you

experience a loss)? [Here again, we expect an approximate answer based on R

simulations, not a pen-and-paper calculation.]

5. (1pt) Include in an Appendix the pseudocode of all the tasks your performed in R/Python

for questions 1 to 4. [You are encouraged to write your pseudo-code before the actual

implementation in R/Python, as this helps structure the coding process and organise

your thoughts.]

5

3. Format Requirements

You must submit your assignment on Turnitin (under the ”Main Assignment” section in

Moodle).

You must submit two files:

– A .pdf file: contains your answers to all questions.

– A .R/py file: contains all the R or Python code you used to produce your answers.

About the .pdf file:

– It includes a title page with your student name and student zID.

– The page format is A4 (the standard Australian format).

– The minimum font size used is equivalent to ”Times New Roman” size 11.

– The minimum line spacing used is 1.15.

– The margins should not be narrower than the ”narrow” option in Word (0.5 inches

on every side).

– The answers (including sub-parts) are numbered in the same way they are numbered

in the statement of the questions.

– Your answers to Part One (including plots) must fit on 2 pages.

– Your answers to Part Two (including plots) must fit on 1 page.

– The main body of the pdf (i.e., the 3 pages) need not contain R or Python code

but must include everything that is asked in the questions (e.g., visualizations,

tables, numbers, explanations, etc.).

– All R or Python code necessary to produce your results must be placed in an

Appendix to your .pdf file. There is no page limit on this Appendix, but the

efficiency of your code will be graded (see Marking Criteria). To be clear: we

want the entirety of your R or Python code to be present in your Appendix (as well

as in your .R/py file).

– This R or Python code Appendix must be made of text (not images).

? Specifically about your .R/py file:

– Your R or Python code must run as it is (and produce exactly the results in your

assignment). If we cannot run your code, you will lose ALL marks associated with

the R or Python code (C1 and C2 in the Marking Criteria).

– Your R or Python code must contain ALL steps necessary to answer the questions

in this assignment. To be specific: you are NOT allowed to do any data manipulation

in Excel or any software other than R or Python.

6

4. Marking Criteria

Each individual Question is allocated a fixed number of marks. To assess your answers,

we will use a series of criteria. Those criteria are stated below, with a brief description that

corresponds to a ‘HD mark’. Not all criteria are relevant to every sub-question: find a detailed

mapping below.

C1: Code Correctness: Your codes, functions and algorithms produce exactly the

desired results, and do not produce any irrelevant/superfluous results.

C2: Code Efficiency: Your codes are extremely efficient, without sacrificing readability.our R codes are extremely well organised and easy to follow.

C3: Analysis: Your analysis is insightful and accurate. Your interpretation of your

results is correct, clear, precise and shows a great depth of understanding and critical

thinking. Your writing is concise, fluent and devoid of typos, grammatical and syntactical

mistakes.

C4: Choice of Visualisation: Your choice of which visualisation to use is excellent: it

conveys all (and only) the appropriate information.

C5: Presentation: The formatting and presentation of your results and/or visualisations

is impeccable: clear, readable and aesthetic.

C6: Pseudocode: Your pseudocode is clearly written in a neutral syntax and contains

all steps necessary to reproduce your results.

4.1 Part One

For each sub question in Part One, the relevant marking criteria are:

Q1: C1 (20%), C3 (30%), C4 (30%) C5 (20%)

Q2: C1 (50%), C2 (30%), C5 (20%)

Q3: C1 (60%), C3 (40%)

Q4: C1 (60%), C2 (40%)

Q5: C1 (20%), C3 (30%), C4 (30%) C5 (20%)

Q6: C3 (50%), C4 (25%), C5 (25%)

7

ACTL1101 Introduction to Actuarial Studies 2023 Main Assignment

4.2 Part Two

For each sub question in Part Two, the relevant marking criteria are:

Q1: C1 (40%), C2 (20%), C3 (20%), C5 (20%)

Q2: C1 (20%), C2 (20%), C3 (20%), C4 (20%), C5 (20%)

Q3: C1 (60%), C2 (40%)

Q4: C1 (60%), C2 (40%)

Q5: C6 (100%)

4.3 Plagiarism Awareness

This is an individual assignment. While we have no problem with students discussing

assignment problems if they wish, the material each student submits must be their own

individual work. Students should make sure they understand what plagiarism is.

In particular, any R or Python code you present must be from your own computer, and

developed by you alone. With≈360 students performing the same task, some small elements

of code are likely to be similar. However, big patches of identical code (even with different

variable names, layout, or comments) will be considered suspicious and investigated for

plagiarism. Turnitin picks this up easily, so cases of plagiarism have a very high probability

of being discovered. The best strategy to avoid any problem is to never share bits and pieces

of code with other students.

4.4 Late Penalties

Penalties for late assignments are as indicated in the course outline:

Late submission will incur a penalty of 5% per day or part thereof (including weekends)

from the due date and time. An assessment will not be accepted after 5 days (120 hours)

of the original deadline unless special consideration has been approved.

Hence, be careful: 4.99 days of lateness gives you a penalty of 25%, but 5.01 days of

lateness gives you a penalty of 100%.

8

5. Answering Students’ Questions

Questions or clarification about the assignment must be posted on the Ed Forum. We do not

plan to give out many additional hints, but if we were to do so, we want everyone to benefit

from them.

Important Note: The deadline for submission of this assignment is 28 July at 16:00. However,

we will stop answering any questions about the assignment on Wednesday 26 July at 16:00.

The rationale for this is twofold:

? we want to incentivise students to start the assignment early

? we want to be fair to assiduous students who decide to submit their assignment ahead

of time. Were we to give hints right before the deadline, those students would be

penalised for their earliness.

9

A. Variable Description for Part One

Postcode: Identifier of an Australian postcode (for which all other variables are recorded)

State: State in Australia where the Postcode is located

Total.Nb: Total number of individuals (with tax returns) in that postcode

Total.Income: Total taxable income (all sources) for that postcode

Net.Tax.Amt: Total net tax paid

Avg.HELP: Average student HECS-HELP Debt repayment

Avg.Salary: Average salary or wages

Total.Income.Amt: Total income or loss (including non taxable income)

Avg.Income: Average total income or loss

Avg.Tax: Average tax paid

Avg.Tax.Rate: Average tax rate paid (Net.Tax.Amt divided by Total.Income.Amt)

Avg.Work.Expenses: Average work related expenses (all expenses)

Employer.Super.Contributions.Amt: Total reportable employer superannuation contributions

Net.Capital.Gain.Amt: Total net capital gain

Tax.Net.Capital.Gain.Amt: Total estimated tax on net capital gains

Avg.Foreign.Income: Average assessable foreign source income

Gross.Rent.Amt: Total gross rent (income)

Personal.Super.Contributions.Amt: Total personal superannuation contributions made

Total.Business.Income.Amt: Total business income

Total.Business.Exp.Amt: Total business expenses

Net.Business.Income.Amt: Total net income or loss from business

Business.Net.Tax.Amt: Total estimated business net tax

Private.Health.Proportion: Number of individuals with private health insurance divided

by Total.Nb

ICSEA: Average Index of Community Socio-Educational Advantage (see link for details)

LBOTE: Average proportion of students with a ‘Language Background other than English’

Indigenous: Average proportion of indigenous students

Teaching.Ratio: Average number of teaching staff per student


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp