联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2021-02-27 10:54

MIDDLESEX UNIVERSITY COURSEWORK 1

CST2330


Data Analysis for Enterprise Modelling

This assignment is worth 50% of the overall grade. The submission date is Week 12, Friday, 19:00 January 8, 2021.


Contents

1The net present value  (NPV)  problem (10%)1

2Optimisation and linear programming (20%)2

2.1Solution using analysis and graphs (10%) . . . . . . . . . . . . . .2

2.2Solution using solver (10%)   .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . .3

3Data import, plotting and  transformation (20%)3

3.1Plot prices (5%).  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .4

3.2Create  prices  table  (10%)  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .5

3.3Convert prices to log-returns (5%)   .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .   . .6

Software and data required

You are recommended to use R-Studio — an integrated development environment for R language. It is available on the University computers from Apps Anywhere. It can also be installed on personal computers, a free copy is available at:

https://rstudio.com/products/rstudio/

You may need to load the following libaries in R-Studio:

●lpSovle — to solve linear programming problems in Task 2.

●DBI — if you need to connect and read databases in Task 3.

●xts — for extensible timeseries objects in Task 3.

Note, however, that all tasks in this coursework can be implemented in other languages (e.g. Python, Common Lisp, Julia or even more visual projects, such as KNIME or RapidMiner), and you can use them, if you feel more comfortable.

A specific dataset is required for Task 3 as well as the tasks in Coursework 2.

It is available on My Learning in the Data folder of the CST2330 page.


1The net present value (NPV) problem (10%)

A car manufacturer is looking to bring a new range of electric vehicles (EV) in the next five years.  This will require an initial investment of f500M and has running costs of f100M per year after that.  The predicted incomes from this range are:


Year202020212022202320242025

fM01050200300500


Notice that the initial expenditure takes place in 2020.


1.Assume a discount rate of 1% per year, and perform analysis of discounted cashflow. You should create a data table in R with variables Year, Outflow

andin the first three columns:


yearoutflowinflownetnpvf

120205000

2202110010

3202210050

42023100200

52024100300

62025100500


Write a function to analyse the cashflow table:

This function should modify the table by adding columns showing the net returns, the numbers of years to discount, the present value factors and the discounted returns. Your function must be able to take different cashflow tables and discount rates as the input, print the modified cashflow table and return the (total) net present value (NPV) for the investment. Include your

code into the report.5 marks

2.Decide whether or not this is a worthwhile investment, justifying your answer

by your results.1 mark

3.Calculate the NPV for discount rates of 2%, 3%, 4% and 5%, giving your

answers to two decimal places.1 mark

4.Plot NPV against the discount rates ranging from 0%  to 5%.1 mark

5.Use your graph to estimate the discount rate that would give an NPV of zero. What is the significance of this discount rate? (Hint: What happens

at discount rates lower and higher than this figure?)2 marks


2Optimisation and linear programming (20%)

2.1Solution using analysis and graphs (10%)

A car manufacturer produces conventional cars, which generate f4,000 profit per car,  and  a  newly  developed  EVs,  which  generate  f5,000  per  car.   The  objective is to maximise profit by selecting the right combination of conventional cars and EVs, subject to the following constraints:

●The factory can produce up to 1,000 conventional cars and up to 500 EVs per day.

●The logistics and warehouse facilities do not allow for production of more than 750 vehicles per day in total (i.e. both conventional and EVs).

●The manufacturer has a contract to produce at least 100 conventional and 50 EVs per day.

Your task is to:

1.Write the objective function.1 mark

2.Write the constraints for production and logistics.2 marks

3.Plot or draw the feasible region and the isoprofit line.3 marks


4.Find the optimal numbers of conventional cars and EVs to be produced (the

solution) in order to achieve the maximum profit (the optimal value).2 marks


5.Explain the solution analytically or using the graph.2 marks


2.2Solution using solver (10%)

Assume that the manufacturer adds hybrid vehicles to their range, which generate f4,500  profit  per  car.  The  factory  can  produce  up  to  1,000  hybrid  cars  per  day, and there are no obligations to the smallest number. The warehouse constraints remain the same.

Your task is to:


1.Write the updated objective function.1 mark


2.Write the updated constraints for production and logistics.2 marks

3.Use the lpSolve library in R to find the solution (an optimal combination of three types of cars) and the optimal value (the maximum profit) to this

problem. Include your code into the report.5 marks


4.What happens if the contract to supply at least 100 conventional cars is

replaced by the same number of hybrid cars?2 marks


3Data import, plotting and transformation (20%)

In this task, you should use the crypto-candles dataset, which contains daily exchange rates between major crypto-currencies between Jan 2019—Sept 2020. This dataset can be downloaded from the data folder on the course’s webpage (My Learning). There are two versions of this dataset:


crypto-candles.csv crypto-candles.db

The first is a comma separated values (csv) file, and the second is an SQLite database. You can read the csv file with the read.csv command. Alternatively, you can connect to the database and read table candles from the db file using the dbReadTable command. You should assign the result to a variable, which you may call candles. Note that if you read data from the database, then the timestamps have to be converted into dates by the command:

Regardless of which method you use, you should now have the same dataset, the first 6 rows of which are:

TIMESTAMP OPENCLOSEHIGHLOWVOLUMESYMBOL

1 2020 -09 -1601:00:001078910803108031078912.05823tAAABBB

2 2020 -09 -1401:00:0010316500103775004658.04400tAAABBB

3 2020 -09 -1301:00:0010459103241058610238822.69780tAAABBB

4 2020 -09 -1201:00:0010407104431048910297499.03366tAAABBB

5 2020 -09 -1101:00:0010355104061041510218810.13170tAAABBB

6 2020 -09 -1001:00:0010302103551048410271901.84625tAAABBB

The column SYMBOL contains names of the trading pairs (e.g. tBTCUSD is the ex- change rate between Bitcoin and the US Dollar). Thus, each row of the dataset contains a record of the prices (open, close, high, low) and volume data on a spec- ified date (given by the TIMESTAMP) and for each trading pair (given by SYMBOL).

The goal of this task is to make several transformations of this dataset into

other formats, so that it is ready for further analysis in Coursework 2.


3.1Plot prices (5%)

Plot closing prices against time for several (2–5) trading pairs, such as the graph below shown for the tBTCUSD pair:


BTCUSD2019−01−02 / 2020−09−19 01:00:00


1200012000


1000010000


80008000


60006000


40004000


Jan 02Apr 01Jul 01Oct 01Jan 01Apr 01Jul 01Sep 19

20192019201920192020202020202020


To do this, you will need to select subsets from the data corresponding to the trading pairs of your choice (e.g. tETHUSD, tETHBTC, tIOTBTC, tEOSJPY), and then selecting columns TIMESTAMP and CLOSE. Note that before plotting the subset, you can convert it into the extensible timeseries format (xts) using the command:

where <data> is your subset for a specific trading pair.5 marks



3.2Create prices table (10%)

Convert the dataset into a new format, which contains the closing prices for  each trading pair in different columns side-by-side and ordered according to their TIMESTAMP. Thus, each row should correspond to a specific date and contain closing prices for all trading pairs, as shown below:


tAAABBBtABSUSD   ...tBTCEURtBTCGBPtBTCJPYtBTCUSD

2019 -01 -02NANA ... 3577.1953233.9435370.04048.8

2019 -01 -03NA0.0075 ... 3441.5003101.6423820.03924.3

2019 -01 -04NA0.0078 ... 3476.1003116.6429750.03954.9

2019 -01 -05NANA ... 3432.4783075.2424538.33911.9

2019 -01 -06NANA ... 3653.9003285.9452870.04168.4

2019 -01 -07NA0.0077 ... 3583.7003215.3446690.04113.9

Note that the table above shows only some of the columns for just a few pairs (i.e. "tAAABBB", "tBTCUSD", etc). The dataset contains more than 270 trading pairs. One possible way of converting the candles is as follows:

●Create a list (or  vector)  all_pairs of  all  trading  pair  names.  This  can be done by selecting the SYMBOL column from the data, and then removing duplicates using function unique.

●Create an empty variable prices that will be used to assemble closing prices for all pairs.

●Run a loop for (pair in all_pairs), in which

1.Select TIMESTAMP and CLOSE columns from a subset for the pair and assign the result into a temporary table (e.g. call it temp).

2.Convert the result into class xts using CLOSE as data and order by

TIMESTAMP (see the commands in Task 3.1).

3.Add the result into the prices table using the cbind function:


●On finish, give names to columns: colnames(prices) <- all_pairs

In your report, you should print the dimensions of the resulting table using command dim(prices) and show the first 10 rows and randomly chosen 5 columns using the command:

3.3Convert prices to log-returns (5%)

If s(t) and s(t + 1) are the prices on two consecutive days, then the difference s(t + 1) · s(t) is called a return, while the difference of their logarithms is called log-return:

They represent price changes and are more interesting for analysis and forecasting than the prices themselves. Thus, in this task you have to convert the prices table into a table of their log_returns. This can be done by the following command:

Notice the use of as.matrix to preserve the precision  of numerical  operations  as well as dates as the rownames. This allows us to convert the result into the timeseries format using as.xts, which uses dates in the rownames as time index. The problem is that the prices table has some data missing:  notice  the  NA (‘not available’) entries in the prices table. This is because prices for some pairs were not available on certain dates in the original dataset (e.g. some pairs were not traded during some periods and no prices were recorded). Thus, before the log-returns can be computed, the NA entries must be filled in by some values. This

can be done in the following way:

1.Fill in the NA values by the last observations (i.e. most recent available prices). This is equivalent to assuming that the price remained the same rather than missing, and so log-return will be zero rather than NA.

2.In the cases when no more recent prices were observed (e.g. a pair was not traded before certain date), then fill in the NA values by the next observations. This means that log-returns will also be zero in this case rather than NA.

If the prices table is in the xts (timeseries) format, then both operations above can be performed by the function na.locf and using its argument fromLast to control the direction of the operations.

After filling in the NA values, convert the prices table into log_returns, as described above. In your report, you should print the dimensions of the resulting table using command dim(log_returns) and show the first 10 rows and randomly chosen 5 columns using the command:

In addition, include a couple of plots of log-returns for some of the pairs (e.g. those used in Task 3.1). For example, below is the plot of log-returns of tBTCUSD:

Presentation

Your report should be well presented. A good guide is the Publication Manual of the American Psychological  Association  (e.g.  see  http://www.apastyle.org/).  At the very least, your report should be clear, typed or nicely hand-written doc- ument with good spelling, grammar and easy to understand English. There is no word limit, but a useful report should be just long enough to describe the work. A sensible limit is about 10 pages of typed text. Beyond this, you are probably being a bit too verbose. Tables, graphs, careful labelling and numbering are all well established and effective presentation tools.

Things to avoid are:

●Including images or diagrams that you did not create yourself or did not obtain the permission to use from the author (even if the image is from the Internet).

●Including graphs or diagrams that you do not describe in the text.

●Forgetting to label the axes on the charts.

●Using 3D charts to display 2D information.

●Including material irrelevant to the work.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp