联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2019-03-16 11:09

MSCI 517: Introduction to Python

Coursework III (60% of module)

Deadline: 22/3/19, 10AM Lent Term Maximum Marks: 100

1 Coursework Description

This coursework is composed of two parts. You have been given the flexibility to structure the

code in the way that you think is best following the principles of re-usability, maintainability,

information hiding and clarity.

Part 1 (30 Marks)

You should submit a single .ipynb file containing a working version of all the tasks. The name

of this file should be your library number (e.g. 123456789.ipynb, where 123456789 is an example

library number). In case you want to import any packages apart from pandas, matplotlib, numpy,

math and seaborn, you should contact the module leaders first.

A newly appointed health minister has heard reports that there are some hospitals in England

where the average Length of Stay (LoS) of patients is over 1 week, whereas in many others the

average LoS is under 1 week. You will need to download the file patients.csv from Moodle. The

file contains the following data for patients admitted to one of five NHS hospitals in England:

Patient ID String

Admission type String

Age Integer

Height Integer

Weight Integer

Hospital String

LoS in hours Integer

Task 1: (2 marks) Give the code that creates a DataFrame from the information in patients.csv.

The DataFrame should be called df.

Task 2: (10 marks) LoS in weeks can be used to provide the following “Status”:

LoS in weeks Status

< 1 “Short LoS”

= 1 “Normal LoS”

> 1 and < 3 “Long LoS”

≥ 3 “Too Long LoS”

Please Turn Over

MSCI 517 Introduction to Python Page 2 of 4

Give the commands that display the following columns: Patient ID, LoS in weeks, Status, and

Flag. If Status is “Long LoS” or “Too Long LoS”, then Flag should be set to “True”, else it should

be set to “False”.

Task 3: (6 marks) Give the code that calculates the correlation between the Age and Height

(using Pearson’s correlation coefficient). Also give the code for producing a scatter-plot of this

data.

Task 4: (6 marks) Give the code that creates a histogram of the Height and fit a normal curve

over the histogram. Also give the code that calculates the mean, median, standard deviation, range

and inter-quartile range of the Height.

Task 5: (6 marks) Give the code that displays only those patients with LoS more than one week.

Now give the code that uses this data to calculate the mean Age of patients from each of the five

NHS hospitals.

Part 2 (70 Marks)

You should submit a single .py file containing a working version of your program. The name of

this file should be your library number (e.g. 123456789.py, where 123456789 is an example library

number). In case you want to import any packages apart from pandas, matplotlib, numpy, math

and seaborn, you should contact the module leaders first.

In this part, you are to write a program that attempts to solve a machine delivery problem using

stochastic optimisation techniques. Specifically your program will read in a problem instance from

a file and will attempt to produce a solution using heuristic methods.

It is important to break your code into small, meaningful functions and appropriately comment

your code.

Problem description

There are machine requests from n customers that all have to be satisfied. There is one depot

location where all the machines are located. Trucks are hired to transport the machines from

the depot to the customers. The number of available trucks is b

nc, where n is the number of

customers. All trucks must be allocated exactly bn/√

nc deliveries (locations). The route of a truck

must start and end at the depot. We assume that it does not take any time to load a machine at

the depot or to unload a machine at a customer.

The objective is to minimise the total distance travelled by the trucks. In order to determine the

travelled distances, integer coordinates are provided for the customer locations. The location of

p

the depot is always (0, 0). The distance between coordinates (x1, y1) and (x2, y2) is defined as

(x1 ? x2)

2 + (y1 ? y2)

2, i.e., the Euclidean distance.

The problem instances, which are available in .csv format, contain the customer IDs and the x and

y coordinates of the customer locations (all in integers). An example of a problem instance file

I1.csv is shown below.

Please Turn Over

MSCI 517 Introduction to Python Page 3 of 4

ID x y

0 2 2

1 -2 1

2 0 -1

3 3 1

In this example, we have n = 4 customers. The number of available trucks is b

nc = 2. All trucks

must be allocated bn/√

nc = 2 deliveries (locations). The locations of the first, second, third and

fourth customers are (2, 2), (-2, 1), (0, -1) and (3, 1), respectively.

Using the problem instance given above, a solution which specifies the routes for the trucks can be

stored as follows: [ [2, 1], [3, 0] ]. Here the route for the first truck is given as [2, 1], and the route

for the second truck is given as [3, 0].

Note that all trucks depart from and must return to the depot, so this is not mentioned explicitly

in this solution format. The distance travelled by the first truck is 6.064. This can be computed

by calculating the straight line distance between the coordinates of the locations (i.e. distance

between depot (0, 0) and customer 2 (0, -1) + distance between customer 2 (0, -1) and customer

1 (-2, 1) + distance between customer 1 (-2, 1) and depot (0, 0)). The distance travelled by

the second truck in this example is 7.405. Therefore, the total distance travelled by all trucks is

6.064 + 7.405 = 13.469.

Task 1: (5 marks) On Moodle you will find three problem instances: I1.csv, I2.csv and I3.csv.

The first task is to write a Python program that reads the coordinates of the customer locations

from a given file.

Task 2: (10 marks) Generate an initial solution to this problem at random. To get the full mark

you will need to add some ‘intelligence’ to the procedure to produce better quality initial solution.

Task 3: (10 marks) Implement a TotalDistance function that calculates the total distance of the

solution in the way described above.

Task 4: (30 marks) Implement a function that improves the initially generated solution using

e.g. hill climbing method which accepts only non-worsening moves. To do this, you will need to

implement an appropriate operator, e.g. swap operator. To get the full mark, you should make

your method more efficient and powerful. Here are some ideas: When performing a move, you may

notice that only few trucks are altered. Thus, you may be able to speed up your program by only

re-evaluating the effected trucks. You may also consider advanced techniques such as simulated

annealing, tabu search, or great deluge algorithms (you will need to research these yourselves).

You might also consider more than one operators (e.g. swap and insert operators). In any case,

you program should be run for a fixed number of iterations between 100,000 to 1,000,000.

Task 5: (10 marks) At the end of the run, your best obtained solution should be saved in a .csv

file. The file specifies the routes for the trucks, and the last row displays the total distance of

the solution. This format is extremely important, as I will be using my own checker software tool

to automatically examine your solution and your calculated total distance. The example solution

described above would be displayed:

Please Turn Over

MSCI 517 Introduction to Python Page 4 of 4

2 1

3 0

13.469

Task 6: (5 marks) Update your code to provide a visual representation of the best solution. Below

an example of a visual representation to a solution produced by solving the problem instance I2.csv.

-40 -30 -20 -10 0 10 20 30 40

Truck 1

Truck 2

Truck 3

Truck 4

Truck 5

Truck 6

2 Coursework Submission

The coursework deadline is 10:00AM, Friday 22nd March, 2019. In accordance with University

regulations, marks are deducted from any coursework which is not submitted by the deadline.

However, if an extension is given then the rule applies from the date of the extension.

3 Plagiarism

According to university rules, instances of plagiarism will be treated very seriously. Penalties are

in line with the institutional framework of the University. It is natural that students will sometimes

exchange ideas and help each other out in completing coursework of this nature. However, please

be mindful that flagrant instances of copying will result in both parties receiving zero.

Table 1: Version Control System

Version Date Author Changes

1.0 23/02/19 Ahmed Kheiri -

2.0 27/02/19 Ahmed Kheiri math package is allowed.

2.1 28/02/19 Ahmed Kheiri clarify how distance is calculated


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp