ECON215 Coursework
The ECON215 Coursework Exercise is 100% of the module mark. The deadline for submitting the assignment (online only, see below) is 5pm, Thursday 12th December. All exercises in the ECON215 Coursework Exercise need to be completed using Python 3 code in this Jupyter Notebook file, which needs to be converted to a HTML file as follows before submitting:
Within Jupyter, click on File then Download as →→ HTML (.html). It will now be located in the Downloads directory of your MWS computer.
Save this to your M drive. One way to do this on MWS computers is by right-clicking on the file, selecting copy, then right-clicking in your M drive and selecting paste.
Once this is done, the final step is as follows:
Submit the .html version of your assignment via the link that will appear in the Assessment folder on Vital shortly.
The assignment is comprised of three sections, with the following weights: Section A (40\%, 5 marks per question), Section B (40\%, 10 marks per question) and Section C (20\%, 10 marks per question). Where a question has parts (a) and (b), the marks are split evenly.
Section A
Q1 (Python lists) Can you complete the print() statement below, selecting the correct item in the drinks list in order to print 'tea':
In [ ]:
drinks = ['orange juice', 'tea', 'water']
print( )
Q2 (Python lists) Add an additional drink item, pineapple juice, to the list, then print the updated version of drinks
In [ ]:
Q3 (Python lists) Create a list containing any two strings and any two integers.
In [ ]:
Q4 (Python lists) remove the last integer in the list created in Q3
In [ ]:
Q5 (Python dicts) Create a Python dict variable called passengers_2017, containing the following information (key : value pairs) on the total number of aircraft depatures and arrivals in thousands. That is, set the keys to be the airport names and the values to be the numbers.
Heathrow474
Gatwick282
Manchester194
In [ ]:
passengers_2017 = { }
Q6 (Python functions) Define a function my_function that returns the value of f(x)=2xf(x)=2x when supplied with a value of xx .
In [ ]:
# the function definition:
def my_function( :
Q7 (Python functions) Define a function my_function2 that returns the value of f(x,y)=x+yf(x,y)=x+y when supplied with values of xx and yy .
In [ ]:
Q8 (Python for loop) Complete the following code in order to print the items in alist.
In [ ]:
alist = ['a', 'bb', 'ccc']
for item in alist:
Section B
In your solutions to the Section B questions, you must also include explanation of each step in your code. For example, if you use functions belonging to a particular module, or methods belonging to a particular object, or core Python functions, you should say so. If you set particular options inside the functions or methods, you should explain what these do. If you create an object of a particular type e.g. DataFrame or list, you should say so. As many steps as possible should be explained, though it is not necessary to re-explain any identical steps within a given solution (e.g. repeated use of the same function from the same module)
The explanation should be included as comments inside the code cell. Recall that there are two ways of writing comments in Python:
# This is a single line comment
"""
This is
a multi-line
comment
"""
Q1 (matplotlib line plotting) Using matplotlib and the lists of data xdata and ydata provided in the code cell below, plot the ydata against xdata.
In [ ]:
import matplotlib.pyplot as plt
xdata = [0,1,2,3,4,5,6,7,8,9,10]
ydata = [0,2.2,4.3,2.1,9.3,3.2, 13.4,14.1,15.6,7.7,10.2]
Q2 (matplotlib line plotting) Using matplotlib and the lists of data xdata, ydata and zdata provided in the code cell below, can you create graphs of ydata against xdata, and zdata against xdata, both on the same plotting area?
Better answers will specify different line styles, marker styles and colours for each line, and will include a legend describing the lines ( yy and zz ). Moverover, better answers will include labels for each axis ( xx and f(x)f(x) ), a title for the plot, and will use the annotate() function from matplotlib.pyplot to point at the intersection of the lines.
In [ ]:
import matplotlib
import matplotlib.pyplot as plt
xdata = [1, 2, 3, 4, 5]
ydata = [5, 4, 3, 2, 1]
zdata = [2.5, 3, 3, 4, 4.5]
Q3 (Loading data, changing the DataFrame index, Figure objects and subplots ) For this question, please download the files arable_land_2000.txt and arable_land_2001.txt from Vital and put them in the same directory as this Jupyter Notebook. This question has two parts, (a) and (b). Please put your solutions to parts (a) and (b) in the separate code cells below.
(a) Suppose you would like to read the contents of the files arable_land_2000.txt and arable_land_2001.txt into pandas as DataFrame objects, but you don't know the contents of the files. Show how you can look at the contents of these text files using Python, then read both of them into pandas as arable_land_2000 and arable_land_2001, setting the 'Country Name' column to be the row index in each case, and with the 'Country Name' column dropped in both cases.
Note: Part of question (a) is to read the data directly from the text files into the DataFrame objects. However, if you are unable to do this step, you can produce the required DataFrame objects in any other way (e.g. by copying and pasting values into code to create DataFrames from dicts) and continue with the rest of the question for partial marks.
(b) Show how, using a matplotlib Figure object and the plot() method for pandas DataFrame and Series objects, or otherwise, you can you create a figure containing two barchart plots side-by-side, one for each of the two datasets arable_land_2000.txt and arable_land_2001.txt. Each plot should have a title and the spacing between the plots should be increased.
Note: If you choose not to use the pandas plot() method directly on the pandas objects arable_land_2000 and arable_land_2001, and instead use a matplotlib method, recall that the matplotlib methods only work for Series, not for DataFrame. The objects arable_land_2000 and arable_land_2001 will be DataFrames, so you would need to select the data column from each (a Series) in order to do this. Two cosmetic points (optional): as part of your solution, however you do it, you should have two AxesSubplot objects within your code. If these are called ax1 and ax2, note that the unnecessary xx -axis label 'Country Name' in the plots can be removed by running ax1.set_xlabel(' ') and ax2.set_xlabel(' '). Moreover, if you are using the pandas plot() method, note that the unnecessary legend can be removed by setting the option legend = False.
Answer to (a):
In [ ]:
Answer to (b):
In [ ]:
Q4 (Python for loop) This question has two parts, (a) and (b). Please put your solutions to parts (a) and (b) in the separate code cells below.
(a) Consider the following code, which contains a list comprehension.
alist = ['a', 'bb', 'ccc']
newlist=[len(item) for item in alist]
Write a for loop that does the equivalent of the above, starting with the empty list newlist=[] and using the list method append().
(b) Using a for loop, the list letters and the variable count that have been defined below, and either format strings or the string format() method, can you create the following output?
Item 1 in letters is A
Item 2 in letters is B
Item 3 in letters is C
Answer to (a):
In [ ]:
alist = ['a', 'bb', 'ccc']
Answer to (b):
In [ ]:
letters = ['A', 'B', 'C']
count = 0
In [ ]:
Section C
As in Section B, your solutions to the Section C questions must also include explanation of each step in your code. For example, if you use functions belonging to a particular module, or methods belonging to a particular object, or core Python functions, you should say so. If you set particular options inside the functions or methods, you should explain what these do. If you create an object of a particular type e.g. DataFrame or list, you should say so. As many steps as possible should be explained, though it is not necessary to re-explain any identical steps within a given solution (e.g. repeated use of the same function from the same module)
Q1 (mergeing DataFrames, and operating on DataFrames) For this question, please download the files poverty_2000.csv, poverty_2001.csv, poverty_2002.csv and poverty_2003.csv from Vital and put them in the same directory as this Jupyter Notebook.
The code below produces four pandas Series objects, pov_2000, pov_2001, pov_2002 and pov_2003, each containing poverty headcount ratio data (percentage of population) at $1.90 a day (2011 PPP) for countries in a given year (2000, 2001, 2002 and 2003, respectively).
Use the merge() function to join the four Series into a DataFrame with four columns, where only data for the countries common to all four Series is included (there should be no missing data in the merged DataFrame). Then use pandas functions or methods to create a new Series containing the average values over 2000-2003.
Q2 (Indexing, hierarchical indexing, groupby, aggregation, data analysis) For this question, please download the files poverty_full.csv and region_and_incomegroup.csv from Vital and put them in the same directory as this Jupyter Notebook.
Read both files into DataFrame objects, then use loc or iloc to create a new DataFrame containing only the following columns from the file poverty_full.csv: Country Name , Country Code, and the years 2010 to 2015.
Then obtain a new DataFrame that inner merges the two DataFrames on 'Country Code'.
Using this data, perform some exploratory data analysis. The better answers will, as part of this, set a hierarchical index for the rows using 'IncomeGroup' and 'Country Name', use groupby() to find statistics by Region and IncomeGroup, and will include some visualisation of the data.
Two code cells are provide below for your solution: the first is for the data analysis, where you should print out key results, while the second is for any visualisation.
In [ ]:
# pandas code here
In [ ]:
# visualisation code here
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。