联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2019-12-06 10:50

CS602 - Data-Driven Development with Python Fall 2019 Programming Assignment 6

1

Programming Assignment 6

Getting started

Review class handouts and examples, work on the reading and practice assignments posted on the course

schedule. This assignment is designed to practice data manipulation with Pandas and plotting with

Matplotlib.

Programming Project: Plotting worth: 25 points

Create a plot and barcharts visualizing hotel ratings.

Data and program overview

In this assignment you will be working with data on hotel reviews. The task will be to create a plot showing

mean ratings and number of reviews for a selection of hotels in a chosen state, and a barchart that shows

percentage of reviews.

The following data will be provided using csv files:

? A table with information on hotel location (hotels.csv); we will call this the hotel data.

? A table with records of customer reviews of their stays in the hotels (hotelreviews.csv); referred to

henceforth as the reviews data.

Each review references the hotel name and city; these parameters uniquely identify the

corresponding hotel in the hotel data.

The files are supplied in a zip file, which will create a data subfolder, when unpacked. Review the data

before you read the rest of this handout.

Overview of the program

The program should work as follows.

1. Ask the user for the subfolder and names of the two data files (see interaction).

2. Ask the user to enter a state, verifying that the state is one of the states for which hotel

information is available in the hotel data. If user input for the state was not found in the

appropriate column in hotel data, user input must be repeated, until a valid state is entered.

3. Identify all cities with a hotel in the state, based on the hotel data. Provide a numbered sequence of

cities in the specified state and ask the user to enter up to four numbers from the list. Input should

be repeated until the user enters one to four numbers from the numbered list. You may assume the

user will be entering numbers only.

4. Identify all hotels that are located in the selected city(s) (you can assume city names are unique

across all states). Display the names of these hotels.

5. Display a hotel reviews plot (described below), and save the plot as plot1.jpg file using

plt.savefig() function.

6. For the three highest rated hotels among the selected, display a rating percentage barchart

showing percentage of reviews with specific ratings (described below). Save these plots as

barchart1.jpg, barchart2.jpg, barchart3.jpg,

Sample interactions: user input appears in boldface with the generated plots shown after the text of the

interaction.

Please enter names of the subfolder and files: data hotels.csv hotelreviews.csv

CS602 - Data-Driven Development with Python Fall 2019 Programming Assignment 6

2

Please enter state, e.g. MA: MA

1 Auburn

2 Boston

3 Brockton

4 Cambridge

5 Fitchburg

6 West Springfield

dtype: object

Select cities from above list by entering up to four indices on the same line: 2 4

You have selected the following cities:

city

2 Boston

4 Cambridge

Displaying rating information for the following hotels:

name city province

0 The Inn @ St. Botolph Boston MA

1 40 Berkeley Hostel Boston MA

2 A Bed & Breakfast In Cambridge Cambridge MA

3 Holiday Inn Express Hotel and Suites Cambridge Cambridge MA

Exiting...

CS602 - Data-Driven Development with Python Fall 2019 Programming Assignment 6

3

The following interaction demonstrates how invalid input should be handled and the messages to be

shown for invalid input (highlighted). The graphs are omitted.

Please enter names of the subfolder and files: data hotels.csv hotelreviews.csv

Please enter state, e.g. MA: Provence

We have no data on hotels in Provence

Please enter state, e.g. MA: az

We have no data on hotels in az

Please enter state, e.g. MA: AZ

1 Eloy

2 Glendale

3 Mesa

4 Payson

5 Phoenix

6 Prescott Valley

7 Tucson

8 Wellton

dtype: object

Select cities from above list by entering up to four indices on the same line: 5 6 8

10

Selection must range from 1 to 8

Select cities from above list by entering up to four indices on the same line: 1 2 3

4 5 6 7

You selected 7 items, must select up to four

Select cities from above list by entering up to four indices on the same line: 5 7

You have selected the following cities:

city

5 Phoenix

7 Tucson

Displaying rating information for the following hotels:

name city province

0 La Quinta Inn and Suites Tucson - Reid Park Tucson AZ

1 La Posada Lodge & Casitas, An Ascend Hotel Col... Tucson AZ

2 Residence Inn By Marriott Tucson Williams Centre Tucson AZ

3 Holiday Inn Express & Suites Phoenix Downtown ... Phoenix AZ

4 Park Terrace Suites Phoenix AZ

Overview of plots

1. Hotel reviews plot is the first plot shown in the interaction.

For each of the hotels in the selected cities, this plot visualizes the number of reviews as a

coordinate on the x axis and the average rating, as a coordinate on the y axis. Hotels must be

displayed as colored points, annotated with the hotel name, and (for full credit) using a color

corresponding to a city, as shown in the plot legend. Axes must be clearly labeled as shown, and the

title should be as shown.

Do not worry about the hotel names overlapping due to placement of annotations, or extending

beyond the plot boundaries.

2. Rating percentage barchart is generated for three of the top-rated hotels. Each barchart

displays a bar graph produced using the Matplotlib function plt.bar(), showing what percentage

of all reviews have the specific rating (1 through 5). This percentage is computed by calculating

CS602 - Data-Driven Development with Python Fall 2019 Programming Assignment 6

4

the total number of reviews with the rating and dividing it by the total number of reviews for the

hotel.

The percentage value must be displayed clearly on top of the bar. Axes must be clearly marked and

labeled as shown, and the title of the chart should include the name of the hotel, its city and state

information as well as the total number of the reviews of the hotel.

Required Functions

Include

? main() to read the location of the input files and call other functions to run the whole program;

? function pickStateAndCities () that will run the state and city selection procedure and return

user chosen state and all cities in it;

? function selectHotelReviews() to select and return reviews for the hotels in the selected cities, so

that the Hotel reviews plot can be generated;

? functions reviewsRatingsPlot() and ratingPercentageBarchart() to generate and save the

appropriate plots.

Pick function parameters and return values as you see fit, and define other functions as needed.

General Requirements

? You can assume that the provided files will have all of the columns involved in the required

computations, but the number and content of records and order of columns may be different.

? Your program should have no code outside of function definitions, except for a single call to main()

and global variables described in the next bullet.

? In order to make the code easier to modify for a different set of column names, define global variables

that store the names of columns that your program uses (e.g. CITY = 'city') and use the global

variables throughout your code.

? All file related operations must use device-independent handling of paths (use os.getcwd() and

os.path.join() functions to create paths, instead of hardcoding them).

Submission and Grading

Submit your code along with the image files that your program will generate for the input data contained in

the first sample interaction. Grading will be based on the accuracy (conforming to all the requirements and

format of the interaction), generality of code and the appropriate use of pandas/numpy/matplotlib resources

(data structures and functions). Two points will be awarded for programming style.

Created by Tamara Babaian on November 23, 2019


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp