联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Java编程Java编程

日期:2023-12-11 09:34

Programming for EOR

Individual Assignment

Maximilian Osterhaus

2023-2024

The assignment includes 1 exercise worth 30 points. The assignment is designed in such a way that the

majority of questions should be relatively easy to answer with the content covered in the lecture. However,

some questions deliberately are more difficult and require functions that we did not cover in the lecture. The

idea is to train your skills finding solutions using different resources (e.g., Stack Overflow) and solving more

complex real-life coding challenges.

You have to submit the assignment on Brightspace before Sunday 10.12.2023 at 18.00. Do not forget to

hand in both the .pdf file and the .Rmd file in the respective hand-in boxes (other file types cannot be

uploaded). Be careful when submitting your assignment: assignments that are handed in too late, in the

wrong box, or by email will not be checked and will receive grade 0.

Your grade is equal to the number of points that you achieved divided by the total number of possible

points (30). If you fail the assignment (< 17 points), you are eligible for a repair assignment due Sunday

17.01.2024 at 18.00. For the repair assignment, your grade is equal to min(5.5, points/maxpoints × 10).

Make sure that all your code is in accordance with the style guide (available on Brightspace). Points will be

subtracted from your grade for every infraction (up to 2 grade points from your assignment grade).

1

Exercise 1 (30 points)

Research links the consumption of sugar-sweetened beverages to increased body-weight, diabetes, cardiovascular risk factors, and dental caries among other adverse health effects. Some countries, states and

municipalities have taken this as an opportunity to introduce a tax on sugar-sweetened beverages. One of

the first to introduce such a tax was the city of Berkeley in California, USA, which introduced a tax of one

cent USD per liquid ounce (1¢/oz) on such beverages.

In this assignment, you will be working with data from this research article that studies the impact of the

tax on the prices of beverages in Berkeley and surrounding areas. The data includes information on prices

of a large number of products from varying categories sold in stores located in Berkeley and surroundings at

different points in time.

In the following questions, we will first look at price data from stores in Berkeley to see what happened to

the price of sugary and non-sugary beverages in Berkely after the tax was introduced. For that purpose, we

will use the two data sets data_stores.csv and data_prices.csv. The data set data_stores.csv contains a store

identifier (store_id), information on the type of the store (store_type), a product identifier (product_id),

the general product category (category1), information on the subcategory (category2), and information

on the time (year and month).

The data set data_prices.csv contains a store identifier (store_id), a product identifier (product_id),

information on the time (year and month), the price in dollars (price), the product size in liquid ounces

(size), and a variable named taxed that is equal to 1 if the product is addressed by the tax in general, and

0 otherwise (i.e., the variable taxed is equal to 1 for all sugar-sweetened beverages, also for periods before

the introduction of the tax).

a) The stores data covers large chain supermarkets (store_type = 1), independent corner stores

(store_type = 2), drugstores (store_type = 3), and gas stations (store_type = 4). Convert/replace the variable store_type to a factor variable with labels corresponding to the store types

(i) supermarkets, (ii) corner stores, (iii) drugstores, and (iv) gas stations.

Use the new variable to report the unique number of stores per store type, and the unique number of

products offered per store type (i.e., all products across all stores of the same type).

b) The product categories (category1) are reported in different formats: some are reported in lower case

letters, some are reported in upper cases letters, and some are reported with upper and lower case

letters. In addition, some categories use different separators within their names. Unify all product

categories in the variable category1 to be written exclusively in upper case letters and with hyphens

as separators.

For every product category (category1), report the unique number of products.

c) The variable category2 includes a more detailed description of beverages in comparison to the variable

category1, though only for few product categories. Products without additional information are

marked by a hyphen. Replace the hyphens by NAs.

For every product category (category1), report the unique subcategories (category2) without NAs in

your reported results.

d) Load the data set data_prices.csv and merge the data to your store data set that you prepared in

questions a) to c). Merge the two data sets in such a way that you keep only observations appearing

in both data sets. Add a new variable to your merged data set that corresponds to the price per liquid

ounce in dollar cents. Illustrate the distribution of prices in a box plot with a separate box for every

unique product category in category1.

2

e) Convert the variable taxed in your merged data from part d) into a factor variable with label Taxed

for all observations for which taxed is equal to 1 and with label Non-taxed for all observations for

which the variable taxed is equal to zero.

Use the factor variable to replicate the table below presenting summary statistics of the price per

ounce in cents for taxed and no-taxed beverages whereby the first column represents the number of

observations per group and the last column the standard deviations. Hint: You can use either the

package kableExtra or stargazer to create a table in markdown files. You do not need to recreate

the table format exactly. If you did not succeed to answer question d), you can use the data set

data_cleaned.csv to answer the question (and following questions).

Table 1: Summary Statistics of Price per Ounce (in cents) for Taxed and Non-taxed Beverages.

taxed N Min Median Mean Max Std

Non-taxed 1072 0.98 7.64 11.11 49.83 10.46

Taxed 1096 1.21 8.25 13.14 71.50 13.77

f) Use the merged data to replicate the figure below showing the percentage change in average prices

per ounce in cents for taxed and non-taxed drinks between December 2014 and June 2015 ((¯p2015 −

p¯2014)/p2014 ∗ 100) for the different store types in the data.

Supermarkets Corner Stores Drugstores Gas Stations

Store Type

Price Change in %

Non−taxed Taxed

Price Change Between December 2014 and June 2015 for Different Store Types

3

It is possible that the observed price patterns in Berkeley observed in the figure above are not solely due

to the tax introduced on sugar-sweetened beverages, but instead were also influenced by other events that

happened in Berkeley and surrounding areas. To investigate whether this is the case, you will use the data

set data_berkeley.csv to compare the prices in Berkeley before and after the tax to prices from stores located

outside of Berkeley.

The data set contains monthly average prices per liquid ounce in cents for different product categories

(category) collected from stores in and around Berkeley (location) at different points in time (year and

month). The reported prices (prices) are average prices across different stores and products of the same

category, whereby products are separated into products that are addressed by the tax (taxed = Taxed), and

products not addressed by the tax (taxed = Non-taxed). If prices changed in a similar way in surrounding

areas (where no tax on sugar-sweetened beverages was introduced), then the observed price changes in

Berkeley may not be primarily a result of the tax.

More precisely, the impact of the tax on the price of sugar-sweetened beverages in Berkeley, known as the

pass-through of a tax, can be calculated as the difference in the mean prices of sugar-sweetened beverages

(i.e., of beverages generally addressed by the tax) sold after the introduction of the tax in Berkeley and

in surrounding areas, minus the difference in the mean prices of sugar-sweetened beverages sold before the

introduction of the tax in stores in Berkeley, and in stores outside of Berkeley:

Pass-Through = (¯p

taxed

Berkeley,after − p¯

taxed

Non-Berkeley,after) − (¯p

taxed

Berkeley,before − p¯

taxed

Non-Berkeley,before)

However, one important assumption that needs to be satisfied for this calculation to return the true effect of

the tax on prices in Berkeley is that the price-trends in Berkeley and its surrounding areas must be parallel

before the introduction of the tax.

g) In order to study whether this assumption is satisfied, create a new variable of type date (with year,

month, and day) by combining the variables year and month and where the day of the month is always

set to the first day of the month. Use your new variable to calculate average prices at the date-locationtaxed level to replicate the figure below whereby the date is your newly created variable and where the

dashed vertical line illustrates the date where the tax was implemented (i.e., 01.01.2015).

6

8

10

2013 2014 2015 2016

Date

Price per Ounce in Cents

taxed Non−taxed Taxed location Berkeley Non−Berkeley

Price Trends of Taxed and Non−Taxed Beverages in Berkeley and Surrounding Areas

4

h) According to the figure above, average prices of taxed beverages follow a parallel trend in Berkeley and

its surrounding areas before the introduction of the tax. Using only data from before first of November

2014 and from first of March 2015 onwards, calculate the pass-though of the tax using the formula

from above where average prices are calculated across all product categories.

i) Use a for loop to loop over all product categories that are addressed by the tax to calculate the passthrough of the tax for every product category separately. Use only data from before first of November

2014 and from first of March 2015 onwards for your calculations. Collect your results in a data frame

that includes a variable with the category and a variable with the pass-through rate that you calculated.

5


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp