联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2022-04-07 11:02

ECON30130 Econometrics

R Project – Deadline April 10

Dr Benjamin Elsner

benjamin.elsner@ucd.ie

Rules & Guidelines

Ground rules

This assignment counts 30% of your final grade. You have to work through a set of tasks using R, and write

up your answers using Word, LaTeX, or R Markdown. The rules are as follows:

Below you will find a set of tasks. Please answer all questions and work through all tasks. There is no

word or page limit, but please be concise.

Deadline is April 10, 2022, at 11:59:59pm

For late submissions, UCD’s Late Submission of Coursework policy applies.

Papers are to be submitted on Brightspace → Assessment → Assignments

Submissions should be in one pdf, and should include: 1) the write-up of the assignment, 2) the R code.

Students are allowed to work in groups of up to five. If students work in a group, only one group

member should submit the paper on Brightspace. On the first page of the paper it should be clearly

stated that this was a group project and the names and student numbers of the group members should

be given.

UCD’s Student Plagiarism Policy will apply. I reserve the right to run plagiarism checks on Brightspace.

Questions should be posted on Brightspace.

A solution will not be provided after the deadline.

Grading

Students will receive a letter grade for this assignment. Grading is based on the following criteria:

Correctness of the analysis and interpretations

Writing (clear and concise)

Exposition: are graphs and tables done well? They don’t need to look fancy, but it has to be clear

what is shown. For regression tables, please use stargazer or alternative packages that give you nicely

formatted regression tables.

Bonus: a higher grade (1 notch, e.g. from B+ to A-) is given if all of the following are done:

1. project written with R Markdown (can be done via RStudio); please indicate on the first page if

you do so; for an introduction, see here

2. all graphs and tables have been programmed with R, i.e. no copy & paste anywhere

3. all graphs done with ggplot (but not with the default grey background);

4. tidyverse functions (especially the pipe operator) are frequently used.

Some tips

The aim of this assignment is to get students to “figure things out.” In the tutorials, clear instructions and

coding examples were given along with a clean data set. However, this is far away from the work data analysts

1

are doing. Their projects typically have a clear goal, but the data are often messy and it is unclear how to

reach the goal of the analysis. Simply put, the analyst has to “figure things out”: how to best clean the

data set, how to best visualise data, how to bring the data into a format that is suitable for visualisation

and regression analysis, etc. If you’re working in a company, you neither refuse to do a project because “we

haven’t learned about a certain procedure in class”, nor can you run to your manager with every little error

message you encounter. Ultimately, data analysts are paid for solving problems themselves or collaboratively

with team members. The sooner you get into that mindset, the better. This assignment is similar to a project

one would encounter in a data analytics project.

How to figure things out?

Google is your friend. Get a strange error message? Type it into google; chances are someone else

had the problem before. You can also search StackOverflow, the forum for all things programming (R,

Python, C++, etc)

If one solution doesn’t work, try another one. Solving problems is often frustrating; it takes time and

a decent bit of grit. So if you encounter a problem, solve it or find a way around. There is always a

solution!

Preparation

For some of the tasks below, you will need to know how to incorporate binary variables into a regression.

Once you know how regression works, this is pretty straightforward. Here are some sources you may want to

consult:

? When a regressor is a dummy: Chapter 5.3 in Stock & Watson; here is a good video

? When the dependent variable is a dummy (also called linear probability model): Chapter 11.1. in Stock

& Watson. See also this video, this video and this video. The latter video is based on Stock & Watson’s

materials.

2

A. Theory Tasks

Suppose you want to quantify the extent of discrimination in an online market. You have data on all the

sales of a given product (say a smartphone) that took place on an online auction site in the U.S. in 2015.

You observe whether a product was sold, at what price, and whether the seller is a member of an ethnic

minority. On the auction site, consumers don’t directly observe minority status, but they can infer it from

the first names of the sellers.

1. Suppose you want to estimate the effect of minority status (i.e. a dummy that equals one if a person

belongs to a minority and zero if the person is white) on the sale price. Write down a regression equation

that would allow you to estimate this effect.

2. Explain what parameter you are interested in estimating and provide an interpretation of this parameter.

3. Discuss the random sampling assumption and the conditional independence assumption (in the lecture

it was E(u|X) = 0). Are these assumptions fulfilled in this case (explain why or why not)? Explain

intuitively the likely consequences of these assumptions (not) being fulfilled for estimating the effect of

interest.

4. If you could run an experiment (regardless of ethical considerations) to estimate the effect of interest,

what would this experiment look like and why? (N.B.: the ideal experiment asked for here is different

from an experiment described further below.)

B. Empirical Analysis

Introduction

Jennifer Doleac and Luke Stein ran an experiment on ebay small ads, a platform that lists classified ads in

local markets in the U.S. (similar to adverts.ie in Ireland). In their experiment, they put up ads for new ipod

nanos. Their goal was to study whether buyers discriminate between black and white sellers, i.e. whether

they are less likely to contact a black seller, make lower offers and are less friendly in correspondence. To

experimentally vary the race of the seller, they showed the ad listings with a photo in which the same ipod is

held by a white hand, a black hand, or a white hand with a wrist tattoo (which buyers may see as a sign

of lower social status). In addition, they experimentally varied the quality of the ad text and whether the

ipod was held in the right or left hand (such that not all ads look the same), and the asking price (between

three price points). Each ad was online for 12 hours. The authors collected information on the number of

responses, the number of offers, the friendliness of the responses, the amount offered, among others.

Paper and data

You can find the paper here and on Brightspace:

? Doleac, J.L. and Stein, L.C. (2013), The Visible Hand: Race and Online Market Outcomes. Econ J,

123: F469-F492. https://doi.org/10.1111/ecoj.12082

Along with the assignment on Brightspace, you find the dataset data_doleacstein.dta, which is in Stata

.dta format. We will use this dataset for the analysis to follow. Each observation is one email that was sent.

The main variables for our analysis are shown in Table 1.

Tasks

1. Load the dataset into R and produce a table of summary statistics (number of observations, mean,

sd, median, min, max, number of missing observations) for the variables all variables listed in Table

1 except ad and texttype. Interpret the mean of responses, offers, white, black, tattoo, and

polite.

2. Generate a new dummy variable anyresponse that equals 1 if an ad received at least one response.

3

Table 1: Main Variables

Variable name Content

ad ad ID (for authors’ use only)

responses number of responses received

price asking price

offers number of offers received

bestoffer best offer received for ipod

meanoffer mean offer if there were multiple offers

name dummy: 1 if buyer signed response with name

polite dummy: 1 if response was polite

texttype indicator for quality of text; 0=high quality, 1=medium quality, 2=low quality

black dummy: 1 if seller is black

tattoo dummy: 1 if seller is white and has a wrist tattoo

white dummy: 1 if seller is white (without a wrist tattoo)

3. Produce a frequency table for the number of ads that were put up for each seller type (black, white,

tattoo). The table should include the number of ads per seller type (absolute numbers and shares,

i.e. the share of ads that were assigned to a particular seller type).

4. Produce a frequency table with seller types on the horizontal and asking prices (90, 110 and 130 USD)

on the vertical axis. Each cell should show the share of all ads that were put up by a given seller type

for a given asking price (hint: search for cross tabulation). Do not show the absolute numbers, only the

shares. What does the result tell you about the quality of the randomisation in the experiment?

5. Run t-tests comparing the difference in means between white and black sellers for the following variables:

anyresponse, bestoffer, meanoffer, polite. The results of the t-tests should be presented in a table

that shows the following: each row is a variable; columns: mean of the variable for Whites, mean of

variable for Blacks, difference in means between Whites and Blacks, p-value of t-test. Interpret your

findings regarding magnitude and statistical significance.(Hint: you can use t.test which will save the

results of each t-test in an object that you can see under “Environment”. You can then combine these

objects to a table.)

6. Regress the dummy anyresponse on the dummies black and tattoo. Interpret the coefficients of the

slopes and intercept, comment on statistical significance, and compare your results to those in the table

produced in 4.

7. Another way of analysing the results of an experiment like this is through bar charts with error bars.

You plot the means for the treatment and control group and attach to each bar a so-called error bar

(y ± sd(y)). The error bars give an indication of the variation in each seller group. Produce such a

chart (separate bars for black sellers, white sellers, and sellers with a tattoo) for the following outcomes:

bestoffer, meanoffer.

8. Not only did the researchers randomise whether the seller is black, but they also randomised the quality

of the ad text. Create dummies highquality (1 if text of high quality), and mediumquality (1 if text

of medium quality). Run a regression of black on highquality and mediumquality and interpret

your result. Comment on the meaning of this result for the experimental design.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp