Assignment 3
How many clusters is "enough"?
Background:
In many applied settings observational data is not independent. It is often the case that observations are "clustered". That is, there are unobserved factors affecting observations within the same group or cluster. Clusters can be villages, schools, teams, classrooms, etc. depending on your analysis. It is easy to image that there are some factors that affect the outcome variable in a similar way for all individuals in a group. This should be accounted for when running regressions at the individual level and observations are clustered.
Some suggest you need more than 42 clusters?(Links to an external site.)Links to an external site.for the cluster-error formula to be valid. But where did that number come about?
Instructions:
Your goal is to perform a Monte Carlo (MC) simulation exploring the role of the number of clusters (g=2, 4, 5, 10, 20,25, 50 and 100) for varying sample sizes (n=100, 200, 500, 1000 and 10000). The code I provided you in the lecture on MC simulations should give you a great head start to do this homework.
Your code should:
Be replicable on any computer.
Generate one figure written to the hard drive in the ../figures folder.
The x-axis should be the number of clusters.
The y-axis should be the ratio of the estimated clustered standard errors relative to the standard deviation of the simulated estimates (reflecting the "true" standard error).
Show lines with symbols in different colors for varying sample sizes.
Make sure to add a legend indicating the sample size corresponding to each of the lines specified above.
That is all! But once you look at the figure you will now understand why some suggest 42 or 50 as the lowest number of clusters you should have to rely on the standard formula for computing clustered standard errors.
Grading criteria:
Replicability (3 points):
o3:The code runs and generates the desired output perfectly without any edits from the user. The files include a clear readme file.
o2:The code requires a minor edit in order to run and generate the desired output. The readme file is missing.
o1:The code requires several edits in order to run; There may be errors and/or the desired output is not obtained.
o0:The code does not run.
Completeness (3 points):
o3: The code generates 1- the correct figure, that is 2- correctly annotated (axes, labels, legend), and 3- is located in the correct folder.
o2: Something is missing: e.g. the figure is incorrect, some annotations are missing, the figure is not generated in the correct folder.
o1: More than one element is missing.
o0: Figure is not generated.
Coding (3 points):
o3: The code 1- has clear and logical sections, 2- has a clearly-identified parameter section to be modified and result in slightly different versions of the output, 3- is well annotated, 4- is concise but spacious.
o2: The code has at most one of the following: 1- has no clear structure, 2- hard-codes parameters within the core of the script file, 3- lacks some key annotations, 4- is too dense or verbose.
o1: The code has one or more of the following: 1- has no clear structure, 2- hard-codes parameters within the core of the script file, 3- lacks some key annotations, 4- is too dense or verbose.
o0: There is no code or the code cannot perform basic steps necessary to generate the output.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。