SMPA 6242
Analytics and Data Analysis for Strategic Communication
Final Project Guidelines
Fall 2024
1 Project Description
The Final Project is a chance to apply the theories and topics students have learned from class towards a real-world statistical question of their choosing. Students will produce an analysis that is unique to them, of interest to them, and which could appear in a future professional setting. The project can take many forms, just as communication work does, but must meet a few basic criteria:
• The deliverable must thoroughly answer a well-defined research/statistical question of the student’schoos- ing
• The deliverable must be based on original data analysis that the student performs, sourced from a public dataset(s) of the student’s choosing, with the dataset containing at least 1,000 observations
• The deliverable must include at least 2 regressions as part of the analysis
• The deliverable should discuss what potential variables are omitted/missing, and how they could affect the results
• The final deliverable must include at least 3 production quality visualizations (tables, graphs, etc),two of which must be in Tableau. Production quality means something that would appear in a report or paper, and not copy and pasted Stata output
• The deliverable must identify its intended audience and the language must be appropriate thereto (“Will the audience know what an R2 is or not?”)
• The deliverable must clearly present the uncertainty around the results it finds
• The deliverable should be well-written and not contain any typos, spelling, or grammatical errors
• The final Blackboard submission must include a replication packet, which includes
1. all raw data used for the project (as a .dta file)
2. a well-commented .do file that produces the results in the final product
3. the final deliverable
2 Important Dates
The following important dates should be noted and saved:
To-Do Date Time
Final Project Proposal Due |
Oct 28 |
11:59pm |
Final Project Workshop |
Dec 10 |
(In-Class) |
Final Project Due |
Dec 15 |
11:59pm |
3 Deliverables
The Final Project has more than one component due. In sequence, the Final Project will require a:
1. Final Project Proposal (5 pts): Due Oct 28 by 11:59pm to Blackboard. This will be a one or two paragraph
write-up on the proposed topic. The write-up should include:
(a) The proposed statistical research question
(b) The proposed audience (who will be reading this?)
(c) The medium the student will be submitting their Final Project deliverable as (e.g. Twitter/Xthread, Instagram reel, Tik Tok video, podcast, official memorandum, white paper, or something else)
(d) The data set/source the student is planning to use
(e) Any questions students have for the professor
2. Final Project Workshop Presentation (5 pts): Taking place in-class on Dec 10. Students will create a small presentation for the class on their findings thus far, the direction they intend to head, interesting tid-bits they’ve found in their data set, etc. More details in Section 4, below.
3. Final Project Deliverable (20 pts): Due Dec 15 by 11:59pm. All pieces of the Final Project are due including:
(a) the .dta file for the dataset used
(b) the .do file for the code replication
(c) the communication piece in whichever form. students are choosing
4 Final Project Workshop – Dec 10 In-Class
The Final Project Workshop is an in-class roundtable presentation of each individual student’s project. Students will come prepared with a slide deck to give a 6 minute presentation on their project. The presentations will cover:
1. the topic and statistical research question
2. a Positionality Statement on how the student relates to the research at hand
3. the data and the source of the data
4. any limitations within the data
5. regression results, and an interpretation of at least one regression
6. graphics and data vizzes
7. final slide for Q&A
At the end of each student’s presentation the presenter will field questions from the class.
5 Possible Data Sources
Students will find a publicly available data source to work with. Students should note that a “proper” dataset for this project will be one with over 1,000 observations (rows) and about 6 or more columns of numerical data. Students should shy away from using data sets that have categorical data (words, groupings, etc) because they are harder to run regressions with. Students are welcome to use the below list:
• Data Archive from the Data is Plural newsletter
• Data from the Kaggle data archive
• Data from the Amazon archive
• Politics and Sports stuff from FiveThirtyEight
• Data from ProPublica
• Data from BuzzFeed
• Data from cities:
– Chicago
– Austin
– San Francisco
– Seattle
– New York City
• Federal government data:
– Center for Medicare and Medicaid
– Federal Aviation Administration
– National Oceanic and Atmospheric Administration
– Data.gov
• World Bank Microdata – good for international purposes
• Sports data from Sports Reference
• Bikeshare data from several cities
• .....and more! Feel free to find your own data set and consult with the professor
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。