Homework 3
Use the Indiegogo dataset and download five years of
data.
1. For each of these categories* in the category of JSON element, check whether all keywords
has a gaussian distribution. You should count the appearance of the keyword per month and then
assign keyword month. e.g. “Education”, “Jan”, “2020”, “32”
Then, plot their distributions based on the number of year (use density plot in R). It means you
should download the data for five years and then compare their frequency each one separately.
2. Compare following two categories: “Health & Fitness”, “Fashion & Wearables” on year basis
(2018, 2019, 2020).
a. With three statistics tests, one parametric, two non-parametric tests and report results.
b. Use the effect size test, to quantify the magnitude of differences.
3. Use three correlation coefficient tests (Pearson, Spearman, KendallTau) and report whether
following two keywords have correlations: “Fashion & Wearables”, “Health & Fitness”.
You need to prepare a report on your tasks and findings along with a video file describing what
you have done. You can copy paste your codes, its results and your description into a Word
document, Python Notebook or you can use R notebook.
Your deadline for delivering this home work is written on the blackboard online. Please feel free
to ask your question and prepare it for presentation for the next session.
* “Education”, “Energy & Green Tech”, “Health & Fitness”, “Fashion & Wearables”, “Wellness”
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。