CS544程序代写、代写R编程语言-代写Algorithm 算法作业

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

您当前位置：首页 >> Algorithm 算法作业Algorithm 算法作业

CS544程序代写、代写R编程语言

日期：2021-12-13 09:45

CS544 Module 6 Assignment

The following document should not be disseminated outside the purview of its intended purpose.

Part1) Strings (60 points)

Use the stringr functions for the following:

Initialize the vector of words from Lincoln’s speech with the following code:

file <- "http://people.bu.edu/kalathur/datasets/lincoln.txt"

words <- scan(file, what=character())

a) Detect and show all the words that have a punctuation symbol.

b) Replace all the punctuations in the words dataset with an empty string.

Remove any empty words from this dataset. Convert all the remaining

words to lower case.

Make this the new_words data.

c) What are the top 5 frequent words in the new_words data?

d) Show the frequencies of the word lengths in the new_words data. Plot

the distribution of these frequencies.

e) What are the words in the new_words data with the longest length?

f) Show all the words in the new_words data that start with the letter p.

g) Show all the words in the new_words data that end with the letter r.

h) Show all the words in the new_words data that start with the letter p

and end with the letter r.

In c), you realize that the most spoken words are what are known as

stopwords. In text mining, the stopwords are removed before analysis.

Initialize the common English stopwords as follows:

stopfile <- "http://people.bu.edu/kalathur/datasets/stopwords.txt"

stopwords <- scan(stopfile, what=character())

Remove the stopwords from the new_words data. Use the %in% operator.

Repeat c) and d) for this dataset without the stop words.

Part2) Data Wrangling (40 points)

Use the tidyverse library for the following:

Download the following csv file,

http://people.bu.edu/kalathur/usa_daily_avg_temps.csv

locally first and use read.csv to load the data into a data frame.

a) Convert the data frame into a tibble and assign it to the variable

usaDailyTemps.

b) What are the maximum temperatures recorded for each year? Show the

values and also the appropriate plot for the results.

c) What are the maximum temperatures recorded for each state? Show the

values and also the appropriate plot for the results.

d) Filter the Boston data from usaDailyTemps and assign it to the variable

bostonDailyTemps.

e) What are the average monthly temperatures for Boston? Show the

values and also the appropriate plot for the results. Use the

bostonDailyTemps.

Submission:

• You must work on your assignments individually. You are not allowed to copy the

answers from the others.

• Each assignment has a strict deadline. However, you are still allowed to submit your

assignment within 2 days after the deadline with a penalty. 15% of the credit will be

deducted unless you made previous arrangements with your facilitator and professor.

Assignments submitted 2 days after the deadline will not be

Create a folder, CS544_HW6_lastName and place the following files in this folder.

Provide the R code, HW6_lastName.R, with each portion of the code clearly identified

by the corresponding question. Prepare a corresponding Word/PDF document by

pasting the output for each question including the plots and explanations.

Archive the folder (CS544_HW6_lastName.zip). Upload the zip file to the Assignments

section of Blackboard.

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：ELEC6217代写、代写R编程设计

【下一篇】：ELEC6217代写、代写R编程设计

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Algorithm 算法作业Algorithm 算法作业

CS544程序代写、代写R编程语言

日期：2021-12-13 09:45

相关文章