联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Java编程Java编程

日期:2022-06-19 01:23

COMP 202 - Foundations of Programming

Assignment 3

McGill University, Summer 2022

Due: Saturday, June 18th, 11:59 pm on MyCourses

Late Penalty: 10% per day and up to 2 late days

Important notice

Write all the following functions in one file named

sentiment analysis [’your student ID’].py

For example: if your student Id is 260700000, your file name should be

sentiment analysis 260700000.py

Make sure that all file names and function names are spelled exactly as described in this

document. Otherwise, a 50% penalty per question will be applied. You may make as many

submissions as you like prior to the deadline, but we will only grade your final submission

(all prior ones are automatically deleted).

Sentiment Analysis

Sentiment analysis is one of the challenges related to natural language processing. It is the

task of identifying if the sentiment behind a text, a social media message or voice message

is either positive, negative or neutral. It is widely used in different areas such as marketing,

entertainment or healthcare to evaluate the subjective information given by users, customers

or patients.

The basic idea is to analyse a given text (for example, a social media post) and identify the

sentiment that is behind it. For example:

• I am so happy the weather is amazing! ⇒ Positive sentiment

• This movie was the worst movie ever. ⇒ Negative sentiment

• This is food. ⇒ Neutral sentiment

1

There are many natural language processing libraries and a variety of algorithms in the

literature that are used for sentiment analysis using machine learning algorithms or a rule

based approach.

The objective of this assignment is to build a rule-based approach to identify the sentiment

for a given text. In the following function descriptions:

• Please read the entire A3 guidelines and this PDF before starting.

• You must do this assignment individually.

• The following assignment include two files a pickle file "sentiment dict.pkl" and

a text file named posts.txt

• For both fruitful and void functions you should provide 3 examples in the docstrings

(make sure also to have an example for functions that raise an exception)

2

Questions

1. is list char(char list)[8 points]:

• Input parameters:

– char list (a list of characters)

• Output parameter: The function returns True if the input list contains only a list

of single characters

• Description: The function traverse the list and check if any of the characters is

either not a string or not a string of length 1. Do not use the find function.

• Examples:

– if char list = [’A’,’beb’,’f’] the function will return False

– if char list = [’A’,’1’,’f’] the function will return True

– if char list = [’A’,1,’f’] the function will return False

2. character position(text,char list)[8 points]:

• Input parameters:

– text: string

– char list (a list of characters)

• Output parameter: position a list of positions where the characters where

found in the text

• Description: The function starts by checking if the input char list is a valid

list of single characters by calling is list char function. If the list is not valid,

the function will raise a TypeError exception with the following message: ”The

input list should contain only characters”. Otherwise, The function traverse the

text to find all the characters indicated in the list and returns their positions

(positive indices). If none of the characters where found, the function returns an

empty list.

• Note: Do not use the find or index functions.

• Examples:

– if text = ”Hello. Is it me you’re looking for?” and char list = [’’,’.’],

then the function will return the list of positions [5,34]

– if text = ”Hello. Is it me you’re looking for?” and char list = [’H’,’i’],

then the function will return the list of positions [0,10,27]

– if text = ”Hello. Is it me you’re looking for?” and char list = [’Hello’,’i’],

then the function will raise a TypeError exception

3

• NOTE: Since the quotation mark is a special character in a string, if you want

to test the following examples, you should right the previous text as follows:

text=’Hello. Is is me you\’re looking for?’ (The backslash character ”\” will

indicate that the quotation mark is part of the text and not delimiting the end of

the string.

3. remove characters(text,char list)[10 points]:

• Input parameters:

– text: string

– char list (a list of characters)

• Output parameter: filtered text a string where the characters in char list

have been removed

• Description: The function starts by checking if the input char list is empty

then it will return the original text. Otherwise, the function calls character

position to find all the characters in char list in the input text. It

will then construct a new string where the char list characters have been

removed.

• Examples:

– if text = ”Hello. Is it me you’re looking for?” and char list = [’.’,’?’],

then the function will return the new string ”Hello Is is me you’re looking

for”

– if text = ”Hello. Is it me you’re looking for?” and char list = [’H’,’i’],

then the function will return the new string ”ello. Is t me you’re lookng for?”

• NOTE: Do not convert the text into a list and do not use replace or translate

function

4. count words category(text,word list)[10 points]:

• Input parameters:

– text: a string

– word list: a list of words

• Output parameter: nb occurrences: the number of times all the words in the

list of words appear in the input text

• Description: The function computes the number of times all the words in the list

of words appear in the input text. You may use the count method here.

• Examples:

– if text = ’I love this movie! It is great and the adventure scenes are fun. I

highly recommend it! It is really great!’ and

word list = [’great’,’love’,’recommend’, ’laugh’,’happy’,’brilliant’] then the

4

function will return the value 4 because the words ’love’, ’recommend’ appear

once and the word ’great’ appears twice.

– if text = ’My pizza was awful and cold. This is really a bad place!’ and

word list = [’terrible’,’awful’,’hideous’,’sad’,’cry’,’bad’] then the function

will return the value 2 because both the words ’awful’ and ’bad’ appear in

the text.

5. count number words(text)[5 points]:

• Input parameters:

– text (string)

• Output parameter:number words

• Description: The function will count the number of words in the input string. It

will call character position to find the positions of the the space character.

Do not convert the input text into a list.

• Examples:

– if text = ’I love this movie!It is great and the adventure scenes are fun. I

highly recommend it! But the theatre was terrible and there was an awful

smell’. Then the function will return the value 28 (there are 28 words in the

sentence)

– if text = ’Hello! How are you?’. Then the function will return 4.

6. term frequencies(text,dictionary word)[15 points]:

• Input parameters:

– text : string

– dictionary word: A dictionary that has a list of words in each sentiment

category

• Output parameter: dict frequency: A dictionary that contains the frequency

of the words in each category appearing in the text.

• Description: The function starts by calling the number of words in the text using

count number words. Then for each key in dictionary word, it will call

count words category, the output dictionary will have the frequency of the

number of words per category divided by the total number of words in the text

and rounded to 2 decimals.

• Examples: If we have the following text = ’I love this movie! It is great and the

adventure scenes are fun. I highly recommend it! But the theatre was terrible

and there was an awful smell’ and the dictionary is:

{’POSITIVE’:[’great’,’love’,’recommend’, ’laugh’,’happy’,’brilliant’],

’NEGATIVE’:[’terrible’,’awful’,’hideous’, ’sad’,’cry’,’bad’],

5

’NEUTRAL’:[’meh’,indifferent’,’ignore’]}

The function will compute 28 words in total with 3 positive words and 2 negative

and 0 neutral words. Divided by the total number of words, the function will

return the following dictionary:

{’POSITIVE’: 0.11, ’NEGATIVE’: 0.07, ’NEUTRAL’: 0.0}

7. compute polarity(dict frequency)[15 points]:

• Input parameters:

– dict frequency: The dictionary of frequencies as computed by

term frequencies function.

• Output parameter: polarity (string) of the most prominent sentiment observed

with the given frequencies

• Description: The function will traverse all the dictionary keys and will return

the one that has the highest frequencies. You should not use the built-in max

function.

• Examples:

– If we have the following dictionary {’POSITIVE’: 0.11, ’NEGATIVE’: 0.07,

’NEUTRAL’: 0.0}, then the function will return the string ’POSITIVE’ since

it has the maximum corresponding frequency

– If we have the following dictionary {’POSITIVE’: 0.11, ’NEGATIVE’: 0.07,

’NEUTRAL’: 0.5}, then the function will return the string ’NEUTRAL’

– If we have the following dictionary {’POSITIVE’: 0.07, ’NEGATIVE’: 0.07,

’NEUTRAL’: 0.01}, then the function will return the string ’POSITIVE’ the

first maximum value foud.

8. read text(text path)[10 points]:

• Input parameters:

– text path: A string having the file name.

• Output parameter: text list: A list of strings

• Description: The function will open and read the text file located in the given

path. It will read the text line by line. Each line in the text follows the structure:

user pseudonym, comment separated by ‘,’ the function will ignore the user

pseudonym and add only the comment into the text list. Each element in the

list is one comment from one line of the text. You should not use the readline or

readlines functions.

• Examples: If we have a text file named ’text.txt’ with the following content :

6

user1,Hello

user2,How are you today?

user3,Good I hope!

user4,Ok take care!

Then the function will return the list of strings:

[’Hello \n’, ’How are you today? \n’, ’Good I hope! \n’, ’Ok take care! \n’]

9. read pickle[5 points]:

• Input parameters:

– a pickle file

• Output parameter: word dict: the content of the pickle file

• Description: The function will load an object within any pickle file and return

the object

• Examples:

– If the function reads the provided pickle file "sentiment dict.pkl", it

will return the following dictionary:

{’POSITIVE’: [’great’, ’love’, ’recommend’, ’laugh’, ’happy’, ’brilliant’],

’NEGATIVE’: [’terrible’, ’awful’, ’hideous’, ’sad’, ’cry’, ’bad’],

’NEUTRAL’: [’meh’, ’indifferent’, ’ignore’]}

10. analyse text(text path,dict path)[14 points]:

• Input parameters:

– text path: The path to a given text file

– dict path: The path to the given dictionary saved in a pickle file

• Output parameter:list polarity a list of computed polarity for each line in

the text file

• Description: The function read the text file and the pickle file. For each line

in the text, the function will: put the text in lower case, remove leading and

trailing whitespaces, remove the stop words from the text using this list of stop

words [’!’,’.’,’?’,’;’,’\n’], compute the term frequencies and the polarity and add

the computed polarity value to list polarity

• Examples:

– If we test the function with the provided text and pickle file it will return the

following list:

′P OSIT IV E′

,

′ NEGAT IV E′

,

′ NEUT RAL′

,

′ P OSIT IV E′

7


相关文章

版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp