联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> C/C++编程C/C++编程

日期:2024-10-25 09:09


Assignment 1: Web Scraping & Data Analysis

Sep 31, 2024

In this assignment, you should work with data from https://www.themoviedb.org/movie (Online Popular Movies Platform)

The Movie Database (TMDb) is a popular platform for movie enthusiasts, offering a vast collection of movies from all genres and regions. TMDb provides users with detailed information such as movie titles, release dates, cast, crew, genres, ratings, and more. It's a go- to source for finding information about both classic and upcoming films, as well as the latest in TV shows.

Everyone is interested in great movies, but with so many films released each year, how can we find the best ones? Scraping high-quality data from movie websites is crucial. In this project, we will utilize the skills we've learned with requests and regular expressions to scrape essential movie details from The Movie Database (TMDb) website, allowing us to build a comprehensive dataset for further analysis.

Task1. You are required to scrape 200 Movies from the website and save result into ‘your_name+id.csv’. This file should contain data with the following columns: (40 marks)

Title of Movie     5 marks

    Year

User Score     5 marks

5 marks

  Description

5 marks

 Director     5 marks Screenplay     5 marks

Type

Revenue     5 marks

5 marks

 

          You are free to explore data with more properties if needed.

Task2. You are required to do a data analysis on the data. What do you think is interesting about this data? Tell a story about some interesting thing you have discovered by looking at the data. (60 marks)

For example, which one is the best movie you might watch? Does the type of movie affect movie sales? Which category of movies sells the best?

Note: This is an open topic project. You are required to provide a novel topic and demonstrate your hypotheses (view points) with data analysis and figures illustrations.

The reports and running code (web scraping + data analysis) should be submitted using Jupter Notebook file.

Submission Checklist:

Yes/No Items

Jupyter Notenook code your_name+id.csv

         

Marking Guidelines

Marking Criteria

     Idea (5 marks)

  Ÿ Presents a novel idea

Ÿ Clearly demonstrate your viewpoints. Ÿ Demonstrates good understanding of

the topic.

  Discussion (30 marks)

 Ÿ Provide convincing arguments to your viewpoints.

Ÿ Backs up arguments with appropriate data analysis results.

Ÿ Visualize data analysis results by

Ÿ using more than 5 figures.

  Organization (20 marks)

Ÿ Use of figures to support ideas discussed in the report.

Ÿ The quality of the figures.

Ÿ These figures should be informative.

Ÿ Use of sub-titles and/or clear topic

sentences.

Ÿ Usemultiplevisualizationmethods

(line, bar, pie chart, etc, ).

  Writing Style (5 marks)

  Ÿ Concise writing style

Strong scientific writing without grammatical errors.


相关文章

版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp