代写TAT 7008、代写Python程序语言-代写Python编程

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

您当前位置：首页 >> Python编程Python编程

代写TAT 7008、代写Python程序语言

日期：2022-11-28 11:05

TAT 7008 - Assignment 3

Note: A3 is 20% of the overall assessment. The 100 points in A3 will be rescaled to 20% in

the final score.

Web Scraping

1. (25 points) Crawl information from https://www.sciencedirect.com

(1) (13 points) Crawl some key information about all articles published in 2022 from the

website https://www.sciencedirect.com/journal/journal-of-econometrics/issues, including

year, volume, article content, title, authors and pages. Crawl the volume numbers from 226

to 230 only.

(2) (6 points) Remove “\xa0” in volume_name and store the crawled data into pandas

DataFrame.

(3) (6 points) Filter the author with Null value and then find the top 10 authors that published

the most articles.

Hint:

i. Click the button of the targeted item

ii. Pass the html to BeautifulSoup and get all links

iii. Use requests to get article content, title, authors and pages for each block

For this example,

article content Research article

title Identification in nonparametric models for

dynamic treatment effects

authors Sukjin Han

pages Pages 132-147

Scikit-learn

2. (10 points) Handwritten digits dataset loading and preprocessing

(1) (2 points) Load the digits data by load_digits.

(2) (4 points) Use MinMaxScaler to normalize the covariates X.

(3) (4 points) Split the data into training and test set

with test_size=0.2 and random_state=2020.

3. (15 points) Following question 2, fit the model specified below with different hyper-

parameters, and report the performance.

(1) (7 points) Fit the naive bayes model MultinomialNB on the digits training set with

different values of the parameter alpha α∈{1,2,…,20}.

(2) (4 points) Record the accuracy scores on the test set for each α.

(3) (4 points) Draw the line plot of the accuracy scores versus different α.

4. (15 points) Following question 2, apply dimensionality reduction methods applied on the

digits dataset.

(1) (3 points) Fit Principal Component Analysis (PCA, n_components=2) model to Digits

training set for dimension reduction.

(2) (3 points) Apply model from (1) to train/test set for dimensionality reduction, compute

the 2-dimensional embedded train/test set.

(3) (3 points) Fit a nearest neighbor classifier (KNN, n_neighbors=3) on the embedded

training set. Compute the nearest neighbor accuracy on the embedded test set, plot the

projected test set points and show the evaluation score.

(4) (6 points) Use Neighborhood Components Analysis (NCA, n_components=2) for

dimensionality reduction, repeat (1), (2) and (3).

Note: output results in following image format, no need for outputs in (1) and (2)

Computer vision

5. (18 points) Face and Eye Detection

(1) (12 points) Please write down the code to detect the faces and the eyes in face.jpg. Draw

the red rectangle for the faces and the green rectangle for the eyes.

(2) (6 points) If we want to open the front camera for video capturing and performing face

and eye detection. How can we modify the above codes?

Hints: you may use the auxiliary .xml files and the detection algorithm based on Haar-like

features, provided by opencv.

Natural language processing

6. (17 points) Word embedding (Skip-gram)

see the attached jupyter notebook with partially finished code: wb_partial_code.ipynb

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：代写S6140编程、代写Python，C++程序

【下一篇】：代写S6140编程、代写Python，C++程序

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Python编程Python编程

代写TAT 7008、代写Python程序语言

日期：2022-11-28 11:05

相关文章