联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Java编程Java编程

日期:2024-12-24 11:04

Project.md 2024-11-10

1 / 3

[CS209A-24Fall] Assignment2 (100 points)

Background

In the process of software development, many questions will arise. Developers may resort to Q&A website to

post questions and seek answers. Stack Overflow is such a Q&A website for programmers, and it belongs to the Stack Exchange Network. Stack

Overflow serves as a platform for users to ask and answer questions, and, through membership and active

participation, to vote questions and answers up or down and edit questions and answers in a fashion similar

to a wiki. Users of Stack Overflow can earn reputation points and "badges"; for example, a person is awarded

10 reputation points for receiving an "up" vote on a question or an answer to a question, and can receive

badges for their valued contributions. Users unlock new privileges with an increase in reputation, like the

ability to vote, comment, and even edit other people's posts. In this final project, we'll use Spring Boot to develop a web application that stores, analyzes, and visualizes

Stack Overflow Q&A data w.r.t. java programming, with the purpose of understanding the common

questions, answers, and resolution activities associated with Java programming. Data Collection (10 points)

On Stack Overflow, questions related to Java programming are typically tagged java. You could use this java

tag to identify java-related questions. A question and all of its answers and comments are together referred to

as a thread. For java-related threads on Stack Overflow, we are interested in answering a list of questions as described

below. You should first collect proper data from Stack Overflow to answer these questions. Please check the

official Stack Overflow REST API documentation to learn the REST APIs for collecting different types of data. . You may need to create a Stack Overflow account in order to use its full REST API service. · API requests are subject to rate limits. Please carefully design and execute your requests, otherwise

you may reach your daily quota quickly. . Connections to Stack Overflow REST service maybe unstable sometimes. So, please start the data

collection ASAP!

There are over 1 million threads tagged with java on Stack Overflow. You DON'T have to collect them all. Yet, you should collect data for at least 1000 threads in order to get meaningful insights from the data analysis. Important:

Data collection is offline, meaning that you need to collect and persist the data first. It is recommended that

you use a database (e.g., PostgreSQL, MySQL, etc.) to store the data. However, it is also fine if you store the

data in plain files. In other words, when users interact with your application, the server should get the data

from your local database (or local files), instead of sending REST requests to Stack Overflow on the fly. Hence, the data analysis for the below questions should be performed on the dataset you collected. That is, we first collect a subset of Stack Overflow data (e.g., 1000 threads tagged java) and then answer the

following questions using this subset.

Project.md 2024-11-10

2 / 3

Part I: Data Analysis (70 points)

For each question from this part, you should: . Figure out which data is needed to answer the question

. Design and implement the data analysis on the backend

. Visualize the results on the frontend using proper charts. In other words, when interacting with your web application from the browser, users could select interested

analysis, which sends requests to the server; the server performs corresponding data analysis and returns the

results back to the frontend, which visualizes the results on the webpages. Your work will be evaluated by: . whether the data analysis is meaningful and relevant, i.e., it can indeed answer the question with proper

they want

instantly by looking at the visualization. Take a look at the data visualization catalogue for inspirations. 1. Java Topics (10 points)

We have covered various topics in this course, e.g., generics, collections, I/O, lambda, multithreading, socket, etc. It's interesting to know, what are the top N (N>1, you may choose a proper N depending on your data

and your UI design, same below) topics that are most frequently asked on Stack Overflow?

2. User Engagement (15 points)

What are the top N topics that have the most engagement from users with higher reputation scores? User

engagement means any user activity (e.g., edit, answer, comment, upvote, downvote, etc.) on the thread. 3. Common Mistakes (15 points)

Developers make mistakes, which result in bugs in the code. Bugs manifest themselves as errors or

exceptions, which can be roughly classified as: . Fatal errors: errors like OutOfMemoryError that cannot be recovered at runtime. . Exceptions: checked exceptions and runtime exceptions that can be handled programmatically by

developers. What are the top N errors and exceptions that are frequently discussed by Java developers?

Note that, tags are high-level information and may not include low-level errors or exceptions. Hence, for this

question, you cannot only use tag information. You need to further analyze thread content (e.g., question text

and answer text) to identify error or exception related information, probably using advanced techniques such

as regular expression matching. 4. Answer Quality (30 points)

We consider an answer to be "high-quality" if it is accepted or has many upvotes. It's useful to know, what

factors contribute to high-quality answers?

3 / 3

Project.md 2024-11-10

Please investigate the following factors: . The elapsed time between question creation and answer creation (e.g., whether the first posted answer

tends to be accepted?). . The reputation of the user that creates the answer (e.g., whether answers created by high-reputation

users tend to be accepted or have more upvotes?). In addition to these 2 factors, you should also propose another 1 factor that may contribute to the quality of

answers. For each of the 3 factors, use proper data analysis and visualizations to demonstrate whether the factor

contributes to high-quality answers or not. Part II: RESTful Service (20 points)

Your application should also provide a REST service that answers the following two questions, so that users

may use RESTful APIs to GET the answers they want. The required REST services include: . Topic frequency: users could query for the frequency of a specific topic. Users could also query for the

top N topics sorted by frequency. . Bug frequency: users could query for the frequency of a specific error or exception. Users could also

query for the top N errors or exceptions sorted by frequency. Here, you could reuse the data analysis from Part I. Responses of the REST requests should be in json format. Requirements

Data Analysis

You should implement the data analysis by yourself, using Java features such as Collections, Lambda, and

Stream. You CANNOT feed the data to AI, ask AI to do the analysis, and use AI responses as your data analysis results. You will get 0 point for the question if you do so. Data analysis results should be dynamically generated by the server everytime clients send a request. You

SHOULD NOT precompute the results and stored it as a static content then simply display the precomputed

static content on the frontend. 20 points will be deducted if you do so. Web Framework

You should only use Spring Boot as the web framework. Frontend

Frontend functionalities, such as data visualization and interactive controls, could be implemented in any

programming language (e.g., JavaScript, HTML, CSS, etc.) with any 3rd-party libraries or framework.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp