AI3013 Machine Learning Course Project
Description:
This is a GROUP project (each group should have 4-6 students), which aims at applying
machine learning models as well as machine learning techniques (including but not limited
to those covered in our lectures) to solve complex real-world tasks using Python.
Notice: This project should differ from the one you are undertaking in the Machine Learning
Workshop Course.
Notice on Deep Learning Models:
You may decide to work on Deep learning models, and since our course mainly focus on
machine learning models and techniques, deep learning model not be considered as more
superior than other machine learning models if you just repeat a model that is designed by
others. Also, training deep learning models can be very time consuming, so make sure you have
the necessary computing resources.
Project Requirement:
Problem Selection:
• Choose a real-world problem from a domain of interest (e.g., healthcare, finance,
image recognition, natural language processing, etc.).
• Describe the problem, including data sources and the type of machine learning model
that will be applied (e.g., regression, classification, clustering, etc.).
Dataset Selection:
• Choose a dataset from public repositories (e.g., UCI Machine Learning Repository,
Kaggle) suitable for this topic.
• Ensure the dataset has a sufficient number of samples and features to allow for
meaningful analysis and model comparison.
• Apply appropriate data preprocessing steps (e.g., handling missing values, encoding
categorical features, scaling).
Model Theory and Implementation:
• Select and implement at least 2 machine learning models for comparison.
• Provide a comprehensive explanation of the theoretical background of the chosen
models (e.g., loss functions, optimization techniques, and assumptions).
• Discuss the strengths and weaknesses of the chosen models.
• Include mathematical derivations where relevant (e.g., gradient descent for linear
regression).
• Implement the selected models From Scratch without using any existing machine
learning libraries (e.g., scikit-learn, TensorFlow, Keras, etc.). The implementation
should be done in Python using only basic libraries such as NumPy, Pandas, and
Matplotlib.
Model Evaluation:
• Evaluate each model using suitable metrics (e.g., accuracy, precision, recall, F1 score,
RMSE) for the problem.
• Use cross-validation to ensure model robustness and avoid overfitting.
• Analyze the behavior of the models based on the dataset, including bias-variance
trade-offs, overfitting, and underfitting.
Analysis and Comparison:
• Compare the models in terms of:
o Performance (accuracy, precision, etc.).
o Computational complexity (training time, memory usage).
o Suitability for the dataset (e.g., which model performs best, why).
• Provide a comparison of the models' performances with appropriate visualizations
(e.g., bar plots or tables comparing metrics).
• Discuss how the assumptions of each model affect its suitability for the problem.
Submission Requirement:
Upon completion, each group must submit the following materials:
1. Progress report
a) Abstract
b) Introduction: problem statement, motivation and background of the topic
c) Related works and existing techniques of the topic
d) Methodology
e) Progress/Current Status
f) Next Steps and Plan for Completion
2. Project report, your report should contain but not limited to the followingcontent:
a) Abstract
b) Introduction: problem statement, motivation and background of the topic
c) Related works and existing techniques of the topic
d) Methodology
e) Experimental study and result analysis
f) Future work and conclusion
g) References
h) Contribution of each team member
3. Link and description to the Dataset and the implementation code.
4. Your final report should be a minimum of 9 pages and a maximum of 12 pages
5. For the final report, the similarity check Must Not exceed 20%, and the AI generation
content check Must Not exceed 25%.
6. Put all files (including: source code, presentation ppt and project report) into a ZIP file,
then submit it on iSpace.
Deadlines:
Team Information should be submitted by the end of Week 3.
The Progress Report should be submitted by the end of Week 10.
The Presentation will be arranged in Weeks 13 and 14 of this semester.
Final Project Report should be submitted by Friday of Week 15 (May.23.2025).
Assessment:
In general, projects will be evaluated based on:
Significance. (Did the authors choose an interesting or a “real" problem to work on, or
only a small “toy" problem? Is this work likely to be useful and/or haveimpact?)
The technical quality of the work. (i.e., Does the technical material make sense? Are
the things tried reasonable? Are the proposed algorithms or applications clever and
interesting? Do the student convey novel insight about the problem and/or algorithms?)
The novelty of the work. (Do you have any novel contributions, e.g., new model, new
technique, new method, etc.? Is this project applying a common technique to a well studied problem, or is the problem or method relatively unexplored?)
The workload of the project. (The workload of your project may depend on but not
limit to the following aspects: the complexity of the problem; the complexity of your
method; the complexity of the dataset; do you test your model on one or multiple
datasets? do you conduct a thorough experimental analysis on your model?)
Evaluation Percentage:
Progress Report: 5%
Final Report: 40%
Presentation: 40% (Each group will have 15-20 minutesfor presentation, and
each student must present no less than 3 minutes)
Code: 15%
It is YOUR responsibility to make sure:
Your submitted files can be correctly opened.
Your code can be compiled and run.
Late submission = 0; Plagiarism (cheating) = F
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。