THE UNIVERSITY OF HONG KONG
DEPARTMENT OF COMPUTER SCIENCE
FITE7410 Financial Fraud Analytics
First Semester, 2023-2024
Mini Case Study: Real-life Fraud Detection Scenario
(Due Date: 4 Dec, 2023 (Mon) 23:59)
(1) Learning Objectives
a. Analyze a real-world dataset to promote fraud analytics thinking.
b. Identify which explanatory variables may be good predictors or red flags associated with
fraud.
c. Work through the stages in model building and validation.
d. Apply the built model to classify a case based on the predicted risk of fraud.
e. Make a scenario-based decision informed by data analyses.
(2) Instructions
You are provided with a real-world dataset (ENRON case) containing fraud transaction
information. Your task is to analyze the dataset and develop a fraud detection model using
machine learning techniques. Follow the steps below to complete the assignment:
a) Define the scope and objective of the case study.
b) Exploratory Data Analysis:
• Explore the dataset to understand its structure, features, and statistical properties.
• Perform exploratory data analysis techniques, such as data visualization and
statistical analysis, to gain insights into the relationships between variables and
fraud.
• Conduct a thorough analysis of the dataset to identify which explanatory variables
are good predictors or red flags associated with fraud.
• Perform data cleaning and preprocessing as necessary.
c) Model Building and Validation:
• Select appropriate at least TWO machine learning algorithms or any appropriate
data analytics techniques (e.g. social network analysis, statistics analysis) for fraud
detection.
• Split the dataset into training and testing sets.
2
• Develop a fraud detection model using the chosen algorithm(s) and train it on the
training set.
• Evaluate the performance of the model using appropriate evaluation metrics.
• Iterate on the model building process, adjusting hyperparameters or trying different
algorithms, to improve the model's performance.
d) Fraud Scenario Identification:
• Develop a scenario related to financial fraud detection, such as a suspicious
transaction or a potential fraudulent activity.
• Use the trained model and the available data to make a data-informed decision
regarding the given scenario.
• Justify your decision based on the insights gained from the data analysis and the
model's predictions.
e) Non-data analytic element:
• What are the risks and red flags of the case, with the objective to prevent similar
financial frauds in future?
• What are the other non-data analytic elements that should be considered (e.g.
corporate governance and controls)?
• Do you have any suggestions on how to prevent similar financial fraud in future?
(3) Submission Guidelines
1. Report
Prepare a comprehensive report, documenting each step of your analysis, including
explanations, visualizations, and any insights gained. Include the results of model
evaluation and performance metrics. Present your scenario-based decision and provide
a clear rationale for your choice.
The report is max 8 pages long (not including Appendix) and should contain:
• Your name and student ID
• Title of the project
• Background and objectives of the case study
• Description of the dataset and the fraud data analytics method
• Describe and interpret the result of the new fraud detection model
• Summary and recommendation
• Cite any references (such as websites, book chapters, articles, etc) you may have
used
2. Program
Submit your R program on moodle.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。