In this project, we are working for the financial fraud detection arm of a large auditing firm. We are carrying out an investigation into potential fraud in a series of transactions among banks. Given some bank we are suspicious of (the query bank), we want to be able to find out (1) which banks have traded with it (the partners of the query bank), and (2) which banks have traded with those found in (1) (the partners of the partners of the query bank). This idea is analogous to the “friend-of-a-friend” idea familiar to us in social networks.
Data
Our data for this investigation is a csv file extracted from a global financial database. Such a file can be opened directly in Excel or Atom to let us look directly at the data, but our requirement will be to read it in Python. The first line gives the names of the banks in question, separated by commas. Each subsequent line describes one transaction, giving the transaction ID, the date, the amount, and the two parties to the transaction. Each transaction involves two parties. For example:
CIFER,AIB,BOI,NAT
0,2018-07-18,15500000,CIFER,AIB
1,2018-07-19,20000000,CIFER,BOI
2,2018-07-19,30000000,BOI,NAT
In this example, if the query bank is CIFER, then the partners of the query bank are AIB and BOI, and the partners of the partners are NAT (we can exclude CIFER itself because it is always a partner of a partner of itself).
Three example files are provided in Blackboard together with this pdf. Note that your program may be run against other files during evaluation.
2 Requirements
Our requirement is to write a program in Python 3 that works as follows. It accepts a filename as an argument. This filename will indicate a csv file in the above format. It should read this file in order to find the bank names and store the financial data.
Then, it should query the user for the name of a bank against whom we have a query. Then, it should query the stored data to find a list of the partners of the query bank. Then, it should query the stored data
1
to find a list of the partners of the partners of the query bank. It should print out these two lists, one item per line, with a blank line separating the two lists.
Hint: one way to store the data is to use a matrix as studied in class where the (i,j)th entry represents the total amount traded between bank i and bank j.
In addition:
It should use good algorithm design, with functions for clear sub-tasks. ?
It should use good naming conventions, code formatting, and use of comments.
It may use only the math, sys and os standard packages (i.e. no csv, Pandas, Numpy, NetworkX, etc).
It should read only the input file specified on the command line. Two test files are supplied, and different ones may be used when testing your program.
It should be robust (should not crash with unexpected input data), e.g. malformed data file or wrong filename.
Allow csv separation characters other than the comma (e.g. space, tab, etc) to be used: – These can be manually set by the user;– Or potentially automatically detected.
Print nicely-aligned outputs.
On each line in the output report, include the bank name and the amount traded with that bank.
Order the reports by amount traded.
Suppress output of banks who have only traded a very small amount with the suspicious bank.
Print some interesting statistics arising from the query.
Use a dictionary (a data structure we have not studied in this module) in an appropriate way to store the data, instead of the list of lists suggested above.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。