联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2023-09-19 10:21

Faculty of Engineering and Information Technology

The University of Melbourne

COMP90073 Security Analytics,

Semester 2, 2023

Assignment 2: Blue Team & Red Team Cybersecurity

Release: Fri 1 Sep 2023

Due: Tue 17 Oct 2023

Marks: The Project will contribute 25% of your overall mark for the subject.

You will be assigned a mark out of 25, according to the criteria below.

Overview

You’ve recently been brought into head up the cybersecurity team FashionMarketData.com -

who are a major player in providing platforms for online fashion retailers to leverage their

data through machine learning. However, after recently witnessing high profile data

breaches, including at Medibank and AHM, leadership of the company are concerned that

the business might face existential financial, legal, and reputational risks stemming from

hackers potentially manipulating their data, or exploiting their machine learning models.

The CEO has tasked you with heading up the newly formed Blue and Red Team

cybersecurity groups inside the company, and developing a report for the board that outlines

both risks and opportunities for the company. The Blue Team are concerned about users

potentially uploading images that do not match their labels, through either mistaken use of

the platform, or to potentially actively manipulate the company's systems. As such, the Blue

Team are working on designing and implementing systems that ensure only genuine,

fashion-related images are processed and ingested into the company’s crowd sourced

datasets. In practice, this will involve reliably detecting and distinguishing both anomalous

and out of distribution samples.

The Red Team are taking a different bent. Rather than actively defending the company’s

systems, they’re more concerned with understanding the scope of vulnerabilities in machine

learning models that have rapidly become a core part of the company’s business practices.

As such, the team plans to construct evasion and data poisoning attacks against exemplar,

non-production models, and to use these results to build a picture of the vulnerabilities

present within the company’s systems and processes.

Finally, you will need to present a report for the non-technical leadership of

FashionMarketData.com, based upon your insights from working with both the Blue and Red

teams. Due to the critical nature of understanding the risk that the company may face from

its data management and machine learning practices, it is crucial that you deliver this report

at the next meeting of the company's board, which will be on Tuesday the 17th of October,

2023.

Datasets

To understand these vulnerabilities, you have been provided with images encompassing 10

distinct fashion categories, which have primarily been drawn from the Fashion-MNIST

dataset. This dataset consists of 28*28 grayscale images in 10 distinct fashion categories.

This compilation serves as your in-distribution (normal) dataset, representative of the core

and expected content within the fashion domain. Examples of the first 3 fashion categories in

the given dataset are shown below.

Your primary task is to devise and refine multiple algorithms with the specific aim of

identifying anomalies and out-of-distribution (OOD) images. OOD samples refer to images

that do not belong to the fashion categories defined w.r.t. the in-distribution data, such as

airplanes, animals, and hand-written letters/characters, etc. Meanwhile, anomaly samples

pertain to fashion images that diverge from typical fashion items. While they remain

categorically fashion images, they differ from the familiar images in the dataset due to

distortions, rotations, cropping, and similar alterations.

To facilitate this objective, five separate datasets will be made available to you. Each dataset

will play a crucial role in training, validating, and testing the efficiency and accuracy of the

algorithms you develop.

Dataset Description

Training set

[train_data.npy]

[train_labels.npy]

This dataset features images from 10 unique fashion

categories, labelled from 0 to 9. It acts as the principal

guide to discern the standard content in the fashion

domain. You will employ this dataset for both anomaly

and OOD detection tasks.

For Anomaly Detection

Validation set

[anomaly_validation_data.npy]

[anomaly_validation_labels.npy]

This set comprises both original and distorted fashion

items. Importantly, items labelled '1' indicate an

anomaly status, while those labelled '0' represent

normal data. This validation set is primarily intended for

tuning your model's hyperparameters and reporting its

performance using the relevant metrics in your analysis.

For Anomaly Detection

Test set

[anomaly_test_data.npy]

The test set comprises original and distorted fashion

items, with a similar proportion found in the validation

set. However, unlike the validation set, this dataset

contains no labels. As such, you are required to use

your trained model to predict their anomaly statuses.

For OOD Detection

Validation set

[ood_validation_data.npy]

[ood_validation_labels.npy]

This set contains a blend of both fashion and nonfashion items. Notably, items labelled as '1' signify OutOf-Distribution (OOD) status, indicating they do not

align with the standard fashion categories. On the other

hand, samples labelled '0' represent in-distribution data.

Primarily, this validation set is intended for tuning your

model's hyperparameters and for reporting performance

using the relevant metrics in your analysis.

For OOD Detection

Test set

[ood_test_data.npy]

The test set includes both fashion-centric items and

potential non-fashion items, mirroring the proportion

observed in the validation set. Unlike the validation set,

this dataset lacks pre-labelled OOD statuses. As such,

you will be required to use your trained model to predict

these OOD statuses.

Note that the NumPy array file (.npy) can be loaded via: data = np.load('input.npy').

Blue Team Tasks

You will create anomaly detection and OOD detection algorithms with the provided training

and validation sets. Following the development phase, these algorithms will be tested on the

given separate test set. You need to annotate this test set with anomaly and OOD statuses

derived from each of the detectors.

For anomaly detection, you will develop two distinct detection algorithms:

1) Shallow model

Use a shallow (non neural network) model (e.g., OCSVM, LOF) to develop a detector

for identifying non-fashion items. It might be beneficial to utilize dimensionality

reduction techniques before inputting the data into the detection model.

2) Deep learning model

Develop a deep learning model, such as autoencoder, to detect whether an item

belongs to the category of fashion items or not.

For OOD detection, you are required to develop a single algorithm.

Deliverables

1) The predicted labels for the test sets (submit in a zip archive of .npy files)

• After running each of the three detection algorithms on the test set, the annotated

results (the non-fashion statuses determined by each detector) should be prepared in

a structured format as the validation set.

• For your Blue Team results you will need to generate 3 result files corresponding to

each of the Blue Team approaches. The filenames will be 1.npy (anomaly detection:

shallow model), 2.npy (anomaly detection: deep learning model), and 3.npy (OOD

detection).

2) Python code or Jupyter Notebook (submit as zip archive)

• This should contain the complete code for all three detection algorithms, starting from

data import and preprocessing, progressing to algorithm implementation, and ending

with an appropriate analysis of your results. This may include visualisations to help

emphasise your points.

• It is important that if your code is a Jupyter Notebook, the notebook must contain the

evaluated results of all cells; and if you are using Python, it must be able to be run

completely. In both cases, you must include a supplementary README file that

includes the versions of all libraries used must be included.

• When utilizing any pre-existing code sourced from online resources, you need to

clearly annotate that within your codebase using comments. Furthermore, please

ensure that you provide a comprehensive reference to these sources in your report,

detailing their origins and contributions to your work.

• Ensure the code is well-structured with clear function definitions, variable names, and

comments that explain the purpose of each section or step. Complex or nuanced

sections of the code should have accompanying comments for clarity.

− If submitting the Python code (.py), please provide comments in the code for

major procedures and functions. Also, please contain a README file (.txt)

showing instructions on how to run each script. You need to submit a zip archive

containing all scripts (.py) and README.txt.

− If submitting a Jupyter Notebook (.ipynb), incorporate markdown cells to segment

the code and provide explanatory notes or observations. Before submission,

restart the kernel and run the notebook from the beginning to ensure all cells

execute in order and produce the expected outputs. Ensure that all outputs,

especially visualizations or essential printed results, are visible and saved in the

notebook.

• Please include all data preprocessing or visualisation steps you may have

undertaken, even those not included in the report. Use comments to specify the

intent behind each result/graph or what insights were derived from them.

3) Report (submit as PDF)

• Your report should be targeted towards your intended audience and should use

qualitative and quantitative methods to describe your techniques, results, insights,

and understandings. This should include an appropriate description of your choice of

detection algorithms, evaluation methods, and a discussion regarding the

ramifications of these choices. As a part of this, it is important to include any

challenges that you faced, and the decisions or assumptions you made throughout

the process. Your results should be presented in a fashion that is readily

comprehensible to a non-technical audience.

• Your report should include both an introductory executive summary to provide an

overview of the underlying task, offering a snapshot for readers to understand the

context and objective of the report. Following the body of your report, the conclusion

should encapsulate the primary findings of your investigation or study. Additionally,

this section should present recommendations for potential enhancements or

alternative strategies that might be considered in the future.

• The word limit for Blue Team (Task I) report is 1500. Your main report should not

exceed 7 pages in length. However, any supplementary diagrams, plots, and

references that you wish to include can be added after the main report. These

additional materials will not be considered as part of the word or page count limits.

• You should evaluate your model with at least three appropriate metrics for each

detection algorithm. Some commonly used metrics include AUROC and false positive

(FP) rate. However, depending on the context of your algorithm, other metrics might

be equally or even more relevant. Ensure that your chosen metrics provide a wellrounded view of the algorithm's performance in its intended application.

• You could also evaluate samples where the model misclassified. For instance, if

certain types of anomalies are consistently missed, what valuable insights into

patterns or consistencies could be gained from these failures? Additionally, if there

are any extreme cases that makes your model fails to predict, what measures could

be taken in future training? You could also discuss how these inaccuracies might

manifest in real-world scenarios.

• To make your findings more accessible to readers, tables are recommended to use

for structured presentation of numerical results and comparisons. Meanwhile,

visualizations like bar graphs, scatter plots, or heat maps can offer intuitive insights

into the data, helping to convey relationships, distributions, or anomalies that might

be less apparent in raw numbers alone.

• The creativity marks will be allocated based upon both how you extend and present

your work, and the insights that you provide. When we refer to extensions, this may

be in terms of the techniques taken, insights get from your experiments, comparison

of model parameters, or your comprehensive analysis – even if your tested novel

ideas are not successful. The amount of time you spend pursuing the creativity

marks should be commensurate with the marks allocated.

Red Team Tasks

Because the company’s leadership is cautious about Red Team attempting to attack a

production model, you will instead need to train a similar model, that you will use as a proxy

for attacking. To ensure that the model closely matches what is used in production, the

trained architecture that you produce should incorporate at least 3 linear layers, with a

dropout layer (with probability 0.25) preceding the last two linear layers. You should train on

the training set to an accuracy of at least 85%, using a cross entropy loss, an Adam

optimizer, and a learning rate of 10

-4

. All your input samples should be normalised to within

[0,1] before being passed into the model.

After training this model, you will need to design an iterative gradient based attack. The code

for this should be flexible enough that you would be able to take any model and input sample

and attack it for a up to a specified number of iterations with a fixed step size, while ensuring

that the attack image always remains within [0,1]. While the maximum number of iterations

should be fixed, consider if your attack can be modified to stop early if necessary. Your

attack should be able to produce both targeted and untargeted attacks. Because the Red

Team is trying to build up their own internal capabilities, your attack should avoid

unnecessary use of off-the-shelf libraries that implement adversarial attacks themselves.

Your focus should be on employing basic machine learning and mathematical libraries, as

well as autograd.

To test the performance of your ability to attack the model, you should attack every 20

th

sample of the test set. As you do so, vary the step size of your iterative attack between 10

-5

and 10

1

for at most 100 steps, and perform an analysis on the untargeted performance of the

attack, and the performance when targeted towards forcing the model to predict the 0th class.

You will need to perform an appropriate analysis of the attack performance, which should

include an analysis of the success rates and l2 norm distance between your tested images

and your successful attacks (this l2 norm should be calculated using sum, square-root and

power operations).

Given that you know that Blue Team are working on techniques that could be used to detect

attacks, your report to the company's leadership should consider how the techniques

implemented by the Blue Team could be used to defend your model from adversarial attack.

You may also wish to consider how changes to the model architecture, training process, or

data handling procedures may influence the level of adversarial risk faced by your model, or

how you might attack a model that incorporates defensive stratagems from the Blue Team.

Deliverables

1. Python code or Jupyter Notebook (submit as zip archive)

• This should contain the complete code for (1) training the underlying network, (2)

performing adversarial attacks, and (3) evaluating their performance.

• The requirements/guidelines are identical to those outlined for Blue Team (Task I).

2. Report (submit as PDF)

• Please make a separate report for Red Team (Task II), The word limit for this task is

1000. Your main report should not exceed 4 pages in length. However, any

supplementary diagrams, plots, and references that you wish to include can be

added after the main report. These additional materials will not be considered as part

of the word or page count limits.

• You should evaluate your model with appropriate metrics for your implemented

attack. Some commonly used metrics include accuracy drops and perturbation

size/distribution. However, other metrics might be equally or even more relevant.

• You could include visualizations of the adversarial noise and the perturbed images in

the report, such as side-by-side comparisons that illuminate the slight alterations that

result in significant prediction deviations, helping readers discern the vulnerabilities of

your implemented attack. Meanwhile, the plot of loss/performance changes versus

iterations could be used to provide a visual representation of the model's training

dynamics, making it easier to diagnose issues, compare solutions, and communicate

the model's behaviour to both technical and non-technical stakeholders.

• The creativity marks will be allocated based on both how you extend and present

your work and the insights that you provide. This may be in terms of the network

structure, training techniques, adversarial attack techniques, evaluation and analysis,

or to present interesting findings – even if your tested ideas are not successful. The

amount of time you spend pursuing the creativity marks should be commensurate

with the marks allocated.

Assessment Criteria – Blue Team (Task I)

Code quality and README (2 marks)

Technical report (13 marks)

1. Methodology: (4 marks)

2. Critical Analysis: (5 marks)

3. Report Quality (4 marks)

Creativity: (2 marks, as Bonus)

Assessment Criteria – Red Team (Task II)

Code quality and README (2 marks)

Technical report (8 marks)

1. Methodology: (2 marks)

2. Critical Analysis: (3 marks)

3. Report Quality (3 marks)

Creativity: (1 mark, as Bonus)

Changes/Updates to the Project Specifications

If we require any changes or clarifications to the project specifications, they will be posted on

the Canvas. Any addendums will supersede information included in this document. If you

have assignment-related questions, you are welcome to post those in the discussion board.

Academic Misconduct

For most people, collaboration will form a natural part of the undertaking of this project.

However, it is still an individual task, and so reuse of ideas or excessive influence in

algorithm choice and development will be considered cheating. We will be checking

submissions for originality and will invoke the University’s Academic Misconduct policy

(http://academichonesty.unimelb.edu.au/policy.html) where inappropriate levels of collusion

or plagiarism are deemed to have taken place.

Late Submission Policy

You are strongly encouraged to submit by the time and date specified above, but if

circumstances do not permit this, the marks will be adjusted as follows. Each day (or part

thereof) that this project is submitted after the due date (and time) specified above, 10% will

be deducted from the marks available, up until 5 days have passed, after which regular

submissions will no longer be accepted.

Extensions

If you require an extension, please email Mark Jiang <yujing.jiang@unimelb.edu.au> using

the subject “COMP90073 Extension Request” at the earliest possible opportunity. We will

then assess whether an extension is appropriate. If you have a medical reason for your

request, you will be asked to provide a medical certificate. Requests for extensions on

medical grounds received after the deadline may be declined. Note that computer systems

are often heavily loaded near project deadlines, and unexpected network or system

downtime can occur. System downtime or failure will not be considered grounds for an

extension. You should plan to avoid leaving things to the last minute when unexpected

problems may occur.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp