INFS 2042 DataStructures Advanced
Assignment2 – ContactTracing
UniSASTEM
TheUniversity ofSouthAustralia
2023
Originally writtenbyBrandonMatthews
Modified byDanielAblett,andGunLee
Warning: This material has been reproduced and communicated to you by or on behalf of the University of South
Australia in accordance with section 113P of the Copyright Act 1968 (Act). The material in this communication may be
subject to copyright under the Act. Any further reproduction or communication of this material by you may be the
subject of copyright protection under the Act.
2
1. Introduction
To track and reduce the spread of a disease during an epidemic or pandemic situation it is critical that
authorities and healthexpertscantrace who hasbeen incontactwithwhom,whenthecontactoccurredand
where.Thisisknownascontacttracing. Efficiently searching potentiallymillionsofpeopleandwherethey
have been will requireanefficientwaytostore and navigatethroughthedata.
Inthisassignment, youaretaskedwithbuilding a basic contacttracingsystem.Youmustuse yourknowledge
ofdatastructures andsearch algorithmstoefficiently store and processlargequantitiesofcontacttracingdata.
Youarenot restrictedtothedatastructures andalgorithmsexploredinthiscourse.Youmayalsomakeuseof
structuresandalgorithmsfrom theData StructuresEssentialscourse.
2. Requirements
Your clienthasprovidedyouwithastrictsetofsystemrequirementsthattheprogrammustmeet.Howyou
achieve those requirements and which algorithms or data structures you use are up to you. You must
implement the program in Java usingOpenJDK 11 or newer. You shouldalsoaim tomake the program as
efficientaspossible.Forexample,exhaustivelysearchinglistsinnestedloopswouldnotbethemost efficient
implementation inmanycases.
Generally,it iseasiertodesignwithoptimisationin mind.Whenusingthefollowingdatastructures:Binary
Search Tree, Self-Balancing Search Tree, Graph, Skip List, Blockchain, Hash Map, Hash Set etc. you must
implementthedatastructureyourself.Itisexpectedthata selection ofthesestructureswillberequiredto
meettheclientrequirementsasefficientlyaspossible.
Youmayuseprovideddatastructures inJava libraries(suchasLinkedList,Queue,Stacketc.)onlyiftheyare
not apartofthecontentcoveredinthiscoursetosupporttheimplementationofotherstructuresandstore
datawherenecessary. Bewaryof functions that arebuilt into provided data structures,if you do use them
ensureyouconsidertheirperformanceimpact.
Youarealsorequiredtoprovidesupportingdocumentation,inthis,youmustexplaineachdatastructureyou
used,whattheywereusedforandwhy. This includes caseswhereyouhaveusedJava’sbuilt-indata structures.
Consideryour implementationinthecontextofarealcontacttracingapplication.Thedataprovidedforthis
assignment,asdescribed below,isfor40people,with80visitsto6locations.Inarealapplicationwe likely
have millionsofpeople,withtensorhundredsofmillionsofvisits tohundredsofthousandsoflocations. Your
implementation shouldbeefficientforstorageandprocessing oflarge amounts ofdata.
Remember,it isnotenoughthatyoursystemimplementstherequirements,itmustimplementthem
efficiently.
3
2.1SystemRequirements
Belowareasetofrequirementsfortheoperation oftheprogramasprovidedbyyourclient.
• Thesystemadministratorwouldliketheabilitytoloadexistingdatafromtheprovided.csvfiles.The
codetoreadthefilesis already provided bytheclient however they have not implemented a method
tostorethedata.
• Inaddition,publichealth officials needtheabilitytoaddanewPerson,LocationorVisittothedata.
The client has provided the input command parsing code to support this however they have not
implementedthefunctionality.
• Publichealthofficialsneed theabilitytosearchfora Personbyname.Thisshouldshowthemalldetails
abouttheperson.ThisincludeslistingallvisitsthePersonhasmade.
o Hint:ThiswouldrequireanefficientmeansofsearchingforthePersonandallVisitsinwhich
thePersonhasvisitedanyLocation.
o IfastartDateandendDateareprovided, thisshouldalsofilter thelistofVisits toonlyinclude
thosebetweenthesetimes.
• Publichealthofficialsneed theability tosearch fora Location byname.Thisshould show them all
detailsaboutthelocation.Thisincludeslistingallpeoplethathavevisitedthelocation.
o IfastartDateandendDateareprovided, thisshouldalsofilter thelistofVisitstoonlyinclude
those between these times.
• The publichealthofficialswould like the abilitytoproducealistofpotentialcontactsupto(n)levels
away fromagivenperson(including knowncontacts).
o Ifn=1,thelistwillcontainonlydirectcontactsofthegivenperson.
o Ifn=2,thelistwillcontainalldirectcontacts(n=1)ofthegivenpersonandallcontactsof
thosecontacts(n=2).
o Ifn=r,thelistwillcontainalln=1ton=r-1contactsofthegivenpersonandallcontactsofthose
contacts(n=r).
o Hint:Thiswouldrequireanefficientmethodofidentifyingcontactsofagivenpersonbased
ontheirvisits.
• PublichealthofficialsalsoneedtheabilitytospecifyifthepersonisanewActiveCase(i.e., theyhave
becomeinfectedwiththevirus).
o WhenanActiveCase isadded,theyalsoneed toseeanestimation ofwhere,when andfrom
whom the personlikely contracted the virus. Your program should output themost likely
contactsourceincludingthelocationandtimeofcontact.Note:Themostlikelycontactsource
isthepairofpeoplewiththehighestChanceofSpread(C)asdefinedlaterinthisdocument.
o Ifa newActive Case has noimmediate contacts thatarealsoanActive Case, the program
shouldinsteadfindthenearestormost likely ActiveCase.That is,theexistingActiveCasefor
whicheachcontactbetweenthem andthenewActiveCase have thehighesttotalChanceof
Spread(C).
o Hint: This would require amethod for identifying the person from which the visit during
whichthepersonmostlikelycontractedthevirus.
• The publichealthofficialswouldliketooutputatraceofthetransmissionofthevirus fromits original
sourcetoatargetperson.Inthisprocessthistraceshouldensurethedateeachpersonalongthepath
wasinfectediscorrectbyverifyingthestartdateoftheirinfectionisthedayafterthecontactwiththe
highestChanceofSpread(C). Ina‘realworld’datasetthiswouldbeusefulforidentifyingdifferent
branchesofthevirus asitspreadsandtracingthevirusbacktoitsoriginalsource.
o Hint:thiswould requireamethodfortracingthepaththrougheachpersonbackwardsfrom
thegivenpersonuntil noprevioussourcecasecanbefound(intheprovideddata).
• Thepublichealth officials wouldliketobeabletoproducealistofallactivecases.
• Theprogrammust berobustanduserfriendly,soit doesnotcrashbutprint propermessages.
4
2.2SupportingDocumentation
Youmust provide a document to support your program design and demonstrate your program meets the
requirements.Thismustinclude:
• One-pagesummary ofyour programdesignandthereasoningbehindyourdesigndecision.
Explainalldatastructures andalgorithmsyouused,whattheywereusedfor,andyourreasoningfor
selectingthem.(e.g.,estimateof overallperformance,spaceandtime-efficiency)
• Sampleoutputsfromyourprogram. (nopagelimit)
Thisistodemonstratethatyourprogrammeetstherequirements.Provideheadingstoclarifywhat
requirementdoestheprovidedsampleoutputdemonstrates.
3. DataandCode
Forsimplicity,a limiteddatasetisprovided. Apersonisonlyconsideredinfectiousiftheyarecurrentlyan
activecaseandonlythedatesbetweenwhich theyareinfectionsisrecorded.Allofthedataisartificialdata
thathasbeenprocedurallygenerated.ApersonisonlyconsideredanactivecaseiftheyhaveanactiveStartDate,
andtheyeitherdon’thaveanactiveEndDateortheactiveEndDateisafterthe“currentdate”.
3.1ProvidedDataFormat
ThedataintheprovidedCSVfilesarestructuredasfollows:
• Person.csv– Alistofpeoplewhereeachpersonhas:
o name
§ The person’s full name, or the purposes of this assignment you can assume the
person’sfullnameisuniquewithinthedataset
o activeStartDate
§ Thedateaftertheperson isestimatedtohave contractedthevirusthevirus(empty
iftheyhavenotcontractedthevirus)
o activeEndDate
§ The date the person stopped being contagious (orisestimated to stopifafter the
“currentdate”).
• Location– Alistoflocationswhereeachlocation has:
o name
§ Thelocation’sname,orthepurposesofthisassignmentyoucanassumetheperson’s
fullnameisuniquewithinthedataset
• Visit– Alistofvisitsbyapersontoalocationwhereeachvisithas:
o personName
§ Nameofthepersonthatvisitedthelocation
o locationName
§ Nameofthelocationthepersonvisited
o date
§ Dateofthevisit
o entryTime
§ Timethepersonenteredthelocation
o exitTime
§ Timethepersonexitedthelocation
5
3.2ProvidedCode
Theclient has provided thebasic interface commandstheywish tousetohandlethedata.Youarefreetoadd
commandsforyourtestingpurposesifyouwish, however you must keepthecommandslistedherethesame.
The provided base code handles the parsing of these commands and provides some supporting types and
functions. Itisrecommended thatyouretainthecommandfunctionalityandbuilduponit,howeveryouare
freetomodifythebasecodehoweveryouwant/need tomeet therequirements. Seetestfiles/test.txtin
theprovidedcodeforasetofexamplecommands.
Theprogramisconfiguredwithanartificial“CURRENT_DATE”variablethatrelatestotheprovideddatafiles.
Youshouldusewheneverreferringtothecurrentdate. This isconfiguredbyaninitializationcommandinthe
testfiles.
Forsimplicity,alimiteddatasetisprovided. Apersonisonlyconsideredinfectiousiftheyarecurrentlyan
activecaseandonlythedatesbetweenwhichtheyareinfectionsisrecorded.Allofthedataisartificialdata
thathasbeen procedurallygenerated. ApersonisonlyconsideredanactivecaseiftheyhaveanactiveStartDate,
andtheyeitherdon’thave anactiveEndDate ortheactiveEndDate isafterthe“currentdate”.
Command Purpose Parameters
INIT Initializestheprogramandsetsthe
artificialCURRENT_DATE
currentDate– theartificialcurrentdate for
theprogram
LOAD_DATA Loads data from the provided
People, Locations and Visits CSV
files
peoplePath– pathtopeoplecsv
locationPath– pathtolocationcsv
visitPath– pathtovisitcsv
ADD_PERSON Addsanewperson personName– nameofthepersontoadd
ADD_LOCATION Addsanewlocation locationName– nameofthelocationtoadd
ADD_VISIT Adds anewvisit personName– nameoftheperson
locationName – nameofthelocation
date– dateofthevisit
entryTime– timevisitstarted
exitTime– timevisitended
GET_PERSON FindsthePersonbynameandlists
all visits (ora filteredlistof visits
betweenstartDateandendDate)
personName– nameofthepersontoget
startDate (optional)– filtervisitlistbythis
starttime
endDate (optional) – filter visitlist by this
endtime
GET_LOCATION Finds the Location by name and
lists all visits (or a filtered list of
visits between startDate and
endDate)
locationName– nameofthelocationtoget
startDate(optional)– filtervisitlistbythis
starttime
endDate (optional) – filter visitlist by this
endtime
LIST_CONTACTS FindsthePersonbynameandlists
all contacts within (n) contacts of
thegivenperson.i.e.n=1isdirect
contact,n=2iscontactwithann=1
contact … n=N is contact with an
n=N-1contact.
personName– persontogetcontactsof
n– numberoflevelsofcontact
CURRENTLY_ACTIVE Listsallcurrentlyactivepeople.
6
NEW_CASE SetsthegivenPersontonowbean
active case and the date and time
they tested positive. Also outputs
themostlikelyinfectionsourcefor
the target and updates the
activeStartDate for the person (1
dayafterthecontacttookplace),if
no viable contact is detected, sets
the activeStartDate to the
CURRENT_DATE variable in
DateHelpers.
personName– nameofthepersontomakea
newcase
TRACE_PATH Traces the path that the virus
travelled from person to person
untilitreachesthetarget.
personName– nameof theperson to trace
theviustransmissionfor
3.3CalculatingtheChanceofSpread
Forthisassignment,wehave animaginaryvirusthathasahighchanceofspreadingandbecomesdetectable
andcontagiousthefollowing day.That is.if John isdetectedasanactivecaseon5/1/2021,theymusthave
caughtthevirussomedaybefore 5/1/2021
Forthisvirus thechanceofcontactbetweenanactivecaseandanotherpersonresultinginaspreadtothat
personisbasedontheoverlapintimespent bytwopeopleata givenlocation,thetimesincetheactivecase
contracted the virus and theincubation time.Thechanceisthepercentage ofonehour spent incontact(inthe
same location).
LetDbethetimespentbytwopeopleinthesamelocation(inminutes)
TheChanceofSpread(C)is:
= # !
"# × 100'
NotethatC cannotbelessthan0%orgreaterthan100%.
3.4RunningtheProvidedCode
Toruntheprovidedcode youwillneed topassitthepathtothetestfileasaprogramargumentthroughthe
“Run Configuration” in eclipse. The default included test file is at the relative location “testfiles/test.txt”.
Throughoutdevelopmentitmayhelptocreateyourowntestfilesanddatasetsthatyoucanusetohelpwith
implementationofspecificfunctions.Ifyouareusing your own testfile,makesureyouupdatethe“Arguments”
underthe“Run Configuration”in eclipse. Notethatadifferenttestfilemaybeusedformarking.
7
4. SubmissionDetails
Allassignmentsmustbesubmittedviathelearnonlinesubmissionsystemthroughthecoursewebpage.
Youmust submittwofiles, areportandazipfilecontaining your programmingsolution.
Report
Submita singlepdffilewiththenameemailID.pdf (replaceemailID withyourown!)asoutlinedinsection2.2
Code
Make sure you add your nameandemailIDintothecommentsonallofthe Javafileyouhave modified. Create
a singlezipfilewiththenameemailID.zip. Whenunzippeditshouldcontainafunctionaleclipseproject.You
maywork ontheprojectin another IDE however you must ensureitworksasaneclipseproject asitwill be
markedbasedoneclipse.
8
LateSubmissions andExtensions
Late submissions will be penalised by scaling the marks by 70% unless with pre-approved extension.
Application forextension shouldbelodged through thecoursewebsiteanditrequiresadocumentproving
acceptable reasonsforextension (e.g.,medicalcertificate,ora letter fromyourworksupervisor). Pleasecheck
thecourseoutlineavailableonthecoursewebsiteforfurtherdetailsonlatesubmissionandextensions.
AcademicMisconduct
Studentsmust beaware oftheacademicmisconductguidelinesavailablefromtheUniversityofSouthAustralia
website. Deliberate academic misconduct such as plagiarism is subject to penalties. Information about
AcademicintegritycanbefoundinSection9oftheAssessmentpoliciesandproceduresmanualat:
https://i.unisa.edu.au/policies-and-procedures/codes/assessment-policies/
All of the assignments are compared using special tools designed to look for similarities between Java
programs. TheplagiarismcheckingprogramsdonotjustcomparetheactualJavacode,butinsteadperform
comparisonsonthecodeafterithasbeencompiled.Performingcosmeticchangessuchasreformattingcode,
renaming variables,or reorderingcodemayfoolahumanundercasualinspectionbutwillstillbedetectedby
theplagiarism checker asbeingsimilar. BewarethatifyouusegenerativeAItools,thismayresultinanother
personusingthesametoolsubmittingsolutionswithhighsimilarities.
Anyassignmentsfound tobeinviolationoftheuniversity’srulesonacademicmisconductwillbecomesubject
ofAcademicIntegrityinvestigationwhichwilldecidethepenalty.Furthermore,youmayalsofailthecourse
and/orreceiveanofficialnoteinyouracademictranscript.
Thebestwaytoavoid beingpenalised forplagiarism istonotcheatonyourassignment.Donotshareyour
code with anyone, do not let anyone else do your assignment for you, and do not leave your computer
unattended or share your password with anyone. If you are working with friends it is ok to discuss the
assignment and possible ways of solvingit,butyoushouldnotshare your code.Sharingcodewithothersisstill
consideredacademicmisconduct.Thegoldenrule forworkingon yourassignmentsisnever showanother
studentthematerialyouintendonhandingup.
9
5.AssessmentCriteria
• ProgramCode(70%)
o Runswithoutanyerrorsorcrashing(5%)
o MeetsClientRequirements(40%)
o QualityofImplementationandCode(25%)
§ Isthecode well-structuredandlogical?
§ Isthecode readable?
§ Haveefficient control-flowelementsbeen used(whileloops,forloops)?
§ Sufficientcomments. Also,nameandemail IDmustbeadded incomments.
§ CodereusabilityandDRY(don’trepeatyourself)
• Report (30%)
o Logical reasoning for Data StructureandAlgorithm selectionis documentedin supporting
documentation.(25%)
§ Choices ofdatastructureandalgorithm
§ Efficient usageofspace
§ Efficient computationtime
o Provided sample outputsarerelevanttotherequirements.(5%)
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。