联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> C/C++编程C/C++编程

日期:2025-04-08 08:23

Assignment 1 2025

General

General

You must read fully and carefully the assignment specification and instructions.

Course: COMP20007 Design of Algorithms @ Semester 1, 2025

Deadline Submission: Monday, 7th April @ 11:59 pm

Course Weight: 10%

Assignment type: individual

ILOs covered: 2, 3, 4

Submission method: via ED

Purpose

The purpose of this assignment is for you to:

Design efficient algorithms in pseudocode.

Improve your proficiency in C programming and your dexterity with dynamic memory

allocation.

Demonstrate understanding of a representing problems as graphs and implementing a set of

algorithms.

Birds of a feather

Why should we group species?

In the field of genetics we are often interested in grouping similar species together to understand how

animals have evolved over time. Understanding evolution allows us to make more informed

decisions in conservation efforts, but may also lead to the discovery of new medicines, as similar

species produce similar responses to external changes. One possible method of grouping species

together is by how similar they are, which historically has been based on how similar their physical

features are. Some examples of Australian birds are shown below:

Left: Wompoo Fruit-Dove, Middle: Azure Kingfisher, Right: Pale-Yellow Robin

Now, one possible grouping of these three birds is to have them grouped by size - we would expect

small birds like the Pale-Yellow Robin or Azure Kingfisher to have more in common than the larger

Wompoo Fruit-Dove. Of course, we can also group them based on more features, using an

algorithmic approach. Examples of such algorithms are Blake's averages and Cynthia's midpoints.

Other Applications

Grouping methods have numerous real-world applications, which include:

Disease Management: Tracking information about the spread of sickness and grouping

together individuals with similar health problems can help us to identify high-risk population

areas and take preventative steps to save lives.

Resource Allocation: Identifying groups of animals with high/low resource usage can help

better distribute resources and help conservation efforts.

Content Personalisation: Grouping together users with similar tastes can help individuals

access the content and information they want more often.

Group Computation Algorithms

Blake's Averages

Initialization:

Start by selecting the first c birds as the centres of the c groupings desired.

Assign all other birds to their nearest group centre.

Calculate Averages:

For each group, find the average of all the features.

For each group, find the bird that is closest to this average.

Re-assign all birds to the new closest groups.

Termination:

Repeat the main loop until the group centres stop changing, or we exceed a maximum

number of iterations.

Once the centres stop changing, the algorithm terminates.

Output:

The output of the algorithm is the list of birds with their features and group

memberships, ordered alphabetically.

The time complexity of the Blake's Averages algorithm is where is the number

of data points, is the number of features per point, is the desired number of clusters, and is the

number of iterations. As should not vary that much and effectively be constant, we will have a

complexity of roughly .

O(n    d    c    i) n

d c i

i

O(n    c    d)

Pseudocode

BlakeAverages(birds, numBirds, numGroups):

// inputs:

// birds, a list of birds.

// numBirds, the number of birds.

// numGroups, the number of groups.

// Initialise the groups by, for the first c birds, assigning bird i to group i.

birds <- initialiseBirds()

// Assign each bird to the nearest group centre

birds <- assignGroups()

repeat:

// record the current group centres

previousCentres <- currentCentres

// calculate the mean of each group and find the birds which act as the new group centres

currentCentres <- calculateMeans()

// Assign each bird to the nearest group centre

birds <- assignGroups()

until previousCentres == currentCentres or we exceed the maximum iteration count

// all done, so return grouping

return birds

Cynthia's Midpoints

Initialization:

Start by selecting the first c birds as the centres of the c groupings desired.

Assign all other birds to their nearest group centre.

Calculate Midpoints:

For each group, find the median of the numeric features.

For each group, find the mode of the categorical features.

For each group, find the bird that is closest to this combination of midpoints.

Re-assign all birds to the new closest groups.

Termination:

Repeat the main loop until the group centres stop changing, or we exceed a maximum

number of iterations.

Once the centres stop changing, the algorithm terminates.

Output:

The output of the algorithm is the list of birds with their features and group

memberships, ordered alphabetically.

As we will sort the elements every time we need to find a median, the time complexity of the

Cynthia's Midpoints algorithm is ,

where is the number of data points, is the number of features per point, is the desired number

of clusters, and is the number of iterations. As should not vary that much and effectively be

constant, we will have a complexity of roughly .

Pseudocode

CynthiaMidpoints(birds, numBirds, numGroups):

// inputs:

// birds, a list of birds.

// numBirds, the number of birds.

// numGroups, the number of groups.

// Initialise the groups by, for the first c birds, assigning bird i to group i.

birds <- initialiseBirds()

// Assign each bird to the nearest group centre

birds <- assignGroups()

repeat:

// record the current group centres

previousCentres <- currentCentres

// calculate the midpoints of each group and find the birds which act as the new group cent

// For each group

// Find the most common colour

// Find the median weight

// Find the median bodyLength

currentCentres <- calculateMeds()

// Assign each bird to the nearest group centre

birds <- assignGroups()

until previousCentres == currentCentres or we exceed the maximum iteration count

// all done, so return grouping

return birds

Notes

Your main function should look like the following pseudocode:

main():

// Initialise an empty list to store birds, and group centres

birds <- empty list

// Read in the list of birds and number of groups

birds, numBirds, numGroups <- readBirds()

if no birds input:

print "No birds input."

return 1

if number of birds < number of groups:

print "Invalid Data."

return 1

// using the method read from argv:

if method == "B":

birds <- BlakeAverages()

else if method == "C":

birds <- CynthiaMidpoints()

else:

print "Invalid method."

// Print out the list of birds and their group assignments

print(birds)

// free all of the data

free(birds)

The process of grouping can be visualized in two dimensions, and may look like the following.

In the image, different colours represent the different groups.

Example

An example of input and output is provided on the Part 1 Skeleton Slide.

Task 1: Group Computation

Part A

Implement the Blake's Averages and Cynthia's Midpoints algorithms to compute a grouping of

birds, as described in the previous slide.

Requirements

Code: Your program should implement the required pseudocode mentioned in previous slides,

as well as a couple helper functions listed in the skeleton code.

Input Format: The input will be a text file where the first line indicates the total number of

Australian birds, and the total number of groups. Subsequent lines represent birds with their

name, colour, weight (grams) and body length (cm) separated by a space (e.g., Red?backed_Fairywren Black 5 10 ). All the birds are sorted in alphabetical order by name using

Unix sorting order.

Output Format: Your program should output all birds, their features and associated groupings

in alphabetical order. For Blake's Averages, this means keeping it in the original ordering (Unix

sorting rules), and in Cynthia's midpoints it will be alphabetically sorted using strcmp sorting

rules. This should be done by traversing your array of birds and printing out each bird. This

should be output to the console (stdout).

Part B

Evaluate both algorithms through experimental analysis by quantifying the average total basic

operations per iteration of the main loop (calculated by dividing the number of operations in the

main loop by the number of iterations taken) of the two algorithms (Blake's Averages and Cynthia's

Midpoints) across various input scales and configurations. Use all of the valid input sets provided.

You may wish to generated more data, and you can do so with the provided files in the analysis

folder.

You must decide what to define as the basic operation for each algorithm, but remember to only

count the basic operations in the main while loop.

Reporting: Write a report including a discussion on the choice of algorithm, the experimental

evaluation (including tables or graphs showing how the average number of basic operations varies

with the input parameters (n and d)), and conclusions drawn from the comparisons. Include any

assumptions or simplifications made in your implementations. In addition, discuss:

Possible improvements that can be made to the main loop of the algorithms, if any, to reduce

complexity.

Why you selected this basic operation.

Tip: The global variable numOps is provided to track the operation count. You may need to modify some of

the functions (such as the comparison functions) to increment the operation counter. By default, the

information relating to operation counts is printed to stderr.

Include your operation counting code as an appendix to the report.

Submission Guidelines

Submit your C source code files with appropriate comments explaining the algorithms and data

structures used.

Your report should be in PDF format, including your findings from the experimental evaluation

and any observations and theoretical improvements regarding the performance of the two

algorithms.

The report for Part 1B should be submitted as a PDF file named written_task_1B.pdf . The file

should be uploaded to the home directory of Part 1A, that is, the directory that contains your

program file birds.c .

Grading Criteria

Correctness of the implemented algorithms and adherence to the requirements.

Efficiency (time and space) and proper storage of birds.

Clarity of the report, including the depth of the experimental evaluation and the analysis of the

results.

Code readability, structure, and documentation.

Task 1 Skeleton

Example of Input:

10 3

Australasian_Swamphen Blue 1310 51

Australian_Bushturkey Black 2100 64.3

Australian_Darter Black 2600 86.5

Australian_King-Parrot Red 195 42

Australian_Logrunner Black 56 19

Australian_Magpie Black 350 85

Australian_Pelican White 6800 188

Australian_Rufous_Fantail Grey 10 18.5

Australian_White_Ibis White 1475 66.4

Australian_Wood_Duck Brown 955 46

Example of Output (for Blake's Averages):

Australasian_Swamphen Blue 1310.000000 51.000000 Group: 1

Australian_Bushturkey Black 2100.000000 64.300000 Group: 2

Australian_Darter Black 2600.000000 86.500000 Group: 2

Australian_King-Parrot Red 195.000000 42.000000 Group: 0

Australian_Logrunner Black 56.000000 19.000000 Group: 0

Australian_Magpie Black 350.000000 85.000000 Group: 0

Australian_Pelican White 6800.000000 188.000000 Group: 2

Australian_Rufous_Fantail Grey 10.000000 18.500000 Group: 0

Australian_White_Ibis White 1475.000000 66.400000 Group: 1

Australian_Wood_Duck Brown 955.000000 46.000000 Group: 1

Example of Output (for Cynthia's Midpoints):

Australasian_Swamphen Blue 1310.000000 51.000000 Group: 1

Australian_Bushturkey Black 2100.000000 64.300000 Group: 1

Australian_Darter Black 2600.000000 86.500000 Group: 1

Australian_King-Parrot Red 195.000000 42.000000 Group: 0

Australian_Logrunner Black 56.000000 19.000000 Group: 0

Australian_Magpie Black 350.000000 85.000000 Group: 0

Australian_Pelican White 6800.000000 188.000000 Group: 2

Australian_Rufous_Fantail Grey 10.000000 18.500000 Group: 0

Australian_White_Ibis White 1475.000000 66.400000 Group: 1

Australian_Wood_Duck Brown 955.000000 46.000000 Group: 0

Eels in the Kulin Nation

Short-finned eels are a fish which live in the freshwater systems around south-eastern Australia. To

the First Peoples of the Kulin Nation - the traditional custodians of the lands and waters surrounding

what is now Melbourne, or Naarm (which is the Boonwurrung/Woiwurrung name for Port Phillip) -

these eels were very important food sources. The Wurundjeri people of the Kulin Nation had seven

seasons rather than the Western four, and one of the seasons was dedicated to the short-finned eel

migration, this season is known as Iuk (https://inspiringvictoria.org.au/2020/08/13/seasons-in-the?sky/). During Iuk, the short-finned eels which live in the freshwater systems of the Kulin Nation

migrate out to the ocean to begin their long journey to the warm waters of the Coral Sea, some

3,000km away, to breed. However, before setting out for this extensive journey the eels must eat and

get fat to survive the long swim. Hence, the people of the Kulin Nation would make extensive fish

traps in the river systems to catch and eat the fattened eels during Iuk, which have been described as

having a buttery taste by a Yorta Yorta person.

Feeding and breeding eels

You are a fresh water eel in the river systems of the Kulin Nation. There are many rivers which

connect different lakes together, and eventually to the ocean.

Part A

You find that these river systems are difficult to navigate, and you wonder if you are running in

circles. Write an algorithm to tell if there are any paths in this river system which could form a cycle,

i.e. leaving from one lake you could take a certain path of distinct rivers that reaches back to the

starting lake.

This algorithm should work for any set of lakes and rivers. There will be a certain number of lakes,

each with a unique identifier lakeID    [0, numLakes). Each river will run from one lake to

another, but you may assume that rivers can be travelled in both directions.

The first line of input is the number of lakes, and number of rivers in the system respectively. The

subsequent lines will each represent a river, with the first value being the lakeID it flows from and the

second being the lakeID it flows to. The input will look like:

[num_lakes] [num_rivers]

[from_lakeID] [to_lakeID]

...

The output of the program should print "We're running in circles!" if there is a cycle found, and

"Smooth sailing" if not.

Part B

You are very hungry and have heard of some good feeding grounds further inland, but it is a long

way and you can't figure out what's the best way to go given all the different rivers and lakes. You

want to get there as fast as possible before the food is all eaten up. The amount of time taken to

traverse a river is equal to the river's length. Unfortunately, because of a strong current, it takes

twice as long to swim upstream as it does to swim downstream.

Write an algorithm that will find the shortest way to reach the feeding grounds from the ocean.

This algorithm should work for any set of lakes and rivers. As with Part 1, there will be a certain

number of lakes, each with a unique identifier lakeID    [0, numLakes). Each river will run from

one lake to another and have an associated length with it as well.

The first line of the input will contain the lakeID which you are starting from, followed by the lakeID

of the destination lake where the feeding grounds are. The second line contains the number of lakes

and number of rivers in the system. The subsequent lines each contain a river, with the lakeID of the

lake it flows from followed by the lakeID of the lake it flows into, and then the final value is the river's

length.

The input will look like:

[origin_lake] [destination_lake]

[num_lakes] [num_rivers]

[from_lakeID] [to_lakeID] [length_in_km]

...

The output of the program should print the total length of the shortest path to the feeding ground,

followed by the lakeIDs of the lakes traversed to reach it (i.e. by going from lake to lake through the

rivers).

The output should look like:

Total cost: [total_length]

Path: [origin_lakeID], [lakeID_1], [lakeID_2], ..., [destination_lakeID]

Part C

The breeding season (Iuk) is fast approaching, so you need to make your way back to the ocean and

onto the Coral Sea. However, you want to maximise the amount of fat you have by the time you

reach the ocean. Swimming down rivers costs energy (and burns fat), but in many cases, you need to

swim through some rivers and lakes anyway to get to the sea. Moreover, the lakes tend to have some

food in them, which can increase your fat supplies again. Assume that you only travel the rivers

downstream because you want to reach the sea as quickly as possible.

Propose an algorithm to find the best way to get back to the ocean, ensuring that, in total, you have

the maximum fat stores upon reaching the ocean. This algorithm should run in O((V +

E)log(V )). This algorithm may only work with certain sets of rivers and lakes, and it is up to you to

check whether the input will be solvable in this time complexity as part of your algorithm.

Write the pseudocode for this algorithm. You may assume the following:

The input graph is a directed weighted graph. Each edge (u, v, w) is represented as a single

element in the adjacency list for u.

There is a function Dijkstra(graph, origin, destination)    (cost, path) that returns the

cost and the path of the lowest-cost path from origin to destination.

As part of the input data, you have an array fatGain[0..numLakes ? 1] where fatGain[i]

stores the amount of fat (in some units) you always gain when you reach the lake with ID i.

When you swim downstream along a river of length w, you lose exactly w units of fat.

At the start of your journey back to the sea, you have K units of fat in your body. You will die

when the number of fat units in your body reaches 0.

You must justify your algorithm design choices in 300 words, including why certain sets of rivers and

lakes won't work with the algorithm.

Notes:

No marks will be awarded if your algorithm's time complexity is not O((V + E)log(V )), or if

your algorithm is incorrect for the problem.

The report for Part 2C should be submitted as a PDF file named written_task_2C.pdf . The file

should be uploaded to the home directory of Part 2B, the directory that contains your program

files dijkstra.c , graph.c , etc.

Task 2, Part A Skeleton

Implement a function, cycleCheck(graph_t *graph), which returns 1 if a cycle is found, and 0

otherwise.

The main driver functions have already been implemented.

The first line of input is the number of lakes, and number of rivers in the system respectively. The

subsequent lines will each represent a river, with the first value being the lakeID it flows from and the

second being the lakeID it flows to. The input will look like:

[num_lakes] [num_rivers]

[from_lakeID] [to_lakeID]

...

The output of the program should print "We're running in circles!" if there is a cycle found, and

"Smooth sailing" if not.

Input Data Sets

A number of input data files are provided in the subdirectory test_cases . Each file name starts with

the prefix t2a- . The data file t2a-0.txt represents a simplified version of some lakes and water

flows (like rivers, creeks, canals) between them for the area around Lakes Entrance in Victoria. It

should be noted that this area does not belong to the Kulin Nation (it actually belongs to the

Gunaikurnai people). The area was chosen here for illustrative purposes.

Also note that t2b-0.txt is the same as t2a-0.txt , but which additional data for using in Part B.

Task 2, Part B Skeleton

Write a function dijkstra(graph_t *graph, int origin, int dest, int *path) which computes

the shortest path from origin to dest and returns the cost of this path, and the path should be

written into the path argument. This function should return the SENTINEL value (-1) if there is no

path, and print out "No Path".

The input data will look like:

[origin_lake] [destination_lake]

[num_lakes] [num_rivers]

[from_lakeID] [to_lakeID] [length_in_km]

...

The output should look like:

Total cost: [total_length]

Path: [origin_lakeID], [lakeID_1], [lakeID_2], ..., [destination_lakeID]

Most scaffolding functions have been written for you, including I/O and a basic graph and priority

queue implementation. We recommend you use these, but you are welcome to write your own if you

wish, you must ensure that the output is in the same form of the output the test cases give.

Task 2B Test Cases

1 Automatic Zoom

Academic Honesty

This is an individual assignment. The work must be your own work.

While you may discuss your program development, coding problems and experimentation with your

classmates, you must not share files, as doing this without proper attribution is considered

plagiarism.

If you have borrowed ideas or taken inspiration from code and you are in doubt about whether it is

plagiarism, provide a comment highlighting where you got that inspiration.

If you refer to published work in the discussion of your experiments, be sure to include a citation to

the publication or the web link.

 Borrowing   of someone else  s code without acknowledgment is plagiarism. Plagiarism is considered

a serious offense at the University of Melbourne. You should read the University code on Academic

integrity and details on plagiarism. Make sure you are not plagiarizing, intentionally or

unintentionally.

You are also advised that there will be a C programming component (on paper, not on a computer) in

the final examination. Students who do not program their own assignments will be at a disadvantage

for this part of the examination.

Late Policy

The late penalty is 20% of the available marks for that project for each working day (or part thereof)

overdue.

If you wish to apply for an extension, please review the FEIT Extensions and Special consideration

page on the subject LMS. Requests for extensions on medical grounds will need to be supported by a

medical certificate. Any request received less than 48 hours before the assessment date (or after the

date!) will generally not be accepted except in the most extreme circumstances. In general, extensions

will not be granted if the interruption covers less than 10% of the project duration. Remember that

departmental servers are often heavily loaded near project deadlines, and unexpected outages can

occur; these will not be considered as grounds for an extension.

Students who experience difficulties due to personal circumstances are encouraged to make use of

the appropriate University student support services, and to contact the lecturer, at the earliest

opportunity.

Finally, we are here to help! Frequently asked questions about the project will be answered on Ed.

Requirements: C Programming

The following implementation requirements must be adhered to:

You must write your implementation in the C programming language.

Your code should be easily extensible to multiple data structure instances. This means that the

functions for interacting with your data structures should take as arguments not only the values

required to perform the operation required, but also a pointer to a particular data structure, e.g.

search(dictionary, value) .

Your implementation must read the input file once only.

Your program should store strings in a space-efficient manner. If you are using malloc() to

create the space for a string, remember to allow space for the final end of string character,    \0  

( NULL ).

Your approach should be reasonably time efficient.

Your solution should begin from the provided scaffold.

Hints:

? If you haven  t used make before, try it on simple programs first. If it doesn  t work, read the error messages

carefully. A common problem in compiling multifile executables is in the included header files. Note also that

the whitespace before the command is a tab, and not multiple spaces.

? It is not a good idea to code your program as a single file and then try to break it down into multiple files.

Start by using multiple files, with minimal content, and make sure they are communicating with each other

before starting more serious coding.

Programming Style

Below is a style guide which assignments are evaluated against. For this subject, the 80 character

limit is a guideline rather than a rule    if your code exceeds this limit, you should consider whether

your code would be more readable if you instead rearranged it.

/** ***********************

* C Programming Style for Engineering Computation

* Definitions and includes

* Definitions are in UPPER_CASE

* Includes go before definitions

* Space between includes, definitions and the main function.

* Use definitions for any constants in your program, do not just write them

* in.

*

* Tabs may be set to 4-spaces or 8-spaces, depending on your editor. The code

* Below is ``gnu'' style. If your editor has ``bsd'' it will follow the 8-space

* style. Both are very standard.

* We should not comment obvious things - write code that documents itself

Some automatic evaluations of your code style may be performed where they are reliable. As

determining whether these style-related issues are occurring sometimes involves non-trivial (and

sometimes even undecidable) calculations, a simpler and more error-prone (but highly successful)

solution is used. You may need to add a comment to identify these cases, so check any failing test

outputs for instructions on how to resolve incorrectly flagged issues.

Mark Breakdown

There are a total of 10 marks given for this assignment.

Your C programs for Task 1 and 2 should be accurate, readable, and observe good C programming

structure, safety and style, including documentation. Safety refers to checking whether opening a file

returns something, whether mallocs do their job, etc. The documentation should explain all major

design decisions, and should be formatted so that it does not interfere with reading the code. As

much as possible, try to make your code self-documenting, by choosing descriptive variable names.

The remainder of the marks will be based on the correct functioning of your submission.

Note that marks related to the correctness of your code will be based on passing various tests. If your

program passes these tests without addressing the learning outcomes (e.g. if you fully hard-code

solutions or otherwise deliberately exploit the test cases), you may receive less marks than is

suggested but your code marks will otherwise be determined by test cases. For questions with both a

written component and a C code component, part of the mark will be given for the passing of test

cases, with the remainder from the correctness of the written answer.

Task 1 will be marked out of 4 marks, Task 2 will be marked out of 5 marks and C code quality will

comprise the final mark.

Additional Support

Your tutors will be available to help with your assignment during the scheduled workshop times.

Questions related to the assignment may be posted on the Ed discussion forum, using the folder tag

Assignments for new posts. You should feel free to answer other students questions if you are

confident of your skills.

A tutor will check the discussion forum regularly, and answer some questions, but be aware that for

some questions you will just need to use your judgment and document your thinking.

If you have questions about your code specifically which you feel would reveal too much of the

assignment, feel free to post a private question on the discussion forum.

Most students find Academic Skills' Research Report Guide extremely valuable in constructing a well

formed and sensible analysis that makes good use of relevant material taught so far in the subject.

Acknowledgements

ChatGPT was used to help generate some graphs for Task 2A.


相关文章

版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp