COMP26020 - Lab exercise for Part III (Compilers)
Register Allocation using Graph Colouring
Background
Computer programs, regardless of the programming language, often use many more variables
than the number of variables that can fit in all CPU registers. When a program is compiled for
execution on a given processor, the compiler needs to consider what variables will stay in
registers and for how long. If we think that moving data from the memory takes several cycles,
there is a performance benefit if the compiler can minimise such transfers. How to do this? By
doing some ‘clever’ register allocation, for example, by making sure that the most frequently
used variables are placed in registers.
To understand the problem, consider the following piece of code:
In this piece of code, the programmer has used 9 variables. However, does this mean that 9
registers are needed? To find the answer, let us define the notion of a live range. For any given
variable, there is a live range that starts from the point where a value is assigned to this variable
and lasts until the last time this particular value is used. Note that if a new value is assigned to
the same variable, a new live range starts. For example, a value for r2 is defined in instruction 2.
The last time it is used is in instruction 5, hence, the live range is between 2 and 5. However, if
instruction 4 was r2=z, the live range would be from 2 to 3 and another live range would start at
instruction 4 and end at instruction 5.
To practice, you may want to find all live ranges of the code above. The answer is given:
r1:[1,9], r2:[2,5], r3:[3,8], r4:[4,5], r5:[5,7], r6:[6,7], r7:[7,8], r8:[8,9], r9:[9,9].
Live ranges are important because they indicate how many values need to be live at any given
instruction. For example, the live ranges above tell us that at instruction 6 four values need to be
live. Clearly, the maximum number of values that need to be live at any instruction indicates
how many registers we need to have so that all values (live ranges) can be placed in registers.
However, most importantly, live ranges can guide register allocation: two live ranges that do not
overlap or interfere can use the same register. For example, with the live ranges above, r2 and
r6 can share the same register as the corresponding values are needed (or are ‘live’) at different
parts of the code. Different algorithms have been developed to find how to allocate different live
ranges to registers. This problem is known as register allocation. It is an NP-complete problem,
which means that most of the different solutions proposed over the years are based on
heuristics. For additional information you can find more in Chapter 13 of the ‘Engineering a
Compiler’ recommended textbook:
Among the different approaches, register allocation using graph colouring is a common
approach. In register allocation using graph colouring, live ranges are used to create an
interference graph. In this graph, every live range corresponds to a node. There is an edge
between two nodes if the live ranges overlap. Then, register allocation becomes equivalent to
the problem of graph colouring. This is a well-known graph theory problem where the aim is to
colour all nodes of the graph so that two adjacent nodes do not share the same colour. Typically
the goal is to find the smallest number of colours. Every colour corresponds to a register and the
colour of a node corresponds to the register that should be used for a particular live range.
There are various algorithms to colour a graph. Here, we are going to focus on a simple
algorithm, which is known as top-down colouring. The algorithm works as follows:
1. Assume an ordered list of colours (eg, red, black, blue, etc)
2. Assume an interference graph, where nodes are numbered: 1, 2, 3, …
3. Rank nodes (that is, live ranges) of the interference graph according to the number of
neighbours in descending order. In case of a tie (that is, nodes with the same number of
neighbours) the node with the lowest id takes priority.
4. Follow the ranking to assign colours from the list of colours. For each node, select the
first colour from the list that is not used by the node’s neighbours.
5. Keep following the ranking and repeating step 4 until all nodes are coloured.
Your task
Use a programming language of your choice to implement a program that:
reads a file that lists an interference graph (input).
writes a file that lists colours for every node of the graph (output).
The list of colours is given by the (upper-case) letters of the alphabet: A, B, C, …, Z. There is a
total of 26 colours (or registers).
Input file specification:
A number of lines equal to the number of nodes of the interference graph. Every line contains
the number of the node (in ascending order) and the numbers of all nodes with which there is
an interference (not necessarily in ascending order). Example:
1 2 3 4
2 1 4
3 1
4 1 2
This means that node 1 interferes with nodes 2, 3, 4. Node 2 interferes with nodes 1 (we knew
this already) and 4. Node 3 interferes with node 1 and node 4 interferes with nodes 1,2.
You can assume that there are no more than 50 nodes, nodes are ordered and every node has
at least one neighbour.
Output file specification:
For every node (in ascending order), write node number and colour. For the example above:
(other examples may be available on BB – you can also create and post your own if you wish)
You need to make sure that you follow the specifications as most of the marking will take place
automatically. Your program should take two command line arguments, which indicate the name
of the input file and the name of the output file. E.g.:
% <myprogram> input.txt output.txt
You should be able to complete this task in two lab sessions. The deadline for submission is set
after the 2nd lab session and before the 3rd lab session. You should submit through gitlab
(26020-lab4-S-Compilers). Your submission should include source file(s) and a readme file
with instructions on how to compile and run your code.
Marking (out of 10) will take place according to the following scheme:
2 marks for producing the output of the example above correctly.
3 marks for code readability and sensible comments/explanation.
5 marks for finding the output of additional, automated tests correctly.
Note that you are expected to mark this exercise during scheduled lab sessions.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。