日期:2023-09-09 09:43

Project 1 EECS 370 (Fall 2023)

Worth: 100 points

Assigned: Thursday, August 31st, 2023

Part 1a Due: 11:55 PM ET, Thursday, September 14th, 2023

Part 1s & 1m Due: 11:55 PM ET, Thursday, September 21st, 2023

0. Starter Code

starter_1a.tar.gz files Description

Makefile Makefile to compile the project

spec.as Spec test case assembly file

spec.mc.correct Correct machine code output for spec test case

starter_assembler.c Starter code for the LC-2K assembler




Makefile Makefile to compile the project


Spec test case machine code file, this is the same as spec.mc.correct

from P1A


Correct output for spec test case - note that your simulator should

write to standard out

starter_assembler.c Starter code for the LC-2K simulator

There is no starter code for project 1M, the assembly multiplication program.

Feel free to use wget and tar as follows:

$ wget https://eecs370.github.io/project_1_spec/starter_1a.tar.gz

Saving to: ‘starter_1a.tar.gz’


1. Purpose

This is a 3 part project where you will be coding the following:

Project Description Required File(s) for Submission

1A - The LC2K


For project 1A, you will write a c

program which takes as input an

LC2K assembly file (denoted with

*.as ) and outputs its correct

machine code representation into a

machine code file (denoted with

*.mc )

assembler.c, and a suite of test

assembly files ending in *.as to

be ran against your assembler, and

buggy instructor assemblers

1S - The LC2K


For project 1S, you will write a c

program which simulates the LC2K

ISA, with a given machine code file

as input. It will output the simulation

to stdout

simulator.c, and a suite of test

assembly files ending in *.as .

These test files will first be

assembled by the instructor

assembler, and then ran against

your simulator, and buggy

instructor simulators.

$ wget https://eecs370.github.io/project_1_spec/starter_1s.tar.gz

Saving to: ‘starter_1s.tar.gz’

Project Description Required File(s) for Submission

1M - LC2K



For project 1M you will write an LC2K

assembly program which multiplies

two positive 15 bit numbers.


2. LC-2K Instruction Set Architecture

Before we dive into project specifics, it is important to understand the LC2K (Little Computer 2000)

Instruction Set Architecture. As for this and several of the future projects, you will be gradually

“building” out the LC-2K toolchain and LC-2K simulators. The LC-2K instruction set is very simple,

but it is general enough to solve complex problems. To complete project 1’s three parts, you will

need to only know the LC-2K Instruction Set Architecture.

In general, an instruction set architecture defines how a programmer can use the processor, and

what operations the processor supports.

The LC-2K ISA is a RISC architecture (Reduced Instruction Set Computer): This means that it

supports simpler operations. Note that the ISA defines both the assembly language and the machine

code. An assembly language is a low level programming language that closely relates to the

underlying machine code. Each line of assembly code can be assembled into 1 line of machine

 Pro tip: LC2K assembly files ( *.as ) and LC2K machine code files ( *.mc ) are plain-text

files, meaning you should be able to edit and view them in a text editor.

LC2K assembly files can also use the ( *.s ) and ( *.lc2k ) file extensions. This is helpful for

students who use XCode and cannot open ( *.as ) files

 Important facts about the LC-2K ISA:

There are 8 registers (registers 0 through 7)

Each address is 32-bits

Each address stores a word (a word is 4 bytes which is also 32 bits)

LC-2K has 65536 words of memory

By assembly-language convention register 0 always has a value of 0

This is technically not enforced, but no assembly language program should change

register 0 from its initial value of 0).

code, which looks like a bunch of numbers. The machine code is a representation of assembly

code, which is usable by the computer.

The machine code file contains the actual values stored in memory (that is, the assembled

assembly code). Specifically we assume that the first line of the machine code file represents the

0th address. Our assembly language also supports the use of symbolic links, and assembler

directives. These higher-level operations specify how the assembler should handle the input

assembly language and are not visible in the machine code translation after assembly.

2.1. Description of LC-2K Instructions



name for



Opcode in







Add contents of regA with contents of regB , store

results in destReg .





Nor contents of regA with contents of regB , store

results in destReg . This is a bitwise nor; each bit is

treated independently.





“Load Word”; Load regB from memory. Memory address

is formed by adding offsetField with the contents of

regA . Behavior is defined only for memory addresses in

the range [0, 65535].





“Store Word”; Store regB into memory. Memory address

is formed by adding offsetField with the contents of

regA . Behavior is defined only for memory addresses in

the range [0, 65535].





“Branch if equal” If the contents of regA and regB are

the same, then branch to the address

PC+1+offsetField , where PC is the address of this beq





0b101 “Jump and Link Register”; First store the value PC+1 into

regB , where PC is the address where this jalr

instruction is defined. Then branch (set PC) to the

address contained in regA . Note that this implies if



name for



Opcode in



regA and regB refer to the same register, the net effect

will be jumping to PC+1 .





Increment the PC (as with all instructions), then halt the

machine (let the simulator notice that the machine halted).




0b111 “No Operation (pronounced no op)” Do nothing.

2.2. Description of LC-2K assembly language

An LC-2K assembly file ( *.as ) is made up of multiple lines of assembly. Each line represents the

assembly intended to be stored at that address. For example, the first line of the assembly file

represents what is going to go in address 0. the second line of the assembly file is what goes in

address 1 and so on.

An assembly file needs to be assembled into a machine code file before it is executed by an LC-2K


An LC-2K machine code file ( *.mc ) is made up of multiple lines of integers. Each integer in the

machine code file represents the value stored at that address in memory; the first line of the

machine code file represents the value of address 0 when the program begins

2.2.1. LC-2K assembly language syntax

Each line of LC-2K assembly is formatted in the following way:

Each line of assembly may have the following fields:

label whitespace opcode whitespace field0 whitespace field1 whitespace

field2 whitespace comment

Field Description




The leftmost field on a line is the label field. Valid labels contain a

maximum of 6 characters and can consist of letters and numbers

(but must start with a letter). The label is optional (but the a line

without a label must have whitespace before the opcode). Labels

make it much easier to write assembly-language programs.

Without labels you would need to modify all numeric address fields

each time you added a line to your assembly-language program!

Labels that appear in the label field are considered ‘defined’



The opcode field has one of eight LC-2K opcodes (Ex: add or

nor ), it can also have directives for the assembler (Ex: .fill ),

see section on LC-2K Directive


field0 Depending on the instruction type, field0 is ignored, or is a register.

Depends on



field1 Depending on the instruction type, field1 is ignored, or is a register.

Depends on




Depending on the instruction type, field2 is ignored, is a register, a

numeric address, or a symbolic address (represented by a label).

Depends on



comment The comment field is ignored N

2.2.2. LC-2K assembly language instruction types

Here are the instruction types, with a description of the associated fields for each instruction type.

Fields that are not required are ignored by the assembler:



Instructions in


Description of required fields



add , nor

opcode , field0 , field1 , and field2 are

required fields:

field0 is a register (regA)

field1 is a register (regB)

field2 is a register (destReg)



Instructions in


Description of required fields



lw , sw , beq

opcode , field0 , field1 and field2 are

required fields:

field0 is a register (regA)

field1 is a register (regB)

field2 is either a numeric address, or a symbolic

address (represented by a label)




opcode , field0 , and field1 are required fields:

field0 is a register (regA)

field1 is a register (regB)



noop , halt Only the opcode field is required

2.2.3. LC-2K assembler directives

In addition to LC-2K instructions, an assembly-language program may contain directions for the


The only assembler directive we will use is .fill (note the leading period).

The .fill assembler directive tells the assembler to put an integer into the place where the

instruction would normally be stored.

.fill instructions use one field, which can be either a numeric value or a symbolic address.

For example, .fill 32 puts the value 32 where the instruction would normally be stored. .fill

with a symbolic address will store the address of the label.

spec.as - .fill using symbolic address is highlighted

lw 0 1 five load reg1 with 5 (symbolic address) -

note that this instruction is at address 0

lw 1 2 3 load reg2 with -1 (numeric address)

start add 1 2 1 decrement reg1

beq 0 1 2 goto end of program when reg1==0

beq 0 0 start go back to the beginning of the loop


done halt end of program

five .fill 5

neg1 .fill -1

stAddr .fill start will contain the address of start (2)

In the spec example, .fill start will store the value 2, because the label start is at address 2.

The bounds of the numeric value for .fill instructions are to (-2147483648 to


2.2.4. LC-2K symbolic addresses and labels

I-Type instructions and the .fill directive can use defined labels as arguments. Remember that

labels are used to “book-mark” lines of assembly. They provide a way of symbolically indicating a

line of assembly (which in turn represents an address) without using its numeric value. They are

incredibly useful when doing assembly programming. Remember, in our assembly files we assume

that the first instruction is at address 0.

When a label is used inplace of a numeric address in field2 for I-Type instructions, we say

that the instruction is using a symbolic address. (The address it refers to is not static, but is

instead wherever that label is defined).

When used with lw or sw instruction, a label indicates you want to load or store from that

label’s address

When used with a beq instruction, a label indicates you want to branch to that label’s


When a label is used inplace of a number in field0 for a .fill assembler directive, you are

to resolve the label’s value, and use that value for the fill.

Take a look at the spec example for project 1A:

spec.as - lines with label definitions are highlighted

Notice how we define the labels start , done , five , neg1 , and stAddr . Remember from

section 2.2 that each line of assembly represents an address. Thus we say the following:

lw 0 1 five load reg1 with 5 (symbolic address)

lw 1 2 3 load reg2 with -1 (numeric address)

start add 1 2 1 decrement reg1

beq 0 1 2 goto end of program when reg1==0

beq 0 0 start go back to the beginning of the loop


done halt end of program

five .fill 5

neg1 .fill -1

stAddr .fill start will contain the address of start (2)

The label start resolves to a value of 2, since it is defined on the 3rd line, which relates to

address 2 (We count starting by 0 for addresses, but by 1 for line numbers).

The label done resolves to a value of 6

The label five resolves to a value of 7

The label neg1 resolves to a value of 8

The label stAddr resolves to a value of 9

Furthermore, in the spec example for project 1A, there are a few usages of labels as arguments:

spec.as - lines that define labels are highlighted

See the handling labels section to see how your assembler should handle assembling lines of

assembly that use symbolic labels into machine code.

spec.as - lines that use labels are highlighted

2.3. LC-2K Machine Code Instruction Formats

lw 0 1 five load reg1 with 5 (symbolic address)

lw 1 2 3 load reg2 with -1 (numeric address)

start add 1 2 1 decrement reg1

beq 0 1 2 goto end of program when reg1==0

beq 0 0 start go back to the beginning of the loop


done halt end of program

five .fill 5

neg1 .fill -1

stAddr .fill start will contain the address of start (2)

lw 0 1 five load reg1 with 5 (symbolic address)

lw 1 2 3 load reg2 with -1 (numeric address)

start add 1 2 1 decrement reg1

beq 0 1 2 goto end of program when reg1==0

beq 0 0 start go back to the beginning of the loop


done halt end of program

five .fill 5

neg1 .fill -1

stAddr .fill start will contain the address of start (2)

An LC-2K machine code file ( *.mc ) is made up of multiple lines decimal numbers. Each line of the

machine code file represents the number stored at that address in the memory. For example, the

first line of the machine code file represents the value of address 0 when the program begins.

Bits 31-25 are unused for all instructions, and should always be 0. Bit 0 is the least-significant bit.

R-type instructions ( add ,

nor )

bits 24-22: opcode

bits 21-19: reg A

bits 18-16: reg B

bits 15-3: unused (should all be 0)

bits 2-0: destReg

I-type instructions ( lw ,

sw , beq )

bits 24-22: opcode

bits 21-19: reg A

bits 18-16: reg B

bits 15-0: offsetField (a 16-bit, 2’s complement number with

a range of -32768 to 32767)

J-type instructions ( jalr )

bits 24-22: opcode

bits 21-19: reg A

bits 18-16: reg B

bits 15-0: unused (should all be 0)

O-type instructions ( halt ,

noop )

bits 24-22: opcode

bits 21-0: unused (should all be 0)

3. LC-2K Assembly Language and Assembler


The first part of this project is to write a program to take an assembly-language program and

translate it into machine language. You will translate assembly-language names for instructions,

such as beq, into their numeric equivalent (e.g. 100), and you will translate symbolic names for

addresses into numeric values. The final output will be a series of 32-bit instructions (instruction bits

31-25 are always 0).

The assembler should make two passes over the assembly-language program. In the first pass, it

will calculate the address for every symbolic label. Assume that the first instruction is at address 0.

In the second pass, it will generate a machine-language instruction (in decimal) for each line of

assembly language. For example, here is an assembly-language program (that counts down from 5,

stopping when it hits 0).


And here is the corresponding machine language:

spec.as's corresponding machine language, with addresses and hex | Your output
should NOT look like this

should NOT look like this

Be sure you understand how the above assembly-language program got translated to machine


Since your programs will always start at address 0, your program should only output the memory

contents in decimal and not output the addresses.

spec.mc.correct for P1A / spec.mc for P1S

lw 0 1 five load reg1 with 5 (symbolic address)

lw 1 2 3 load reg2 with -1 (numeric address)

start add 1 2 1 decrement reg1

beq 0 1 2 goto end of program when reg1==0

beq 0 0 start go back to the beginning of the loop


done halt end of program

five .fill 5

neg1 .fill -1

stAddr .fill start will contain the address of start (2)

!! Your output should only include the machine code in decimal !!

!! The address and hex information is for your understanding !!

!! See spec.mc.correct for what your output should look like !!

(address 0): 8454151 (hex 0x810007)

(address 1): 9043971 (hex 0x8a0003)

(address 2): 655361 (hex 0xa0001)

(address 3): 16842754 (hex 0x1010002)

(address 4): 16842749 (hex 0x100fffd)

(address 5): 29360128 (hex 0x1c00000)

(address 6): 25165824 (hex 0x1800000)

(address 7): 5 (hex 0x5)

(address 8): -1 (hex 0xffffffff)

(address 9): 2 (hex 0x2)

3.1. Handling labels

For lw or sw instructions, the assembler should compute offsetField to be equal to the

address of the label. This could be used with a zero base register to refer to the label, or could be

used with a non-zero base register to index into an array starting at the label. For beq instructions,

the assembler should translate the label into the numeric offsetField needed to branch to that


3.2. Your Assembler’s Input and Outputs

Write your program to take two command-line arguments. The first argument is the file name where

the assembly-language program is stored, and the second argument is the file name where the

output (the machine-code) is written. For example, with a program name of assemble , an

assembly-language program in program.as , the following would generate a machine-code file

program.mc :

./assemble program.as program.mc

Note that the format for running the command must use command-line arguments for the file names

(rather than standard input and standard output). Your program should store only the list of decimal

numbers in the machine-code file, one instruction per line. The decimal numbers will range from

to (-2147483648 to 2147483647). Any deviation from this format (e.g. extra

spaces or empty lines) will render your machine-code file ungradeable. Any other output that you

want the program to generate (e.g. debugging output) can be printed to standard output.

Note to compile your assembler, see Appendix B Makefile tips

3.3. Error Checking

Your assembler should catch the following errors in the assembly-language program:

Use of undefined labels

Duplicate definition of labels

offsetFields that don’t fit in 16 bits

Unrecognized opcodes

Non-integer register arguments

Registers outside the range [0, 7]

Your assembler should exit(1) if it detects an error and exit(0) if it finishes without detecting

any errors. Your assembler should NOT catch simulation-time errors, i.e. errors that would occur at

the time the assembly-language program executes (e.g. branching to address -1, infinite loops,

etc.). You are not required to output any specific output when an error is encountered.

3.4. Test Cases

An integral (and graded) part of writing your assembler will be to write a suite of test cases to

validate any LC-2K assembler. Writing thorough and robust test suites is a common practice in in

the real–world software companies, Writing a comprehensive suite of test cases will deepen your

understanding of the project specification and your program, and it will help you a lot as you debug

your program. Moreover, staff will have a much easier time helping you identify issues in your

project if you have test cases that are producing incorrect output on your implementation.

The test cases for the assembler part of this project will be short assembly-language programs that

serve as input to an assembler. You will submit your suite of test cases together with your

assembler, and we will grade your test suite according to how thoroughly it exercises an assembler.

Each test case may be at most 50 lines long, and your test suite may contain up to 20 test cases.

These limits are much larger than needed for full credit (the solution test suite is composed of 5 test

cases, each < 10 lines long). See Section 6 for how your test suite will be graded.

Note: All instructions should appear before any .fill ’s. Instructions and .fill ’s should not be

interleaved. e.g. Your assembly programs should look like this:

Correct usage of .fill

They should NOT look like this:

 Hints: The spec assembly-language program is a good case to include in your test suite,

though you’ll need to write more test cases to get full credit. Remember to create some test

cases that test the ability of an assembler to check for the errors in Section 3.3.


Incorrect usage of .fill

This won’t be enforced for project 1, but assembly programs with interleaved instructions and

.fill ’s will not work properly in project 2. Note that many students like to reuse their project 1

assembly tests for project 2.

3.5. Assembler Hints

Since offsetField is a 2’s complement number, it can only store numbers ranging from -32768 to

32767. For symbolic addresses, your assembler will compute offsetField so that the instruction

refers to the correct label.

Remember that offsetField is only a 16-bit 2’s complement number. Since Linux integers are 32

bits, you’ll have to chop off all but the lowest 16 bits for negative values of offsetField . Consider

where a value being negative is significant. See the providied static inline int isNumber(char

*string) method.

4. Behavioral Simulator (40%)

The second part of this assignment is to write a program that can simulate any legal LC-2K

machine-code program. The input for this part will be the machine-code file that you created with

your assembler. With a program name of simulate and a machine-code file of program.mc , your

program should be run as follows:

This directs all printfs to the file output .

The simulator should begin by initializing all registers and the program counter to 0. The simulator

will then simulate the program until the program executes a halt.

 IMPORTANT: Test case names must NOT have empty spaces in them. Any test cases with

spaces in it will not be graded. For example, “tes t.as” is incorrectly formatted.

./simulate program.mc > output

The simulator should call the printState function before executing each instruction and once just

before exiting the program. This function prints the current state of the machine (program counter,

registers, memory). printState will print the memory contents for memory locations defined in the

machine-code file (addresses 0-9 in the spec example).

4.1 Simulator Behavior

The purpose of the simulator is to keep a record of the current state of our registers and memory.

Before each instruction is executed, a call to printState will be made, showing the values of your

program’s memory and registers. The input for the simulator will be a machine code file, meaning

you will need to parse the input and determine what actions to take.

Consider the following machine code: 655363

The same number, but in binary: 0b 0000 0000 0000 1010 0000 0000 0000 0011

From here, we can determine the opcode and all other arguments. Recall that all numbers are

binary under the hood, so we can implicitly think about the machine code in binary (even though it is

given to us in decimal notation)

Looking at positions 24-22, the opcode bits are 000, implying it is an ADD instruction Per Section 2,

ADD has 3 arguments:

RegA: which is bits 21-19, is 0b001, or 1

RegB: which is bits 18-16, is 0b010, or 2

DestReg: which is bits 2-0, is 0b011, or 3

Therefore, we know this line of machine code is trying to do:

Register 3 = Register 1 + Register 2

Note: we are adding the register’s values, not their names.

4.2 Test Cases

As with the assembler, you will write a suite of test cases to validate any LC-2K simulator.

The test cases for the simulator part of this project will be short, valid assembly-language programs

that, after being assembled into machine code, serve as input to a simulator. You will submit your

suite of test cases together with your simulator, and we will grade your test suite according to how

thoroughly it exercises an LC-2K simulator. Each test case may be at most 50 lines and may

execute at most 200 instructions on a correct simulator, and your test suite may contain up to 20 test

cases. These limits are much larger than needed for full credit (the solution test suite is composed of

a couple test cases, each executing less than 40 instructions). See Section 6 for how your test suite

will be graded.

4.3 Simulator Hints

Be careful how you handle offsetField for lw , sw , and beq . Remember that it’s a 2’s

complement 16-bit number, so you need to convert a negative offsetField to a negative 32-bit

integer on the Linux workstations (by sign extending it). One way to do this is to use the following

function, also given in the starter code:

convertNum function from starter code

An example run of the simulator (not for the specified task of multiplication) is included in the starter

code for Project 1 S in the file spec.out.correct

5. Assembly-Language Multiplication (20%)

The third part of this assignment is to write an assembly-language program to multiply two numbers.

Input the numbers by reading memory locations called mcand and mplier . The result should be

stored in register 1 when the program halts. You may assume that the two input numbers are at

most 15 bits and are positive; this ensures that the (positive) result fits in an LC-2K word.

Remember that shifting left by one bit is the same as adding the number to itself. Given the LC-2K

instruction set, it’s easiest to modify the algorithm so that you avoid the right shift. Submit a version

of the program that computes ( ).

 Warning: Behavior is defined only for accesses to memory addresses in the range [0,

65535]. In your test cases, do not access memory addresses outside of this range with LW or

SW instructions. This is NOT one of the errors you are required to check for, but assembly

programs with undefined behavior may execute differently on your simulator than on our

reference simulator.

static inline int convertNum(int32_t);

// convert a 16-bit number into a 32-bit Linux integer

static inline int convertNum(int32_t num) {

return num - ( (num & (1<<15)) ? 1<<16 : 0 );


6203 × 1429

Your multiplication program must be reasonably efficient — it must be at most 50 lines long and

execute at most 1000 instructions for any valid input (this is several times longer and slower than

the solution). To achieve this, you are strongly encouraged to consider using a loop and shift

algorithm to perform the multiplication; algorithms such as successive addition (e.g. multiplying

by adding 5 six times) will take too long.

6. Grading, Auto-Grading, and Formatting

We will grade primarily on functionality, including error handling, correctly assembling and simulating

all instructions, input and output format, method of executing your program, correctly multiplying,

and comprehensiveness of the test suites.

To help you validate your project, your submission will be graded automatically after submission.

You may then continue to work on the project and re-submit. To deter you from using the autograder

as a debugger, you will receive feedback from the autograder only for the first THREE

SUBMISSIONS on any given day. That is, you will receive feedback with your score only three times

on any given day. All subsequent submissions will be silently graded. Your final score will be derived

from your overall best submission to the autograder.

The feedback from the autograder will not be very illuminating; it won’t tell you where your problem

is or give you the test programs. The purpose of the autograder is to let you know that you should

keep working on your project (rather than thinking it’s perfect and ending up with a 0). The best way

to debug your program is to generate your own test cases, figure out the correct answers, and

compare your program’s output to the correct answer. This is also one of the best ways to learn the

concepts in the project.

The student suite of test cases for the assembler and simulator parts of this project will be graded

according to how thoroughly they test an LC-2K assembler or simulator. We will judge thoroughness

of the test suite by how well it exposes potential bugs in an assembler or simulator.

For the assembler test suite, the auto-grader will use each test case as input to a set of buggy

assemblers. A test case exposes a buggy assembler by causing it to generate a different answer

from a correct assembler. The test suite is graded based on how many of the buggy assemblers

were exposed by at least one test case. This is known as mutation testing in the research literature

on automated testing. Your test suite is run on 19 buggy assemblers. To receive all Mutation Testing

points, your test suite must expose at least 15 of the 19 buggy assemblers.

For the simulator test suite, the auto-grader will correctly assemble each test case, then use it as

input to a set of buggy simulators. A test case exposes a buggy simulator by causing it to generate a

different answer from a correct simulator. The test suite is graded based on how many of the buggy

simulators were exposed by at least one test case. Your test suite is run on 10 buggy assemblers.

5 × 6

To receive all Mutation Testing points, your test suite must expose at least 7 of the **10 ** buggy

assemblers. Note that the test cases for the simulator should all be valid, correct assembly language


Because all programs will be auto-graded, you must be careful to follow the exact formatting rules in

the project description:

1. (assembler) Follow exactly the format for inputting the assembly-language program and

outputting the machine-code file.

2. (assembler) Call exit(1) if you detect errors in the assembly-language program. Call

exit(0) if you finish without detecting errors.

3. (simulator) Don’t modify printState or stateStruct at all. Download this code into your

program electronically (don’t re-type it) to avoid typos.

4. (simulator) Call printState exactly once before each instruction executes and once just

before the simulator exits. Do not call printState at any other time.

5. (simulator) Don’t print the sequence “@@@” anywhere (except where the provided

printState function prints it).

6. (simulator) state.numMemory must be equal to the number of lines in the machine-code file.

7. (simulator) Initialize all registers to 0.

8. (multiplication) Store the result in register 1.

9. (multiplication) The two input numbers must be in locations labeled mcand and mplier


7. Turning in the Project

Use autograder.io to submit your files.

Here are the files you should submit for each project part:

1) assembler (part 1a)

a. Your assembler, a C program named "assembler.c"

b. Suite of test cases (each test case is an assembly-language program

in a separate file, ending in ".as")

2) simulator (part 1s)

a. Your simulator, a C program named "simulator.c"

Your code will be compiled with the GCC compiler using the C99 standard. Use the provided

makefile to compile your programs. See Appendix B Makefile tips

The official time of submission for your project will be the time the last file is sent. If you send in

anything after the due date, your project will be considered late (and will use up your late days). If

you have already used up all of your late days, additional late submissions will not be scored for

your project grade.

Appendix A: . C Programming Tips

Here are a few programming tips for writing C programs to manipulate bits:

1) To indicate a hexadecimal constant in C, precede the number by 0x . For example, 27 decimal is

0x1b in hexadecimal.

2) The value of the expression (a >> b) is the number “a” shifted right by “b” bits. Neither a nor b

are changed. E.g. (25 >> 2) is 6. Note that 25 is 11001 in binary, and 6 is 110 in binary.

3) The value of the expression (a << b) is the number “a” shifted left by “b” bits. Neither a nor b

are changed. E.g. (25 << 2) is 100. Note that 25 is 11001 in binary, and 100 is 1100100 in


4) To find the value of the expression (a & b) , perform a logical AND on each bit of a and b (i.e. bit

31 of a ANDed with bit 31 of b, bit 30 of a ANDed with bit 30 of b, etc.). E.g. (25 & 11) is 9, since:

5) To find the value of the expression (a | b) , perform a logical OR on each bit of a and b (i.e. bit

31 of a ORed with bit 31 of b, bit 30 of a ORed with bit 30 of b, etc.). E.g. (25 | 11) is 27, since:

b. suite of test cases (each test case is an assembly-language program

in a separate file, ending in ".as")

3) multiplication (part 1m)

a. Your assembly program for multiplication, named "mult.as"

11001 (binary)

& 01011 (binary)


= 01001 (binary), which is 9 decimal.

11001 (binary)

| 01011 (binary)


= 11011 (binary), which is 27 decimal.

6) ~a is the bit-wise complement of a (a is not changed).

Use these operations to create and manipulate machine-code. For example:

To look at bit 3 of the variable a, you might do: (a>>3) & 0x1

To look at bits (bits 15-12) of a 16-bit word, you could do: (a>>12) & 0xF

To put a 6 into bits 5-3 and a 3 into bits 2-1, you could do: (6<<3) | (3<<1)

If you’re not sure what an operation is doing, print some intermediate results to help you debug.

7) To print in C, use the printf() function (this is included in stdio.h). We can print a message like

such: printf(“Hello world\n”); If we want to print values, we need to include their types in the output


When we want to print a value, we need to include these identifiers, alongside the value of the

variable as an argument of printf().


8) Please review discussion 1 for more C information

Appendix C: Makefile Tips

You can use the provided Makefile in the starter code by doing

$ make <rule>

Where <rule> is a rule that is defined in the makefile.

Open up the Makefile and see what <rule> ’s we have already written for you, they will have the

following format:





use "%d" for an int value

use "%c" for a char value

use "%s" for a string value

use "%f" for a non int number.




int num = 370;

char name[5] = "EECS";

printf("Welcome to %s %d\n, name, num); //This prints "Welcome to EECS 370".

targets : prerequisites


(From https://www.gnu.org/software/make/manual/html_node/Rule-Syntax.html#Rule-Syntax)

Example 1 : Compiling the assembler executable

In our provided makefile we see that we define the rule assembler below. This rule relies on

assembler.c. The recipe compiles the assembler.c file and creates the assembler executable.

Feel free to explore the Makefile and see what other rules we provide!


Calling `make assembler` in your terminal to compile the assembler executable

Example 2 : Compiling the assembler executable

We also define some rules that use pattern rules, an example is the rule %.mc which we can use to

assemble an LC2K assembly file into it’s machine code representation.


Example, calling `make spec.mc` in your terminal to assemble `spec.as` to create





# Compile Assembler

assembler: assembler.c

$(CXX) $(CXXFLAGS) $< -o assembler



$ make assembler

gcc -std=c99 -Wall -Werror -g3 assembler.c -o assembler -lm




# Assemble an LC2K file into Machine Code

%.mc: %.as assembler

./assembler $< $@




$ make spec.mc

gcc -std=c99 -lm -Wall -Werror -g3 assembler.c -o assembler

./assembler spec.as spec.mc

