联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> C/C++编程C/C++编程

日期:2018-09-26 10:11

Assignment 2 Description

Assignment 2 - Writing an Assembler

Weighting and Due Dates

? Marks for this assignment contribute 5% of the overall course

mark.

? Marks for functionality will be awarded automatically by the

web submission system.

? Due dates: Milestone - 11:55pm Tuesday of week 9, Final -

11:55pm Friday of week 9.

? Late penalties: For each part, the maximum mark awarded will

be reduced by 25% per day / part day late. If your mark is

greater than the maximum, it will be reduced to the maximum.

? Core Body of Knowledge (CBOK) Areas: abstraction, design,

hardware and software, data and information, and programming.

Project Description

In this assignment you will complete a variation of project 6 in the

nand2tetris course. A detailed description of Nand2Tetris Project 6

tailored to this course is shown below. In this assignment the

assembler will be written as two separate programs. The executable

program, parser, will read a Hack Assembly Language program from

standard input and produce an abstract syntax tree on standard

output. The executable program, translator, will read the abstract

syntax tree and assemble a machine code representation of the

original Hack Assembly Language program. The assembled code will be

formatted as sixteen zeros or ones per line and it will be written to

standard output.

SVN Repository

You must create a directory in your svn repository named:

<year>/<semester>/cs/assignment2. This directory must only contain

the following files and directories - the web submission system will

check this:

? Makefile - this file is used by make to compile your

programs - do not modify this file.

? parser.cpp - C++ source file

? translator.cpp - C++ source file

? my*.cpp C++ source files with names that start with my

? my*.h C++ include files with names that start with my

? bin - this directory contains precompiled programs and scripts

- do not modify this directory.

? lib - this directory contains precompiled components - do not

modify this directory.

? includes - this directory contains .h files for precompiled

classes - do not modify this directory.

? tests - this directory contains test data, you can add your

own tests here

Note: if the lib/lib.a file does not get added to your svn

repository you will need to explicitly add it using:

% svn add lib/lib.a

Submission and Marking Scheme

This assignment has two assignments in the web submission system

named: Assignment 2 - Milestone Submissions and Assignment 2 -

Final Submissions. The assessment is based on "Assessment of

Programming Assignments".

Assignment 2 - Milestone Submissions: due 11:55pm Tuesday of

week 9

The marks awarded by the web submission system for the milestone

submission contribute up to 20% of your marks for assignment

2. Your milestone submission mark, after the application of late

penalties, will be posted to the myuni gradebook when the assignment

marking is complete.

Your programs must be written in C++ and will be tested using Hack

Assembly Language programs that that may or may not be syntactically

correct. Although a wide range of tests may be run, including a

number of secret tests, marks will only be recorded for those tests

that are syntactically correct and do not use symbols. Your programs

will be compiled using the Makefile included in the zip file

attached below. The parser program will be compiled using the file

parser.cpp and the translator program will be compiled using the

file translator.cpp file. In both cases any .cpp files with names

starting with my will be also included together with the precompiled

library functions.

Assignment 2 - Final Submissions: due 11:55pm Friday of week

9

The marks awarded for the final submission contribute up to 80% of

your marks for assignment 2.

Your final submission mark will be the geometric mean of the marks

awarded by the web submission system, a mark for your logbook and a

mark for your code. It will be limited to 20% more than the marks

awarded by the web submission system. See "Assessment - Mark

Calculations" for examples of how the marks are combined.

Your final submission mark, after the application of late penalties,

will be posted to the myuni gradebook when the assignment marking is

complete.

Automatic Marking

The automatic marking will compile and test both of your programs in

exactly the same way as for the milestone submission. The difference

is that marks will be recorded for all of the tests including

the secret tests. Note: if your programs fail any of these

secret tests you will not receive any feedback about these

secret tests, even if you ask!

Logbook Marking

Important: the logbook must have entries for all work in this

assignment, including your milestone submissions. See "Assessment -

Logbook Review" for details of how your logbook will be assessed.

Code Review Marking

For each of your programming assignments you are expected to submit

well written code. See "Assessment - Code Review" for details of

how your code will be assessed.

Nand2Tetris Project 6: The Assembler

Background

Low-level machine programs are rarely written by humans. Typically,

they are generated by compilers. Yet humans can inspect the

translated code and learn important lessons about how to write their

high-level programs better, in a way that avoids low-level pitfalls

and exploits the underlying hardware better. One of the key players

in this translation process is the assembler -- a program designed to

translate code written in a symbolic machine language into code

written in binary machine language.

This project marks an exciting landmark in our Nand to Tetris

odyssey: it deals with building the first rung up the software

hierarchy, which will eventually end up in the construction of a

compiler for a Java/C++ like high-level language. The relevant

reading for this project is Chapter 6. Some of the useful tools

available include, the Hack Assembler, the CPU Emulator and working

versions of the two programs, bin/working-parser and bin/workingtranslator.

Objective

The Hack assembler is a relatively simple program however, so that

you can gain experience with the tools used in other workshops and

assignments, you will build your assembler from three parts. This

will involve using a precompiled tokeniser for the Hack assembly

language to implement a parser that recognises labels, A-instructions

and C-instructions using tokens returned by the tokeniser. The

parser will construct a tree representation of the program that the

translator will walk over in order to assemble the final machine

code.

If you wish to create additional source files that can be used by

your programs the .cpp and .h files must have names that start

with my. All my*.cpp files will be compiled as part of both

you parser and translator programs.

Compiling and Running Your Programs

You programs can be compiled with one of the following three

commands. To compile both:

% make

To compile just parser:

% make parser

To compile just translator:

% make translator

To test your programs you can use the following command:

% make

Testing student assembler-m against hack files

Checking "./assembler-m < tests/AddL.asm | diff - tests/AddL.hack" -

test failed

...

The test scripts do not show the program outputs, just passed or

failed, but they do show you the commands being used to run each

test. You can copy-paste these commands if you want to run a

particular test yourself and see all of the output.

If you wish to add new tests, create appropriately named files in the

tests directory and then use the command:

% make test-add

Note: Do not modify the Makefile or the subdirectories

includes, bin and lib. They will be replaced during

testing by the web submission system.

The Tokeniser

You must use the precompiled tokeniser functions provided by the

library in the zip file attached below. The tokeniser functions are

described in the files includes/a-tok.h and include/tokeniser.h.

This tokeniser returns the followings tokens:

Token

Class

Definition Example Token

ak_address ::= '@' ( name | number ) @hello ak_name

ak_label ::= '(' name ')' (END) ak_label

ak_dest ::= 'MD' | 'AM' | 'AD' | 'AMD' Am ak_dest_AM

ak_register ::= 'A' | 'M' | 'D' a ak_register_A

ak_alu_op ::= '0' | '1' | '-1' | '!D' | '!A' |

'-D' | '-A' | 'D+1' | 'A+1' | 'D-

1' | 'A-1' | 'D+A' | 'D-A' |'A-D'

| 'D&A' | 'D|A' | '!M' | '-M' |

'M+1' | 'M-1' | 'D+M' | 'D-M' |

'M-D' | 'D&M' | 'D|M'

1 ak_alu_1

ak_jump ::=

'JMP' | 'JLT' | 'JLE' | 'JGT' |

'JGE' | 'JEQ' | 'JNE' JGT ak_jgt

ak_assign ::= '=' = ak_assign

ak_semi ::= ';' ; ak_semi

ak_null ::= 'NULL' nuLl ak_null

Additional

Rules

Definition Example Text

name ::= letter ( letter | digit )* "_he:82m.Uch$"

number ::= digit digit* "99"

letter ::=

'a'-'z' | 'A'-'Z' | '$' | '_' |

':' | '.' "$"

digit ::= '0'-'9' "1"

Notes:

? if an error occurs or the end of input is reached, the token

ak_eoi is returned

? it is an error to find a character that cannot be part of

token or is not a space " ", tab "\t", carriage return "\r" or

newline "\n"

? single line comments that start with "//" and finish at the

next newline character "\n"

? when searching for the start of the next token all spaces,

tabs, carriage returns, newlines and comments are ignored

? newlines are not significant to the tokeniser so more than one

instruction can be on the same line or an instruction could be

split over multiple lines

? name, number, letter and digit are never returned as token

classes

? letters in a name are case sensitive, all other occurrences of

letters are converted to uppercase

? in a definition the or operator | separates

alternative components

? in a definition the round brackets ( ) which are not inside

single quotes are for grouping components of token

? in a definition the star character * indicates that the

preceding component of a token may appear 0 or more times

The Assembler Parser

The following table describes the structure of an assembly language

program that must be translated into machine code. Note: there are no

references to lines in these rules. All whitespace, including newline

characters, and comments will have been removed by the

precompiled tokeniser. Therefore these rules are defined purely in

terms of the token classes in the previous table.

Term Definition

program ::=

( ak_label | ak_address | Cinstr

)* ak_eoi

C-instr ::= [destination] aluop [ajump]

destination ::=

( ak_null | ak_dest | ak_register )

ak_assign

aluop ::= ak_register | ak_alu_op

ajump ::= ak_semi ( ak_null | ak_jump )

Notes:

? in a definition eof indicates the end of the input

? in a definition the or operator | separates

alternative components

? in a definition the round brackets ( ) are for grouping

components

? in a definition the square brackets [ ] which indicates that

the enclosed components may appear 0 or once

? in a definition the star character * indicates that the

preceding component may appear 0 or more times

Abstract Syntax Tree

An abstract syntax tree is just a tree like data structure that holds

a representation of a program. In this case the parser program must

construct an object of class an_program_node and append all labels

or instructions to it. Each tree class can only be manipulated by

pointers and has a typedef for a pointer to the class so the *

operator does not need to be used very often. Each class for which

you are allowed to create objects also has a create function, for

example to create an an_program_node you would call the function

an_program_node_create(). To accommodate situations where pointers

of different classes can be used, each class has a function to

convert the pointer type after checking that it is appropriate. For

example, a pointer to an an_c_instruction can be used anywhere a

pointer to an an_instruction is required and the conversion function

is named to_an_instruction(). If the checks fails your program will

exit immediately.

Parser

The skeleton parser.cpp file has most of the parsing structure

already written, you just need to complete the object creation and

parsing of labels and individual instructions. The main() function of

the parser program prints out an XML representation of the abstract

syntax tree using the precompiled library function an_print_as_xml().

You do not need to know how to write out XML. XML is produced by

the parser because it is easy to read by humans.

If your parser encounters any errors, it must exit immediately and

have not produced any output. Examples of errors include, multiple

definitions of a label, an A-instruction with a number that is too

large, or a C-instruction with multiple destination components,

multiple jump components or no ALU component. In this two program

structure, the parser will not be able to directly detect the first

two kinds of errors. Multiple definitions of a label will not be

detectable until the translator walks the abstract syntax tree and a

number that is too large will cause the tokeniser to report that it

is at the end of the input. The other errors can be detected by the

parser and should result in an immediate exit with no output.

The iobuffer.h include file gives details of functions you can use

to manage when output is produced by your programs.

Translator

The skeleton translator.cpp file includes a loop that walks over the

abstract syntax tree produced by the parser. The translator program

uses the precompiled function an_parse_xml(), to read the output of

the parser program. You do not need to know how to read in and

interpret XML. The later workshops and assignments will use this

technique to allow their compilers to be written as separate

programs.

One advantage of this two program approach is that walking over the

tree more than once is really simple. This is important in the

translator when it comes to resolving symbols. When a program

includes an A-instruction such as @somename you cannot tell if

somename is a label or a variable until the entire program has been

inspected because the label definition may come later. This will be

true for jumps to a later part of a program. We solve this problem by

walking over the abstract syntax tree and record all the labels that

we can find. This is Pass 1. When we walk over the abstract syntax

tree a second time, Pass 2, every label name will already have been

defined. Therefore, all undefined names found in Pass 2 must be

variable names.

If your translator encounters any errors, it must exit immediately

and have not produced any output. It may be interesting to reflect on

why we did not include missing label definitions amongst the list of

errors to be detected.

Lookup Tables

Your translator program will need a way of generating the correct set

of bits for each part of a C-instruction. For example, "JMP" needs to

be mapped to "111". To do this you will need some sort of lookup

table to record these mappings. The translator program also needs a

lookup table so that it can implement a symbol table to record label

addresses and the memory locations for variables. You should use the

precompiled symbol table classes described in the includes/symbols.h

file to create any lookup tables that you require.

The includes/symbols.h file includes examples of how to use these

lookup tables.

You will need several lookup tables, one for the symbol table and one

for each component of a C-instruction. Although most parts of a Cinstruction

are unique, the M, A and D registers need to be mapped to

different sets of bits depending on whether they are used as an alu

operation or a destination. For example, "A" as a destination would

be "100" but as an alu operation it would be "0110000".

Test Programs

In addition to the tests in tests directory, we will use a number of

secret tests that may contain illegal tokens and / or multiple labels

and instructions on the same line and / or syntactic errors. Note:

these tests are secret, if your programs fail any of these

secret tests you will not receive any feedback about these

secret tests, even if you ask!

The Pong program supplied in the tests directory was written in the

Java-like high-level Jack language and translated into the Hack

assembly language by the Jack compiler and a Hack Virtual Machine

translator. (The Virtual Machine, Jack and the Jack compiler are

described in Chapters 7-8, 9 and 10-11, respectively). Although the

original Jack program is only about 300 lines of Jack code, the

executable Pong code is naturally much longer. Running this

interactive program in the supplied CPU Emulator is a slow affair, so

don't expect a high-powered Pong game. This slowness is actually a

virtue, since it enables your eye to track the graphical behavior of

the program. And don't worry! as we continue to build the software

platform in the next few projects, Pong and and other games will run

much faster.

Startup Files


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp