Assignment 2: Interpretation
This project requires you to build an interpreter for a simple, but non-trivial language: the extended calculator language.
Submit your solution and README.txt plus any additional files as a single ZIP file.
Your task in this project is to implement a complete interpreter for our extended calculator language (call it ECL), again with if statements, do/check statements, and both int and real types. We are providing you with a parser generator and driver in the assignment ZIP file that builds an explicit parse tree. The provided code also includes the skeleton of a possible solution that converts the parse tree to a syntax tree and then recursively “walks” that tree to effect the interpretation. You are of course welcome to adopt a different skeleton if you prefer. Since this one has been excised from a complete working solution, however, you may find it a good place to start.
The provided code has two main entry points:
get_parse_table : grammar -> parse_table = ...
parse : parse_table -> string -> parse_tree = ...
The first of these routines returns a parse table, in the form expected as the first argument of the second routine. The second returns a parse tree. You’ll want to print some of these trees to see what they look like. If the program has syntax errors (according to the grammar), parse will print an error message (as a side effect) and return a PT_error value (it does not do error recovery). If the input grammar you provide is malformed, you may get unhelpful run-time errors from the parser generator—it isn’t very robust.
The grammar takes the form of a list of production sets, each of which is a pair containing the LHS symbol and k right-hand sides, each of which is itself a list of symbols. The extended calculator language looks like this:
let ecg : grammar =
[ ("P", [["SL"; "$$"]])
; ("SL", [["S"; "SL"]; []])
; ("S", [ ["int"; "id"; ":="; "E"]; ["real"; "id"; ":="; "E"]
; ["id"; ":="; "E"]; ["read"; "TP"; "id"]; ["write"; "E"]
; ["if"; "C"; "SL"; "fi"]; ["do"; "SL"; "od"]; ["check"; "C"]
])
; ("TP", [["int"]; ["real"]; []])
; ("C", [["E"; "RO"; "E"]])
; ("RO", [["=="]; ["!="]; ["<"]; [">"]; ["<="]; [">="]])
; ("E", [["T"; "TT"]])
; ("TT", [["AO"; "T"; "TT"]; []])
; ("T", [["F"; "FT"]])
; ("FT", [["MO"; "F"; "FT"]; []])
; ("F", [["id"]; ["i_num"]; ["r_num"]; ["("; "E"; ")"]
; ["trunc"; "("; "E"; ")"]; ["float"; "("; "E"; ")"]])
; ("AO", [["+"]; ["-"]])
; ("MO", [["*"]; ["/"]])
];;
A program is just a string:
let sum_ave_prog =
" read int a read int b int sum := a + b
write sum write float(sum) / 2.0 ";;
Your work will proceed in two steps:
1. Translate the parse tree into a syntax tree:
let rec ast_ize_prog (p:parse_tree) : ast_sl = ...
where the single argument is a parse tree generated by function parse. We have provided a complete description of the ast_sl (abstract syntax tree for a statement list) type, though you are free to modify this if you prefer a different format.
2. Walk the syntax tree to determine its behavior on a given input:
let rec interpret (ast:ast_sl) (full_input:string) : string = ...
where the first argument is a syntax tree, as generated by function ast_ize_prog, the second argument is the input string to be read by the interpreted program, and the return value is the program output (the concatenation of values from write statements), possibly ending with an error message.
You are required to catch the following semantic errors:
o Use of an undeclared variable.
o Redeclaration of a variable in the same scope.
o Non-integer provided to float.
o Non-real provided to trunc.
o Type clash in binary expression, comparison, or assignment.
o check statement not inside a loop.
o Non-numeric input, unexpected end of input, int read when real is expected, or real read when int is expected. (Unlike the scanner, the input processing mechanism assumes that numbers are separated by white space, so 12.34 on the input, when an int is expected, should be an immediate error.)
o Division by zero.
You should treat each statement list as a separate scope. So a variable declared inside an if or do statement can have the same name as a variable declared outside (and will hide the outer variable in the remainder of the nested statement list). Note that while a declaration inside a do loop enters a name into the symbol table only once, the read or assignment that initializes that variable occurs on every iteration.
Putting the pieces together, the provided code includes the following:
let ecg_parse_table = get_parse_table ecg;;
let ecg_ast prog =
ast_ize_prog (parse ecg_parse_tableprog);;
let ecg_run prog inp =
interpret (ecg_ast prog) inp;;
If you are working in the ocaml interpreter, that last function is the one you’ll want to call to interpret small programs. For larger programs, you’ll want to read the source from a file. Toward that end, the provided code also includes a main function that runs when your interpreter is compiled with ocamlc, rather than being run inside the ocaml interpreter. This function
1. reads an ECL program from the file specified on the command line;
2. reads input from stdin;
3. runs the ECL program on the input.
See the included README.html in the ZIP file for an example.
In addition to (incomplete) starter code, we are also providing an already-compiled solution to the assignment, which you can run to get a sense of the output we’re expecting—and in particular to see how you might respond to semantic errors. You can find this solution as ecl in the ZIP file.
Because the solution was written quickly (and your instructors don't produce bug-free code), it’s possible that the provided solution and/or the starter code has bugs. If you find one, please report it ASAP and we’ll try to release a fix. Note that your interpreter does not have to produce exactly the same error messages as the provided solution (feel free to do better!), so long as it meets the requirements described on this page.
Warning: your program must not take advantage of any imperative features in OCaml. You may perform I/O and run the while loops in main as described above, and you can of course do whatever you want while debugging, but the main logic of your final syntax tree construction and interpretation routines must be purely functional.
Hints
The initial source code in interpreter.ml is about 1060 lines of OCaml. You should read most of it carefully to understand how it works (you can skip the details of parse table construction if you like, though I think it’s kind of cool :-).
For most of the assignment, it will probably be easiest to use the ocaml interpreter or its utop wrapper. You’ll want to keep reloading your source code (#use "interpreter.ml") as you go along, so you catch syntax and type errors early. On occasion, you may also want to try compiling your program with ocamlc, to create a stand-alone executable. Note that the code we have given you uses functions (regexp and split) from the Str library. This library is not visible to either the interpreter or the compiler by default. In ocaml, you will need to say
#load "str.cma";;
before you #use your source code. (Once is enough; you don’t have to re-#load in order to re-#use.) With ocamlc, type the following at a shell prompt:
ocamlc -o interp str.cma interpreter.ml
The provided source includes code for several ECL programs (sum-and-ave, primes, gcd, sqrt). You may want to put these into separate files, so they can be executed by the compiled version of your interpreter. You will undoubtedly want to write additional tests.
We will be grading your assignment using /usr/bin/ocamlc on the csug machines. You can download your own copy for Windows, MacOS, or Linux, but please be sure to check that your code works correctly on the csug installation.
For reference: Prof. Scott's implementation of the full set of ast_ize_ functions is about 70 lines of code. Likewise, his version of the full set of interpret_ functions is about 150 lines, not counting the provided code for memory and values.
You may find the following helpful.
• OCaml system documentation—includes the language definition; manuals for the interpreter, compilers, and debugger; standard library references; and other resources.
• Tutorials on various aspects of the language.
• OCaml for the Skeptical—an alternative, arguably more accessible introduction to the language from the University of Chicago.
• Developing Applications with Objective Caml—a book-length online introduction to functional programming and everything OCaml.
Division of labor and writeup
As in most assignments this semester, you may work alone or in teams of two. If you choose to work in pairs, I strongly encourage you to read each others’ code, to make sure you have a full understanding of semantic analysis. The most obvious division of labor is for one team member to write ast_ize_prog and the other to write interpret, but the former is probably easier than the latter, so you may want to consider other options.
Submit your completed interpreter.ml file with an accompanying README (in text, markdown or PDF format) as a ZIP file. The README should contain the names and NetIDs of all the students involved in the project.
Extra Credit Suggestions
1. Extend the calculator language in other interesting ways. You might, for example, implement arrays, for loops, nested scopes, or functions. Several of these are likely to introduce new semantic rules that you will want to check.
2. Implement a separate pre-pass over the AST to enforce static semantics. In particular, you should be able to find the first six kinds of errors above before your ECL program executes.
3. Implement simple forms of code improvement via transformations of the AST. A good place to start would be constant folding, which collapses arithmetic and logical operations on constants in a pre-execution pass over the tree. More ambitiously, yo
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。