Project 2: Syntax Analysis


In the previous project, you each wrote a lexical analyzer that took an input file and divided it into tokens. Now, you will write a parser that will test those tokens for syntactic correctness. Using BNF (Backus-Naur Form), you will establish a set of grammar rules that define the syntax of our language. A simple example of a pair of BNF rules is:

declaration :   TYPE ID ';'
            |   TYPE ID '=' value ';'

value       :   VAL_LITERAL
            |   ID

You must design and implement a grammar that can determine if a "Good" program is syntactically correct. For this program, you need to write a function called "parse_string", that can parse its string argument. If the program is correct, the program should only return True>. If your program cannot be parsed (for example if there is a problem on line 3), you should raise a SyntaxError. If your program has illegal name declarations/usages, you should raise a NameError.

Symbol Table

You must also implement a symbol table, a simple data structure that will track all of the identifiers that have been declared in the source program that you are compiling. Every time a variable is declared, it should be entered into the symbol table, and every time a variable is used, it should be verified with the symbol table. There are two types of errors that your program needs to identify related to symbol tables:

  1. If a variable is being used in an expression, but has never been declared. For example, if the variable my_var is used without being declared, your compiler raise a NameError.
  2. If a variable id is being declared for a second time. For example, if var1 was declared on line 2, but it's being redeclared on line 5, your compiler raise a NameError.

As with Project 1, your compiler should end after the first error is encountered.

Legal Expressions

Legal types of expressions that your compiler should accept include:

  • Variable declarations, for example: val var1;
  • Mathematical expressions (including comparisons, assignments and the random command): var1 = 7 + var2 * random(6);
  • Variable declarations with assignment to expressions: val var3 = (var1 + 76) * 4;
  • Output command: print(var1, var2 + 42);

While val, char, and string are all legal types, this project will only be testing the use of val. In Project 4 (when we learn type checking) other types will be used as well.

The mathematical operations that expressions in your compiler should be able to handle are:

  • = (assignments)
  • + (addition)
  • - (both subtraction and unary negation)
  • * (multiplication)
  • / (division)
  • +=, -=, *=, /= (compound assignments)
  • ==, !=, <, <=, >, >= (comparison operators)
  • && and || (Boolean operators)
  • ( and ) (parentheses)

Note that when assignments and compound assignments can be included inside of mathematical expressions they will (when fully implemented in Project 3) return the value of the variable being assigned to. For example, a=(b=2); is a legal statement that assigns both a and b to the value 2.

The random(expr) command can be used as a value in a mathematical expression and (starting Project 3) will produce a random int x where 0 <= x < expr.

The print command should be able to take one or more of arguments (including expressions) separated by commas.

We have put a reference parser executable named "project2_parser" in your project directory for testing purposes. It takes one argument a "good" file to compile.

Turning In

When you hand in your program, you must include:

  • A README.txt file that includes a summary of what does and does not work in your code. Your README may also include any notes that you want us to read before grading your program, such as any external resources you used in its development.
  • A python3 file that is invoked by the test cases. There should be no other python files, all your code should be in this one file.
    • The file must have a function called "parse_string" that takes a single argument, a string to be parsed. The function should return True.
    • In the case of a parse error, the function should raise a SyntaxError, and in the case of an illegal name usage/declaration, the function should raise a NameError.
    • Note: we don't check the messages included in the Exceptions, just that they were raised appropriately.