Aug 19 2003

Top-down Parser with Backtracking in C++

Category: Programming

Another assignment in my Foundations of Computing class in college was to create a top-down parser that also supports backtracking. My implementation works for any arbitrary context-free grammar. Here's the program with source code.

The program accepts any arbitrary context-free grammar as an input file, as well as a file that contains the input string to be parsed. When the program starts, it looks for a file called grammar.txt in the program's directory. The grammar should have the following form:

<L> -> <L>;<A>|<A>
<A> -> <I>=<E>

<E> -> <E>-<T> | <E>+<T> | <T>
<T> -> <T>/<F> | <T>*<F> | <F>
<F> -> <P>^<F> | <P>
<P> -> (<E>) | <I>
<I> -> w|x|y|z

The example grammar above defines a language where any mathematical expression can be formed with the variables w, x, y, z, and the operators +, -, *, /, ^, and =. Multiple expressions can also be separated by a semicolon.

Symbols enclosed in brackets "<...>" are non-terminals. The arrow symbol "->" separates the two sides of the given rewrite rule (production). The "or" symbol "|" is used to separate different choices for a single rewrite rule. Any other symbol is a terminal. The special symbol "<epsilon>" represents a null or empty terminal symbol (sometimes referred to as lambda). Comments can be written in the grammar file by placing a # symbol at the beginning of the line.

Notice that the parser supports left-recursion in the grammar. The program will automatically eliminate all left-recursion from the grammar before parsing any data. After eliminating left-recursion, the program will output a file called grammar_after_left_recursion.txt (for debugging purposes), then another file called grammar_after_unit.txt (after eliminating unit productions), and a file called grammar_after_epsilon.txt (after eliminating any empty productions).

After loading and massaging the grammar, the program will ask you to type in the string to be parsed. Alternatively, you can create a file that contains the string, and pass it as a command line parameter to the program. The program will then state whether or not the data was parsed successfully. If the parsing was successful, the program will display a graphical window with a tree that depicts the parsing performed by the program.

4 Responses to “Top-down Parser with Backtracking in C++”

  1. Luong Xuan Bach says:

    Dear Brant,
    Could you give more detail about the algorithm was used in your backtracking-parser implementation?
    Thanks in advance!

  2. ashish says:

    hi........i want to design a parser in linux using lex and yacc.
    it will parse C programs.......can i get the source code for it

  3. john carlo says:

    I like it...but it would be better if you translate it to C..tnx

  4. john carlo says:

    can you help me how to make a syntactical analyzer in C? tnx in advance

Leave a Reply