Top-down Parser with Backtracking in C++

Another assignment in my Foundations of Computing class in college was to create a top-down parser that also supports backtracking. My implementation works for any arbitrary context-free grammar. Here’s the program with source code.

The program accepts any arbitrary context-free grammar as an input file, as well as a file that contains the input string to be parsed. When the program starts, it looks for a file called grammar.txt in the program’s directory. The grammar should have the following form:
<L> -> <L>;<A>|<A>
<A> -> <I>=<E>

<E> -> <E>-<T> | <E>+<T> | <T>
<T> -> <T>/<F> | <T>*<F> | <F>
<F> -> <P>^<F> | <P>
<P> -> (<E>) | <I>
<I> -> w|x|y|z

The example grammar above defines a language where any mathematical expression can be formed with the variables w, x, y, z, and the operators +, -, *, /, ^, and =. Multiple expressions can also be separated by a semicolon.

Symbols enclosed in brackets “<…>” are non-terminals. The arrow symbol “->” separates the two sides of the given rewrite rule (production). The “or” symbol “|” is used to separate different choices for a single rewrite rule. Any other symbol is a terminal. The special symbol “<epsilon>” represents a null or empty terminal symbol (sometimes referred to as lambda). Comments can be written in the grammar file by placing a # symbol at the beginning of the line.

Notice that the parser supports left-recursion in the grammar. The program will automatically eliminate all left-recursion from the grammar before parsing any data. After eliminating left-recursion, the program will output a file called grammar_after_left_recursion.txt (for debugging purposes), then another file called grammar_after_unit.txt (after eliminating unit productions), and a file called grammar_after_epsilon.txt (after eliminating any empty productions).

After loading and massaging the grammar, the program will ask you to type in the string to be parsed. Alternatively, you can create a file that contains the string, and pass it as a command line parameter to the program. The program will then state whether or not the data was parsed successfully. If the parsing was successful, the program will display a graphical window with a tree that depicts the parsing performed by the program.

Recursive Descent Parser in C++

One of the later assignments in my Foundations of Computing class in college was to implement a recursive descent parser that outputs assembly code (defined within the assignment). I implemented mine in C++. Here’s the program and the source code.

This program accepts any arbitrary grammar as an input file, as well as a file that contains the input string to be parsed. When the program starts, it looks for a file called grammar.txt in the program’s directory. The grammar should have the following form:
<L> -> <A><L'>
<L'> -> ;<A><L'> | <epsilon><A> -> <I>=<E><#gen>

<E> -> <T><E'>
<E'> -> -<T><#gen><E'> | +<T><#gen><E'> | <epsilon>

<T> -> <F><T'>
<T'> -> /<F><#gen><T'> | *<F><#gen><T'> | <epsilon>

<F> -> <P><F'>
<F'> -> ^<F><#gen> | <epsilon>

<P> -> (<E>) | <I>
<I> -> w|x|y|z

The example grammar above defines a language where any mathematical expression can be formed with the variables w, x, y, z, and the operators +, -, *, /, ^, and =. Multiple expressions can also be separated by a semicolon.

Symbols enclosed in brackets “<…>” are non-terminals. The arrow symbol “->” separates the two sides of the given rewrite rule (production). The “or” symbol “|” is used to separate different choices for a single rewrite rule. Any other symbol is a terminal. The special symbol <epsilon> represents a null or empty terminal symbol (sometimes referred to as lambda). The other special symbol <#gen> is an instruction for the parser to generate code at the point where this symbol appears. Comments can be written in the grammar file by placing a # symbol at the beginning of the line.

The parser supports exactly one non-terminal and no terminals on the LHS (left-hand side) of any rewrite rule. The RHS of any rewrite rule can be arbitrary.

After loading the grammar, the program will ask you to type in the string to be parsed. Alternatively, you can create a file that contains the string, and pass it as a command line parameter to the program. If the data is parsed successfully, the program will generate a file called out.txt that will contain the assembly code generated by the grammar.

Markov Chains in Maple

The following is a procedure for Maple that simulates a Markov chain by executing a specified number of trials:

MarkovSim := proc(Mat, startstage, ntrials)
  local count, nvisit, a, i, j, firstAbsorbingStage, run, stage, sumQuant;
  count:=Vector(ColumnDimension(Mat));
  nvisit:=Vector(ColumnDimension(Mat));
  firstAbsorbingStage:=0;
  for i from 1 to ColumnDimension(Mat) do
    for j from 1 to RowDimension(Mat) do
      if Mat[j,i]>=1.0 then firstAbsorbingStage:=i; break; fi;
    od;
    if firstAbsorbingStage>0 then break; fi;
  od;
  for i from 1 to ColumnDimension(Mat) do count[i]:=0; od;
  for run from 1 to ntrials do
    for i from 1 to ColumnDimension(Mat) do nvisit[i]:=0; od;
    nvisit[startstage]:=1;
    stage:=startstage;
    while stage
      a:=rand()/10.^12;
      sumQuant:=0;
        for i from 1 to RowDimension(Mat) do
        if a<(sumQuant+Mat[i,stage]) and a>sumQuant then
          stage:=i; nvisit[i]:=nvisit[i]+1; break;
        fi;
        sumQuant:=sumQuant+Mat[i,stage];
      od;
    od;
    count:=count+nvisit;
  od;
  printf("Average number of steps in different stages before absorption:\n");
  for i from 1 to firstAbsorbingStage-1 do printf("%g ", count[i]/ntrials); od;
  j:=0; printf("\n");
  for i from 1 to ColumnDimension(Mat) do j:=j+count[i]; od;
  printf("Average number of steps until absorption:\n %g", j/ntrials);
end proc:

DFA (Deterministic Finite Automaton) Builder

Here is a program I wrote for my Foundations of Computing class in college (here’s the source code). The program can be used to visually design and test deterministic finite automata (DFAs) of arbitrary size and complexity.

To create a new DFA from scratch, select “File » New DFA” from the main menu. A new, blank DFA window will be displayed.

The first step in creating the DFA is to define the alphabet over which the DFA will operate. The alphabet can be constructed from any combination of subsets of ASCII characters. From the right-hand portion of the window, select the subset of ASCII characters to include in the alphabet, and click “Add.” For example, you can add “Letter,” which will make the alphabet contain all letters (a-z, uppercase and lowercase), or you can add “Lowercase,” which will make the alphabet contain only lowercase letters. You may also add individual characters from the list, for example if you want the alphabet to consist only of the letters “a” and “b.”

The second step in creating the DFA is to define the possible states that the machine can be in. To begin defining states, go to the “State Transitions” tab. The states are organized in a state transition table, as shown below. To add a state, click “Add State.” For each state, the program will generate fields for all items in the alphabet, as well as the end-of-string character. By double-clicking in a field, you can change the transition rule for that state and character. The first state you add will be the start state of the machine. The states are automatically numbered in ascending order.

At this point the DFA is complete, and ready to accept strings. To start testing strings, go to the “Test String” tab. In the top field, type in the string you want to test, and click “Test.” The program will then list the configuration sequence that results from testing the string, and will finally say whether the string was accepted or rejected.

The program also allows the user to save the DFA to a file. To save the DFA you are currently working on, select “File » Save DFA” from the main menu. You may then select the directory and the file name for the DFA. To open a saved DFA file, select “File » Open DFA” from the main menu, and browse for the DFA file that you want to open.

Clean Up Your Damn System Tray!

In the Windows operating system, the “System Tray” is the collection of small icons that appears on the far right side of the taskbar. These icons generally represent programs that are currently running in the background. An icon in the system tray could be used to inform the user that a certain program is running, or offer the user a quick way to change certain settings. I have found that the number of icons visible in a certain computer’s system tray gives a surprisingly accurate measurement of the well-being of the computer, as well as the degree of negligence and/or inexperience of the computer’s owner.

Theorem 1: The well-being of the computer is inversely proportional to the number of visible system tray icons.

Let’s now have a look at some of the programs that actually put their icon in the system tray. Some of these programs may very well prove to be useful. However, the vast majority of them are utterly useless, and sometimes even malicious.
The following is a screen shot of an actual computer (running Windows 98) from a user who has generously contributed to this study. Let’s examine some of the icons seen here, and comment on the worst offenders in this category:

  • McAfee Virus Scan — Okay, this one may be useful to a certain extent. However, my feelings toward antivirus programs deserve an article of their own, so I’ll leave this alone for now.
  • ATI Display Settings — honestly, how many times a day do you find yourself changing your display resolution or color depth? I suppose this may have been useful in the old days when certain programs (especially games) required a certain color depth, but we’re well beyond those constraints now.
  • SETI-at-home — While a noble and conceptually fascinating project, the SETI-at-home effort has produced absolutely no results and, so far, has been a total waste of resources. SETI-at-home was one of the first large-scale distributed computing projects, where CPU time of any available computer in the world could be contributed toward solving a single problem. However, now there are many other distributed computing projects that serve much more important purposes, like modeling protein folding in order to find possible cures for diseases.
  • MSN Messenger — An instant messaging application that is quite unpopular (compared to AIM and Yahoo Messenger). And yet Windows launches it by default even if you don’t have an MSN account, and it takes considerable effort to disable it completely. Notice that this user also has both Yahoo Messenger and AIM running alongside. It should not be necessary to run more than one instant messaging client. If it is absolutely necessary to use both (if you have friends on the different networks who simply refuse to switch), then download a program that supports both protocols simultaneously, like Gaim or Trillian.
  • Adaptec DirectCD — This allows you to drag and drop files onto a rewritable disc and burn a disc with the click of a button. However, Windows XP now allows you to do this natively.
  • WeatherBug — This program runs in the background and shows you the local temperature, absolutely free! Oh, and, every few minutes or so, it will pop up an advertisement that you didn’t ask for, also absolutely free! And good luck uninstalling it!
  • Bonzi Buddy — The mildly amusing, but increasingly annoying virtual monkey-companion who can read your e-mail aloud, among a multitude of other activities. Of course, every once in a while, he’ll kindly present you with an unsolicited online advertisement or a very special offer. He also tracks your internet activity, and then customizes the advertisements he displays based on your browsing habits! Sign me up!
  • WinAmp Agent — WinAmp is an extremely useful and simple tool for playing music files. But is it really necessary to have a portion of it running in the background at all times? When you want to play a song, launch Winamp. When you’re done, close it. Who cares if Winamp takes a few milliseconds longer to load if the Agent isn’t running? It’s better than having the Agent continually consume system resources.
  • Mouse Properties — for God’s sake, how often do you need to alter your mouse properties? Sure, it’s cute to change around your mouse cursor so that it looks like a little dinosaur or a piece of cheese being eaten by a mouse. But it gets old rather quickly.