# Are there any recommended tools for teaching syntax and grammar of programming languages?

Computer Science Educators Asked by Lila on August 21, 2021

I would like to know if there are some tools to teach the following topics in a programming paradigms course:

1. Demonstration of lexical rules
2. Demonstration of grammar rules, such as tree parsers of Backus-Naur forms

I found that one can teach this topics by using Flex and Bison, but it requires some previous knowledge in C and C++ languages. In any case, are there some alternatives to these tools?

I agree with @Buffy's comment. If your goal is to teach the concepts of lexical rules and context-free grammar, I would focus on that rather than on using tools to build parsers from those grammar rules.

I've found that if you have first taught regular expressions and finite automata, that you can teach these concepts quite nicely within that scope using toy languages that you invent and derive in class.

For example, you could create a language defined as:

L = {w | w is a pair of positive integers which are added or subtracted}


For example, these would all be valid strings in such a language: { "3 + 2", "5 - 1", "0 + 2", ... }

Then you could define your grammar by deciding what set of variables (V), set of terminals (Σ), set of substitution rules (R), and start variable (s) you need to describe that language.

G = (V, Σ, R, s)


This can all be done in a class or group discussion on paper (or whiteboard) without any tools. I find it easiest for students to come up with Σ first, then use that to deduce the members of V, then R, then s.

A guided discussion could go something like:

# Σ

"Let's decide what kinds of things we could see in our set of terminals, what symbols (some people refer to these as tokens, though parsers think of these differently) would we see in this language?"

...

"According to our definition of L, this could be any positive integer, a plus, or a minus. So the set Σ could be defined as:"

{ℤ+, +, -}


# V

"Now, how could we group these things together?"

...

"The things we're seeing are either positive-integer things, or Operator things, a valid Expression in our language always appears in this form:

Expression -> positive-integer Operator positive-integer


Since the set of things that can be a positive-integer is already represented by ℤ+ in our set of terminals (Σ), that leaves only two types of things in our set of variables:

{ Expression, Operator }


# R

"Now, let's come up with the substation rules for this language. We already said that:

Expression -> positive-integer Operator positive-integer


"Since positive-integer is a terminal, we just need a set of rules for Operator. What kinds of things can an Operator be?"

...

Operator -> +
Operator -> -


"So that means the complete set of rules is:"

Expression -> positive-integer Operator positive-integer
Operator -> +
Operator -> -


# s

"Now, we just need to decide what the starting variable should be. Which element of V should be our top-level item?" ... "Expression"

# Next Steps

From this point, you can then walk through derivations for the grammar. Depending on where you want to go next, you could then illustrate the grammar as a finite automata or as a parse tree, or express it in Backus-Naur form.

If you still want to discuss parsers, I've also given students assignments where I provide them with a simple grammar definition like the above, and ask them to implement a DFA that can determine if a provided string is a valid expression in the language. This is a lot easier if you've already discussed regular expressions.

Answered by lfalin on August 21, 2021