Stack Overflow Asked by Adi Mehrotra on December 16, 2021
I’m writing a small bison / flex calculator but I’m not able to understand and debug the verbose output
This is the code which is causing the errors:
%token NUM
%token NAME
%%
input: 'n'
| line 'n'
;
line: expr {printf("%.10gn", $1);}
| NAME '=' expr {Assign(table, yyname, $3);}
;
expr: NUM {$$ = $1;}
| NAME {$$ = Lookup(table, yyname);}
| expr '+' expr {$$ = $1 + $2;}
| expr '-' expr {$$ = $1 - $2;}
| expr '*' expr {$$ = $1 * $2;}
| expr '/' expr {$$ = $1 / $2;}
| expr '^' expr {$$ = pow($1, $2);}
| '(' expr ')' {$$ = $2;}
The verbose error output is:
state 20
7 expr: expr . '+' expr
7 | expr '+' expr .
8 | expr . '-' expr
9 | expr . '*' expr
10 | expr . '/' expr
11 | expr . '^' expr
'+' shift, and go to state 13
'-' shift, and go to state 14
'*' shift, and go to state 15
'/' shift, and go to state 16
'^' shift, and go to state 17
'+' [reduce using rule 7 (expr)]
'-' [reduce using rule 7 (expr)]
'*' [reduce using rule 7 (expr)]
'/' [reduce using rule 7 (expr)]
'^' [reduce using rule 7 (expr)]
$default reduce using rule 7 (expr)
...
state 24
7 expr: expr . '+' expr
8 | expr . '-' expr
9 | expr . '*' expr
10 | expr . '/' expr
11 | expr . '^' expr
11 | expr '^' expr .
'+' shift, and go to state 13
'-' shift, and go to state 14
'*' shift, and go to state 15
'/' shift, and go to state 16
'^' shift, and go to state 17
'+' [reduce using rule 11 (expr)]
'-' [reduce using rule 11 (expr)]
'*' [reduce using rule 11 (expr)]
'/' [reduce using rule 11 (expr)]
'^' [reduce using rule 11 (expr)]
$default reduce using rule 11 (expr)
I’m pretty new to using bison so any help on this would be appreciated, thanks in advance
Bison operates in part by maintaining a stack of symbols, and repeatedly choosing between
It uses the current contents of the stack and the type of the next input symbol (if any), to make that choice.
It is very easy to write grammar rule sets that are ambiguous, in the sense that they afford states where either shifting or reducing could be viable. This is a "shift/reduce" conflict, and there are many such states in your case. Consider this input:
1 + 2 * 3
Supposing that tokens 1
, 2
, and 3
are parsed as NUM
s, we have this:
stack input action
(empty) `NUM` `+` `NUM` `*` `NUM` shift
`NUM` `+` `NUM` `*` `NUM` reduce (1)
`expr` `+` `NUM` `*` `NUM` shift (2)
`expr` `+` `NUM` `*` `NUM` shift (3)
`expr` `+` `NUM` `*` `NUM` reduce (1)
`expr` `+` `expr` `*` `NUM` !!
And there we have two viable alternatives: Bison could reduce the three symbols at the top of the stack to an expr
, or it could shift the *
onto the stack. In fact, although it is not always the case with a shift / reduce conflict, either alternative could lead to a successful parse, but the choice matters because it can -- and in your case does -- affect the interpretation of the input. Performing the reduction at this point leads to interpreting the input as equivalent to (1 + 2) * 3
, whereas performing the shift leads, in a couple more steps, to interpreting the input as equivalent to 1 + (2 * 3)
.
Without other guidance, Bison defaults to shifting in these cases. That more often leads to a successful parse, but not always, and sometimes it leads to a successful but wrong parse. Bison's default will get it right in the above example, but would get it wrong (relative to our usual expectations for operator precedence) with 1 * 2 + 3
.
Writing unambiguous grammar rules is a distinctly non-trivial exercise. In this case, it would require breaking the symmetry of your rules for expressions, so that the left- and right-hand operands are represented by different symbols. To take operator precedence into account, you will also need to differentiate between different types of expressions based on the top-level operator, if any, with which they are formed.
Notes:
(1) There is no rule for handling a NUM
except to reduce it to an expr
.
(2) an expr
could also be reduced to a line
, but Bison can take the next token type into account to see that performing such a reduction would leave it in a state from which it could not proceed.
(3) There is no rule to reduce a group of symbols ending with a +
.
Answered by John Bollinger on December 16, 2021
Chris Dodd's answer tells you what is wrong. This answer is about how to fix it.
You can fix this by introducing a bunch of sub-productions of the expr
production, and this is what you will often see in formal specifications, e.g.
atom: NUM | NAME | '(' expr ')'
pow: atom '^' atom
muldiv: pow '*' pow | pow '/' pow
/* etc */
And with more complicated grammars, sometimes you have to do it this way. But in this case, it's easier to use Bison's operator precedence feature. Add these lines to the first section of your .y
file:
%left '+' '-'
%left '*' '/'
%left '^'
This tells the parser generator that all of the arithmetic operators are left-associative (that is, treat a + b + c
as (a + b) + c
), that *
and /
have higher precedence than +
and -
(that is, treat a + b * c
as a + (b * c)
), and that ^
has higher precedence still. This is enough extra information to eliminate all the conflicts.
Answered by zwol on December 16, 2021
A "conflict" in bison tells you the grammar in question is not LALR(1) -- there is something about it that can't find a unique parse for an input with only 1 token lookahead.
The verbose grammar file tells you exactly where the conflict is, but to understanbd it, you need to understand how a shift-reduce parser recognizes inputs. A shift-reduce parser is basically a state machine coupled to a stack (a push-down automata or PDA), where the state machine tracks which possible (partial) rule productions have been recognized so far, and the stack holds states that have been deferred.
In your specific example, you have the state rules:
7 expr: expr . '+' expr
7 | expr '+' expr .
8 | expr . '-' expr
9 | expr . '*' expr
10 | expr . '/' expr
11 | expr . '^' expr
which tells you the state represents having partially recognized any of those rules up to the point where a .
appears. So it might have recognized the entire rule expr + expr
or it might be just the initial expr
of one of the other rules. This is indicative of an ambiguity in the grammar. Whenever you have an input of the form
A + B * C
or
3 + 5 - 7
it can recognize it as two binary operations, but it doesn't know which one binds tighter -- should it be (A + B) * C
or A + (B * C)
Answered by Chris Dodd on December 16, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP