Debugging Conflicts in ocamlyacc

Debugging Conflicts in ocamlyacc,debugging,types,ocamlyacc,reduce-reduce-conflict,Debugging,Types,Ocamlyacc,Reduce Reduce Conflict,I am trying to write a parser for a simple language that recognizes integer and float expressions using ocamlyacc. However I want to introduce the possiblity of having variables. So i defined the token VAR in my lexer.mll file which allows it to be any alphanumneric string starting with a capital letter. expr: | INT { $1 } | VAR { /*Some action */} | expr PLUS expr { $1 + $3 } | expr MINUS expr { $1 - $3 } /* and similar rules

I am trying to write a parser for a simple language that recognizes integer and float expressions using ocamlyacc. However I want to introduce the possiblity of having variables. So i defined the token VAR in my lexer.mll file which allows it to be any alphanumneric string starting with a capital letter.

 expr:
 | INT                      { $1 }
 | VAR                      { /*Some action */}
 | expr PLUS expr           { $1 + $3 }
 | expr MINUS expr          { $1 - $3 }

 /* and similar rules below for real expressions differently */

Now i have a similar definition for real numbers. However when i run this file, I get 2 reduce/reduce conflict because if i just enter a random string(identified as token VAR). The parser would not know if its a real or an integer type of variable as the keyword VAR is present in defining both int and real expressions in my grammar.

Var + 12  /*means that Var has to be an integer variable*/
Var  /*Is a valid expression according to my grammar but can be of any type*/

How do I eliminate this reduce/reduce conflict without losing the generality of variable declaration and mainting the 2 data types available to me.


#1

exp:
    | sub {$1}
    | exp PLUS sub {$1+$3}

sub:
    | basic_unit {$1}
    | sub MINUS basic_unit {$1-$3}

basic_unit:
    | INT {$1}
    | VAR { /*some action*/ }

#2

You cannot keep track of type information in a context-free grammar. You must do it at runtime.

#3

Well, if you only have two types, then it's kinda possible, but you have to replicate your entire grammar to have int-expr and real-expr etc; also, your tokenizer must return either int-var or real-var by looking the symbol up in the symbol table.

#4

Thanks @n.m. for your reply. I have already written a similar expression for reals. But the problem is I have bound any alphanumeric regular expression as VAR in my lexer and I use this token in both real_expr and int_expr. How do I define the tokens int_var and real_var. Are you suggesting that I need to store the variables in some kind of a data structure?

#5

You define INT_VAR and REAL_VAR just like VAR. The lexer must do a bit more than just a regular expression match, it has to look up the type (execute a lexer action). Yes you need to store variables somewhere, how else would you find their values?

#6

How can we look up the type of a variable during lexing? I mean both int and real vars can be any alphanumeric and I call that a VAR. It would help if you could give some code that does what u are suggesting.

#7

Can you edit your answer to offer an explanation of your approach and why it addresses the original problem?