You are on page 1of 5

CHAPTER 4

SYNTAX ANALYZER
The parser obtains a string of tokens from the lexical analyzer, as shown in Fig 4.1, and verifies that the string can be generated by the grammar for the source language. We expect the parser to report any syntax errors in an intelligible fashion. It should then also recover from commonly occurring errors so that it can continue parsing the remainder of the input program.
Token

Source Program (C File)

Lexical Analyzer
Get Next Token

Parser

Parse Rest of Intermediate Tree Front end Representation

Symbol Table
Figure 4.1 Position of Parser in Compiler Model

Yacc, Occs are some of the tools which generates the codes for parsing. Here, we have used YACC (Yet Another CompilerCompiler) to generate parser. Features of YACC (1) (2) It uses the Look Ahead LR (LALR) parsing technique to create a parser. It stores the action and goto table in the form of compressed array.

YACC Specifications A Yacc program consists of three parts: declarations %% production (translation) rules %% supporting Croutine

SYNTAX ANALYZER

Yacc Specifications

13

yacc.y y.tab.c

Yacc C Compiler Symbol Table

y.tab.c a.out

Source Program (C File)

lex.yy.exe (Lexical Analyzer)

Token Get Next Token Error Handler

yyout.exe (Parser)

Figure 4.2 Creating the Parser with Yacc

The declarations part: There are two optional sections in the declarations part of a Yacc program. In the first section, we put ordinary C declarations, delimited by %{ and %}. Here we place declarations of any temporaries used by the translation rules or procedures of the second and third sections. Also in the declarations part are declarations of the grammar tokens. Tokens declared in this section can be used in the second and third sections of the Yacc specification. The translation rules part: In the part of the Yacc specification after the first %% pair, we put the translation rules. Each rule consists of a grammar production and the associated semantic action. A set of productions that we have been writing <left side> <alt 1> | <alt 2> | | <alt n> would be written in Yacc as <left side> : | ... | ; <alt 1> <alt 2> <alt n> { semantic action 1 } { semantic action 2 } { semantic action n }

SYNTAX ANALYZER

Yacc Specifications

14

In a Yacc production, a quoted single character c is taken to be the terminal symbol c, and unquoted strings of letters and digits not declared to be tokens are taken to be nonterminals. Alternative right side can be separated by a vertical bar, and a semicolon follows each left side with its alternatives and their semantic actions. The first left side is taken to be the start symbol. A Yacc semantic action is a sequence of C statements. In a semantic action , the symbol $$ refers to the attribute value associated with the nonterminal on the left, while $i refers to the value associated with the ith grammar symbol (terminal or nonterminal) on the right. The semantic action is performed whenever we reduce by the associated production, so normally the semantic action computes a value for $$ in terms of the $is. The supporting Croutines part: The third part of a Yacc specification consists of supporting Croutines. A lexical analyzer by the name yylex() must be provided. Other procedures such as error recovery routines may be added as necessary. Using Yacc with Ambiguous Grammars When ambiguous grammar is used in the Yacc specification, the LALR algorithm will generate parsing action conflicts. Yacc will report number of parsing action conflicts that are generated. Unless otherwise instructed Yacc will resolve all parsing action conflicts using the following two rules: 1) 2) A reduce/reduce conflict is resolved by choosing the conflicting production listed first in the Yacc specification. A shift/reduce conflict is resolved in favour of shift. This rule resolves the shift/reduce conflict arising from the danglingelse ambiguity correctly.

Since these default rules may not always be what the compiler designer wants, Yacc provides a general mechanism for resolving shift/reduce conflicts. In the declarations portion, we can assign precedences and associativities to terminals. The declaration %left + -

makes + and be of the same precedence and be left associative. Similarly for an operator to be right associative right is used instead of left in the statement above.
SYNTAX ANALYZER Error Recovery in Yacc 15

We can force an operator to be a nonassociative binary operator (i.e., two occurrences of the operator cannot be combined at all) by stating %nonassoc < Moreover, the tokens are given precedences in th order in which they appear in the declarations part, lowest firat. Tokens in the same declarations have the same precedence. In some situations, where the rightmost terminal does not supply the proper precedence to a production, we can force a precedence by appending to a production the tag %prec <terminal>

The precedence and associativity of the production will then be the same as that of the terminal, which presumably is defined in the declaration section. Yacc does not report shift/reduce conflicts that are resolved using the precedence and associativity mechanism. Error Recovery in Yacc In Yacc, error recovery can be performed using a form of error productions. First, the designer decides what major nonterminals will have error recovery associated with them. Typical choices are some subset of the nonterminals generating expressions, statements, blocks and procedures. The designer then adds to the grammar error productions of the for A error , where A is a major nonterminal and is a string of grammar symbols, perhaps the empty string; error is a Yacc reserved word. Yacc will generate a parser from such a specification, treating the error productions as ordinary productions. However, when the parser generated by Yacc encounters an error, it treats the states whose sets of items contain error productions in a special way. On encountering an error, Yacc pops symbol from its stack until it finds the topmost state on its stack whose underlying set of items includes an item of the form A . error . The parser then shifts a fictitious token error onto the stack, as though it saw the token error on its input. When is , a reduction to A occurs immediately and the semantic action associated with the production A error (which might be a userspecified

errorrecovery routine) is invoked. The parser then discards input symbols until it finds an input symbol on which normal parsing can proceed.
SYNTAX ANALYZER SYNTAX ANALYZER Error Recovery in Yacc Error Recovery in Yacc 15 16

If is not empty, Yacc skips ahead on the input looking for a substring that can be reduced to . If consists entirely of terminals, then it looks for this string of terminals on the input, and reduces them by shifting them onto the stack. At this point, the parser will have error on top of its stack. The parser will then reduce error to A, and resume normal parsing.

You might also like