Jump to content

Tail recursive parser

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 207.210.205.48 (talk) at 03:40, 28 February 2006. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A Tail Recursive Parser is a type of parser derived from more common Recursive descent parsing. As an alternative to more complex bottom-up parsers, tail recursive parsers are quite useful for parsing left recursive grammars using a top down method. Tail Recursive Parsing is a method of parsing a CFG, it is usually used in a Compiler for semantic analysis of a source program. Tail Recursive parsers are typically used because they use little stack space, are easy to write, and can parse left recursive grammars very elegantly. While typical recursive descent techniques make it impossible to parse a left recursive grammar because of an infinite recursive loop, tail recursive parsers use a node reparenting technique that makes this allowable. Given a EBNF Grammar such as the following:

E: T
T: T ('+' F)* | F
F: F ('*' I)* | I
I: <identifier>

A simple tail recursive parser can be written much like a recursive descent parser. The typical algorithm for parsing a grammar like this using an Abstract syntax tree is:

  1. Parse the next level of the grammar and get its output tree, designate it the first tree, F
  2. While there is terminating token, T, that can be put as the parent of this node:
    1. Allocate a new node, N
    2. Set N's current operator as the current input token
    3. Advance the input one token
    4. Set N's left subtree as F
    5. Parse another level down again and store this as the next tree, X
    6. Set N's right subtree as X
    7. Set F to N
  3. Return N

A C programming language implementation of this parser is shown here:

exptree *parse_e(void) {
    return parse_t();
}

exptree *parse_t(void) {
    exptree *first_f = parse_f();
    
    while(cur_token() == '+') {
        exptree *replace_tree = alloc_tree();
        replace_tree->token = cur_token();
        replace_token->left = first_f;
        next_token();
        replace_token->right = parse_f();
        first_f = replace_token;
    }

    return first_f;
}

exptree *parse_f(void) {
    exptree *first_i = parse_i();
    
    while(cur_token() == '*') {
        exptree *replace_tree = alloc_tree();
        replace_tree->token = cur_token();
        replace_tree->left = first_i;
        next_token();
        replace_tree->right = parse_i();
        first_i = replace_tree;
    }
    
    return first_i;
}

exptree *parse_i(void) {
    exptree *i = alloc_tree();
    exptree->left = exptree->right = NULL;
    exptree->token = cur_token();
    next_token();
}

Further reading