Jump to content

Talk:Recursive descent parser

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 166.137.214.242 (talk) at 15:42, 18 July 2014 (Another bug in example code). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
WikiProject iconComputer science Start‑class High‑importance
WikiProject iconThis article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
StartThis article has been rated as Start-class on Wikipedia's content assessment scale.
HighThis article has been rated as High-importance on the project's importance scale.
Things you can help WikiProject Computer science with:

Untitled

This page really needs a discussion of the problem of left-recursive grammar rules. Such rules are extremely common, as for example

EXPR = INT | EXPR + EXPR | EXPR * EXPR;

A recursive descent parser, presented with this grammar, will loop forever on a malformed input.

Should probably mention that hand-written recursive descent parsers are usually based on EBNF grammars? These support left-recursion by turning

    A = A b C | C

into

    A = C (b C)*

Paolo Bonzini, 09:34, 8 Mar 2006 (UTC)

Recursive descent parser example

Dominus 04:44, 28 Aug 2003 (UTC)

The code on the page is wrong. Parse_E will parse the string NOTTRUETRUE which is not part of the language.

I don't think it does. Care to demonstrate? -- Jan Hidders 11:22, 28 Aug 2003 (UTC)
maybe the problem is hidden behind the if's, which should be else-if's??
Ah, yes, of course. Hope I fixed that now. -- Jan Hidders 00:51, 4 Nov 2004 (UTC)

Does it accept 1*-1? I'm not sure it does

Does not look like it accepts this:
// ident := number * - number .

Correct.

Symbol symbols[7] = {ident,becomes,number,times,minus,number,period};
Should it?

The parser is based on the [PL/0] grammar (a compiler used in many compiler construction courses), which is in turn based on Pascal. For reasons unknown to me, standard Pascal does not allow imbedded unary expressions. For instance:
ident := -ident * number;
is allowed, however, expressions such as yours are not.
To allow this, you could add "-|+" to the factor production, as in:
factor = 
         ident 
       | ("+"|"-") factor
       | number 
       | "(" expression ")"
And change the parser appropriately.
Since the [PL/0] grammar is so well known, I'm of the opinion that the grammar should be left as is. 68.219.72.181 13:47, 17 April 2006 (UTC)[reply]

Question about See Also entry

There is a new See Also entry that I have a question about.

It points to a 'source code' site (nothing wrong with that), but it doesn't seem to add very much. It is essentially the example, copied from this wikipedia site (without any notice of the copying, by the way), plus another simple example.

Should this link be deleted? 67.34.42.125 12:27, 22 May 2006 (UTC)[reply]

PEGs

I have removed references to parsing expression grammars in this discussion on recursive descent parsing, as the content referring to PEGs, at least as formulated, tended to give the impression that PEGs are a widely adopted formalism. Parsing expression grammars, while analytic, are a formal grammar, and RD parsing is a parsing strategy (algorithm). QTJ 04:36, 11 October 2006 (UTC)[reply]

Bug in the parser code

The code inside block(), after "while (accept(procsym))" seems to be nonsense, right? There are 2 nested while-loops and as far as i understand the grammar only the outer one should be there. Catskineater (talk) 17:28, 30 March 2008 (UTC)[reply]

SML example

I just wrote one of these in SML for an assignment. Should I post it here under Implementation in functional languages? bkhl (talk) 07:47, 13 May 2008 (UTC)[reply]

Recursive descent with backup

The "recursive descent with backup" discussion, while correct, troubles me. "X with backup", for any parsing algorithm X, will parse any context-free language. It will do so, except in a very few cases, at the expense of every virtue which the original algorithm X had.

Perhaps the "backup" discussion might be moved later in the article. I'd almost suggest deleting it, but the topic does come up. Almost automatically, whenever algorithm X proves inadequate, backup will be suggested. Quite often, it will be tried. My wishing it were not so, does not help. So some kind of discussion of "backup" as an alternative probably needs to be kept.

--Jeffreykegler (talk) 03:23, 26 May 2010 (UTC)[reply]

Expansion necessary

The example uses no token look ahead at all and thus is not really helpful to understand the strength of recursive descent parsers. There should at least be a bit of ambiguity in the grammar, otherwise the example is too trivial and - even worse - points the reader into wrong directions. — Preceding unsigned comment added by 139.30.5.126 (talk) 18:33, 27 March 2012 (UTC)[reply]

The article doesn't explain what a Recursive descent parser is

Sorry to nag, but this article doesn't provide an answer to the simple question of what a Recursive descent parser is. I'm sure that the information is technically correct in a formal way, but an encyclopedia article really should provide some understanding of the subject to an interested reader. I suspect the authors are very talented people that are communicating among themselves, but to the average person this is gobbledygook. I've been programming since 1972, know FORTRAN, Pascal, C, assembly languages for the PDP-8, CDC 6400 and 6600, 8080 and 8086 Intel chips, Python, APL, PHP, and probably a few others, and I don't know what a recursive descent parser is after reading this article. If I'm lost, many others are also going to be left bewildered by the article. Kd4ttc (talk) 04:54, 22 November 2012 (UTC)[reply]

Another bug in example code

The grammar rules suggest that the following string should be accepted by this grammar:

begin foo := 5 ; end

i.e. The grammar rules accepts (in fact, requires) a semicolon between the "5" number token and the "end" endsym token whereas the implementation of the relative portion of the "statement" function looks something like:

if (accept(beginsym))
{
.   do
.   {
.       statement();
.   }
.   while (accept(semicolon));
.   expect(endsym);
}

This only accepts strings of the form:

"begin" statement { ";" statement } "end"

Or, am I missing something? This example code is as old as the hills so it seems unlikely that such a bug could've been around for so long, but I guess it's possible. — Preceding unsigned comment added by 166.137.214.242 (talk) 15:39, 18 July 2014 (UTC)[reply]