Jump to content

Parikh's theorem

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by SPat (talk | contribs) at 05:57, 11 March 2010 (yet to complete proof). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Parikh's theorem in theoretical computer science says that if we look only at the relative number of occurrences of terminal symbols in a context-free language, without regard to their order, then the language is indistinguishable from a regular language.[1] It is useful for deciding whether or not a a string with given number of some terminals is accepted by a context-free grammar.[2]. It was first proved by Rohit Parikh in 1966[3]

Definitions

Let be an alphabet. The Parikh vector of a word is defined as the function , given by[1]

, where gives the number of occurrences of the letter in the word .

Further, for a language ,


A subset of is said to be linear if it is of the form

for some integers .


A subset of is said to be semi-linear if it is a union of finitely many linear subsets.

Statement

Parikh's theorem says that for any context-free language , the set is semi-linear.

Proof

Let be a context free grammar in the Chomsky normal form. Let denote parse trees of with non-terminal at the root, non-terminals labeling internal nodes and terminals or non-terminals labeling leaves. A root is defined as the non-terminal at the root of the tree , yield is the string of terminals and non-terminals at the leaves of reading left to right.

References

  1. ^ a b Kozen, Dexter (1997). Automata and Computability. New York: Springer-Verlag. ISBN 3-540-78105-6.
  2. ^ Håkan Lindqvist. "Parikh's theorem" (PDF). Umeå Universitet.
  3. ^ Parikh, Rohit (1966). "On Context-Free Languages". Journal of the Association for Computing Machinery. 13 (4).