Jump to content

Parikh's theorem

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Nxavar (talk | contribs) at 17:11, 8 November 2012 (Significance: Remove false connections of ambiguity to determinism.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Parikh's theorem in theoretical computer science says that if we look only at the relative number of occurrences of terminal symbols in a context-free language, without regard to their order, then the language is indistinguishable from a regular language.[1] It is useful for deciding whether or not a string with given number of some terminals is accepted by a context-free grammar.[2] It was first proved by Rohit Parikh in 1961[3] and republished in 1966.[4]

Definitions

Let be an alphabet. The Parikh vector of a word is defined as the function , given by[1]

, where gives the number of occurrences of the letter in the word .

Further, for a language ,

A subset of is said to be linear if it is of the form

for some vectors .

A subset of is said to be semi-linear if it is a union of finitely many linear subsets.

Significance

Parikh's theorem proves that some context-free languages can only have ambiguous grammars. Such languages are called inherently ambiguous. From a formal grammar perspective this means that some ambiguous context-free grammars cannot be converted to an eqivalent unambiguous context-free grammar.

References

  1. ^ a b Kozen, Dexter (1997). Automata and Computability. New York: Springer-Verlag. ISBN 3-540-78105-6.
  2. ^ Håkan Lindqvist. "Parikh's theorem" (PDF). Umeå Universitet.
  3. ^ Parikh, Rohit (1961). "Language Generating Devices". Quartly Progress Report, Research Laboratory of Electronics, MIT.
  4. ^ Parikh, Rohit (1966). "On Context-Free Languages". Journal of the Association for Computing Machinery. 13 (4).