Jump to content

Lesk algorithm

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Rama (talk | contribs) at 15:38, 28 September 2007 (Created page with 'The '''Lesk algorithm''' is a classical algorithm for Word sense disambiguation introduced by Mike Lesk in 1986 <ref> ''[http://portal.acm.org/ft_gateway.cf...'). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

The Lesk algorithm is a classical algorithm for Word sense disambiguation introduced by Mike Lesk in 1986 [1].

The Lesk algorithm is based on the assumption that words in a given neighbourhood will tend to share a common topic. A naive implementation of the The Lesk algorithm would be

  1. choosing pairs of ambiguous words within a neighbourhood
  2. checks their definitions in a dictionnary
  3. choose the senses as to maximise the number of common terms in the definitions of the chosen words.

Accuracy on Pride and Prejudice and selected papers of the Associated Press was found to be in the 50% to 70% range.

A simplified version of the Lesk algorithm is to compare the dictionnary definition of an ambiguous word with the terms contained of the neighbourhood.

Versions have been adapted to Wordnet[2].

References

  1. ^ Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone, Mike Lesk, ACM Special Interest Group for Design of Communication Proceedings of the 5th annual international conference on Systems documentation, p. 24 - 26, 1986. ISBN 0897912241 [1]
  2. ^ An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet, Satanjeev Banerjee and Ted Pedersen, Lecture Notes In Computer Science; Vol. 2276, Pages: 136 - 145, 2002. ISBN 3540432191