Lesk algorithm

The Lesk algorithm is a classical algorithm for Word sense disambiguation introduced by Mike Lesk in 1986 ^[1].

The Lesk algorithm is based on the assumption that words in a given neighbourhood will tend to share a common topic. A naive implementation of the The Lesk algorithm would be

choosing pairs of ambiguous words within a neighbourhood
checks their definitions in a dictionnary
choose the senses as to maximise the number of common terms in the definitions of the chosen words.

Accuracy on Pride and Prejudice and selected papers of the Associated Press was found to be in the 50% to 70% range.

A simplified version of the Lesk algorithm is to compare the dictionnary definition of an ambiguous word with the terms contained of the neighbourhood.

Versions have been adapted to Wordnet^[2].

References

^ Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone, Mike Lesk, ACM Special Interest Group for Design of Communication Proceedings of the 5th annual international conference on Systems documentation, p. 24 - 26, 1986. ISBN 0897912241 [1]
^ An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet, Satanjeev Banerjee and Ted Pedersen, Lecture Notes In Computer Science; Vol. 2276, Pages: 136 - 145, 2002. ISBN 3540432191

[1] Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone, Mike Lesk, ACM Special Interest Group for Design of Communication Proceedings of the 5th annual international conference on Systems documentation, p. 24 - 26, 1986. ISBN 0897912241 [1]

[2] An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet, Satanjeev Banerjee and Ted Pedersen, Lecture Notes In Computer Science; Vol. 2276, Pages: 136 - 145, 2002. ISBN 3540432191

[1]

[2]