Jump to content

Co-occurrence

From Wikipedia, the free encyclopedia

In linguistics, co-occurrence or cooccurrence (in older texts often shown with diacritic as coöccurrence) is an above-chance frequency of ordered occurrence of two adjacent terms in a text corpus. Co-occurrence in this linguistic sense can be interpreted as an indicator of semantic proximity or an idiomatic expression. Corpus linguistics and its statistical analyses can reveal (regularity of) patterns of co-occurrences within a language and enable the working out of typical collocations for its lexical items.

A co-occurrence restriction is identified when linguistic elements never occur together. Analysis of these restrictions can lead to discoveries about the structure and development of a language.[1]

Co-occurrence can be seen an extension of word counting in higher dimensions. Co-occurrence can be quantitatively described using measures like a massive correlation or mutual information.

Co-occurrence information and knowledge of co-occurring words may be relevant in analysis of language for the purposes of large language models, part of the emerging field of artificial intelligence, and helpful in word games such as scrabble.

See also

[edit]

References

[edit]
  1. ^ Kroeger, Paul (2005). Analyzing Grammar: An Introduction. Cambridge: Cambridge University Press. p. 20. ISBN 978-0-521-01653-7.
[edit]