Keyword extraction

Keyword extraction is tasked with the automatic identification of terms that best describe the subject of a document.^[1] ^[2]

Key phrases, key terms, key segments or just keywords are the terminology which is used for defining the terms that represent the most relevant information contained in the document. Although the terminology is different, function is the same: characterization of the topic discussed in a document. Keyword extraction task is important problem in Text Mining, Information Retrieval and Natural Language Processing.^[3]

Keyword assignment vs. extraction

Keyword assignment methods can be roughly divided into:

keyword assignment (keywords are chosen from controlled vocabulary or taxonomy) and
keyword extraction (keywords are chosen from words that are explicitly mentioned in original text).

Methods for automatic keyword extraction can be supervised, semi-supervised, or unsupervised.^[4] Unsupervised methods can be further divided into simple statistics, linguistics, graph-based, and other methods.

References

^ "An Overview of Graph-Based Keyword Extraction Methods and Approaches". Journal of Information and Organizational Sciences. 39 (1): 1–20. 2015. {{cite journal}}: Unknown parameter |authors= ignored (help)
^ TextRank: Bringing Order into Texts (PDF). Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004). Barcelona, Spain. July 2004. {{cite conference}}: Check date values in: |date= (help); Unknown parameter |authors= ignored (help)
^ Toward Selectivity-Based Keyword Extraction for Croatian News (PDF). Surfacing the Deep and the Social Web (SDSW 2014). Vol. 1310, . Italy : CEUR Proc. 2014. pp. 1–14. {{cite conference}}: Unknown parameter |authors= ignored (help)CS1 maint: extra punctuation (link)
^ SemCluster: Unsupervised Automatic Keyphrase Extraction Using Affinity Propagation. 17th UK Workshop on Computational Intelligence. 2017. {{cite conference}}: Unknown parameter |authors= ignored (help)

This computational linguistics-related article is a stub. You can help Wikipedia by expanding it.

[1] "An Overview of Graph-Based Keyword Extraction Methods and Approaches". Journal of Information and Organizational Sciences. 39 (1): 1–20. 2015. {{cite journal}}: Unknown parameter |authors= ignored (help)

[2] TextRank: Bringing Order into Texts (PDF). Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004). Barcelona, Spain. July 2004. {{cite conference}}: Check date values in: |date= (help); Unknown parameter |authors= ignored (help)

[3] Toward Selectivity-Based Keyword Extraction for Croatian News (PDF). Surfacing the Deep and the Social Web (SDSW 2014). Vol. 1310, . Italy : CEUR Proc. 2014. pp. 1–14. {{cite conference}}: Unknown parameter |authors= ignored (help)CS1 maint: extra punctuation (link)

[4] SemCluster: Unsupervised Automatic Keyphrase Extraction Using Affinity Propagation. 17th UK Workshop on Computational Intelligence. 2017. {{cite conference}}: Unknown parameter |authors= ignored (help)

[1]

[2]

[3]

[4]