Jump to content

Suffix tree clustering

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by NetworkOP (talk | contribs) at 19:08, 2 January 2015. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Suffix Tree Clustering, often abbreviated as STC is an approach for clustering that uses suffix trees.[1] A suffix tree cluster keeps track of all n-grams of any given length to be inserted into a set word string, while simultaneously allowing differing strings to be inserted incrementally in a linear order. This has the advantage of ensuring that a large number of clusters can be handled sequentially. However, a potential disadvantage may be that it also increases the number of possible documents that need to be looked through when handling large sets of data. Suffix tree clusters can either be decompositional or agglomerative in nature, depending on the type of data being handled.[2]

Reflist

  1. ^ "CS276A Final Project" (PDF). Retrieved 2 January 2015.
  2. ^ "Lecture 4: Clustering". Retrieved 2 January 2015.