Jump to content

Random indexing

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Wavelength (talk | contribs) at 15:49, 11 April 2014 (inserting 1 hyphen: —> "very-high-dimensional"—User talk:Wavelength#Hyphenation [to Archive 6]). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Random indexing is a dimension reduction method and computational framework for Distributional semantics, based on the insight that very-high-dimensional Vector Space Model implementations are impractical, that models need not grow in dimensionality when new items (e.g. new terminology) is encountered, and that a high-dimensional model can be projected into a space of lower dimensionality without compromising distance metrics if the resulting dimensions are chosen appropriately, which is the original point of the random projection approach to dimension reduction first formulated as the Johnson–Lindenstrauss lemma. Locality-sensitive hashing has some of the same starting points. Random indexing, as used in representation of language, originates from the work of Pentti Kanerva on Sparse distributed memory, and can be described as an incremental formulation of a random projection.

References

  • Kanerva, P., Kristoferson, J. & Holst, A. (2000): Random Indexing of Text Samples for Latent Semantic Analysis, Proceedings of the 22nd Annual Conference of the Cognitive Science Society, p. 1036. Mahwah, New Jersey: Erlbaum, 2000.
  • Sahlgren, M. (2005) An Introduction to Random Indexing, Proceedings of the Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE 2005, August 16, Copenhagen, Denmark.
  • Sahlgren, M., Holst, A. & P. Kanerva (2008) Permutations as a Means to Encode Order in Word Space, In Proceedings of the 30th Annual Conference of the Cognitive Science Society: 1300-1305.
  • Kanerva, P. (2009) Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors, Cognitive Computation, Volume 1, Issue 2, pp. 139–159.
  • Cohen T., Schvaneveldt R. & Widdows D. (2009) Reflective Random Indexing and indirect inference: a scalable method for discovery of implicit connections, Journal of Biomedical Information, 43(2):240-56.