Jump to content

Latent semantic structure indexing

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Richataxon (talk | contribs) at 12:52, 7 October 2007. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Latent Semantic Structure Indexing (LaSSI) is a technique for calculating chemical similarity derived from Latent semantic analysis (LSA).

LaSSI was developed at Merck & Co. and patented in 2007 [1] by Richard Hull, Eugene Fluder, Suresh Singh, Robert Sheridan, Robert Nachbar and Simon Kearsley.


Overview

LaSSI is similar to LSA in that it involves the construction of an occurrence matrix from a corpus of items and the application of singular value decomposition to that matrix to derive . What differs is that the occurrence matrix represents the frequency of two- and three-dimensional chemical descriptors (rather than natural language terms) found within a chemical database of chemical structures. This process derives latent chemical structure concepts that can be used to calculate chemical similarity for drug discovery.

References

Hull, R.D., Fluder, E.M., Singh, S.B., Nachbar, R.B., Sheridan, R.P. and Kearsley, S.K. (2001) "Latent semantic structure indexing (LaSSI) for defining chemical similarity." J Med Chem, 2001 Apr 12;44(8):1177-84.

Hull, R.D., Singh, S.B., Nachbar, R.B., Sheridan, R.P., Kearsley, S.K. and Fluder, E.M. (2001) "Chemical similarity searches using latent semantic structure indexing (LaSSI) and comparison to TOPOSIM." J Med Chem, 2001 Apr 12;44(8):1185-91.

Singh, S.B., Sheridan, R.P., Fluder, E.M. and Hull, R.D. (2001) "Mining the chemical quarry with joint chemical probes: an application of latent semantic structure indexing (LaSSI) and TOPOSIM (Dice) to chemical database mining." J Med Chem, 2001 May 10;44(10):1564-75.