User:Van Parunak/sandbox

This is the user sandbox of Van Parunak. A user sandbox is a subpage of the user's user page. It serves as a testing spot and page development space for the user and is not an encyclopedia article. Create or edit your own sandbox here.

Other sandboxes: Main sandbox | Template sandbox

Finished writing a draft article? Are you ready to request review of it by an experienced editor for possible inclusion in Wikipedia? Submit your draft for review!

Grammar-based distances refer to a family of String metrics that take into account the grammars that generate the strings being compared. Most string measures simply count the changes in characters needed to transform one string into another (Edit distance), or even more generally, the number of shared and distinct characters (e.g., Overlap coefficient). The fundamental insight of grammar-based distance is that if two strings differ at a character that is generated by the same grammatical production, the strings should be reckoned closer to one another than if the distinguishing character arises from higher-level productions.

Most work on grammar-based distances is based on inducing the grammar underlying a set of strings by applying Lempel-Ziv compression, which effectively reconstructs a production sequence yielding the strings. This approach is increasingly popular in biological settings ^[1] ^[2]. An alternative approach is based on knowing the underlying grammar of the strings in advance.

Distance Measures from Induced Grammars

Distance Measures from Explicit Grammars

References

^ Otu, Hasan; Sayood, Khalid (2003). "A new sequence distance measure for phylogenetic tree construction" (PDF). Bioinformatics. 19 (16): 2122-2130. doi:10.1093/bioinformatics/btg295. Retrieved 19 July 2017.
^ Russell, David J.; Otu, Hasan H.; Sayood, Khalid (2008). "Grammar-based distance in progressive multiple sequence alignment". BMC Bioinformatics. 9. doi:10.1186/1471-2105-9-306. Retrieved 19 July 2017.{{cite journal}}: CS1 maint: unflagged free DOI (link)

External links

Category:Information retrieval evaluation Category:String similarity measures Category:Measure theory Category:Similarity and distance measures

This computer science article is a stub. You can help Wikipedia by expanding it.

[1] Otu, Hasan; Sayood, Khalid (2003). "A new sequence distance measure for phylogenetic tree construction" (PDF). Bioinformatics. 19 (16): 2122-2130. doi:10.1093/bioinformatics/btg295. Retrieved 19 July 2017.

[2] Russell, David J.; Otu, Hasan H.; Sayood, Khalid (2008). "Grammar-based distance in progressive multiple sequence alignment". BMC Bioinformatics. 9. doi:10.1186/1471-2105-9-306. Retrieved 19 July 2017.{{cite journal}}: CS1 maint: unflagged free DOI (link)

[1]

[2]

Distance Measures from Induced Grammars

Distance Measures from Explicit Grammars

See also

References

External links