CodonCode Aligner
![]() | This article contains promotional content. (April 2009) |
![]() | A major contributor to this article appears to have a close connection with its subject. (April 2009) |
CodonCode Aligner | |
---|---|
Developer(s) | CodonCode Corporation |
Stable release | 3.0.1
/ 30 March 2009 |
Operating system | Mac OS X, WIndows |
Type | Bioinformatics |
License | commercial; free for limited use (trace viewing & editing) |
Website | http://www.codoncode.com/aligner |
CodonCode Aligner is a commercial application for DNA sequence assembly, sequence alignment, and editing on Mac OS X and Windows.
Features
- Chromatogram editing, end clipping, and vector trimming.
- Sequence assembly and contig editing
- Aligning cDNA against genomic templates
- Sequence alignment and editing.
- Alignment of contigs to each other with ClustalW, MUSCLE, or built-in algorithms.
- Mutation detection, including detection of heterozygous single-nucleotide polymorphism.
- Analysis of heterozygous insertions and deletions.
- Start online BLAST searches.
- Restriction analysis - find and view restriction cut sites.
- Trace sharpening.
- Support for Phred, Phrap, ClustalW, and MUSCLE.
CodonCode Aligner has several features not found in other sequence assembly programs. These include the ability to align contigs to each other using ClustalW, MUSCLE, or built-in alignment algorithms, while maintaining the links to the sequences from which the contigs were generated. This enables rapid verification of differences by going back to the underlying sequence traces directly from the aligned contigs. This feature was requested by scientists on the EvolDir mailing list[1] in 2005[2], and added to CodonCode Aligner in 2006[3].
Sequence Assembly and Alignment Methods
CodonCode Aligner provides a variety of methods for sequence assembly and sequence alignments. The built-in algorithms are based on banded versions of the Needleman-Wunsch algorithm and the Smith-Waterman algorithm. Early versions used a local alignment method by default, imitating the approach used by Phrap (although, according to the software documentation, the assembly algorithm is simplified). However, assemblies based on local alignments can result in unaligned ends. This can confuse users who are used to global (end-to-end) alignments, which are more commonly used in other assembly programs. In later versions of CodonCode Aligner, a global alignment algorithm was used by default. A third alignment algorithm, called "large gap alignments", was also introduced in version 1.6.2 to enable alignments of cDNA to genomic DNA. Algorithms can be selected and configured on the CodonCode preferences.
For resequencing projects, CodonCode Aligner supports the use of reference sequences. Multiple sequences can be designated as reference sequences; the program will choose the best reference sequence for each alignment, based on matching word counts. CodonCode Aligner uses the term "Alignment" for assemblies to a reference sequence.
In addition to the built-in algorithms for sequence assembly, CodonCode Aligner also supports sequence assemblies and alignments with Phrap. The alignment programs ClustalW, and MUSCLE can be used for the "Compare contigs" option, which generates "contigs of contigs".
In resequencing projects, scientists often compare the sequences from a number of different sources - for example different patients in medical genetics, different species in phylogenetics, or strains from different locations in biogeography. CodonCode Aligner provides an option called "Assemble in Groups" to automatically build separate contigs for each of the different sources, based on the sample names. Sample name interpretation can be configured to use separator characters, fixed length name parts, and regular expressions. Sequences can also be automatically be base called, end clipped, and vector trimmed before assembly.
For the generation of "consensus" sequences for contigs, CodonCode Aligner uses a sequence-quality based method by default, similar to what was originally proposed by Bonfield and Staden[4] and later used by Phrap. This can be of particular convenience in resequencing projects, were consensus sequences are often formed from two sequences from opposite directions. Near the ends of the sequences, one sequence is often high quality, while the other sequence is low quality and contains errors. With older sequence assemblers that use majority-based consensus sequences, manual editing is required at every single discrepancy. In contrast, a quality-based consensus sequence will typically be correct, substantially reducing the amount of manual editing that is required.
History
The first beta version of CodonCode Aligner was released in April 2003, followed by the first full version in June 2003. Major upgrades were released in 2003, 2004, 2005, 2006, 2007, and 2008.
In April 2009, CodonCode Aligner had been cited in more than 400 scientific publications. Citations cover a wide variety of biomedical research areas, including HIV research,[5][6][7], biogeography and environmental biology[8][9], DNA methylation studies[10], genetic diseases[11][12][13], clinical microbiology[14][15], and evolution research and phylogenetics[16][17][18].
References
- ^ http://evol.mcmaster.ca/evoldir.html
- ^ http://evol.mcmaster.ca/~brian/netevoldir/Archive/Mnth_Review_Jul_05.pdf
- ^ http://evol.mcmaster.ca/~brian/evoldir/Archive/Mnth_Review_Apr_06.pdf
- ^ Bonfield JK, Staden R (1995): The application of numerical estimates of base calling accuracy to DNA sequencing projects. Nucleic Acids Res. 1995 Apr 25;23(8):1406-10. PMID 7753633
- ^ Bailey JR, Sedaghat AR, Kieffer T, Brennan T, Lee PK, Wind-Rotolo M, Haggerty CM, Kamireddi AR, Liu Y, Lee J, Persaud D, Gallant JE, Cofrancesco J, Quinn TC, Wilke CO, Ray SC, Siliciano JD, Nettles RE, Siliciano RF (2006). "Residual Human Immunodeficiency Virus Type 1 Viremia in Some Patients on Antiretroviral Therapy Is Dominated by a Small Number of Invariant Clones Rarely Found in Circulating CD4+ T Cells†". J Virol. 80: 6441--6457.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Calis JC, Rotteveel HP, van der Kuyl AC, Zorgdrager F, Kachala D, van Hensbroek MB, Cornelissen M (2008). "Severe anaemia is not associated with HIV-1 env gene characteristics in Malawian children". BMC Infect Dis. 8: 26.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Mild M, Esbjörnsson J, Fenyö EM, Medstrand P (2007). "Frequent Intrapatient Recombination between Human Immunodeficiency Virus Type 1 R5 and X4 Envelopes: Implications for Coreceptor Switch‚ñø". J Virol. 81: 3369--3376.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Pendley CJ, Becker EA, Karl JA, Blasky AJ, Wiseman RW, Hughes AL, O‚O'Connor SL, O‚O'Connor DH (2008). "MHC class I characterization of Indonesian cynomolgus macaques". Immunogenetics. 60: 339--351.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Behnke A, Bunge J, Barger K, Breiner H, Alla V, Stoeck T (2006). "Microeukaryote Community Patterns along an O2/H2S Gradient in a Supersulfidic Anoxic Fjord (Framvaren, Norway)†". Appl Environ Microbiol. 72: 3626--3636.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Bart A, van Passel MWJ, van Amsterdam K, van der Ende A (2005). "Direct detection of methylation in genomic DNA". Nucleic Acids Res. 33: e124.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Andersson LS, Juras R, Ramsey DT, Eason-Butler J, Ewart S, Cothran G, Lindgren G (2008). "Equine Multiple Congenital Ocular Anomalies maps to a 4.9 megabase interval on horse chromosome 6". BMC Genet. 9: 88.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Tremblay K, Lemire M, Potvin C, Tremblay A, Hunninghake GM, Raby BA, Hudson TJ, Perez-Iratxeta C, Andrade-Navarro MA, Laprise C (2008). "Genes to Diseases (G2D) Computational Method to Identify Asthma Candidate Genes". PLoS ONE. 3.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ McCullough BJ, Adams JC, Shilling DJ, Feeney MP, Sie KCY, Tempel BL (2007). "3p-Syndrome Defines a Hearing Loss Locus in 3p25.3". Hear Res. 224: 51--60.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Pignone M, Greth KM, Cooper J, Emerson D, Tang J (2006). "Identification of Mycobacteria by Matrix-Assisted Laser Desorption Ionization-Time-of-Flight Mass Spectrometry". J Clin Microbiol. 44: 1963--1970.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ van Amsterdam K, Bart A, van der Ende A (2005). "A Helicobacter pylori TolC Efflux Pump Confers Resistance to Metronidazole". Antimicrob Agents Chemother. 49: 1477--1482.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Baxter SW, Papa R, Chamberlain N, Humphray SJ, Joron M, Morrison C, ffrench-Constant RH, McMillan WO, Jiggins CD (2008). "Convergent Evolution in the Genetic Basis of Müllerian Mimicry in Heliconius Butterflies". Genetics. 180: 1567--1577.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Siddall ME, Trontelj P, Utevsky SY, Nkamany M, Macdonald KS (2007). "Diverse molecular data demonstrate that commercially available medicinal leeches are not Hirudo medicinalis". Proc Biol Sci. 274: 1481--1487 PMID 17426015.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Stoeck T, Kasper J, Bunge J, Leslin C, Ilyin V, Epstein S (2007). "Protistan Diversity in the Arctic: A Case of Paleoclimate Shaping Modern Biodiversity?". PLoS ONE. 2 PMID 17710128.
{{cite journal}}
: CS1 maint: multiple names: authors list (link)