LSCGAT: LONG SEQUENCES CUSTOMIZABLE GLOBAL ALIGNMENT TOOL

Main Article Content

Anton Nikolaevich Pankratov
Ruslan Kurmanbievich Tetuev
Maxim Ivanovich Pyatkov

Abstract

The most general model of a pairwise global alignment with the possibility of aligning long sequences is considered. The model features are alignment of sequences in different alphabets including nucleotides and amino acids, alternative alignments with the same score, predefined or fully customazible scoring matrix and gap penalty systems for each sequence including end gap penalties.


Developed versatile parallel algorithm for global alignment is based on the Needleman-Wunsch algorithm with an arbitrary scoring matrix and Gotoh algorithm for the affine system of penalties for gaps. The main features of the algorithm include optimal memory consumption and parallel computation. It is proved that algorithm can align two sequences of length L in memory space O(L^4/3).


The algorithm is implemented in the form of a high-performance web application in Javascript programming language with the usage of webworkers and is freely available on http://sbars.impb.ru/aligner.html.

Article Details

How to Cite
PANKRATOV, Anton Nikolaevich; TETUEV, Ruslan Kurmanbievich; PYATKOV, Maxim Ivanovich. LSCGAT: LONG SEQUENCES CUSTOMIZABLE GLOBAL ALIGNMENT TOOL. Journal of Bioinformatics and Genomics, [S.l.], n. 1 (10), july 2019. ISSN 2530-1381. Available at: <http://journal-biogen.org/article/view/145>. Date accessed: 24 aug. 2019. doi: http://dx.doi.org/10.18454/jbg.2019.1.10.1.
Section
Novel computational tools and databases
References
Needleman S., Wunsch C. A general method applicable to the search for similarities in the amino acid sequence of two proteins. // Journal of Molecular Biology. – 1970. – №3 (48) – P. 443-453.
Pyatkov M.I., Pankratov A.N. SBARS: fast creation of dotplots for DNA sequences on different scales using GA-, GC- content. // Bioinformatics. - 2014. - №12 (30) - P. 1765—1766.
Gotoh O. An improved algorithm for matching biological sequences. // Journal of Molecular Biology. – 1982. - №3 (162) - P. 705–708.
Myers E.W., Miller W. Optimal alignments in linear space. // Computer Applications in the Biosciences. – 1988. - №1 (4) – P. 11–17.
Driga A., Lu P., Schaeffer J., Szafron D., Charter K., Parsons I. FastLSA: A Fast, Linear-Space, Parallel and Sequential Algorithm for Sequence Alignment. // Algorithmica. – 2006. - №3 (45) – P. 337-375.
Galvez S., Diaz D., Hernandez P., Esteban F.J., Caballero J.A., Dorado G. Next-generation bioinformatics: using many-core processor architecture to develop a web service for sequence alignment. // Bioinformatics. – 2010. - №5 (26) – P. 683-686.
Tetuev R.K., Pyatkov M.I., Pankratov A.N. Parallel algorithm for global alignment of long aminoacid and nucleotide sequences. // Mathematical Biology and Bioinformatics. – 2017. - №1 (12) – P. 137-150.