Open Access Open Access  Restricted Access Subscription or Fee Access

Tools and Techniques for Whole-Genome Alignments

Khushboo Upadhyay

Abstract


The prediction of evolutionary relationships between two or more genomes at the nucleotide level is known as whole-genome alignment (WGA). It possesses characteristics of both gene orthology prediction and collinear sequence alignment. WGAs are useful for genome-wide analyses like phylogenetic inference, genome annotation, and function prediction. So many solutions have been developed despite the fact that this problem is difficult. This article provides an overview of the approaches used to address WGA as well as a discussion of its significance and meaning. We also look at the issue of evaluating whole-genome aligners and provide a list of methodological issues that must be resolved in order to make the best use of our whole-genome databases, which are rapidly expanding. For the first time, two significant vertebrate genomes could be compared and aligned thanks to the availability of the assembled mouse genome. We looked into various alignment techniques that work well for assemblies of varying quality in order to analyse the conservation of genomes later on. These methods were used to compare the mouse genome sequencing consortium assembly and other preliminary mouse assemblies with the working draught of the human genome. Our techniques are quick, and the resulting alignments have a high level of sensitivity, covering more than 90% of the human genome’s recognised coding exons. We managed to achieve this coverage while maintaining specificity. 


Keywords


Comparative genomics, genome evolution, homology map, sequence alignment, top orthology, whole-genome alignment

Full Text:

PDF

References


Lo¨ytynoja A. Alignment methods: Strategies, challenges, benchmarking, and comparative overview. Methods Mol Biol. 2012; 855: 203–235.

Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, McKenney K. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. Jul 1995; 269(5223): 496–512.

Mukherjee S, Stamatis D, Bertsch J, Ovchinnikova G, Verezemska O, Isbandi M, Thomas AD, Ali R, Sharma K, Kyrpides NC, Reddy TB. Genomes Online Database (GOLD) v. 6: data updates and feature enhancements. Nucleic Acids Research. Oct 2016: gkw992.

Fitch WM. Distinguishing homologous from analogous proteins. Systematic Zoology. Jun 1970; 19(2): 99–113.

Altenhoff AM, Dessimoz C. Inferring orthology and paralogy. Evolutionary Genomics. 2012: 259–279.

Dewey CN. Positional orthology: Putting genomic evolutionary relationships into context. Briefings in Bioinformatics. Sep 2011; 12(5): 401–412.

Dewey CN, Pachter L. Evolution at the nucleotide level: the problem of multiple whole-genome alignment. Human Molecular Genetics. Apr 2006; 15(suppl_1): R5–R56.

Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, Haussler D. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Research. Apr 2004; 14(4): 708–715.

Ma J, Ratan A, Raney BJ, Suh BB, Miller W, Haussler D. The infinite sites model of genome evolution. Proceedings of the National Academy of Sciences. Sep 2008; 105(38): 14254–14261.

Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology. Mar 1970; 48(3): 443–453.

Smith TF, Waterman MS. Identification of common molecular subsequences. Journal of Molecular Biology. Mar 1981; 147(1): 195–197.

Dewey CN. Whole-genome alignment. Evolutionary Genomics. 2012: 237–257.


Refbacks

  • There are currently no refbacks.