| 其他摘要 | The main purpose of comparative genomics is to find the functional regions in genome(s) of specific linage(s) through comparing the sequences between species. With the rapid expansion of DNA sequences, including genome, transcriptome, and exome, it provides unprecedented opportunities for comparative genomic studies. Meanwhile, it also provides abundant materials to understand the key problem in evolutionary biology — how organisms adapt to environment during evolution. Adaptive evolution is not only the important theoretical basis, but also the major research object of theory of natural selection. Along with the development of molecular evolutionary studies, it has been increasing common to study the genetic basis of adaptive evolution by identifying targets of natural selection at molecular level. With the development of the sequencing technology, the genomes and transcriptomes are always assembled from abundant short reads. Although the mathematic algorithms for this process were being evolved, it is unavoidable to introduce some errors during sequence assembled. When the assembly is finished, subsequent analyses will also cause some mistakes, like gene annotation error, which would generate an erroneous segment in the final gene sequences, and then cause the occurrence of alignment error when aligned with other homologous genes. People have been widely aware of the false detections of positive selection caused by the alignment error, so how to control the alignment quality is an urgent problem to solve for those studies of positive selection. In this thesis, we proposed an efficient alignment trimming method, “AlignTrim”, which used the concept of “sliding window” and “random sampling” to identify the potentially unreliable regions in alignment. Through simulation benchmark, we found that our programs can eliminate more than 95% of the errors, and only incorrectly removed less than 0.5% of the correct columns. We also found that alignments trimmed by AlignTrim can lower the false-positive rate much less than 5%, significantly superior to other programs, such as Guidance and Gblocks. AlignTrim is particularly suitable for large-scale detection of positively selected genes, especially for the low-coverage genomes which have much higher probability to introduce some unreliable gene sequences(Chapter 2). Based on the trimmed alignments by AlignTrim, we subsequently detected the mammalian candidate genes for aquatic adaptation. Cetaceans diverged from an extant group of artiodactyls approximately 50 million years ago. Their habitat transition to a fully aquatic environment represents one of the most dramatic transformations in mammalian evolutionary history. In order to adapt to the marine habitats, they have changed greatly in phenotype, like streamlined form and thickened subcutaneous fat. With the release of the dolphin genome, it provides a good opportunity for us to identify the candidate genes for aquatic adaptation. We compared the genomes of dolphins with its close terrestrial relatives, and identified the dolphin’s positively selected genes. Considering the fact that the dolphin genome’s coverage is quite low, only 2.59×, so all the alignments were controlled by using the newly developed program AlignTrim. Finally, we found these PSGs can be significantly enriched in some interesting GO terms, like muscle contraction, lipid transport and localization, ATPase, and perception of sound. We discussed the contribution of these genes on dolphin’s specific phenotypes in this thesis, and concluded that these adaptively evolved genes in dolphin genomes might act an important role in mammalian aquatic adaptation (chapter 3). There are two major aspects concerning animal’s aquatic adaptation, locomotion and respiration. There might be no significant difference of the behavior of respiration among fishes, as they all breathe by gills in water, but the locomotive patterns among fishes are nd for reproduction or foraging; others however do not have the migration behavior. Hence, the energy demands for these two kinds of fishes are obviously different. Mitochondria are the power plant of cells, which play critical roles in energy metabolism and thermo-regulation. Based on the two aspects of functional constraints on the mitochondria, we expected that the mitochondrial DNA may be not in neutrally evolution as traditional thought, but also under different degrees of selection pressure, and the level of pressure is mainly dependent upon the energy budget of organisms. In order to study whether the mitochondrial proteins’ evolutionary pattern correlates with the fishes’ different life-history, we conduct a comparative genomic analysis of 401 teleost complete mitochondrial genomes and test the roles of these dual functional constraints on mitochondria to provide a more complete view of mtDNA evolution. We found that mitochondrial protein-coding genes of migratory fishes have significantly smaller Ka/Ks than non-migratory fishes. The same dataset showed that the genes of fishes living in cold climates have significantly smaller Ka/Ks than tropical fishes. In contrast, these trends were not observed for two nuclear genes that are not involved in energy metabolism. The differences in selection patterns observed between mitochondrial and nuclear genes suggest that the functional constraints acting on mitochondria, due to energy metabolism and/or thermo-regulation, influence the evolution of mitochondrial-encoded proteins in teleosts. (chapter 4). In conclusion, the specific modes of mitochondrial proteins’ evolution have also played important roles in the animals’aquatic adaptation. In the context of rapidly increasing genomic data, this thesis proposed an efficient alignment trimming method to eliminate the potential alignment errors and to improve the identification of authentic positive selection signals. With our provided method, we comprehensively studied the relationship between natural selection and animal’s adaptive evolution by using both the mitochondrial and nuclear genomes (including mammals and fishes), laying the fundation for understanding the molecular mechanism underlying the animal’s aquatic adaptation. |
修改评论