The origin of new exons is a potentially important source for generating proteomic diversity. Despite recent significant progresses, many questions concerning the functional role and evolutionary importance of new exons and their host genes remain. We report a study of evolutionary and functional features of new exons and their host genes in human and mouse. We find that new exons preferentially locate in UTR especially 5’ UTR regions, implying that many new exons are involved in regulation. We also find that genes containing new exons have higher tissue specificity of gene expression and are more likely to be involved in cellular regulation and interaction with the environment. By comparing with orthologs in outgroup lineages, we were able to show that genes that gain new exons inherently evolved faster rather than gain of new exons accelerated the evolution of host genes. We report the recurrent origination of new exons in mammalian chromodomain Y like (CDYL) genes and the functional evolution associated with the new exons. The CDYL gene in the common ancestor of mammals acquired three new exons together with a new upstream promoter. Subsequently one more new exon evolved independently in mammalian lineages. In human, additional changes including start codon shift and alternative splicing occurred in CDYL gene that led to the creation of a longer peptide. The evolution of these new exons in mammals appears to be a result of positive selection as significant excess of non-synonymous mutations was observed in these exons. Functionally, the newly evolved longer peptide exhibits a weaker transcription repression activity and could attenuate the repression activity of the shorter form, suggesting that the evolution of the new exons are functionally relevant and may contribute to the complexity of the proteome. Chimeric RNAs from two or more distinct transcripts are conventionally thought to be produced by trans-splicing and have been reported in variety of organisms. We conducted a large-scale search for chimeric RNAs in the budding yeast, fly, mouse and human. Surprisingly, we identified thousands of chimeric transcripts in these organisms (except for yeast in which only five chimeric RNAs were observed), suggesting that formation of chimeric RNAs is a widespread process and can greatly contribute to the complexity of the transcriptome and proteome of organisms. However, only a small fraction (<20%) of these chimeric RNAs can be explained with the previous trans-splicing model. In contrast, we observed short homologous sequences (SHSs) at the junction sites of the source sequences for about half of the chimeric RNAs, suggesting a transcriptional slippage model. Our in vivo experiments in yeast showed that the disruption of the SHSs resulted in the disappearance of the corresponding chimeric RNAs, supporting our new model for chimeric RNAs generation.
修改评论