Background: Transposable elements (TEs) represent more than 45% of the human and mouse genomes. Both parasitic and mutualistic features have been shown to apply to the host-TE relationship but a comprehensive scenario of the forces driving TE fixation within mammalian genes is still missing. Results: We show that intronic multispecies conserved sequences (MCSs) have been affecting TE integration frequency over time. We verify that a selective economizing pressure has been acting on TEs to decrease their frequency in highly expressed genes. After correcting for GC content, MCS density and intron size, we identified TE-enriched and TE-depleted gene categories. In addition to developmental regulators and transcription factors, TE-depleted regions encompass loci that might require subtle regulation of transcript levels or precise activation timing, such as growth factors, cytokines, hormones, and genes involved in the immune response. The latter, despite having reduced frequencies of most TE types, are significantly enriched in mammalian-wide interspersed repeats (MIRs). Analysis of orthologous genes indicated that MIR over-representation also occurs in dog and opossum immune response genes, suggesting, given the partially independent origin of MIR sequences in eutheria and metatheria, the evolutionary conservation of a specific function for MIRs located in these loci. Consistently, the core MIR sequence is over-represented in defense response genes compared to the background intronic frequency. Conclusion: Our data indicate thatgene function, expression level, and sequence conservation influence TE insertion/fixation in mammalian introns. Moreover, we provide the first report showing that a specific TE family is evolutionarily associated with a gene function category.
ASJC Scopus subject areas
- Cell Biology
- Ecology, Evolution, Behavior and Systematics