Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
Bioinformatics Analysis of <i>MSH1</i> Genes of Green Plants: Multiple Parallel Length Expansions, Intron Gains and Losses, Partial Gene Duplications, and Alternative Splicing
oleh: Ming-Zhu Bai, Yan-Yan Guo
| Format: | Article |
|---|---|
| Diterbitkan: | MDPI AG 2023-09-01 |
Deskripsi
<i>MutS homolog 1</i> (<i>MSH1</i>) is involved in the recombining and repairing of organelle genomes and is essential for maintaining their stability. Previous studies indicated that the length of the gene varied greatly among species and detected species-specific partial gene duplications in <i>Physcomitrella patens</i>. However, there are critical gaps in the understanding of the gene size expansion, and the extent of the partial gene duplication of <i>MSH1</i> remains unclear. Here, we screened <i>MSH1</i> genes in 85 selected species with genome sequences representing the main clades of green plants (Viridiplantae). We identified the <i>MSH1</i> gene in all lineages of green plants, except for nine incomplete species, for bioinformatics analysis. The gene is a singleton gene in most of the selected species with conserved amino acids and protein domains. Gene length varies greatly among the species, ranging from 3234 bp in <i>Ostreococcus tauri</i> to 805,861 bp in <i>Cycas panzhihuaensis</i>. The expansion of <i>MSH1</i> repeatedly occurred in multiple clades, especially in Gymnosperms, Orchidaceae, and <i>Chloranthus spicatus</i>. <i>MSH1</i> has exceptionally long introns in certain species due to the gene length expansion, and the longest intron even reaches 101,025 bp. And the gene length is positively correlated with the proportion of the transposable elements (TEs) in the introns. In addition, gene structure analysis indicated that the <i>MSH1</i> of green plants had undergone parallel intron gains and losses in all major lineages. However, the intron number of seed plants (gymnosperm and angiosperm) is relatively stable. All the selected gymnosperms contain 22 introns except for <i>Gnetum montanum</i> and <i>Welwitschia mirabilis</i>, while all the selected angiosperm species preserve 21 introns except for the ANA grade. Notably, the coding region of <i>MSH1</i> in algae presents an exceptionally high GC content (47.7% to 75.5%). Moreover, over one-third of the selected species contain species-specific partial gene duplications of <i>MSH1</i>, except for the conserved mosses-specific partial gene duplication. Additionally, we found conserved alternatively spliced <i>MSH1</i> transcripts in five species. The study of <i>MSH1</i> sheds light on the evolution of the long genes of green plants.