- Open Access
Characterization of a novel Helitron family in insect genomes: insights into classification, evolution and horizontal transfer
© The Author(s). 2019
- Received: 11 February 2019
- Accepted: 30 April 2019
- Published: 31 May 2019
Helitrons play an important role in shaping eukaryotic genomes due to their ability to transfer horizontally between distantly related species and capture gene fragments during the transposition. However, the mechanisms of horizontal transfer (HT) and the process of gene fragment capturing of Helitrons still remain to be further clarified.
Here, we characterized a novel Helitron family discontinuously distributed in 27 out of 256 insect genomes. The most prominent characteristic of Hel1 family is its high sequence similarity among species of different insect orders. Related elements were also identified in two spiders, representing the first report of spider Helitrons. All these elements were classified into 2 families, 9 subfamilies and 35 exemplars based on our new classification criteria. Autonomous partners of Helitron were reconstructed in the genomes of three insects and one spider. Integration pattern analysis showed that majority of Hel1A elements in Papilio xuthus and Pieris rapae inserted into introns. Consistent with filler DNA model, stepwise sequence acquisition was observed in Sfru_Hel1Aa, Sfru_Hel1Ab and Sfru_Hel1Ac in Spodoptera frugiperda. Remarkably, the evidence that Prap_Hel1Aa in a Lepdidoptera insect, Pieris rapae, was derived from Cves_Hel1Aa in a parasitoid wasp, Cotesia vestalis, suggested the role of nonregular host-parasite interactions in HT of Helitrons.
We proposed a modified classification criteria of Helitrons based on the important role of the 5′-end of Helitrons in transposition, and provided evidence for stepwise sequence acquisition and recurrent HT of a novel Helitron family. Our findings of the nonregular host-parasite interactions may be more conducive to the HT of transposons.
- Transposable elements
- Horizontal transfer
- Genome evolution
As the single largest component of the genetic material of most eukaryotic and proeukaryotic species, transposable elements (TEs) play key roles in the epigenetic regulation of the genome and generation of genomic novelty [1, 2]. Depending on the mode of transposition, TEs are traditionally categorized as class-I elements or retrotransposons and class-II elements or DNA transposons [1, 3]. Copy and paste retrotransposons replicate via reverse transcription of an RNA intermediate of a source element, and can be further divided into long terminal repeat (LTR) and non-LTR retrotransposons. DNA transposons move through a single or double-stranded DNA intermediate, and are classified into three major subclasses, including the classic “cut-and-paste” transposons, rolling-circle (RC) transposons called Helitrons, and self-synthesizing transposons called Mavericks/Polintons. Both retrotransposons and DNA transposons exist as self-mobilizing autonomous elements or non-autonomous elements relying on trans-mobilization by the enzymatic machinery of their autonomous counterparts .
Helitrons, a novel superfamily of transposons, were originally discovered by in silico genome-sequence analysis , and later identified in a wide range of organisms, from protists to mammals [6, 7]. Helitrons are fundamentally different from classical transposons in terms of enzymatic activity and structure. Helitrons encode a RepHel protein homologous to RCR prokaryotic transposases, which comprises the replication initiator (Rep) and helicase (Hel) domains and is predicted to have both HUH (His-hydrophobe-His) endonuclease activity and 5′ to 3′ helicase activity . Helitrons do not create target site duplications or contain terminal inverted repeats, and recent studies show that they transpose via copy-and-paste rather than cut-and-paste mechanism . The characteristic features of Helitrons include a ‘TC’ motif on the 5′-end and a ‘CTRR’ motif on the 3′-end, and a palindromic sequence of 16–20 bp near the 3′-end, which can form a hairpin structure. Because of the minimal sequence feature and high sequence heterogeneity among Helitron copies, a classification system for family and subfamily definition has been proposed based on genome-wide analysis of Helitrons in the maize, Zea mays .
Helitrons have attracted widespread attention because their remarkable ability to capture gene fragments at the DNA level makes them play an important role in the host genome evolution. This process appears to have been particularly remarkable in the maize genome, where it is estimated that at least 20,000 gene fragments have been picked up and shuffled by Helitrons [10–12]. High frequency of Helitron-mediated gene capture is also reported in bats . A recent study revealed that Helitrons have captured 3724 fragments from 268 genes in the silkworm, Bombyx mori . Several models have been proposed to explain the mechanism of gene capture at the DNA level including end bypass and filler DNA model [8, 15].
Horizontal transfer (HT) is the non-vertical exchange of genetic material between reproductively isolated species. The inherent mobility and replication abilities of TEs facilitate them to undergo vector-mediated HT between organisms to avoid co-evolved host suppression mechanisms leading to vertical inactivation [1, 16–18]. The first evidence for the repeated HT of four different families of Helitrons including Heligloria, Helisimi, Heliminu, and Helianu, was described in an unprecedented array of organisms, including mammals, reptiles, fish, invertebrates, and polydnaviruses . Subsequent identification of horizontally transferred Helitrons, such as Hel-2 , Lep1 , suggesting that Helitrons rely heavily on HT for their propagation and maintenance throughout evolution . However, the physiological or ecological factors favoring the high frequency of HT still remains elusive.
Here, we have conducted a thorough search for the distribution of a novel Helitron family by analyzing the sequenced genomes of 256 insects and 22 spiders. We found that Hel1 elements distributed in 27 investigated insect genomes as well as the genome of a distantly related spider, Nephila clavipes, which were classified into 9 subfamilies and 34 exemplars. A related Hel2 family was identified in the genome of a spider, Parasteatoda tepidariorum. Furthermore, we provided evidence for stepwise sequence acquisition and recurrent HT of this novel Helitron family. Our results provided new insights into the classification and evolution of Helitrons, and suggested that the Helitrons can undergo horizontal transfer by diverse means.
Identification and distribution of a novel Helitron transposon
Characteristics of 35 Helitron exemplars from 29 species
Copies (% Genome)
Characterization of reconstructed potential autonomous DNA Helitrons
Contribution of Hel1 to gene and genome evolution
We further analyzed the integration pattern relative to the annotated genes in two representative genomes, P. xuthus and Pieris rapae. Out of the 796, 722 and 1520 copies of Pxut_Hel1Aa, Pxut_Hel1Ab and Prap_Hel1Aa, 463 (58%) of Pxut_Hel1Aa Helitrons, 413 (57%) of Pxut_Hel1Ab Helitrons, and 740 (52%) of Prap_Hel1Aa Helitrons were found in introns. Only 4, 7 and 3 copies of Pxut_Hel1Aa, Pxut_Hel1Ab and Prap_Hel1Aa were found to insert into exons, respectively (Fig. 5a). Further analysis revealed the insertion of multiple copies of Hel1 into introns of the same gene. For example, as many as 4 copies of Prap_Hel1Aa inserted into introns of LOC110995424 gene, and the fifth copy inserted into 3′-end of coding sequence (CDS) (Additional file 1: Figure S8). However, in most cases, only one copy was detected in intron regions of a specific gene. Notably, a 127 bp copy of Pxut_Hel1A (NW_013531711.1: 4069476–4,069,349) inserted into CDS of a gene encoding an unclassified protein (Fig. 5b). Thus, the P. xuthus and P. rapae Hel1 Helitrons mainly contribute to structural variation in introns, which might influence the regulation of gene expression.
Sequence acquisition and new Hel1 creation
Among all Helitrons identified in this study, Hel1 in Spodoptera frugiperda attracted our attention. In addition to Sfru_Hel1Aa、Sfru_Hel1Ab and Sfru_Hel1Ac, a 161 bp copy (FJUZ01003913.1: 464817–464,657) was detected (Fig. 6). Sequence analysis showed that the consensus sequences of these three exemplars shared almost 99% identity with this short sequence excluding insertions, thus this short copy was designated as core sequence (Sfcore) (Fig. 6). Further analysis showed that compared with the core sequence, a 35 bp fragment named “A” region inserted into core sequence 130 bp downstream of the 5′-end in Sfru_Hel1Aa, and a 66 bp fragment named “B” region inserted into “A” region in Sfru_Hel1Ab, while a 108 bp fragment named “C” region inserted into “B” region in Sfru_Hel1Ac (Fig. 6a). Alignment of the consensus sequences showed that the insertion sites were consistent with the overlapping region (Fig. 6b). The search of putative source loci of these insertions revealed that “A” region consisted of “A1” and “A2” regions, among which “A1” region was derived from sequence NJHR01000652 (137291–137,357) and “A2” region from NJHR01000244 (961324–961,247) (Additional file 1: Figure S9). While sequence NJHR01000585 (153969–153,870) showed high identity with “B” region, we did not find source locus of “C” region, putative due to the incomplete genome sequencing. Additionally, 2-13 bp end junctions were identified in each source locus, supporting the filler DNA model  (Additional file 1: Figure S9). Furthermore, the average percentage divergence was 0.852, 0.164 and 0.016, respectively, indicating a clear evolutionary order (Table 1).
The sequence acquisition of P. xuthus Hel1 is different from that of S. frugiperda. Two exemplars of Hel1A subfamily, Pxut_Hel1Aa and Pxut_Hel1Ab, were found in P. xuthus, with the length of 296 bp and 204 bp, respectively, and only 172 bp region was shared by these two exemplars (Additional file 1: Figure S10a). The average percentage divergence of Pxut_Hel1Aa and Pxut_Hel1Ab was 0.069 and 0.066, respectively (Table 1). It seems unlikely that Pxut_Hel1Ab was formed by the sequence acquisition of Pxut_Hel1Aa. Furthermore, we also found a core sequence (BBJE01004687.1: 58267–58,430) highly similar to that of S. frugiperda (Additional file 1: Figure S10b). We speculated that Pxut_Hel1Aa and Pxut_Hel1Ab were independently derived from the core sequence by sequence acquisition during transposition.
In case of C. suppressalis, the 162 bp consensus sequence of Csup_Hel1Aa was over 96% identical to the above core sequences of P. xuthus and S. frugiperda (Additional file 1: Figure S10b). Compared with Csup_Hel1Ab, a 7 bp fragment (AGACGTG) was unique to Csup_Hel1Aa (Additional file 1: Figure S10b). Given similar average percentage divergence in these two exemplars, it seems that Csup_Hel1Ab was not derived from Csup_Hel1Aa. On the other hand, 5 core sequences with high similarity to Csup_Hel1Ea were also found in the genome of C. suppressalis (Additional file 1: Figure S10c). Considering that the average percentage divergence of Csup_Hel1Ea was 0.034, we inferred that Csup_Hel1Ea was evolutionarily earlier than Csup_Hel1A, and had different origin with Csup_Hel1Aa.
Evolution and horizontal transfer of Hel1
Using HeligloriaAi_DW1 and HeligloriaAi_Rp1 as out group , the phylogenetic tree of the 35 Helitron consensus sequences showed that Ptep_Hel2Ca was evolutionarily different from other Hel1 elements, and insects of the same order were not clustered together. The incongruence of Hel1 elements and host phylogeny as well as the patchy distribution and high sequence similarity of Hel1 elements among distantly related lineages suggest the recurrence of HT and that multiple mechanisms may underlie the horizontal spread of Hel1. Notably, Lepidopteran Prap_Hel1Aa and Hymenopteran Cves_Hel1Aa, Dipteran Cvic_Hel1Ca and Lepidopteran Bmor_Hel1Ca, Hemipteran Hvit_Hel1Ga and Lepidopteran Pgla_Hel1Ga were clustered into distinct clades, which diverged 325, 272 and 358 million years ago, respectively (http://www.timetree.org/)  (Fig. 7). Furthermore, several paralogous and orthologous empty sites were also detected in these insect genomes (Additional file 1: Figure S11). It is also noteworthy that the genetic distance between species of the same cluster was less than 0.1, indicating that these elements have spread horizontally among these species within a relatively narrow timeframe.
The clustering of Prap_Hel1Aa from P. rapae (Lepidoptera: Pieridae) and Cves_Hel1Aa from C. vestalis (Hymenoptera: Braconidae) into the same clade is of particular interest. While the calculated genetic distances of orthologous genes calreticulin, Hsc70 and opsin between P. rapae and C. vestalis were 0.325, 0.229 and 0.312, respectively (Additional file 1: Figure S12a, b, c), sequence comparison showed that the consensus sequences of Prap_Hel1Aa and Cves_Hel1Aa shared over 98% identity excluding a 169 bp insertion in Prap_Hel1Aa (Additional file 1: Figure S12d). Considering the average percentage divergence of Prap_Hel1Aa and Cves_Hel1Aa were 0.054 and 0.372, respectively, we speculated that Prap_Hel1Aa was derived from C. vestalis through HT, followed by the capture of 169 bp fragment and a rapid burst in transposition. This hypothesis was partly supported by the reconstructed phylogeny in which the Cves_Hel1Aa copies are generally nested within clades made of Prap_Hel1Aa copies, and the closely related Csup_Hel1Aa copies from C. suppressalis were phylogenetically separated from both Prap_Hel1Aa and Cves_Hel1Aa (Additional file 1: Figure S13). Additionally, the 169 bp insertion fragment was almost entirely absent in a short copy of Prap_Hel1Aa (Prap0202, LWME01000202.1: 138955–138,682), which was over 94% identical to eight copies of Cves_Hel1Aa, and specifically over 97% identity was observed at the 30 bp 3′-ends of Prap0202 and these Cves_Hel1Aa copies (Additional file 1: Figure S12e). Interestingly, PCR amplification and sequencing revealed orthologous empty site of Prap0202 in a local population of P. rapae, suggesting Prap_Hel1Aa elements mobilized recently (Additional file 1: Figure S14). Furthermore, as many as 11 elements in P. rapae genome were found to be completely same as the consensus sequence of Prap_Hel1Aa, indicating recent invasion of the P. rapae genome by Prap_Hel1Aa elements (Additional file 1: Figure S15).
There are few reports on the occurrence of HT between Lepidoptera and Diptera . In this study, we found that Cvic_Hel1Ca and Bmor_Hel1Ca, Pgla_Hel1Ga and Hvit_Hel1Ga were clustered into same clade, respectively, and the corresponding consensus sequences were highly similar (Additional file 1: Figure S16a and Figure S17), suggesting the occurrence of HT between these insects. Furthermore, orthologous empty sites were detected in both Lepidoptera and Diptera insects (Additional file 1: Figure S11). Notably, a putative Hel1 sequence (GEND01024785.1: 446–129) was found in the transcriptome shotgun assembly (TSA) database of Entomophthora muscae, which shared 90% identity with two copies of Cvic_Hel1Ca and 74% identity with Bmor_Hel1Ca (Additional file 1: Figure S16b), suggesting a possible role of E. muscae in the HT between Lepidoptera and Diptera insects.
The classification of Helitron has always been ambiguous. The classical classification system was proposed based on genome-wide analysis of maize Helitrons, in which the sequences with the most similar 3′-ends (30 bp with at least 80% identity) were classified as members of the same family and sequences with the most similar 5′-ends (30 bp with at least 80% identity) were classified as members of the same subfamily . This criteria has been followed by several other studies . In addition, the unique internal sequence that was > 20% different at the nucleotide level from any other Helitron internal regions was defined “exemplars” . However, based on genome-wide analysis of silkworm Helitrons, Han et al. (2013) suggested that sequences with identities > 80% in the 30 bp of both their 5′- and 3′-ends were classified as members of the same family, and full-length sequences with identity > 80% were classified in the same subfamily. Due to the lack of knowledge regarding Helitron cis- or trans- activation of Helitron, these classification criteria are exploratory. According to the end bypass model, which was proposed to explain the mechanism of gene capture of Helitron, transposition initiates at the 5′-end and gene capture occurs if the 3′-end signal is missed. A random cryptic sequence located downstream would then act as the termination signal and all intervening sequences would be captured [8, 26]. This model was supported by the fact that the xanA gene fragment was captured by a Helitron in Aspergilus nidulans genome . Recent study showed that the modification or deletion of the hairpin loop or palindrome sequence had little effect on the transposon colony-forming activity of the reconstructed active bat Helitron, Helraiser. However, the deletion of 5′-end of Heliraiser resulted in complete loss of activity . Given the importance role of the 5′-end of Helitrons in transposition, we think it seems more reasonable to classify the family with 5′-end of Helitron. Thus, we proposed a new classification standard, as described in methods. This new criteria was supported by our phylogenetic analysis of P. xuthis and C. suppressalis Helitrons, in which the copies of different subfamilies or exemplars of Helitron are well separated phylogenetically (Fig. 3).
The distinct copy and paste transposition process of Helitrons ensures them the capability of reaching high genomic copy numbers. For example, maize and silkworm Helitrons constitute 6.6% and 4.23% of the genome, respectively [11, 14]. In this study, as many as 5578 copies of Cvir_Hel1Ea were found in C. virginiensis, which account for 0.479% of genome. Besides their direct effect on genome size, evidence has accumulated in recent years that Helitrons can also impact the gene structure and expression as well as genome organization [28, 29]. For example, the insertion of two non-autonomous Helitron elements, AtREP3 and AtREP1, into upstream of ETT and ARF4 genes in tebichi (teb) mutant Arabidopsis thaliana resulted in the upregulation of these two genes . In the tetraploid sour cherry, Prunus cerasus, the insertion of a small non-autonomous Helitron element into 38 bp downstream of the stop codon of SFB gene is proposed to interfere with the polyadenylation process, resulting in a loss of function of the SFB gene involved in gametophytic self-incompatibility . In this study, we found that, similar to silkworm Helitrons , majority copies of Pxut_Hel1A and Prap_Hel1Aa insert into introns of host genome, suggesting that Hel1 duplication and transposition led to structural variation in introns, which might influence the regulation of gene expression. Notably, a copy with 3′-end deletion of Pxut_Hel1A inserted into coding region of an unclassified gene (Gene accession: LOC110995424) in P. xuthus genome, while the impact of the insertion on gene function is unknown at present.
A predominant characteristic of Helitrons is their ability to capture and amplify host genome sequences. Among 1649 Helitron-like transposons identified in genome of maize inbred line B73, over 90% of maize Helitrons have captured gene fragments . While end bypass and filler DNA models [8, 15] have been proposed to explain Helitron gene capture and transposition, the exact mechanisms is far from clear. It has been proposed that gene capture during Helitron transposition occurs in a stepwise or sequential way . In this study, three exemplars of Helitron, Sfru_Hel1Aa, Sfru_Hel1Ab and Sfru_Hel1Ac, were identified in S. frugiperda together with a shorter core sequence sharing high identity with these three exemplars. Multiple sequence alignment showed that these three exemplar Helitrons have high sequence identity in shared sequences, but differ due to additional captured regions internal to the elements. The gene fragment trapped within Helitrons excluded the end bypass model. Alternatively, filler DNA model suggests that Helitrons acquire DNA from the host during the repair of double-strand breaks (DSBs) internal to the element, and predicts that short regions flanking the DSB in the acceptor transposon should be homologous to DNA sequences flanking the original host sequence captured by the transposon . The identification of end junctions in the putative source loci suggested that Hel1 Helitrons acquire DNA from the host putatively by filler DNA insertion during the repair of DSBs. Notably, the average percentage divergence of these three exemplars were 0.852, 0.164 and 0.016, respectively, strongly supporting the occurrence of stepwise transposition and amplification putatively using the core sequence as the source element. However, while shorter core sequences were also identified in respective host genomes, exemplars of the Pxut_Hel1A and Csup_Hel1 seems to capture host gene fragment during independent transposition events.
No less than 2836 horizontal transposon transfer (HTT) events have been recorded so far in multicellular eukaryotes , however, the mechanisms underlying HTT remain largely mysterious. The role of a host-parasite relationship has been proposed recently as a major mechanism of horizontal DNA transfer [21, 35, 36]. In this study, we provide evidence that Prap_Hel1Aa might derive from Cves_Hel1Aa. While C. vestali is larval parasitoid of the diamondback moth, Plutella xylostella (Lepidoptera: Plutellidae), we did not find any Cves_Hel1-like sequences in the genome database of P. xylostella (http://iae.fafu.edu.cn/DBM/), putatively due to the evolutionary dead-end of parasitized caterprillars. On the other hand, parasitoids are likely to oviposit within marginal (or even completely unsuitable) hosts in the laboratory or field, even if suitable hosts are present , and C. vestalis has been reared from several species belonging to different Lepidopteran families , thus we propose that C. vestalis might be a nonregular parasite of P. rapae, and this nonregular host-parasite interactions contribute to the HT of Hel1 between these two species. The origin of Cves_Hel1Aa in C. vestalis seems to be a mystery. A number of core sequences were found in Lepidoptera genome including S. frugiperda, P. xuthus and C. suppressalis, thus as a vector for HT in Lepidoptera insects, C. vestalis is more likely to acquire and transfer Cves_Hel1Aa to P. rape from other Lepidoptera insects.
While our results indicate the role of nonregular host-parasite interactions in HT of Prap_Hel1Aa and Cves_Hel1Aa, the evidence of 2 additional cases of HTT (Cvic_Hel1Ca and Bmor_Hel1Ca, Pgla_Hel1Ga and Hvit_Hel1Ga) based on their patchy distribution and incongruence of Hel1 and host phylogeny is somewhat intriguing due to the absence of host-parasite relationship among these species. It has been proposed that mechanisms of HT include insect-associated facultative symbionts [39–45]. In addition, the Lep1-like elements identified in the genome of Nosema bombycis suggested that the intracellular microsporidia parasite is also a potential vector for HT . Recent studies have also suggested that both baculovirus and polydnaviruses might be important vectors of HTT [46–48]. While Lep1-like and Hel-2 Helitrons had been identified in C. vestalis and Cotesia sesamiae bracovirus and AcNPV, respectively [20, 21], we did not find Hel1 in the genomes of bracovirus and NPV. However, the discovery of Hel1-like sequence in TSA database of E. muscae suggests that pathogen may also serve as a vector mediating HT of insect TEs. More widespread sequencing would be required to find exact vectors that would facilitate the HT of Hel1 Helitrons in these species.
In the current report, we conducted a thorough search for a novel Helitron family by analyzing the sequenced genomes of 256 insects and 22 spiders. We modified the classical classification system for family and subfamily definition of Helitrons, and classified Hel1 family into 9 subfamilies and 34 exemplars, among which three exemplars in S. frugiperda exhibited stepwise sequence acquisition, supporting the filler DNA model. We proposed that nonregular host-parasite interactions plays an important role in HT of Helitrons. Our data may have implications for understanding the evolution and HT mechanisms of Helitrons.
The publicly available 256 Insecta and 22 Arachnida WGS from National Center for Biotechnology Information (NCBI) (last accessed September 30, 2017) were used in this study. P. rapae and P. xuthus WGS were downloaded from NCBI. A list of the analyzed species and corresponding amount of sequence data is provided in Additional file 4: Data S3 online. As corresponding gene annotation files, we used the GFF files GCF_001856805.1 for P. rapae and GCF_000836235.1 for P. xuthus, respectively.
Database searches and copy number estimation of Helitrons
Database searches were performed and comprise three steps. Firstly, the novel Helitron sequence located downstream of a SINE in C. suppressalis was used as a query in BLASTN searches against the NCBI C. suppressalis WGS database. Sequences of high homology as well as 200 bp upstream and downstream flanking regions were extracted and analyzed for hallmarks of Helitrons such as characteristic 5′-TC and 3′-CTRY nucleotide termini, and the consensus sequences of three Helitron exemplars of Hel1 families in C. suppressalis, Csup_Hel1Aa, Csup_Hel1Ab and Csup_Hel1Ea, were determined. Secondly, a total of 255 insect WGS collections were searched using 161 bp common sequence of Csup_Hel1Aa and Csup_Hel1Ab (Additional file 1: Figure S18) as query to detect sequences with high identity with Csup_Hel1Aa and Csup_Hel1Ab in other insect species, and 9 subfamilies, Hel1A-Hel1I were identified. Finally, WGS collections of other invertebrates were searched using 161 bp common sequence of Csup_Hel1Aa and Csup_Hel1Ab as query to detect Hel1-like sequences in other species, and the second family Hel2 was identified. In total, 2 families, 9 subfamilies, and 35 Helitron exemplars were identified, and consensus sequences for each Helitron exemplars were reconstructed based on a multiple alignment of at least 10 individual copies . Specially, the copies of Aros_Hel1Aa and Bmor_Hel1Aa were less than 10, thus all copies were used for multiple alignment to determine consensus sequence. All consensus sequences are provided in Additional file 5: Data S4.
To estimate copy number and average percentage divergence of Helitrons, we used respective consensus sequences to search against related genomes where these Helitron elements were found using BLASTN. All contiguous fragments with at least 80% identity at the nucleotide level to the consensus over 100 bp were used to estimate copy number in all species [36, 49]. Given that 3′-ends deletion occurred in several copies of different subfamilies/exemplas in the same organism species, all those undistinguishable copies were counted as members of families. For example, two Helitron exemplars in P. xuthus, Pxut_Hel1Aa and Pxut_Hel1Ab shared high identity of 128 bp sequence at 5′-ends, thus all copies aligned only with part or full of this 128 bp region in the consensus sequence were estimate as members of family (Additional file 1: Figure S19). Furthermore, all fragments sharing at least 80% identity over at least 80% of the length of the consensus sequence were aligned and used for average percentage divergence calculation with Kimura-2 parameter model  in all species except A. rosae, Blattella germanica, Locusta migratoria, P. tepidariorum, Pseudomyrmex gracilis, T. cristinae and Phortica variegata, in which a high level of fragmentation was observed in multiple Helitron copies.
Reconstruction of potential autonomous Helitron
The reconstruction of autonomous Helitron comprise three steps. Firstly, large DNA fragments ranging from 1000 bp to 10 kb that shared similar terminal sequences to the above families were retrieved from WGS databases, and their potential transposase were predicted using getorf in EMBOSS-6.3.1 package . Secondly, these candidates with degenerated remnants of Helitron coding sequences were used as queries in BLAST searches against both WGS databases, TSA and non-redundant protein databases. Finally, the query sequence and hit sequences were aligned to reconstruct the uninterrupted coding sequences with complete Rep/helicase gene ORF of Helitron by removing frameshifts and insertions.
To distinguish Helitron elements from 29 species, we assume a set of concept names that consist of short Latin of single species, the type of TEs, the family, subfamily and exemplars of Helitron, just like Csup_Hel1Aa. Given that the 3′-end sequences of Helitrons were more variable than the 5′-end sequences , and the 5′-end sequence was strictly necessary for Helitron transposition , we modified Yang and Bennetzen′s method to reclassify Helitron TEs . Generally, the sequences with the most similar 5′-ends (30 bp with at least 80% identity) were classified as members of the same family and sequences with the most similar 3′-ends (30 bp with at least 80% identity) were classified as members of the same subfamily. Due to the internal sequence divergence of copies in the same Helitron subfamily, the unique internal sequences with more than 80% identity were classified as members of exemplars.
Gene association and genomic show cases
The site of the Helitron integration relative to annotated genes was analyzed with a custom Perl script . All copies of Prap_Hel1Aa, Pxut_Hel1Aa and Pxut_Hel1Ab, were determined for their positions in the genome through BLAST analysis with respective genome database and the GFF annotation files. The Helitrons in coding and untranslated gene regions as well as the distances of intergenic copies to the closest neighboring gene were determined and the numbers were counted . All the genic and genomic loci harboring Helitrons were refined and visualized with the respective annotations using Perl script. All the figures used CorelDRAW to beatify the fine tune.
Sequence analysis and phylogeny
RNAstructure (http://rna.urmc.rochester.edu/RNAstructureWeb) was used to predict and analyze DNA secondary structure . Multiple alignment of Helitrons were created by MUSCLE , and subsequently visualized with GENEDOC (www.psc.edu/biomed/genedoc) and TeXshade .
The phylogeny of Helitron elements was built using MrBayes 3.2  after removing ambiguously aligned regions using BMGE  (Additional file 5: Data S5). Nucleotide substitution models were chosen using the AIC criterion in Modeltest  (HKY + G). The robustness of the nodes was evaluated for all phylogenies by performing a bootstrap analysis involving 1000 pseudo replicates of the original matrix .
Specifically, in the evolutionary analysis of subfamilies from P. xuthus and C. suppressalis, we conducted local BLAST analysis and got a CSV file based on location information to obtain all sequences that are larger than 80% coverage of and 80% identity to the consensus sequences. Finally, we extracted these sequences from each genome using TBtools . A Neighbor-Joining (NJ) phylogenetic tree of these sequences in C. suppressalis and P. xuthus were constructed using MEGA 7.0.
Detection of insertion polymorphism of Prap_Hel1Aa
In P. rapae, using one pair of primers flanking the insertion site (Forward primer: 5′-ACGAGAGATGGCTACAACAG-3′; Reverse primer: 5′- AACACACCCACACCCTAAAC -3′), the insertion polymorphism of one short copy of Prap_Hel1Aa (Prap0202, LWME01000202.1: 138955–138,682) was assessed by performing a PCR survey. The PCR products were cloned into the pMD18-T vector (TaKaRa, Dalian, China) and sequenced.
The authors would like to thank anonymous referees for their helpful comments on the manuscript.
This work was supported by the National Natural Science Foundation of China (Grant No. 30871642 and 31701792), National Rice Industry Technology System Project (Grant No. Cars-001-25), Jiangsu Agricultural Science and Technology Innovation Fund (Grant No. ZX (17)2002) and Jiangsu Science Project of China (Grant No. BK20181215).
Availability of data and materials
All the data supporting the findings are included in this published article and its supplementary information files.
GH did most of the experimental work and wrote the manuscript; NZ analysed the genome database; JX designed the experiments; HJ reconstructed the autonomous Helitrons; CJ performed genomic DNA extraction and PCR; ZZ analysed the data and revised the manuscript; QS and DS revised the manuscript; JF designed the experiments and wrote the manuscript; JW designed the experiments, supervised all of the experimental work and wrote the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Barron MG, Fiston-Lavier AS, Petrov DA, Gonzalez J. Population genomics of transposable elements in Drosophila. Annu Rev Genet. 2014;48:561–81.PubMedGoogle Scholar
- Jangam D, Feschotte C, Betrán E. Transposable element domestication as an adaptation to evolutionary conflicts. Trends Genet. 2017;33(11):817–31.PubMedPubMed CentralGoogle Scholar
- Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P, Chalhoub B, Flavell A, Leroy P, Morgante M, Panaud O. A unified classification system for eukaryotic transposbale elments. Nat Rev Genet. 2007;8(12):973–82.PubMedGoogle Scholar
- Slotkin RK, Martienssen R. Transposable elements and the epigenetic regulation of the genome. Nat Rev Genet. 2007;8(4):272–85.PubMedGoogle Scholar
- Kapitonov VV, Jurka J. Rolling-circle transposons in eukaryotes. Proc Natl Acad Sci U S A. 2001;98(15):8714–9.PubMedPubMed CentralGoogle Scholar
- Kapitonov VV, Jurka J. Helitrons on a roll: eukaryotic rolling-circle transposons. Trends Genet. 2007;23(10):521–9.PubMedGoogle Scholar
- Rossato DO, Ludwig A, Depra M, Loreto EL, Ruiz A, Valente VL. BuT2 is a member of the third major group of hAT transposons and is involved in horizontal transfer events in the genus Drosophila. Genome Biol Evol. 2014;6(2):352–65.PubMedPubMed CentralGoogle Scholar
- Thomas J, Pritham EJ. Helitrons, the eukaryotic rolling-circle transposable elements. Microbiology Spectrum. 2015;3(4):MDNA3–0049-2014.Google Scholar
- Grabundzija I, Messing SA, Thomas J, Cosby RL, Bilic I, Miskey C, et al. A Helitron transposon reconstructed from bats reveals a novel mechanism of genome shuffling in eukaryotes. Nat Commun. 2016;7:10716.PubMedPubMed CentralGoogle Scholar
- Yang L, Bennetzen JL. Distribution, diversity, evolution, and survival of Helitrons in the maize genome. Proc Natl Acad Sci U S A. 2009;106(47):19922–7.PubMedPubMed CentralGoogle Scholar
- Xiong W, He L, Lai J, Dooner HK, Du C. HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes. Proc Natl Acad Sci U S A. 2014;111(28):10263–8.PubMedPubMed CentralGoogle Scholar
- Du C, Fefelova N, Caronna J, He L, Dooner HK. The polychromatic Helitron landscape of the maize genome. Proc Natl Acad Sci U S A. 2009;106(47):19916.PubMedPubMed CentralGoogle Scholar
- Thomas J, Phillips CD, Baker RJ, Pritham EJ. Rolling-circle transposons catalyze genomic innovation in a mammalian lineage. Genome Biol Evol. 2014;6(10):2595–610.PubMedPubMed CentralGoogle Scholar
- Han MJ, Shen YH, Xu MS, Liang HY, Zhang HH, Zhang Z. Identification and evolution of the silkworm Helitrons and their contribution to transcripts. DNA Res. 2013;20(5):471–84.PubMedPubMed CentralGoogle Scholar
- Feschotte C, Pritham EJ. DNA transposons and the evolution of eukaryotic genomes. Annu Rev Genet. 2007;41(41):331.PubMedPubMed CentralGoogle Scholar
- Hartl DL, Lohe AR, Lozovskaya ER. Modern thoughts on an ancyent marinere: function, evolution, regulation. Annu Rev Genet. 1997;31(31):337–58.PubMedGoogle Scholar
- Peccoud J, Loiseau V, Cordaux R, Gilbert C. Massive horizontal transfer of transposable elements in insects. Proc Natl Acad Sci U S A. 2017;114(18):4721–6.PubMedPubMed CentralGoogle Scholar
- Sormacheva I, Smyshlyaev G, Mayorov V, Blinov A, Novikov A, Novikova O. Vertical evolution and horizontal transfer of CR1 non-LTR retrotransposons and Tc1/mariner DNA transposons in Lepidoptera species. Mol Biol Evol. 2012;29(12):3685–702.PubMedGoogle Scholar
- Thomas J, Schaack S, Pritham EJ. Pervasive horizontal transfer of rolling-circle transposons among animals. Genome Biol Evol. 2010;2:656–64.PubMedPubMed CentralGoogle Scholar
- Coates BS. Horizontal transfer of a non-autonomous Helitron among insect and viral genomes. BMC Genomics. 2015;16:137.PubMedPubMed CentralGoogle Scholar
- Guo X, Gao J, Li F, Wang J. Evidence of horizontal transfer of non-autonomous Lep1 Helitrons facilitated by host-parasite interactions. Sci Rep. 2014;4:5119.PubMedPubMed CentralGoogle Scholar
- Pritham EJ, Feschotte C. Massive amplification of rolling-circle transposons in the lineage of the bat Myotis lucifugus. Proc Natl Acad Sci U S A. 2007;104(6):1895–900.PubMedPubMed CentralGoogle Scholar
- Gorbunova V, Levy AA. Non-homologous DNA end joining in plant cells is associated with deletions and filler DNA insertions. Nucleic Acids Res. 1997;25(22):4650.PubMedPubMed CentralGoogle Scholar
- Hedges SB, Dudley J, Kumar S. TimeTree: a public knowledge-base of divergence times among organisms. Bioinformatics. 2006;22(23):2971–2.PubMedGoogle Scholar
- Yang L, Bennetzen JL. Structure-based discovery and description of plant and animal Helitrons. Proc Natl Acad Sci U S A. 2009;106(31):12832–7.PubMedPubMed CentralGoogle Scholar
- Mendiola MV, Bernales I, Cruz FDL. Differential roles of the transposon termini in IS91 transposition. Proc Natl Acad Sci U S A. 1994;91(5):1922–6.PubMedPubMed CentralGoogle Scholar
- Cultrone A, Domínguez YR, Drevet C, Scazzocchio C, Fernández-Martín R. The tightly regulated promoter of the xanA gene of Aspergillus nidulans is included in a helitron. Mol Microbiol. 2007;63(6):1577–87.PubMedGoogle Scholar
- Stuart T, Eichten SR, Cahn J, Karpievitch YV, Borevitz JO, Lister R. Population scale mapping of transposable element diversity reveals links to gene regulation and epigenomic variation. Elife. 2016;5:e20777.PubMedPubMed CentralGoogle Scholar
- Seibt KM, Wenke T, Muders K, Truberg B, Schmidt T. Short interspersed nuclear elements (SINEs) are abundant in Solanaceae and have a family-specific impact on gene structure and genome organization. Plant J. 2016;86(3):268–85.PubMedGoogle Scholar
- Soichi I, Kenzo N, Atsushi M. A link among DNA replication, recombination, and gene expression revealed by genetic and genomic analysis of TEBICHI gene of Arabidopsis thaliana. PLoS Genet. 2009;5(8):e1000613.Google Scholar
- Tsukamoto T, Hauck NR, Tao R, Ning J, Iezzoni AF. Molecular and genetic analyses of four nonfunctional S haplotype variants derived from a common ancestral S haplotype identified in sour cherry (Prunus cerasus L.). Genetics. 2010;184(2):411.PubMedPubMed CentralGoogle Scholar
- Dong Y, Lu X, Song W, Shi L, Zhang M, Zhao H, et al. Structural characterization of helitrons and their stepwise capturing of gene fragments in the maize genome. BMC Genomics. 2011;12:609.PubMedPubMed CentralGoogle Scholar
- Lal SK, Hannah LC. Plant genomes: massive changes of the maize genome are caused by Helitrons. Heredity. 2005;95(6):421–2.PubMedGoogle Scholar
- Dotto BR, Carvalho EL, da Silva AF, Dezordi FZ, Pinto PM, TdL C, et al. HTT-DB: new features and updates. Database. 2018;2018.Google Scholar
- Schaack S, Gilbert C, Feschotte C. Promiscuous DNA: horizontal transfer of transposable elements and why it matters for eukaryotic evolution. Trends Ecol Evol. 2010;25(9):537–46.PubMedPubMed CentralGoogle Scholar
- Gilbert C, Schaack S, Pace JK II, Brindley PJ, Feschotte C. A role for host-parasite interactions in the horizontal transfer of transposons across phyla. Nature. 2010;464(7293):1347–50.PubMedPubMed CentralGoogle Scholar
- Heimpel GE, Neuhauser C, Hoogendoorn M. Effects of parasitoid fecundity and host resistance on indirect interactions among hosts sharing a parasitoid. Ecol Lett. 2003;6(6):556–66.Google Scholar
- Cameron PJ, Walker GP. Host specificity of Cotesia rubecula and Cotesia plutellae, parasitoids of white butterfly and diamondback moth. New Zealand Plant Protection. 1997;50(50):236–41.Google Scholar
- Oliver K, Degnan P, Burke G, Moran N. Facultative symbionts in aphids and the horizontal transfer of ecologically important traits. Annu Rev Entomol. 2010;55(55):247–66.PubMedGoogle Scholar
- Husnik F, Nikoh N, Koga R, Ross L, Duncan RP, Fujie M, et al. Horizontal gene transfer from diverse bacteria to an insect genome enables a tripartite nested mealybug symbiosis. Cell. 2013;153(7):1567–78.PubMedGoogle Scholar
- Sloan DB, Nakabachi A, Richards S, Qu J, Murali SC, Gibbs RA, et al. Parallel histories of horizontal gene transfer facilitated extreme reduction of endosymbiont genomes in sap-feeding insects. Mol Biol Evol. 2014;31(4):857–71.PubMedPubMed CentralGoogle Scholar
- Dunning Hotopp JC, Clark ME, Oliveira DC, Foster JM, Fischer P, Muñoz Torres MC, et al. Widespread lateral gene transfer from intracellular bacteria to multicellular eukaryotes. Science. 2007;317(5845):1753–6.PubMedGoogle Scholar
- Venner S, Miele V, Terzian C, Biemont C, Daubin V, Feschotte C, et al. Ecological networks to unravel the routes to horizontal transposon transfers. PLoS Biol. 2017;15(2):e2001536.PubMedPubMed CentralGoogle Scholar
- Werren JH. Biology of Wolbachia. Annu Rev Entomol. 1997;42(1):587.PubMedGoogle Scholar
- Nikoh N, Tanaka K, Shibata F, Kondo N, Hizume M, Shimada M, et al. Wolbachia genome integrated in an insect chromosome: evolution and fate of laterally transferred endosymbiont genes. Genome Res. 2008;18(2):272–80.PubMedPubMed CentralGoogle Scholar
- Gilbert C, Feschotte C. Horizontal acquisition of transposable elements and viral sequences: patterns and consequences. Curr Opin Genet Dev. 2018;49:15–24.PubMedPubMed CentralGoogle Scholar
- Gilbert C, Chateigner A, Ernenwein L, Barbe V, Bezier A, Herniou EA, et al. Population genomics supports baculoviruses as vectors of horizontal transfer of insect transposons. Nat Commun. 2014;5:3348.PubMedPubMed CentralGoogle Scholar
- Herniou EA, Olszewski JA, O'Reilly DR, Cory JS. Ancient coevolution of baculoviruses and their insect hosts. J Virol. 2004;78(7):3244–51.PubMedPubMed CentralGoogle Scholar
- Zhang HH, Xu HE, Shen YH, Han MJ, Zhang Z. The origin and evolution of six miniature inverted-repeat transposable elements in Bombyx mori and Rhodnius prolixus. Genome Biol Evol. 2013;5(11):2020.PubMedPubMed CentralGoogle Scholar
- Lerat E, Rizzon C, Biémont C. Sequence divergence within transposable element families in the Drosophila melanogaster genome. Genome Res. 2003;13(8):1889–96.PubMedPubMed CentralGoogle Scholar
- Rice P, Longden I, Bleasby A. EMBOSS: the European molecular biology open software suite. Trends in Genetics Tig. 2000;16(6):276–7.PubMedGoogle Scholar
- Mathews JSR, David H. RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinformatics. 2010;11(1):129.PubMedPubMed CentralGoogle Scholar
- Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–7.PubMedPubMed CentralGoogle Scholar
- Beitz E. TEXshade: shading and labeling of multiple sequence alignments using LATEX2 epsilon. Bioinformatics. 2000;16(2):135–9.PubMedGoogle Scholar
- Ronquist F, Teslenko M, DMP V, Ayres DL, Darling A, Höhna S, et al. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012;61(3):539–42.PubMedPubMed CentralGoogle Scholar
- Criscuolo A, Gribaldo S. BMGE (block mapping and gathering with entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol Biol. 2010;10(1):210.PubMedPubMed CentralGoogle Scholar
- Posada D, Crandall KA. MODELTEST: testing the model of DNA substitution. Bioinformatics. 1998;14(9):817–8.PubMedGoogle Scholar
- Chen C, Xia R, Chen H, He Y. TBtools, a toolkit for biologists integrating various biological data handling tools with a user-friendly interface. BioRxiv. 2018:289660.Google Scholar