Phylogenomic analysis reveals genome-wide purifying selection on TBE transposons in the ciliate Oxytricha
© Chen and Landweber. 2016
Received: 24 September 2015
Accepted: 5 January 2016
Published: 25 January 2016
Transposable elements are a major player contributing to genetic variation and shaping genome evolution. Multiple independent transposon domestication events have occurred in ciliates, recruiting transposases to key roles in cellular processes. In the ciliate Oxytricha trifallax, the telomere-bearing elements (TBE), a Tc1/mariner transposon, occupy a significant portion of the germline genome and are involved in programmed genome rearrangements that produce a transcriptionally active somatic nucleus from a copy of the germline nucleus during development.
Here we provide a thorough characterization of the distribution and sequences of TBE transposons in the Oxytricha germline genome. We annotate more than 10,000 complete and 24,000 partial TBE sequences. TBEs cluster into four major families and display a preference for either insertion into DNA segments that are retained in the somatic genome or their maintenance at such sites. The three TBE-encoded genes in all four families display dN/dS ratios much lower than 1, suggesting genome-wide purifying selection. We also identify TBE homologs in other ciliate species for phylogenomic analysis.
This paper provides genome-wide characterization of a major class of ciliate transposons. Phylogenomic analysis reveals selective constraints on transposon-encoded genes, shedding light on the evolution and domesticated functions of these transposons.
KeywordsTransposable element Transposon domestication Ciliates Genome rearrangement Purifying selection
Transposable elements (TEs) are genomic parasites present in all eukaryotic genomes. There exist multiple different classes of TEs, which occupy distinct fractions of the genome and show a wide variety of genomic activity. Despite the drastic differences, TEs play important roles in shaping the genome and facilitating genome evolution by processes that can promote genome rearrangements, contribute to the origin of new genes and alter gene expression [1–4].
Ciliates are unicellular eukaryotes that possess two types of nuclei, a transcriptionally active somatic nucleus and an archival germline nucleus . The somatic nucleus develops from a copy of the germline through extensive genome rearrangements. In Oxytricha, the somatic macronucleus (MAC) is extremely gene dense, with ~16,000 short “nanochromosomes” that average 3.2 kb, and most encode a single gene . The germline micronucleus (MIC), on the other hand, exhibits a highly fragmented and complex genome architecture, with short gene segments (Macronuclear Destined Sequences, MDSs) interrupted by brief noncoding sequences (Internal Eliminated Sequences, IESs). These DNA segments are the information that is retained in the soma after development; intriguingly, the DNA segments are often present in a permuted order or inverse orientation in the germline. Therefore, correct assembly of functional genes in the soma requires precise deletion of noncoding sequences and extensive reordering and inversion of gene segments that are “scrambled” in the germline. The somatic genome is free of transposons, although it contains some transposase-like genes . Nearly 20 % of the germline genome is occupied by TEs , which are all eliminated during somatic development.
Ciliates provide novel model systems to study transposable elements because multiple TEs, especially the transposases they encode, have been recruited to provide important cellular functions for somatic development [8, 9]. The macronuclear genomes of Tetrahymena and Paramecium encode a homolog of the PiggyBac transposase that is expressed during development. Knockdown of the PiggyBac transposase results in a developmental defect, implicating its role in nuclear development [10, 11]. Tc1/mariner transposons are the most prevalent transposons in ciliate germline genomes, including the Tec elements in Euplotes  and Tennessee, Sardine and Thon elements in Paramecium [13, 14]. The terminal sequences of Paramecium IESs resemble the terminal inverted repeats of Tec elements in Euplotes [12, 15] and the ends of Tc1/mariner transposons , leading to the hypothesis that many IESs are remnants of TE insertions .
A few studies have suggested that purifying selection is acting on the 42kD transposase encoded by TBEs [21, 23, 24]. However, these studies were limited by the small number (up to 100) of TBE sequences that were previously available. The levels of selection acting on the 22kD and 57kD ORFs have not been reported before and here we investigate their properties genome-wide. With the recent sequencing and assembly of the Oxytricha micronuclear genome , we are able to provide a thorough characterization of TBE sequences in the germline, including their genomic distribution and sequence features. We also infer the levels of selective constraints acting on the three transposon-encoded ORFs, and we discovered homologs of TBE transposons in other ciliate genomes. Together, these results provide insights into the origin and evolution of TBE transposons in Oxytricha.
TBE sequences in the micronuclear genome cluster into four major families
Genomic distribution of TBEs in the Oxytricha germline genome
% of complete TBEs by length
% of partial TBEs by length
% of all TBEs by length
% total copies
9898 (TBE2.1 + TBE2.2 + unclassified partial TBE2)
Pairwise percent protein sequence similarity of TBE genes
89.3 ± 4.4
74.2 ± 3.6
74.9 ± 3.3
67.1 ± 3.1
90.4 ± 4.2
67.1 ± 3.6
68.5 ± 3.6
65.3 ± 3.4
86.4 ± 5.2
46.3 ± 2.7
49.5 ± 2.8
36.4 ± 1.7
89.3 ± 4.3
89.3 ± 4.2
67.9 ± 3.2
89.5 ± 5.1
87.6 ± 5.2
64.4 ± 3.4
87.2 ± 6.0
53.5 ± 3.1
34.5 ± 1.6
90.8 ± 4.5
68.7 ± 2.9
89.8 ± 4.8
65.5 ± 3.1
86.1 ± 5.9
38.2 ± 1.8
90.9 ± 4.8
89.9 ± 5.4
86.1 ± 7.5
Features of Oxytricha TBEs and complete TBEs identified in other stichotrich genomes
Terminal inverted repeat (TIR) (Underline for telomeric repeat)
Target site duplication
Distance (bp) between (mode, % of mode)
GGGGTTGGGGTTATTAAT (78 bp)
AGTAATTTAAGTGAGTAT (138 bp)
GTTTATGTAGGGGTTTAATGTTTAAAT (117 bp)
GTTTATGTAGGGGTTTAGGGTT (112 bp)
GGTCTTATTGGGGTTAGTAGAGAAA (85 bp)
TTAGTACACAAA (102 bp)
TCTTTATTAATACTTATAATCATGC (85 bp)
ACTGCGATGAGTTATATAAA (110 bp)
Distribution of TBEs in the micronuclear genome
Annotated TBEs occupy ~13.3 % of the micronuclear genome. This is slightly smaller than the previously reported estimate of 15 %  because the previous annotation is based on RepeatMasker (http://www.repeatmasker.org/), which uses sequence similarity at the nucleotide level, often including short nucleotide matches. Here, our annotation approach is based on sequence similarity at the protein level, and therefore sequences other than the three ORFs, such as terminal inverted repeats and spacer regions between the three ORFs, may have been missed, especially for partial TBEs. Among all annotated TBEs, 24.8 % are TBE1, 37.5 % are TBE2, with the ratio between TBE2.1 and TBE2.2 approximately 2.3:1, and 37.7 % are TBE3 (Table 1).
Annotated partial TBEs are more likely to be located within 500 bp of contig ends (57.7 %) than complete TBEs (19.4 %), suggesting that the original PacBio and Illumina-based genome assembly algorithm  had difficulty spanning repetitive sequences. Therefore, improvements in the genome assembly would be expected to lead to completion of these terminal, partial TBEs. On the other hand, partial TBEs located internal to a contig have lower sequence similarity to the protein sequence consensus of each family than those at contig termini (for example, % protein sequence similarity for internal partial TBE1 42kD genes: 71.6 ± 18.5; vs. terminal partial TBE1 42kD genes: 90.1 ± 9.3), suggesting that a significant portion of internal, partial TBEs are degenerate copies that are truly partial TBEs due to loss of one or two ORFs.
TBE sequences display a preference for insertion into MDSs (precursor DNA segments that are incorporated into the somatic genome), with more frequent distribution near MDSs (18.3 % within 500 bp) than the 11.1 % estimate of the genome space occupied by MDSs (Table 1, Chi-squared test, p-value = 6.304e-05). The short noncoding elements (IESs) that interrupt MDSs have long been proposed to be remnants of ancient transposon insertions . Since the TBE transposase has been implicated in IES removal and genome rearrangement , this enrichment near MDSs may facilitate the removal of both the transposons, themselves, and IESs. Among the TBEs that are near MDSs, there is a slight enrichment for members of TBE1 and TBE3, accompanied by a slight depletion of both TBE2 representatives (Table 1). Satellite repeats are another major class of repetitive sequences in the germline genome. There is no significant preference for TBE insertions near satellite repeats. Only 80 (0.79 %) complete TBEs reside within 500 bp of 380 bp repeats (which occupy 1.4 % of the genome) and 97 (0.96 %) complete TBEs reside near 170 bp repeats (which represent 1.2 % of the genome). Therefore, TBEs are more often associated with MDSs. Either their preferential insertion or maintenance near MDS-rich regions is consistent with the inferred participation of TBEs in genome rearrangement events that reassemble MDSs .
Sequence analysis of TBE sequences
Complete TBE sequences are highly similar to each other within each family (Table 2), with ~90 % pairwise similarity for the 42kD and 22kD ORFs and a slightly lower similarity, ~86 %, for the 57kD ORF. This high sequence similarity suggests that either their expansion and insertion occurred relatively recently or that each family is subject to strong selective constraints. TBE1 and TBE2 members are more similar to each other than to TBE3 (Table 2). Among different families, the 42kD transposase gene is more conserved than the 22kD ORF. The 57kD ORF is the least conserved compared to the other two ORFs (Table 2), with just 36.4 % similarity between TBE1 and TBE3, for example.
The terminal inverted repeat sequences are highly similar between both ends of a TBE (% sequence similarity: TBE1: 95.4 ± 3.6; TBE2.1: 93 ± 4.4; TBE2.2: 93.2 ± 4.4; TBE3: 95.9 ± 3.7), also consistent with either recent insertion or selective constraint.
Prevalence of premature stop codons and frameshifts
TBEs in newly sequenced stichotrich genomes
We searched six newly sequenced stichotrich macronuclear genomes  for orthologous sequences of TBE transposons. Since TBEs are repetitive sequences that occupy a large portion of the micronuclear genome, their copy number in whole cell DNA is often comparable to nanochromosomes at high copy number in the macronuclear genome. DNA prepared from whole cell extracts therefore often contains some TBE sequences. We took advantage of this to extract TBE orthologs from macronuclear genome assemblies prepared from whole cell DNA.
We were able to identify complete or partial TBE sequences in the macronuclear genome assemblies of Urostyla sp., Paraurostyla sp., Laurentiella sp., Stylonychia lemnae, Tetmemena sp. and Sterkiella histriomuscorum (the phylogeny of these species is discussed in ). Complete TBEs were found in Laurentiella, Tetmemena and Sterkiella (Table 3), with conserved orientation of the three ORFs but distinct terminal inverted repeats. Complete but degenerate TBEs were found in Paraurostyla, with no terminal inverted repeat and an inverted 57kD ORF. In Stylonychia, we could only identify incompletely assembled contigs containing TBE sequences. In Urostyla, we found only one sequence that exhibits weak protein sequence similarity to the Oxytricha 42kD transposase and we identified no homolog for the 22kD and 57kD ORFs. Similar to the DDE transposases in Euplotes Tec elements and the Tetrahymena and Paramecium genomes , the Urostyla DDE transposase homolog is very divergent from the Oxytricha 42kD transposase, exhibiting ~26 % sequence similarity in only a ~100 amino acid region containing the DDE motif towards the C-terminus.
Phylogenetic analysis supports the grouping of the assembled TBE orthologs in Sterkiella, Tetmemena and Laurentiella with Oxytricha TBE3 (Fig. 1b, c and d) (the incompletely assembled TBE sequences in the Urostyla, Paraurostyla and Stylonychia genomes also group with TBE3, data not shown). We found no premature stop codon in Sterkiella TBE orthologs of the 42kD and 22kD ORFs, the Tetmemena ortholog of the 22kD TBE ORF and the Laurentiella ortholog of the 22kD ORF. The Tetmemena and Laurentiella orthologs of the 42kD transposase both contain a premature stop codon, neither of which is present at a homologous position with each other nor with common sites of premature stop codons in the 42kD ORFs of Oxytricha TBE3. This suggests that the most common premature stop codons in the Oxytricha TBE3 genes may be specific to the Oxytricha lineage.
Since TBE orthologs group with TBE3, and TBE1 and TBE2 are more similar to each other than either is to TBE3 (Table 2), we infer that the TBE1 and TBE2 divergence and expansion most likely occurred recently in the Oxytricha lineage. Alternatively, the divergence may have occurred earlier but orthologous TBE1 and TBE2 sequences could be rare, or otherwise absent from the whole cell genome data for all other stichotrich genomes surveyed, or TBE 1 and 2 could have been lost from those lineages during evolution; however, these are all less parsimonious explanations than the conclusion that TBE 1 and 2 arose after TBE3 and underwent an expansion in the Oxytricha lineage. Furthermore, preliminary micronuclear genome sequence data from one of the outgroup species confirm the absence of TBE1/TBE2 orthologs in its micronuclear genome (Beh, Lindblad, Chen, Sebra, and Landweber, unpublished). Since the micronuclear genome sequences of most stichotrichs are not available and the DDE transposases in Euplotes, Tetrahymena and Paramecium are too divergent to provide outgroups, it is difficult to infer features of the ancestral TBE transposon that first invaded stichotrich germline genomes.
We report a genome-wide characterization of the distribution and sequence features of TBE transposons in Oxytricha and provide phylogenomic evidence that the root among them may be in the TBE3 clade. The four major TBE families each have distinct terminal inverted repeats and spacer regions between TIRs and ORFs.
Of the three TBE-encoded genes, the 57kD ORF is much less conserved among different families than the 42kD and 22kD ORFs. It is possible that the structure and function of the 57kD protein allows it to be tolerant to more substitutions. Notably, the two subfamilies of TBE2 have similar 42kD and 22kD genes but very different 57kD genes, consistent with both increased variation in and the possible expanding roles of the 57kD protein. One type of the TBE2.2 terminal inverted repeat is precisely 21 bp shorter than the TBE2.1 TIR. It is possible that TBE2.1 is the ancestral form of TBE2 and that TBE2.2 diverged later from TBE2.1 with the acquisition of substitutions in the 57kD ORF, and that this was accompanied by shortening or altering the TIRs. While all three ORFs currently appear to be under purifying selection, the branch lengths in Fig. 1d suggest that the 57kD gene appears to have evolved rapidly under relaxed selective constraints after the divergence of the TBE families. This may have been a period when the diversification of the 57kD genes contributed to the functional differences among TBE families. Functional studies of the 57kD protein would provide insight into its biological roles in transposon elimination or genome rearrangement.
No TBE1 or TBE2 orthologs are found in related stichotrich ciliates, but future sequencing of their germline genomes would provide a better view of their germline transposons and help delineate the origin and evolution of TBE1 and TBE2, as well as TBE2.1 and TBE2.2 elements. Comparative germline genome sequences will also shed light on the evolutionary relationship between TBE3 and TBE1/2, and possibly permit inference of the ancestral TBE type that first invaded ciliate genomes.
Our analysis of transposons relies on the accuracy of genome assembly. The Oxytricha micronuclear genome was assembled using a hybrid approach, taking advantage of long PacBio reads that average ~7 kb . A complete TBE sequence is ~4 kb and can be easily spanned by a PacBio read. Therefore, the accuracy of the assembly should be high for characterization of the genomic location and distribution of TBEs. However, PacBio reads were first error-corrected with high confidence unitigs assembled from Illumina reads before genome assembly , and Illumina reads, limited by their short length, can be ambiguous in repetitive regions. While Illumina unitigs are longer and more informative than Illumina reads, it is still possible that unitigs were ambiguous in resolving individual repeats, and hence that some PacBio reads deriving from repetitive regions may have not been corrected 100 % accurately. Therefore, TBE sequences will assemble less accurately than non-repetitive regions. The observed prevalence of premature stop codons and frameshifts may partially derive from this assembly artifact. However, such assembly artifacts could not explain the enrichment of stop codons at specific sites in the 42kD and 22kD ORFs of TBE3, since they would result in stop codons that are randomly distributed across the coding sequence. Assembly artifacts may have also contributed to the slightly higher dN/dS ratios that we identified, compared to ref. . Another factor contributing to the higher dN/dS ratios could be that we included all annotated TBEs (both active and inactive copies) in the analysis, whereas the previous study was based on a small set of known TBE sequences that are more likely to contain active copies.
This study provides the first genome-wide evolutionary analysis of ciliate transposons, suggesting the importance of all three TBE-encoded gene products, either in genome arrangement or other aspects of late nuclear differentiation, when the transposon genes are expressed. Sequencing and comparative analysis of more ciliate germline genomes will provide insights into the evolution and recruitment of domesticated transposons in genomes with complex genetic architecture.
Annotation and extraction of TBE sequences from the micronuclear genome
The protein sequences for the three ORFs (GenBank accession: AAB42034.1, AAB42016.1 and AAB42018.1) were used to query the Oxytricha micronuclear genome (GenBank accession: ARYC00000000) as well as the ciliate macronuclear genome assemblies (Urostyla sp.: LASQ02000000, Paraurostyla sp.: LASR02000000, Laurentiella sp.: LASS02000000, Sterkiella histriomuscorum: LAST02000000, Tetmemena sp.: LASU02000000 and Stylonychia lemnae: ADNZ03000000) with TBLASTN (BLAST+ , parameters: -db_gencode 6 -evalue 1e-7). TBE regions were annotated according to the TBLASTN output. Regions containing three ORFs in proximity (within 1 kb from each other) and in the correct orientation were annotated as complete TBEs, while those that do not contain all three ORFs were annotated as partial TBEs.
Clustering and alignment of TBE sequences
Complete TBE sequences were aligned to each other using an all-by-all BLASTN (BLAST+ , parameters: -word_size 50). Pairwise sequence similarity values were converted into input for MCL (parameter: -I 1.2) , which clustered TBE sequences into large clusters. Coding sequences were extracted using Exonerate  (parameters: --model protein2dna --geneticcode 6 --ryo “ > %ti_%tab_%tae\n%tcs” --verbose 0 --showalignment no --showvulgar no). All-by-all BLASTP searches were performed on translated protein sequences (BLAST+ , E-value cutoff 10−7) and pairwise protein sequence similarities were extracted from the BLASTP output. Terminal inverted repeats were determined by aligning the two ends of a TBE sequence using BLASTN, and clustering and consensus sequence generation were performed using UCLUST . For Oxytricha TBEs, target site duplications were determined by comparing the MIC genome sequences immediately flanking TBEs. For TBEs identified in other ciliate macronuclear genomes, target site duplications were determined by mapping genomic reads to TBEs using BWA  (default parameters) and comparing the sequences flanking the terminal inverted repeats.
Construction of phylogenetic trees
Randomly sampled protein sequences of the three TBE ORFs were aligned with MAFFT  and excess gaps and poorly aligned regions were removed with trimAl (version 1.2, with the “-automated1” parameter) . We used ProtTest  to determine the most suitable protein model (JTT + I + G). Phylogenetic trees were generated from the alignments using MrBayes v3.2.2  (parameters: prset aamodelpr = fixed(jones); lset rates = invgamma). Trees were drawn using FigTree 1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/).
Estimation of substitution rates
Pairwise protein alignments (MAFFT version 6.956b, ) were performed for each of the three genes (stop codons masked and frameshifts corrected) encoded by TBEs. Protein alignments were converted to coding sequence alignments using PAL2NAL . The lengths of trimmed alignments are 344 codons (42kD protein), 187 codons (22kD) and 471 codons (57kD). Nonsynonymous to synonymous rate (dN/dS) ratios were calculated using the codeml program in PAML  (version 4.5) with parameters “icode = 5, runmode = −2, CodonFreq = 2”. Synonymous substitution rates below 0.01 or above 5 were excluded from the analysis. In addition to pairwise dN/dS estimation, we randomly sampled 50 to 80 42kD, 22kD and 57kD ORFs and used codeml to compare likelihoods of models with estimated dN/dS (runmode = 0, fixed_omega = 0) and that with dN/dS = 1 (runmode = 0, fixed_omega = 1, omega = 1) (Additional file 3).
Telomere bearing elements
Macronuclear destined sequence
Internal eliminated sequence
Open reading frame
Target site duplication
Terminal inverted repeat
We thank Clayton Schwarz for preliminary analysis, Tom Doak, and all members of the Landweber lab, and Seolkyoung Jung and Sean Eddy for discussion. This study was supported by NIH grants GM59708, GM109459 and the Human Frontier Science Program RGP004/2014 to L.F.L.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Kidwell MG, Lisch DR. Transposable elements and host genome evolution. Trends Ecol Evol. 2000;15:95–9.PubMedView ArticleGoogle Scholar
- Feschotte C, Pritham EJ. DNA transposons and the evolution of eukaryotic genomes. Annu Rev Genet. 2007;41:331–68.PubMedPubMed CentralView ArticleGoogle Scholar
- Slotkin RK, Martienssen R. Transposable elements and the epigenetic regulation of the genome. Nat Rev Genet. 2007;8:272–85.PubMedView ArticleGoogle Scholar
- Fedoroff NV. Transposable elements, epigenetics, and genome evolution. Science. 2012;338:758–67.PubMedView ArticleGoogle Scholar
- Prescott DM. The DNA of ciliated protozoa. Microbiol Rev. 1994;58:233–67.PubMedPubMed CentralGoogle Scholar
- Swart EC, Bracht JR, Magrini V, Minx P, Chen X, Zhou Y, et al. The oxytricha trifallax macronuclear genome: a complex eukaryotic genome with 16,000 tiny chromosomes. PLoS Biol. 2013;11:e1001473.PubMedPubMed CentralView ArticleGoogle Scholar
- Chen X, Bracht JR, Goldman AD, Dolzhenko E, Clay DM, Swart EC, et al. The architecture of a scrambled genome reveals massive levels of genomic rearrangement during development. Cell. 2014;158:1187–98.PubMedPubMed CentralView ArticleGoogle Scholar
- Dubois E, Bischerour J, Marmignon A, Mathy N, Regnier V, Betermier M. Transposon invasion of the paramecium germline genome countered by a domesticated PiggyBac transposase and the NHEJ pathway. Int J Evol Biol. 2012;2012:e436196.View ArticleGoogle Scholar
- Vogt A, Goldman AD, Mochizuki K, Landweber LF. Transposon domestication versus mutualism in ciliate genome rearrangements. PLoS Genet. 2013;9:e1003659.PubMedPubMed CentralView ArticleGoogle Scholar
- Baudry C, Malinsky S, Restituito M, Kapusta A, Rosa S, Meyer E, et al. PiggyMac, a domesticated piggyBac transposase involved in programmed genome rearrangements in the ciliate Paramecium tetraurelia. Genes Dev. 2009;23:2478–83.PubMedPubMed CentralView ArticleGoogle Scholar
- Cheng C-Y, Vogt A, Mochizuki K, Yao M-C. A domesticated piggyBac transposase plays key roles in heterochromatin dynamics and DNA cleavage during programmed DNA deletion in Tetrahymena thermophila. Mol Biol Cell. 2010;21:1753–62.PubMedPubMed CentralView ArticleGoogle Scholar
- Jahn CL, Doktor SZ, Frels JS, Jaraczewski JW, Krikau MF. Structures of the Euplotes crassus Tec1 and Tec2 elements: identification of putative transposase coding regions. Gene. 1993;133:71–8.PubMedView ArticleGoogle Scholar
- Mouël AL, Butler A, Caron F, Meyer E. Developmentally regulated chromosome fragmentation linked to imprecise elimination of repeated sequences in paramecia. Eukaryot Cell. 2003;2:1076–90.PubMedPubMed CentralView ArticleGoogle Scholar
- Arnaiz O, Mathy N, Baudry C, Malinsky S, Aury J-M, Denby Wilkes C, et al. The paramecium germline genome provides a niche for intragenic parasitic DNA: evolutionary dynamics of internal eliminated sequences. PLoS Genet. 2012;8:e1002984.PubMedPubMed CentralView ArticleGoogle Scholar
- Doak TG, Witherspoon DJ, Jahn CL, Herrick G. Selection on the genes of Euplotes crassus Tec1 and Tec2 transposons: evolutionary appearance of a programmed frameshift in a Tec2 gene encoding a tyrosine family site-specific recombinase. Eukaryot Cell. 2003;2:95–102.PubMedPubMed CentralView ArticleGoogle Scholar
- Klobutcher LA, Herrick G. Consensus inverted terminal repeat sequence of Paramecium IESs: resemblance to termini of Tc1-related and Euplotes Tec transposons. Nucleic Acids Res. 1995;23:2006–13.PubMedPubMed CentralView ArticleGoogle Scholar
- Klobutcher LA, Herrick G. Developmental genome reorganization in ciliated protozoa: the transposon link. Prog Nucleic Acid Res Mol Biol. 1997;56:1–62.PubMedView ArticleGoogle Scholar
- Herrick G, Cartinhour S, Dawson D, Ang D, Sheets R, Lee A, et al. Mobile elements bounded by C4A4 telomeric repeats in Oxytricha fallax. Cell. 1985;43(3 Pt 2):759–68.PubMedView ArticleGoogle Scholar
- Seegmiller A, Williams KR, Hammersmith RL, Doak TG, Witherspoon D, Messick T, et al. Internal eliminated sequences interrupting the Oxytricha 81 locus: allelic divergence, conservation, conversions, and possible transposon origins. Mol Biol Evol. 1996;13:1351–62.PubMedView ArticleGoogle Scholar
- Doak TG, Doerder FP, Jahn CL, Herrick G. A proposed superfamily of transposase genes: transposon-like elements in ciliated protozoa and a common “D35E” motif. Proc Natl Acad Sci U S A. 1994;91:942–6.PubMedPubMed CentralView ArticleGoogle Scholar
- Nowacki M, Higgins BP, Maquilan GM, Swart EC, Doak TG, Landweber LF. A functional role for transposases in a large eukaryotic genome. Science. 2009;324:935–8.PubMedPubMed CentralView ArticleGoogle Scholar
- Williams K, Doak TG, Herrick G. Developmental precise excision of Oxytricha trifallax telomere-bearing elements and formation of circles closed by a copy of the flanking target duplication. EMBO J. 1993;12:4593–601.PubMedPubMed CentralGoogle Scholar
- Doak TG, Witherspoon DJ, Doerder FP, Williams K, Herrick G. Conserved features of TBE1 transposons in ciliated protozoa. Genetica. 1997;101:75–86.PubMedView ArticleGoogle Scholar
- Witherspoon DJ, Doak TG, Williams KR, Seegmiller A, Seger J, Herrick G. Selection on the protein-coding genes of the TBE1 family of transposable elements in the ciliates Oxytricha fallax and O. trifallax. Mol Biol Evol. 1997;14:696–706.PubMedView ArticleGoogle Scholar
- Schwarz C. The roles of transposon-encoded genes in a genome rearrangement process. Princeton University Senior Thesis. 2011.
- Chen X, Jung S, Beh LY, Eddy SR, Landweber LF. Combinatorial DNA rearrangement facilitates the origin of new genes in ciliates. Genome Biol Evol. 2015;7(10):2859–70. doi:10.1093/gbe/evv172.PubMedPubMed CentralGoogle Scholar
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421.PubMedPubMed CentralView ArticleGoogle Scholar
- van Dongen S: A cluster algorithm for graphs. Tech. Report, National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam 2000.
- Slater GS, Birney E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics. 2005;6:31.PubMedPubMed CentralView ArticleGoogle Scholar
- Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26:2460–1.PubMedView ArticleGoogle Scholar
- Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics Oxf Engl. 2009;25:1754–60.View ArticleGoogle Scholar
- Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30:3059–66.PubMedPubMed CentralView ArticleGoogle Scholar
- Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009;25(15):1972–3. doi:10.1093/bioinformatics/btp348.PubMedPubMed CentralView ArticleGoogle Scholar
- Darriba D, Taboada GL, Doallo R, Posada D. ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics. 2011;27(8):1164–5. doi:10.1093/bioinformatics/btr088.PubMedView ArticleGoogle Scholar
- Huelsenbeck JP, Ronquist F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics Oxf Engl. 2001;17:754–5.View ArticleGoogle Scholar
- Suyama M, Torrents D, Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006;34(Web Server issue):W609–12.PubMedPubMed CentralView ArticleGoogle Scholar
- Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–91.PubMedView ArticleGoogle Scholar