Insertion site preference of Mu, Tn5, and Tn7 transposons
© Green et al; licensee BioMed Central Ltd. 2012
Received: 5 October 2011
Accepted: 7 February 2012
Published: 7 February 2012
Transposons, segments of DNA that can mobilize to other locations in a genome, are often used for insertion mutagenesis or to generate priming sites for sequencing of large DNA molecules. For both of these uses, a transposon with minimal insertion bias is desired to allow complete coverage with minimal oversampling.
Three transposons, Mu, Tn5, and Tn7, were used to generate insertions in the same set of fosmids containing Candida glabrata genomic DNA. Tn7 demonstrates markedly less insertion bias than either Mu or Tn5, with both Mu and Tn5 biased toward sequences containing guanosine (G) and cytidine (C). This preference of Mu and Tn5 yields less uniform spacing of insertions than for Tn7, in the adenosine (A) and thymidine (T) rich genome of C. glabrata (39% GC).
In light of its more uniform distribution of insertions, Tn7 should be considered for applications in which insertion bias is deleterious.
KeywordsTn7 Mu Tn5 Mutagenesis Insertion site DNA transposon Mobile element
Transposons, mobile DNA elements that can integrate into target DNA molecules, are useful for insertional mutagenesis, gene tagging, gene transfer, and sequencing applications. A major class of transposable elements used for genome engineering is DNA 'cut and paste' transposons. The transposases for DNA transposons cut the transposon away from the donor DNA by a variety of mechanisms and the excised transposon integrates into a new target site by joining of its 3'OH termini to staggered positions on the top and bottom DNA strands of the target. This staggered joining results in a target site duplication of a defined number of base pairs, which can be used to map precisely the site of integration for the transposon .
In most of the applications of transposons to molecular biology, it is important that the transposon insert into target DNA with little to no sequence bias. Limited sequence bias will lead to more complete coverage of a region for a given number of insertion events. However, most transposons have been shown to exhibit some preference for certain sequences or sequence features . Clearly, insertion site bias may be a confounding factor for large scale transposon mutagenesis projects.
A number of manuscripts reporting insertion motifs for various transposons have been published, but the target DNA, transposition protocol and environment (in vitro versus in vivo) vary widely, making direct comparisons difficult. For example, individual genes , Escherichia coli genomic DNA , and Saccharomyces cerevisiae genomic DNA  have been used. In this publication, three transposon systems were evaluated using the same target DNA in vitro: Mu, Tn5, and a modified Tn7 . Previous work had identified a CPy(G/C)PuG or similar motif for Mu [6–8], a GPyPyPy(A/T)PuPuPuC motif for Tn5 [9, 10] and negligible bias for the modified Tn7 . Since previous publications all used different target DNA, and because our DNA of interest (C.glabrata genomic DNA) has a moderately high A/T content (61%, ), specificity and distribution of insertion sites for all three transposons was assessed on the same target DNAs.
BACs containing C. glabrata genomic DNA were prepared as follows. First, the vector plasmid pBAC-NAT was constructed in two steps. pCR2.1-NAT was constructed by amplifying the NAT cassette using primers ON-5'NAT (CCGCTGCTAGGCGCGCCGTGGAAGTTCCTATACTTTCTAGAGAATAGGAACTTCGATCCCCCCCATAAAGCACGTGATAGCTTC) and ON-3'NAT (GCAGGGATGCGGCCGCTGACGAAGTTCCTATTCTCTAGAAAGTATAGGAACTTCAGCTTGATATCGAATTCCGCAAATTAAAGCC) from pCaNAT1 (a gift of Julia Koehler) and cloning that into pCR2.1 using the TA/TOPO kit (Life Technologies, Carlsbad, CA, USA 92008) The NAT cassette was amplified using primers ON3601 (AGTCGCGGCCGCGTTTAAACGGCGCCCCGCTGCTAGGCGCGCCGTG) and ON3602 (AGTCGGCCCGGGCGGCCACGCGTTGACCCGCGGGCAGGGATACGGCCGCTGAC), cloned into pCR2.1, sequence verified, and a NotI/SfiI fragment of that was inserted into pBAC  cut with NotI/SfiI to yield pBAC-NAT (pB1895).
Next, genomic DNA was inserted into pBAC-NAT. The four plasmids into which transposons were mobilized contain genomic DNA from C. glabrata from the indicated ORF to the telomere. The genomic DNA began at CAGL0A00187g from the strain BG2  for pB1907 (24,252 bp and 34% GC insert), from CAGL0C00297g and strain BG2 for pB1908 (31,757 bp and 34% GC insert), from CAGL0C05599g and strain BG2 for pB1909 (25,125 bp and 34% GC insert), and from CAGL0C05599g and strain CBS138  for pB1910 (19,423 bp and 31% GC insert). Although pB1909 and pB1910 contain the region from the same gene to the telomere from different strains, they are only homologous for the centromeric (rightmost in figures) approximately 8 kb, after which they diverge completely (data not shown).
Mu transposition reactions were carried out per the manufacturer's recommendations using the Finnzyme Template Generation System Kit (Thermo Fisher Scientific, Waltham, MA, USA 02454), with pB1909 and pB1910 as target DNA sequences. Tn5 transposition reactions were carried out per the manufacturer's recommendations using the Ez-Tn5 kit (Epicentre, Madison, WI, USA 53713) with pB1907, pB1908, pB1909, and pB1910 as targeting sequences. Tn7 reactions were carried out as published,  with pB1907, pB1908, pB1909, and pB1910 as targeting sequences.
All sequencing was done using the ABI BigDye Terminator Kit v1.1 (Life Technologies, Carlsbad, CA, USA 92008). The primers used for sequencing the Mu transposon containing clones were SeqE (CGACACACTCCAATCTTTCC) and SeqW (GGTGGCTGGAGTTAGACATC). The primers used for sequencing the Tn5 transposon containing clones were KAN-2 FP-1 (ACCTACAACAAAGCTCTCATCAACC) and KAN-2 RP-1 (GCAATGTAACATCAGAGATTTTGAG). The primers used for sequencing the Tn7 transposon containing clones were ON661 (ATAATCCTTAAAAACTCCATTTCCACCCCTCCCAG) and ON662 (GACTTTATTGTCATAGTTTAGATCTATTTTGTTCAG).
BLogo sequence logos were generated using the web form at http://www.bioinformatics.org/blogo/cgi-bin/Blogo/Blogoform.pl as type 2 logos with coloring for symbols with P < 0.001 (Fisher's exact test) and base representation calculated from the fosmid sequences into which the various transposons were integrated. The background frequencies of A, C, G, and T used for the BLogo sequence logos are given in the figure legends.
Number of insertions mapped
The G/C content of the fosmids mutagenized is not uniform across their length (see below), so some of the apparent bias in the flanking base pairs might be simply due to a sampling bias due to a strongly biased central core. In fact, if the base percentages used in BLogo are calculated from all 25mers in the fosmids containing a central CGG core, no other positions show significant bias for Mu transposition (Figure 1D). The results from a similar analysis for Tn5 were ambiguous, but since many of the Tn5 insertions were in the C. glabrata telomeric repeats (which are very G/C rich), this could explain the observed flanking bias.
Uniform distribution of transposon insertion is important for their use in many molecular biological applications. In the analysis here, Tn7 demonstrates a far more random insertion profile than either Mu or Tn5. For Tn5 and Mu, the consensus sequences derived are consistent with extensive published analyses [5–10]. Transposon insertion site preference is complex. For example, previous studies of Mu insertion have shown that particular dinucleotides, or base steps, contribute to site preference; predicted structural features of DNA also may play a role [6–8]. However, even these comprehensive analyses of insertion site preference derived consensus sites that are both G/C rich and consistent with those we derived here. We suggest that the preference for G/C rich sequences exhibited by Mu and Tn5 is of particular importance for investigators working with A/T rich genomes such as S. cerevisiae (approximately 62% A/T) and many other fungi, Caenorhabditis elegans (approximately 65% A/T), and human (approximately 60% A/T) . Tn7 has less target bias, with a weak preference for A/T rich sequences. For the large genomic inserts analyzed here, this resulted in a broad distribution of insertion sites, with a bias away from the plasmid backbone (which is more G/C rich relative to the cloned genomic DNA). The minimal sequence bias exhibited by Tn7 suggests that Tn7 should be considered for generating random insertions in cloned A/T-rich genomes.
We thank Julia Koehler for the gift of pCaNAT. This work was supported by grants 1F32AI080094 (BMG), 2RO1AI046223 (BPC), and RO1GM076425 (NLC). NLC is an Investigator of the Howard Hughes Medical Institute.
- Craig NL: Mobile DNA II. 2002, Washington, DC: ASM PressView Article
- Fransen M, Vastiau I, Brees C, Brys V, Mannaerts GP, Van Veldhoven PP: Analysis of human Pex19p's domain structure by pentapeptide scanning mutagenesis. J Mol Biol. 2005, 346: 1275-1286. 10.1016/j.jmb.2005.01.013.View ArticlePubMed
- Puttamreddy S, Cornick NA, Minion FC: Genome-wide transposon mutagenesis reveals a role for pO157 genes in biofilm development in Escherichia coli O157:H7 EDL933. Infect Immun. 2010, 78: 2377-2384. 10.1128/IAI.00156-10.PubMed CentralView ArticlePubMed
- Xu T, Bharucha N, Kumar A: Genome-wide transposon mutagenesis in Saccharomyces cerevisiae and Candida albicans. Methods Mol Biol. 2011, 765: 207-224. 10.1007/978-1-61779-197-0_13.View ArticlePubMed
- Biery MC, Stewart FJ, Stellwagen AE, Raleigh EA, Craig NL: A simple in vitro Tn7-based transposition system with low target site selectivity for genome and gene analysis. Nucleic Acids Res. 2000, 28: 1067-1077. 10.1093/nar/28.5.1067.PubMed CentralView ArticlePubMed
- Butterfield YS, Marra MA, Asano JK, Chan SY, Guin R, Krzywinski MI, Lee SS, MacDonald KW, Mathewson CA, Olson TE, Pandoh PK, Prabhu AL, Schnerch A, Skalska U, Smailus DE, Stott JM, Tsai MI, Yang GS, Zuyderduyn SD, Schein JE, Jones SJ: An efficient strategy for large-scale high-throughput transposon-mediated sequencing of cDNA clones. Nucleic Acids Res. 2002, 30: 2460-2468. 10.1093/nar/30.11.2460.PubMed CentralView ArticlePubMed
- Haapa-Paananen S, Rita H, Savilahti H: DNA transposition of bacteriophage Mu. A quantitative analysis of target site selection in vitro. J Biol Chem. 2002, 277: 2843-2851. 10.1074/jbc.M108044200.View ArticlePubMed
- Manna D, Deng S, Breier AM, Higgins NP: Bacteriophage Mu targets the trinucleotide sequence CGG. J Bacteriol. 2005, 187: 3586-3588. 10.1128/JB.187.10.3586-3588.2005.PubMed CentralView ArticlePubMed
- Goryshin IY, Miller JA, Kil YV, Lanzov VA, Reznikoff WS: Tn5/IS50 target recognition. Proc Nat Acad Sci USA. 1998, 95: 10716-10721. 10.1073/pnas.95.18.10716.PubMed CentralView ArticlePubMed
- Shevchenko Y, Bouffard GG, Butterfield YS, Blakesley RW, Hartley JL, Young AC, Marra MA, Jones SJ, Touchman JW, Green ED: Systematic sequencing of cDNA clones using the transposon Tn5. Nucleic Acids Res. 2002, 30: 2469-2477. 10.1093/nar/30.11.2469.PubMed CentralView ArticlePubMed
- Seringhaus M, Kumar A, Hartigan J, Snyder M, Gerstein M: Genomic analysis of insertion behavior and target specificity of mini-Tn7 and Tn3 transposons in Saccharomyces cerevisiae. Nucleic Acids Res. 2006, 34: e57-10.1093/nar/gkl184.PubMed CentralView ArticlePubMed
- Dujon B, Sherman D, Fischer G, Durrens P, Casaregola S, Lafontaine I, De Montigny J, Marck C, Neuveglise C, Talla E, Goffard N, Frangeul L, Aigle M, Anthouard V, Babour A, Barbe V, Barnay S, Blanchin S, Beckerich JM, Beyne E, Bleykasten C, Boisramé A, Boyer J, Cattolico L, Confanioleri F, De Daruvar A, Despons L, Fabre E, Fairhead C, Ferry-Dumazet H: Genome evolution in yeasts. Nature. 2004, 430: 35-44. 10.1038/nature02579.View ArticlePubMed
- Castano I, Kaur R, Pan S, Cregg R, Penas Ade L, Guo N, Biery MC, Craig NL, Cormack BP: Tn7-based genome-wide random insertional mutagenesis of Candida glabrata. Genome Res. 2003, 13: 905-915. 10.1101/gr.848203.PubMed CentralView ArticlePubMed
- Cormack BP, Falkow S: Efficient homologous and illegitimate recombination in the opportunistic yeast pathogen Candida glabrata. Genetics. 1999, 151: 979-987.PubMed CentralPubMed
- Li W, Yang B, Liang S, Wang Y, Whiteley C, Cao Y, Wang X: BLogo: a tool for visualization of bias in biological sequences. Bioinformatics. 2008, 24: 2254-2255. 10.1093/bioinformatics/btn407.View ArticlePubMed
- Herold J, Kurtz S, Giegerich R: Efficient computation of absent words in genomic sequences. BMC Bioinformatics. 2008, 9: 167-10.1186/1471-2105-9-167.PubMed CentralView ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.