Carbapenemases on the move: it’s good to be on ICEs

Background The evolution and spread of antibiotic resistance is often mediated by mobile genetic elements. Integrative and conjugative elements (ICEs) are the most abundant conjugative elements among prokaryotes. However, the contribution of ICEs to horizontal gene transfer of antibiotic resistance has been largely unexplored. Results Here we report that ICEs belonging to mating-pair formation (MPF) classes G and T are highly prevalent among the opportunistic pathogen Pseudomonas aeruginosa, contributing to the spread of carbapenemase-encoding genes (CEGs). Most CEGs of the MPFG class were encoded within class I integrons, which co-harbour genes conferring resistance to other antibiotics. The majority of the integrons were located within Tn3-like and composite transposons. Conserved attachment site could be predicted for the MPFG class ICEs. MPFT class ICEs carried the CEGs within composite transposons which were not associated with integrons. Conclusions The data presented here provides a global snapshot of the different CEG-harbouring ICEs and sheds light on the underappreciated contribution of these elements to the evolution and dissemination of antibiotic resistance on P. aeruginosa. Electronic supplementary material The online version of this article (10.1186/s13100-018-0141-4) contains supplementary material, which is available to authorized users.


Background
Among the non-fermenting Gram-negative bacteria, the Pseudomonas genus is the one with the highest number of species [1,2]. Pseudomonas aeruginosa, an opportunistic human pathogen associated with an ever-widening array of life-threatening acute and chronic infections, is the most clinically relevant species within this genus [3][4][5]. P. aeruginosa is one of the CDC "ESKAPE" pathogens -Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, P. aeruginosa and Enterobacter species -, emphasizing its impact on hospital infections and the ability of this microorganism to "escape" the activity of antibacterial drugs [6]. P. aeruginosa can develop resistance to a wide range of antibiotics due to a combination of intrinsic, adaptive, and acquired resistance mechanisms, such as the reduction of its outer membrane permeability, over-expression of constitutive or inducible efflux pumps, overproduction of AmpC cephalosporinase, and the acquisition of antibiotic resistance genes (ARGs) through horizontal gene transfer (HGT) [4,7,8]. P. aeruginosa has a non-clonal population structure, punctuated by specific sequence types (STs) that are globally disseminated and frequently linked to the dissemination of ARGs [4,9]. These STs have been designated as high-risk clones, of which major examples are ST111, ST175, ST235 and ST244.
Due to its high importance for human medicine, carbapenems are considered by the World Health Organization as Critically-Important Antimicrobials that should be reserved for the treatment of human infections caused by MDR Gram-negative bacteria [10], such as P. aeruginosa. Carbapenem-resistant P. aeruginosa is in the "critical" category of the World Health Organization's priority list of bacterial pathogens for which research and development of new antibiotics is urgently required [11]. Besides P. aeruginosa, carbapenem resistance has been reported in other Pseudomonas spp. and is often mediated by the acquisition of carbapenemase-encoding genes (CEGs) [12][13][14]. Carbapenemases are able to hydrolyse carbapenems and confer resistance to virtually all ß-lactam antibiotics [15]. In the Pseudomonas genus, CEGs are mostly present on class I integrons within the chromosome [4]. Class I integrons are genetic elements that carry ARGs and an integrase gene, which controls integration and excision of genes [16][17][18]. Mobile genetic elements (MGEs) such as transposons, plasmids and integrative and conjugative elements (ICEs), are responsible for the spread of ARGs [19][20][21][22][23].
Usually, the genes acquired by HGT are integrated in common hotspots in the host's chromosome, comprising a cluster of genes designated by genomic islands (GIs) [19,24,25]. This broad definition also encompass other MGEs, such as ICEs and prophages. ICEs are self-transmissible mosaic and modular MGEs that combine features of transposons and phages (ICEs can integrate into and excise from the chromosome), and plasmids (ICEs can also exist as circular extrachromosomal elements, replicate autonomously and be transferred by conjugation) [21,24,[26][27][28][29]. Integrative and mobilizable elements encode their own integration and excision systems, but take advantage of the conjugation machinery of co-resident conjugative elements to be successfully transferred [30]. ICEs usually replicate as part of the host genome and are vertically inherited, remaining quiescent, and with most mobility genes repressed [31,32]. These elements also encode recombinases related to those in phages and other transposable elements. Conjugation involves three mandatory components: a relaxase, a type-IV secretion system (T4SS) and a type-IV coupling protein (T4CP) [33,34]. Four mating-pair formation (MPF) classes cover the T4SS among Proteobacteria: MPF T , MPF G , MPF F and MPF I [35]. The first is widely disseminated among conjugative plasmids and ICEs, while MPF F is more prevalent in plasmids of γ-Proteobacteria and MPF G is found essentially on ICEs. MPF I is rarely identified. Guglielmini et al. constructed a phylogenetic tree of VirB4, a highly conserved ATPase from the T4SS apparatus of different conjugative plasmids and ICEs, and formulated the hypothesis of interchangeable conjugation modules along their evolutionary history [36]. A close interplay between these elements in the ancient clades of the phylogenetic tree was observed, suggesting that plasmids may behave like ICEs and vice-versa, reinforcing the common assumption that the line separating ICEs and conjugative plasmids is blurring [27,37]. These authors also searched more than 1000 genomes and found that ICEs are present in most bacterial clades and are more prevalent than conjugative plasmids [36]. It was also observed that the larger the genome, the higher the likelihood to harbour a conjugative element at a given moment, which supports the common assumption that bacteria with large genomes are more prone to acquire genes by HGT [38][39][40].
Delimiting ICEs in genomic data remains particularly challenging [25]. Some signatures features are frequently observed, such as a sporadic distribution, sequence composition bias, insertion next to or within a tRNA gene, bordering attachment (att) sites and over-representation of mobility genes of the T4SS. However, some ICEs present atypical features and may not be detected by these approaches [25,38]. In P. aeruginosa, most ICEs fall into three large families: the ICEclc, pKLC102 and Tn4371. The PAGI2(C), PAGI3(SG), PAGI-13, PAGI-15 and PAGI-16 were previously described as members of the ICEclc family, while the PAPI-1, PAPI-2, PAGI-4 and PAGI-5 were linked to the pKLC102 family [19]. The ICE Tn4371 family also represents a large group of ICEs with a common backbone and which are widely distributed, such as in P. aeruginosa UCBPP-PA14, PA7 and PACS171b strains [21]. These ICEs have been frequently implicated in virulence [41,42].
Previous reports characterized the complete nucleotide sequence of extra-chromosomal genetic elements housing different CEGs in pseudomonads [20,[43][44][45][46]; however, the association of CEGs with chromosome-located MGEs has rarely been investigated [47][48][49]. Taking into consideration that i) in pseudomonads, CEGs are frequently located within the chromosome, ii) ICEs are the most abundant conjugative elements in prokaryotes and iii) ICEs are more frequently identified in large bacterial genomes, such as in pseudomonads, we hypothesize that ICEs may play a key role in the horizontal spread of CEGs. To investigate this hypothesis, we developed an in silico approach to explore the association between ICEs and CEGs in pseudomonads.

Results
A plethora of carbapenemase-encoding genes was identified in a subset of Pseudomonas species From the total Pseudomonas genomes analysed (n = 4565), 313 CEGs were identified in 297 genomes ( Fig. 1 and Additional file 1: Table S1). As expected, bla VIM-2 represents the majority of the CEGs found among Pseudomonas spp., being detected mainly in P. aeruginosa, followed by P. plecoglocissida, P. guariconensis, P. putida, P. stutzeri and 16 genomes corresponding to unidentified species (Additional file 1: Table S1). Curiously, some strains presented two CEGs, either presenting a duplication of the same gene, such as bla IMP-34 from NCGM 1900 and NCGM 1984 Japanese isolates, or harbouring different CEGs, such as bla IMP-1 and bla DIM-1 in isolates 97, 130 and 142 recovered in Ghana (Additional file 1: Table S1, highlighted in red). A wide variety of STs was also observed, including the high-risk clones ST111, ST175 and ST244.

Detection of ICE encoding carbapenemases in
Pseudomonas spp 65.5% (205/313, Additional file 1: Table S1) of the CEG hits are located within small contigs, with a sequence smaller than 20 kb in length. The presence of repeated regions, such as those encoding for transposases, tends to split the genome when second-generation sequencing approaches are used. Based on information retrieved from NCBI (accessed on the 24th of May, 2018), the total number of bacterial genomes sequenced at the chromosome/complete genome level is 12,077, while the number of genomes sequenced at the scaffold/contig is much larger (127,231). With this sequencing limitation, we were still able to identify 49 ICEs associated with CEGs (n = 20 with complete sequence) among all pseudomonads genomes (Table 1, Additional file 1: Table S1 and Fig. 1). When we attributed an ICE location to a CEG located on a small contig, this assumption was based on previously published data, as pointed out on Table 1. Besides the aforementioned ICEs, we also identified a putative MGE within Pseudomonas sp. NBRC 111143 strain (Additional file 1: Table S1). The T4CP-encoding gene was absent from this bla IMP-10 -carrying element, which could be due to contig fragmentation or gene absence. In case the gene is actually missing, this element could still be mobilized by the conjugation machinery of an ICE or conjugative plasmid(s) present in the host, and should be classified as an integrative and mobilizable element.
The ICEs identified here were all integrated within P. aeruginosa genomes (with the exception of the one element identified in Pseudomonas sp. PONIH3 genome) and AT-rich when compared to their host's chromosome; the mean GC value for this species is Whole-genome phylogeny of the CEG-carrying P. aeruginosa isolates. The maximum-likelihood phylogenetic tree was constructed using 146,106 single nucleotide polymorphisms (SNPs) spanning the whole genome and using the P. aeruginosa PAO1 genome (highlighted by a green triangle) as a reference. Multilocus sequence typing (MLST), continent and host data are reported on the outer-most, middle and innermost circles, respectively. The strains belonging to a double ST profile (ST235/ST2613) are shaded yellow. Blue stars point out P. aeruginosa strains for which a CEG-harbouring ICE was predicted. The P. aeruginosa AR_0356 genome (accession number CP027169.1) was removed from the tree since it corresponds to a strain of which host and origin are unknown. The phylogenetic distance from the tree root to this genome is 1 (calculated with the tree scale). The Newick format file for the original tree is included in the Additional file 3 NA is shown when the country information was not provided by sequence authors; 2 NA is shown when the isolation source was not provided by sequence authors; 3 NA is shown when no output was obtained by the platform or the conjugative module system was incomplete due to contig fragmentation; 4,5 NA is shown when the ICE sequence was incomplete due to contig fragmentation or delimitation of the entire element was not successful; 6 Representation of total ARG associated with the same CEG; a given strain harbouring the referred CEG may not present all ARG here reported; Δ represents incomplete genes 66.2% according to EZBioCloud (https://www.ezbiocloud. net/taxon?tn=Pseudomonas%20aeruginosa) ( Table 1). All ICEs identified here possessed only one tyrosine integrase (Fig. 2). ICEs belonging to the ICEclc family (MPF G class) carried an integrase belonging to the bacteriophage P4-like family, while ICEs belonging to the ICE Tn4371 family (MPF T class) carried an integrase belonging to shufflon-specific DNA recombinase Rci and Bacteriophage Hp1-like family (Table 1). Rci and Hp1-like were only distantly related (13% amino acid identity) to P4-like integrases. Orthologous assignment of these integrases revealed that the former and the later integrases identified were present in more than 100 and 400 proteobacteria species, respectively. While P4-like integrases were more prevalent on γ-proteobacteria, half of the strains carrying Rci and Hp1-like integrases belong to the α-proteobacteria.
We observed that MPF G class ICEs tend to integrate into a single copy of tRNA Gly or a cluster of two tRNA-Glu and one tRNA Gly genes, which is in agreement with previous findings [25,38]. A conserved 8-bp att site (5´-CCGCTCCA) flanked all complete ICEs of the MPF G class identified here (Table 1). Notably, most ICEs of this class were adjacent to phages (either at the 5′-or the 3′-end) targeting the same att site as the neighbour ICE. No att site could be identified for the integration of MPF T class ICEs. A gene encoding for a catechol 1,2-dioxygenase and a gene encoding for a protein with no described conserved domain were found flanking the bla SPM-1 -harbouring ICEs. Regarding the elements carrying bla NDM-1 , a gene encoding for a different protein also with no conserved domain identified and a gene encoding for the type III secretion system adenylate cyclase effector ExoY were separated upon insertion of these ICEs. Integration next to hypothetical proteins or tRNA genes was commonly observed.

Carbapenemases are frequently encoded within transposons
CEGs were associated with class I integrons frequently co-harbouring aminoglycoside resistance genes when associated with MPF class G ICEs (Table 1). Class I integrons were often associated with a wide array of transposons, such as the Tn3 superfamily transposons and the IS6100 composite elements (Table 1). MPF T class ICEs were targeted by more complex elements, such as the composite transposons carrying bla SPM-1 and bla NDM-1 ( Table 1).
The bla NDM-1 gene was identified in an isolate from Singapore in ICE Tn4371 6385 and associated with ST308, as recently reported [50]. The bla NDM-1 was flanked by two ISCR24-like transposases. bla SPM-1 was linked to ICE Tn4371 6061, a recently described ICE [51]. Again, the CEG was located within an ISCR4-like composite transposon. ISCR elements are atypical elements of the IS91 family which represent a well-recognized system of gene capture and mobilization by a rolling-circle transposition process [21,52].
Besides previously described bla NDM-1 and bla SPM-1 harbouring ICEs, we characterize here new ICEs of MPF G and MPF T classes (Table 1 and Fig. 3). The bla DIM-1 -harbouring ICE from IOMTU 133 strain was integrated into the 3′-end of a tRNA Gly gene (IOM-TU133_RS11660) and next to a gene encoding for the R body protein RebB (IOMTU133_RS12085). bla DIM-1 was first described as a two gene cassette (found together with aadB; encoding resistance to aminoglycosides) located within a class I integron associated with a 70-kb Pseudomonas stutzeri plasmid recovered in the Netherlands [13]. However, the integron carrying bla-DIM-1 in strain IOMTU 133 was unrelated to the one from the P. stutzeri plasmid, harbouring bla DIM-1 as a single gene cassette plus genes encoding for aminoglycoside (aacA4-C329 and rmtf ), trimethoprim (dfrB5) and chloramphenicol (catB12) resistance (Fig. 3a). Direct repeats (DRs) were found flanking the entire IS6100 composite transposon (5'-TTCGAGTC), indicating the transposition of this element into the ICE. Besides being identified as a composite transposon, IS6100 was frequently observed as a single copy at the 3'end of the class I integron (Fig. 3b and c), suggesting that these elements were derived from the In4 lineage [53]. The bla IMP-1 from the NCGM257 strain identified in Japan belonged to a different ST (ST357) than the frequently identified ST235 associated with the spread of this CEG in this country [54]. The CEG was also shown to be associated with a novel class I integron, co-harbouring aadB, cmlA9 and tet(G) genes encoding resistance to aminoglycosides, chloramphenicol and tetracyclines, respectively (Fig. 3b). This integron was inserted (DRs 5′-GAGTC) within a mercury resistance transposon. This genetic organization was frequently recovered among other ICE-harbouring strains, such as the ones associated with bla GES-5 , bla IMP-13 and bla IMP-14 ( Table 1). The entire ICE was integrated in the chromosome of NCGM257 strain into the 3′-end of a tRNA Gly gene (PA257_RS24790) and next to a Pseudomonas phage Pf1-like element. The new ICE identified on the P1_London_28_IMP_1_04_05 strain presented bla IMP-1 in a different In4-like integron than that observed for the NCGM257 strain, even though both elements were associated with a Tn3-like transposon (Fig. 3c). Unlike most ICEs of the MPF G class, its integration occurred between a gene encoding for a LysR family transcriptional regulator (AFJ02_RS19410) and a gene encoding for a hypothetical protein (AFJ02_RS19770). Regarding the bla KPC-2 -harbouring Pseudomonas sp. PONHI3 strain, an average nucleotide identity based on BLAST (ANIb) analysis revealed that this strain belongs to the Pseudomonas Fig. 2 Blastn comparison among multiple ICE described in this study. A gradient of blue and red colours is observed for normal and inverted BLAST matches, respectively. Model elements (ICEclc for the MPF G and Tn4371 for the MPF T classes, respectively) were also included for comparison. The arrows and arrowheads point the orientation of the translated coding sequences. In purple are highlighted the integrases, in yellow the mandatory features of a conjugative system according to Cury et al. [38] and in green the transposons harbouring the CEG soli species, since the ANIb value was above the 95% cut-off for species delineation [55]. The PONHI3 strain carried a double copy of bla KPC-2 within an ICE from MPF T class (Fig. 3d). This ICE was integrated between a gene encoding for a biopolymer transport protein ExbD/ TolR (C3F42_RS18665) and a gene encoding for an alpha/ beta hydrolase (C3F42_RS18995).

An atypical GI encoding carbapenemases
Besides ICEs, we also identified an atypical 19.8-kb long GI harbouring bla VIM-2 in P. aeruginosa AZPAE13853 and AZPAE13858 strains from India (Additional file 2: Fig. S1). A similar element was also observed in P. aeruginosa BTP038 strain from the USA, with the exception that the Tn402-like transposon harbouring bla VIM-2 was oriented in an inverted position. Five base-pair DRs (5'-CTCTG in AZPAE13853 and AZPAE13858 and 5'-CTGAG in BTP038 strains) were found flanking this transposon structure. Importantly, in these strains the GIs were flanked by identical signal recognition particle RNAs (srpRNAs), indicating a strong site preference for these elements.

Discussion
Our results show that bla VIM and bla IMP are widely disseminated, both geographically and phylogenetically (across Pseudomonas spp.). Moreover, and as previously described, bla VIM-2 was the most frequently reported CEG ( Fig. 1 and Additional file 1: Table S1) [4]. On the other hand, bla SPM-1 is still restricted to P. aeruginosa and Brazil (or patients who had been previously hospitalized in Brazil) [56]. Curiously, some strains (highlighted on Fig. 1) belong to a double ST profile (ST235/ST2613), since the strains carry a double copy with different allele sequences of the house-keeping gene acsA, encoding for an acetyl-coenzyme A synthetase. These genes only display 80.3% nucleotide identity. We plan to conduct comparative genomic studies to explore the idiosyncrasies of these double ST profile strains.
Not all CEGs are likely to be geographically and phylogenetically disseminated, but those that are more promiscuous present a serious threat. The geographical distribution of the high-risk clones and the diversity of CEGs propose that the spread of these STs is global and the acquisition of the resistance genes is mainly  [4,57]. Previous studies suggest that environmental species may have a role as an important reservoir for the dissemination of clinically relevant carbapenemases, which are vertically amplified upon transfer to P. aeruginosa high-risk clones [12,14]. The prevalence of these elements among high-risk clones may be partially explained by the genetic capitalism theory, given that a widely disseminated ST should have a greater probability of acquiring new CEGs and to be further selected and amplified due to the high antibiotic pressure in the hospital environment [58]. Other theories support that the high-risk clones have a naturally increased ability to acquire foreign DNA, since these STs appear to have lost the CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR associated proteins) system, which act as an adaptive immune system in prokaryotic cells and protects them from invasion by bacteriophages and plasmids [59][60][61].
This study underestimates the extent of host range because only ICEs in sequenced genomes were detected. Also, identification of new ICEs could only be achieved in complete genomes or contigs with a sequence length large enough to include the full (or near complete) sequence of the ICE. As so, it is important to highlight the need to perform third generation sequencing on CEG-harbouring genomes to avoid fragmentation of the genetic environment surrounding the gene and to provide a wider view of complete ICEs and other MGEs. All ICE elements here identified fulfilled the criteria to be considered conjugative as proposed by Cury et al.: a relaxase, a VirB4/TraU, a T4CP and minimum set of MPF type-specific genes [38]. ICEs tend to integrate within the host's chromosome by the action of a tyrosine recombinase, even though some elements may use serine or DDE recombinases instead [27]. Though rare, some elements encode more than one integrase, most likely resulting from independent integration of different MGEs [38]. Conserved sites are hotspots for ICE integration due to their high conservation among closely related bacteria, and so expanding the host range and be stably maintained after conjugative transfer [62,63]. ICEs were often integrated next to phages highly similar to the Pseudomonas phage Pf1 (NC_001331.1), a class II filamentous bacteriophage belonging to the Inoviridae family [61]. Pf1-like phages are widely disseminated among P. aeruginosa strains and may have a role in bacterial evolution and virulence [64][65][66]. Interestingly, no representative of the pKLC102 family was linked to the dissemination of CEGs. This may be due to a higher affinity of the transposons carrying the CEGs for hotspots located within representatives of the other two families.
MGEs specifically targeting conserved regions of the genome such as tRNAs are common and this specificity represents an evolutionary strategy whereby the target site of an element is almost guaranteed to be present, due to its essentiality, and very unlikely to change due to biochemical constraints of the gene product. We think a similar situation exists for the elements found between the small srpRNAs described on the atypical GI elements here identified and is in contrast to the more permissive nature of target site selection shown for example, by elements of the Tn916/Tn1545 family [67].

Conclusions
Here, we revealed that different Tn3-like and composite transposons harbouring a wide array of CEGs were transposed into MPF G and T ICE classes, which were most likely responsible for the dissemination of these genes through HGT and/or clonal expansion of successful Pseudomonas clones. This study sheds light on the underappreciated contribution of ICEs for the spread of CEGs among pseudomonads (and potentially further afield). With the ever-growing number of third-generation sequenced genomes and the development of more sophisticated bioinformatics, the real contribution of these ICEs will likely rapidly emerge.
Recently, it was shown that interfering with the transposase-DNA complex architecture of the conjugative transposon (also know as ICE) Tn1549 leads to transposition inhibition to a new host [68]. In the future, it would be interesting to determine if the same mechanism is observed for tyrosine recombinases present in ICEclc and Tn4371 derivatives, as well as in other MPF ICE classes, as a potential approach to interfere with the spread of antimicrobial resistance.

Genome collection and blast search
A total of 4565 Pseudomonas genomes was downloaded from NCBI (accessed on the 24th of April, 2018). These genomes were blasted against the local carbapenemase database using the following command: 'diamond blastx -d DB.dmnd -o hits.txt --id 100 --subject-cover 100 -f 6 --sensitive'.

Phylogenomics
All CEG-harbouring P. aeruginosa genomes were mapped against the P. aeruginosa PAO1 reference strain (accession number NC_002516.2), to infer a phylogeny based on the concatenated alignment of high quality SNPs using CSI Phylogeny and standard settings [78]. The phylogenetic tree was plotted using the iTOL platform (https://itol.embl.de/).

MLST and taxonomic assignment of unidentified species
To predict the ST of the strains harbouring ICEs, the P. aeruginosa MLST website (https://pubmlst.org/paeruginosa/) developed by Keith Jolley and hosted at the University of Oxford was used [79]. Taxonomic assignment of unidentified species carrying ICEs was achieved by JSpeciesWS v. 3.0.17 (http://jspecies.ribohost.com/jspeciesws/) [80].

Additional files
Additional file 1: Table S1. General features of the hits. Hits associated with ICEs are highlighted in blue. Strains for which more than one CEG was identified are represented in red. (DOCX 52 kb) Additional file 2: Figure S1. Genetic environment of a novel genomic island (GI) harboring bla VIM-2 in P. aeruginosa strain AZPAE13853. Gene cassettes are shown by pale blue boxes, the conserved sequence (