Examples of Onco-exaptation. Gene models of known TE-derived promoters expressing downstream oncogenes and listed in Table 2. Legend is shown at the top. a 6 kb upstream of CSF1R, a THE1B LTR initiates transcription and contains a splice donor site which joins to an exon within a LINE L1MB5 element and then into the first exon of CSF1R. The TE-initiated transcript has a different, longer 5’ UTR than the canonical transcript but the same full-length protein coding sequence. b An LOR1a LTR initiates transcription and splices into the canonical second exon of IRF5 that contains the standard translational initiation site (TIS) to produce a full-length protein. There also is a novel second exon which is non-TE derived which is incorporated into a minor isoform of LOR1a-IRF5. c Within the canonical intron 2 of the proto-oncogene MET, a full length LINE L1PA2 element initiates transcription (anti-sense to itself), splicing through a short exon in a SINE MIR element and into the third exon of MET. The first TIS of the canonical MET transcript is 14 bp into exon 2, although an alternative TIS exists in exon 3, which is believed to also be used by the L1-promoterd isoform. d An LTR16B2 element in intron 19 of the ALK gene initiates transcription and transcribes into the canonical exon 20 of ALK. An in-frame TIS within the 20th exon results in translation of a shortened oncogenic protein containing only the intra-cellular tyrosine kinase domain, but lacking the transmembrane and extracellular receptor domains of ALK.
e There are two TE-promoted isoforms of ERBB4, the minor variant initiates in an MLT1C LTR in the 12th intron and the major variant initiates in a MLT1H LTR in the 20th intron. Both isoforms produce a truncated protein, although the exact translation start sites are not defined. f In the third exon of SLCO1B3, two adjacent partly full-length HERV elements conspire to create a novel first exon. Transcription initiates in the anti-sense orientation from an LTR7 and transcribes to a sense-oriented splice donor in an adjacent MER4C LTR, which then splices into the fourth exon of SLCO1B3, creating a smaller protein. g An LTR2 element initiates anti-sense transcription (relative to its own orientation) and splices into the native second exon of FABP7. The LTR-derived isoform has a non-TE TIS and splice donor which creates a different N-terminal protein sequence of FABP7