Mammalian-wide interspersed repeat (MIR)-derived enhancers and the regulation of human gene expression
© Jjingo et al.; licensee BioMed Central Ltd. 2014
Received: 30 August 2013
Accepted: 10 April 2014
Published: 5 May 2014
Mammalian-wide interspersed repeats (MIRs) are the most ancient family of transposable elements (TEs) in the human genome. The deep conservation of MIRs initially suggested the possibility that they had been exapted to play functional roles for their host genomes. MIRs also happen to be the only TEs whose presence in-and-around human genes is positively correlated to tissue-specific gene expression. Similar associations of enhancer prevalence within genes and tissue-specific expression, along with MIRs’ previous implication as providing regulatory sequences, suggested a possible link between MIRs and enhancers.
To test the possibility that MIRs contribute functional enhancers to the human genome, we evaluated the relationship between MIRs and human tissue-specific enhancers in terms of genomic location, chromatin environment, regulatory function, and mechanistic attributes. This analysis revealed MIRs to be highly concentrated in enhancers of the K562 and HeLa human cell-types. Significantly more enhancers were found to be linked to MIRs than would be expected by chance, and putative MIR-derived enhancers are characterized by a chromatin environment highly similar to that of canonical enhancers. MIR-derived enhancers show strong associations with gene expression levels, tissue-specific gene expression and tissue-specific cellular functions, including a number of biological processes related to erythropoiesis. MIR-derived enhancers were found to be a rich source of transcription factor binding sites, underscoring one possible mechanistic route for the element sequences co-option as enhancers. There is also tentative evidence to suggest that MIR-enhancer function is related to the transcriptional activity of non-coding RNAs.
Taken together, these data reveal enhancers to be an important cis- regulatory platform from which MIRs can exercise a regulatory function in the human genome and help to resolve a long-standing conundrum as to the reason for MIRs’ deep evolutionary conservation.
Transposable elements (TEs) are abundant in eukaryotic genomes, particularly mammalian genomes. Indeed, at least 45% of the human genome is made up of TE-derived sequences[1, 2], which are non-randomly distributed across the genome. For example, human Alu short interspersed elements (SINEs) are predominantly found in GC- and gene-rich regions, whereas L1 long interspersed elements (LINEs) are most prevalent in low-GC and gene-poor regions[1, 3]. Transposable elements have also been shown to affect the expression of host genes via the provisioning of a variety of regulatory sequences. The non-random genomic distribution of human TEs, considered together with their regulatory potential, initially suggested the possibility that the TEenvironment of human genes might affect the way that they are expressed.
In fact, a number of associations between the TE environment in-and-around human genes and their expression levels and functional patterns have subsequently been observed. Weakly expressed genes generally contain low SINE and high LINE densities, while the most highly expressed human genes are enriched for SINEs (Alu) and depleted in L1 elements. Additionally, Alu elements are significantly associated with the breadth of gene expression across tissues[7, 8]. Thus, highly and broadly expressed housekeeping genes are identifiable by their TE-content, which is rich in Alus and poor in L1s. Functionally, TEs have recently been demonstrated to have been exapted during the evolution of novel phenotypic characteristics, such as mammalian pregnancy[10, 11]. Mammalian-wide interspersed repeats (MIRs) are the only TEs that show a positive association between their prevalence in-and-around genes and tissue-specific gene expression[8, 12].
MIR elements are an ancient family of tRNA-derived SINEs[13, 14], whose anomalous sequence-conservation levels among mammalian genomes were initially taken as evidence that they encode some unknown regulatory function. Succeeding studies demonstrated that, in a number of individual cases, MIRs do in fact donate transcription-factor binding sites[16–20], enhancers[18, 21, 22], microRNAs[23, 24] and cis natural antisense transcripts to the human genome. The association of MIRs with tissue-specific expression, along with their propensity to be exapted as regulatory sequences, suggests to us the possibility that they might provide numerous tissue-specific regulatory sequences across the human genome.
Enhancers are regulatory elements that are most highly associated with tissue-specific expression[26, 27]. They are also characterized by a unique chromatin environment made up of a specific combination of histone modifications[26–29]. Consistent with their role as tissue-specific regulatory elements, the enhancer chromatin environment is highly variable across cell-types, compared to other classes of regulatory sequences[26, 27, 29]. We hypothesized that the global coincident association of both MIRs and enhancers to tissue-specific gene expression is at least in part a consequence of MIR sequences frequently acting either as enhancers and/or constituting fragments of enhancer sequences. This would be consistent with previously reported individual cases of TE-derived enhancers[21, 30–32]. We also reasoned that the enhancer-characteristic chromatin environment could serve as a useful proxy to identify putative MIR-derived enhancers.
To test our hypothesis, we performed a genome-wide assessment of the relative prevalence of MIRs within enhancer sequences and explored the potential mechanistic bases and functional consequences of this relationship. We found that not only are MIRs highly concentrated in predicted enhancers, but they also constitute a significant part of the core of genic enhancers; this analysis identified many more putative MIR-derived enhancers than previously reported[22, 33]. These MIR-derived enhancers have cell-type specific chromatin profiles that are highly similar to those seen for canonical enhancers. Furthermore, we report MIRs to be major donors of transcription-factor binding sites (TFBSs) within enhancers, and show that MIR-derived enhancers affect both the level and tissuespecificity of gene expression. Using the erythroid K562 cell-line as an example, we show that MIR-enhancers are involved in the modulation of several developmentally-specific biological processes related to erythropoiesis.
Results and discussion
MIRs are highly concentrated in enhancers
As noted in the introduction, MIRs are the only TEs that show a positive association with tissue-specific gene expression. Similarly, unlike other cis- regulatory elements, enhancers are marked with highly cell-type specific histone modification patterns and are accordingly also highly related to tissue-specific gene expression[26, 27]. We thus sought to test our working hypothesis that these apparent similarities between MIRs and enhancers are largely a consequence of MIR sequences either frequently acting as enhancers and/or constituting fragments of enhancer sequences.
Thus, while MIRs have been previously reported to be enriched within introns, our data clearly reveal this genic enrichment to be strongly biased towards enhancers. And while MIRs are known to donate enhancers in a number of individual cases[21, 22, 33], these data show an even deeper relationship, namely that MIRs are substantially concentrated in enhancers genome-wide.
Numerous MIRs provide enhancers or are linked to enhancers
Finding MIRs to be highly concentrated within enhancers, we sought to establish the numbers and locations of MIRs that provide core enhancer sequences themselves (MIR-enhancers) as well as those that lie within enhancer regions (enhancer-MIRs). We found 934 and 1,429 MIRs to be MIR-enhancers in K562 and HeLa cell-lines (genomic locations in Additional file2). This is in contrast to the 669 and 996 MIRs that would be expected to be enhancers in the two cell-lines if MIRs were randomly distributed among enhancers (χ2 = 105, P = 1.2 × 10-24, K562; χ2 = 188, P = 1.0 × 10-42, HeLa). Furthermore, the numbers of MIR-derived enhancers identified here is far greater than has been previously reported[22, 33], owing to the availability of more enhancer annotations and a more accurate estimate of the size of enhancers, which we deduced from the span of characteristic enhancer histone-modifications at enhancer sites. When this analysis was expanded to include all enhancer-linked MIRs (that is, enhancer-MIRs), the extent to which enhancers are connected to MIRs became even more apparent. We found 16,144 and 26,520 enhancers to be linked to MIRs in K562 and HeLa celllines respectively, compared with the 6,559 and 9,320 enhancer-MIRs that would be expected by chance alone (χ2 = 14,007, P < 1.0 × 10-308, K562; χ2 = 31,742, P < 1.0 × 10-308, HeLa).
MIRs are enriched for TFBSs
Enhancer-associated MIRs influence gene expression levels and tissue-specificity
Functional significance of enhancer-associated MIRs
To further understand the impact that enhancer-associated MIRs might have on K562 cell-type specific biological functions, we focused on the erythropoiesis biological function, whose gene set has the most significant overlap with enhancer-MIR associated genes (Figure 6A). This gene set contains genes that have been implicated in various aspects of erythrocyte function and development. We compared the expression levels of enhancer-MIR associated genes in this gene set with developmental stages of erythropoiesis and found them to have highly divergent expression levels across development, an indicator that enhancer-associated MIRs might be involved in their regulation during erythropoiesis (Figure 6B and Additional file1: Table S3).
Interestingly, this erythropoiesis gene set and its related enhancer-associated MIRs include four distinct MIR sequences that were previously implicated as being involved in the regulation of the α-globin gene cluster by virtue of their co-location with an experimentally characterized ‘locus control region’. This cluster contains a number of globin genes, including HBZ and HBA1, both of which are enhancer-MIR associated and differentially expressed across the various stages of erythropoiesis (Figure 6B,C). Furthermore, the enhancer-associated MIRs in the locus control region can be seen to be marked by the enhancer-related histone modifications H3K4me1 and H3K27ac, to recruit the TFs C-JUN, NF-E2 and ZNF274, and to reside in a relatively open chromatin environment, as characterized by DNase I hypersensitive sites (DHSs) (Figure 6C). In addition, one of the four locus control region co-located MIRs appears to recruit RNA polymerase II (Pol2) transcriptional machinery. This suggests the possibility that enhancer-associated MIRs might also exert their regulatory activity by virtue of the transcriptional activity of non-coding RNAs, as has been observed for a number of TE- and tRNA-derived regulatory sequences[52–54]. Consistent with this possibility, non-coding RNA transcriptional activity has been suggested as a widespread mechanism underlying enhancer function in mammalian genomes[55–58].
Considered together, these results suggest that K562-predicted enhancer-associated MIRs are active in the cell-type specific and developmental regulation of genes involved in a number of biological processes related to K562 functions in general, and erythropoiesis in particular. Furthermore, the exaptation of MIRs as enhancers might be predicated upon the recruitment of specific TFs, as previously discussed, as well as the transcriptional activity of non-coding RNAs.
A number of previous studies have found different classes and families of TEs and TE-derived sequences to have distinct and substantial effects on genome regulation. In one such study, our group identified MIRs to be the only TE-derived sequences whose presence shows a positive correlation to tissue-specific gene expression. Here, we provide evidence that this correlation is likely to be due to the propensity of MIRs to be exapted as enhancers that regulate cell-type and developmental-stage specific gene expression. We show that MIR-related enhancer activity is functionally relevant and may be related to TF binding, as well as the transcriptional activity of non-coding RNAs. The widespread exaptation of previously selfish MIR sequences as enhancers resolves a long-standing conundrum arising from the observation that MIR-derived sequences are far too evolutionarily conserved to be simply non-functional or ‘junk’ DNA.
Identification of MIR-related enhancers
We used two sets of 24,538 and 36,550 putative transcriptional enhancers, predicted in the K562 and HeLa cell-lines, respectively. These enhancers were predicted as ENCODE regions that bear specific chromatin histone modification (H3K4me1 and H3K27ac) profiles that are similar to those seen for genomic regions bound by the coactivator protein p300, which is known to colocalize at active enhancers. The p300 binding sites were themselves located using a chromatin immunoprecipitation-based microarray method (ChIP-chip) and the histone modification data were taken from the ENCODE project[36, 60]. We considered the span of enhancers to be the ±4 kb region around the predicted enhancer midpoints, which corresponds roughly to the empirically determined range of the characteristic chromatin pattern found at predicted enhancers (Figure 2A). We intersected the coordinates of these enhancers with the RepeatMasker annotations of MIR elements as identified by the Repbase classification system[61, 62] to predict MIR-related enhancers. These MIR annotations on the human genome assembly (NCBI build 36.1; UCSC hg18) were downloaded from the UCSC Genome Browser[34, 35]. Each putative enhancer analyzed here was originally predicted to be anchored around a single basepair position. If this core basepair was located in a MIR, then such a MIR was classified as a core enhancer. Such cases are considered MIR-enhancers for the purposes of analysis. On the other hand, some MIRs do not donate the core enhancer locus but are instead located within the typical ±4 kb span for enhancers. These MIRs were considered enhancer-linked MIRs and are referred to as enhancer-MIRs. There are thus two categories of MIR-related enhancers analyzed here: MIR-enhancers and enhancer-MIRs. However, at both the locational and functional levels, both categories of MIRs are part of the enhancer body, which we determined using the span of its chromatin profile.
A set of 19,536 non-overlapping transcriptional units derived from Refseq gene annotations, as defined in reference, was used to assess MIR densities within genes. For both the K562 and HeLa cell-lines, regions of overlap between MIR genomic coordinates and four different types of genomic elements or regions were determined: (i) genic enhancers, (ii) genic non-enhancer regions, (iii) non-genic enhancers, and (iv) the core 200 bp region around predicted enhancer midpoints. For each region, the density of MIRs was computed either as the fraction of the length of each region in basepairs that was occupied by MIRs or their fold enrichment within the regions relative to the local genomic background. The local genomic backgrounds were compiled as regions randomly sampled 100 kb downstream of each locus of interest. Their enrichment in terms of MIRs or other ChIP-seq datasets was then divided by the average densities of those datasets over the entire genome to obtain their normalized signal values.
Expected numbers of MIR-enhancers were computed as the average genome-wide density of enhancers (enhancers/bp) multiplied by the total length in basepairs of all genomic MIRs. Expected numbers of enhancers with MIRs were simulated by mapping random genomic sites equivalent to MIRs (in number and size) to enhancer regions and then counting the number of enhancers that were overlapped.
Histone modification profile analysis
Genome-wide ChIP-seq data for eight histone modifications (H3K4me1, H3K27ac, H3K36me3, H3K9ac, H3K4me2, H3K4me3, H4K20me1, and H3K27me3) in the K562 and HeLa-S3 cell-lines was taken from the ‘ENCODE histone modification tracks’ of the UCSC Genome Browser. Genomic loci of 20 kb centered on canonical enhancers (all predicted enhancers), MIR-enhancers, and enhancer-MIRs were evaluated for enrichment of the histone modifications. Counts of each histone modification within 500 bp windows across the 20 kb region were then computed and their profiles represented as fold enrichments relative to average counts per 500 bp in the genomic background. The congruence of histone modification profiles between canonical enhancers and MIR-related enhancers was assessed using rank correlations between modification enrichments, which were weighted by the slope of their line-of-best-fit to establish the relative order of histone mark enrichment congruence.
Transcription-factor binding site analysis
TFBS sequence motifs within MIR-related enhancers were identified using both PWMs and regular expression representations of sequence motifs corresponding to K562-related TFs as gleaned from the experimental literature and TF databases[45, 64–68]. Counts of TFBS sequence motifs in MIR-related enhancer sequences were compared with counts of the same motifs in random genomic sequences of the same number and size. For each transcription factor, we obtained the expected number by counting the number of motifs in 1,000 random samples, each being of the same size as the enhancer-MIR set (50,599 sequences). The medians of those distributions were then considered the expected number for each motif. P values were then empirically determined using a Z test on the distributions with μ the median and χ the observed number of motifs in enhancer-MIRs, which is the same number as our set of enhancer-associated MIRs.
Experimentally characterized TF binding sites within MIR-related enhancer sequences were identified using ChIP-seq data from the ‘ENCODE transcription-factor binding tracks’ of the UCSC Genome Browser. Enrichment values of TF occupancy levels for MIR-related enhancer sequences were computed using ChIP-seq tag counts within MIR-related sequences normalized by the genome average ChIP-seq tag counts for non-enhancer-MIRs. The presence of TFBS sequence motifs within ChIP-seq characterized TF bound regions was evaluated using regular expressions, as described, along with the motif finding program MEME.
Gene expression analysis
where N is the number of tissues and x i represents a gene’s signal intensity value in each tissue i divided by the maximum signal intensity value of the gene across all tissues. For both datasets, tissue-specificity was also calculated based on the entropy among gene expression levels between tissues using Shannon entropy in R’s Bioconductor package.
For each gene, the density of enhancer-associated MIRs in and around the gene (from 10 kb upstream to 10 kb downstream) was computed by dividing the number of enhancer-associated MIRs in that genomic range by the number of base pairs in the range. The density values of the enhancer-associated MIRs were then divided into 100 equal bins whose average densities were regressed against the respective average expression levels or TS of the associated genes.
Functional association analysis
The functional effects of enhancer-associated MIRs were evaluated using erythroid (K562)-specific enhancer-associated MIRs (defined as enhancer-associated MIRs present in K562 and absent in HeLa). Genes were associated with K562-specific enhancer-associated MIRs within 100 kb. The hypergeometric test was used to check for enrichment of enhancer-MIR associated genes within (i) a set of 350 genes previously shown to be developmentally regulated in erythroids across four stages of erythropoiesis, and (ii) gene sets for nine erythroid-related cellular functions (Additional file1: Table S1) taken from the Broad Institute’s molecular signatures database (MSigDB). Developmental regulation of enhancer-MIR associated erythropoiesis-related genes was assessed using gene expression data for five stages of erythrocyte development.
chromatin immunoprecipitation followed by high-throughput sequencing
DNase I hypersensitive site
Encyclopedia of DNA elements
long interspersed element
mammalian-wide interspersed repeat
molecular signatures database
position weight matrix
short interspersed element
transcription-factor binding site
University of California Santa Cruz.
This work was supported in part by the Fulbright foundation through a PhD scholarship to DJ, the School of Biology, Georgia Institute of Technology (to IKJ, DJ, ABC, and JW), an Alfred P Sloan Research Fellowship in Computational and Evolutionary Molecular Biology (BR-4839) to IKJ, and NIH R21 AG043921 to VVL. This research was supported in part by the Intramural Research Program of the NIH, NLM, NCBI. No funding bodies played any role in the design, collection, analysis, and interpretation of the data or the writing and submission of the manuscript. The authors would like to acknowledge members of the Jordan laboratory for helpful discussions. The authors are also grateful to the ENCODE consortium for making their data freely accessible and acknowledge the use of data generated by the ENCODE institutions and groups below: Broad Institute and Massachusetts General Hospital-Harvard Medical School/Bernstein Laboratory (histone modifications data). Stanford University/Michael Snyder laboratory, Yale University/Mark Gerstein and Sherman Weissman laboratories, University of Southern California/Peggy Farnham Laboratory and Harvard University/Kevin Struhl Laboratory (transcription-factor binding data) and University of Washington/Sandstrom laboratory (expression data).
- Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, et al.: Initial sequencing and analysis of the human genome. Nature 2001,409(6822):860-921. 10.1038/35057062View ArticlePubMed
- Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, Gocayne JD, Amanatides P, Ballew RM, Huson DH, Wortman JR, Zhang Q, Kodira CD, Zheng XH, Chen L, Skupski M, Subramanian G, Thomas PD, Zhang J, Gabor Miklos GL, Nelson C, Broder S, Clark AG, Nadeau J, McKusick VA, Zinder N, et al.: The sequence of the human genome. Science 2001,291(5507):1304-1351. 10.1126/science.1058040View ArticlePubMed
- RepeatMasker http://www.repeatmasker.org
- Feschotte C: Opinion - transposable elements and the evolution of regulatory networks. Nat Rev Genet 2008,9(5):397-405. 10.1038/nrg2337PubMed CentralView ArticlePubMed
- Versteeg R, van Schaik BDC, van Batenburg MF, Roos M, Monajemi R, Caron H, Bussemaker HJ, van Kampen AHC: The human transcriptome map reveals extremes in gene density, intron length, GC content, and repeat pattern for domains of highly and weakly expressed genes. Genome Res 2003,13(9):1998-2004. 10.1101/gr.1649303PubMed CentralView ArticlePubMed
- Han JS, Szak ST, Boeke JD: Transcriptional disruption by the L1 retrotransposon and implications for mammalian transcriptomes. Nature 2004,429(6989):268-274. 10.1038/nature02536View ArticlePubMed
- Kim TM, Jung YC, Rhyu MG: Alu and L1 retroelements are correlated with the tissue extent and peak rate of gene expression, respectively. J Korean Med Sci 2004,19(6):783-792. 10.3346/jkms.2004.19.6.783PubMed CentralView ArticlePubMed
- Jjingo D, Huda A, Gundapuneni M, Marino-Ramirez L, Jordan IK: Effect of the transposable element environment of human genes on gene length and expression. Genome Biol Evol 2011, 3: 259-271. 10.1093/gbe/evr015PubMed CentralView ArticlePubMed
- Eller CD, Regelson M, Merriman B, Nelson S, Horvath S, Marahrens Y: Repetitive sequence environment distinguishes housekeeping genes. Gene 2007,390(1–2):153-165.PubMed CentralView ArticlePubMed
- Emera D, Wagner GP: Transposable element recruitments in the mammalian placenta: impacts and mechanisms. Brief Funct Genomics 2012,11(4):267-276. 10.1093/bfgp/els013View ArticlePubMed
- Lynch VJ, Leclerc RD, May G, Wagner GP: Transposon-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals. Nat Genet 2011,43(11):1154-U1158. 10.1038/ng.917View ArticlePubMed
- Sironi M, Menozzi G, Comi GP, Cereda M, Cagliani R, Bresolin N, Pozzoli U: Gene function and expression level influence the insertion/fixation dynamics of distinct transposon families in mammalian introns. Genome Biol 2006,7(12):R120. 10.1186/gb-2006-7-12-r120PubMed CentralView ArticlePubMed
- Jurka J, Zietkiewicz E, Labuda D: Ubiquitous mammalian-wide interspersed repeats (MIRs) are molecular fossils from the mesozoic era. Nucleic Acids Res 1995,23(1):170-175. 10.1093/nar/23.1.170PubMed CentralView ArticlePubMed
- Smit AF, Riggs AD: MIRs are classic, tRNA-derived SINEs that amplified before the mammalian radiation. Nucleic Acids Res 1995,23(1):98-102. 10.1093/nar/23.1.98PubMed CentralView ArticlePubMed
- Silva JC, Shabalina SA, Harris DG, Spouge JL, Kondrashovi AS: Conserved fragments of transposable elements in intergenic regions: evidence for widespread recruitment of MIR- and L2-derived sequences within the mouse and human genomes. Genet Res 2003,82(1):1-18. 10.1017/S0016672303006268View ArticlePubMed
- Polavarapu N, Marino-Ramirez L, Landsman D, McDonald JF, Jordan IK: Evolutionary rates and patterns for human transcription factor binding sites derived from repetitive DNA. BMC Genomics 2008, 9: 226-236. 10.1186/1471-2164-9-226PubMed CentralView ArticlePubMed
- Wang J, Bowen NJ, Marino-Ramirez L, Jordan IK: A c-Myc regulatory subnetwork from human transposable element sequences. Molecular Biosyst 2009,5(12):1831-1839. 10.1039/b908494kView Article
- Thornburg BG, Gotea V, Makalowski W: Transposable elements as a significant source of transcription regulating signals. Gene 2006, 365: 104-110.View ArticlePubMed
- Polak P, Domany E: Alu elements contain many binding sites for transcription factors and may play a role in regulation of developmental processes. BMC Genomics 2006, 7: 133. 10.1186/1471-2164-7-133PubMed CentralView ArticlePubMed
- Cui F, Sirotin MV, Zhurkin VB: Impact of Alu repeats on the evolution of human p53 binding sites. Biol Direct 2011, 6: 2. 10.1186/1745-6150-6-2PubMed CentralView ArticlePubMed
- Marino-Ramirez L, Jordan IK: Transposable element derived DnaseI-hypersensitive sites in the human genome. Biol Direct 2006, 1: 20. 10.1186/1745-6150-1-20PubMed CentralView ArticlePubMed
- Teng L, Firpi HA, Tan K: Enhancers in embryonic stem cells are enriched for transposable elements and genetic variations associated with cancers. Nucleic Acids Res 2011,39(17):7371-7379. 10.1093/nar/gkr476PubMed CentralView ArticlePubMed
- Piriyapongsa J, Marino-Ramirez L, Jordan IK: Origin and evolution of human microRNAs from transposable elements. Genetics 2007,176(2):1323-1337.PubMed CentralView ArticlePubMed
- Kuang HH, Padmanabhan C, Li F, Kamei A, Bhaskar PB, Shu OY, Jiang JM, Buell CR, Baker B: Identification of miniature inverted-repeat transposable elements (MITEs) and biogenesis of their siRNAs in the Solanaceae: new functional implications for MITEs. Genome Res 2009,19(1):42-56.PubMed CentralView ArticlePubMed
- Conley AB, Miller WJ, Jordan IK: Human cis natural antisense transcripts initiated by transposable elements. Trends Genet 2008,24(2):53-56. 10.1016/j.tig.2007.11.008View ArticlePubMed
- Heintzman ND, Hon GC, Hawkins RD, Kheradpour P, Stark A, Harp LF, Ye Z, Lee LK, Stuart RK, Ching CW, Ching KA, Antosiewicz-Bourget JE, Liu H, Zhang X, Green RD, Lobanenkov VV, Stewart R, Thomson JA, Crawford GE, Kellis M, Ren B: Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 2009,459(7243):108-112. 10.1038/nature07829PubMed CentralView ArticlePubMed
- Bulger M, Groudine M: Enhancers: the abundance and function of regulatory sequences beyond promoters. Dev Biol 2010,339(2):250-257. 10.1016/j.ydbio.2009.11.035PubMed CentralView ArticlePubMed
- Calo E, Wysocka J: Modification of enhancer chromatin: what, how, and why? Mol Cell 2013,49(5):825-837. 10.1016/j.molcel.2013.01.038View ArticlePubMed
- Creyghton MP, Cheng AW, Welstead GG, Kooistra T, Carey BW, Steine EJ, Hanna J, Lodato MA, Frampton GM, Sharp PA, Boyer LA, Young RA, Jaenisch R: Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc Natl Acad Sci USA 2010,107(50):21931-21936. 10.1073/pnas.1016071107PubMed CentralView ArticlePubMed
- Ferretti V, Poitras C, Bergeron D, Coulombe B, Robert F, Blanchette M: PreMod: a database of genome-wide mammalian cis-regulatory module predictions. Nucleic Acids Res 2007, 35: D122-D126. 10.1093/nar/gkl879PubMed CentralView ArticlePubMed
- Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo QM, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH, Thomson JA, Ren B, Ecker JR: Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 2009,462(7271):315-322. 10.1038/nature08514PubMed CentralView ArticlePubMed
- Mayshar Y, Ben-David U, Lavon N, Biancotti JC, Yakir B, Clark AT, Plath K, Lowry WE, Benvenisty N: Identification and classification of chromosomal aberrations in human induced pluripotent stem cells. Cell Stem Cell 2010,7(4):521-531. 10.1016/j.stem.2010.07.017View ArticlePubMed
- Huda A, Tyagi E, Marino-Ramirez L, Bowen NJ, Jjingo D, Jordan IK: Prediction of transposable element derived enhancers using chromatin modification profiles. PLoS One 2011,6(11):e27513. 10.1371/journal.pone.0027513PubMed CentralView ArticlePubMed
- Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ: The UCSC table browser data retrieval tool. Nucleic Acids Res 2004, 32: D493-D496. 10.1093/nar/gkh103PubMed CentralView ArticlePubMed
- Rhead B, Karolchik D, Kuhn RM, Hinrichs AS, Zweig AS, Fujita PA, Diekhans M, Smith KE, Rosenbloom KR, Raney BJ, Pohl A, Pheasant M, Meyer LR, Learned K, Hsu F, Hillman-Jackson J, Harte RA, Giardine B, Dreszer TR, Clawson H, Barber GP, Haussler D, Kent WJ: The UCSC Genome Browser database: update 2010. Nucleic Acids Res 2010, 38: D613-D619. 10.1093/nar/gkp939PubMed CentralView ArticlePubMed
- Heintzman ND, Stuart RK, Hon G, Fu YT, Ching CW, Hawkins RD, Barrera LO, Van Calcar S, Qu CX, Ching KA, Wang W, Weng Z, Green RD, Crawford GE, Ren B: Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 2007,39(3):311-318. 10.1038/ng1966View ArticlePubMed
- Rada-Iglesias A, Bajpai R, Swigut T, Brugmann SA, Flynn RA, Wysocka J: A unique chromatin signature uncovers early developmental enhancers in humans. Nature 2011,470(7333):279-283. 10.1038/nature09692PubMed CentralView ArticlePubMed
- Maston GA, Evans SK, Green MR: Transcriptional regulatory elements in the human genome. Annu Rev Genom Hum G 2006, 7: 29-59. 10.1146/annurev.genom.7.080505.115623View Article
- Adunyah SE, Chander R, Barner VK, Cooper RS: Regulation of c-jun mRNA expression by hydroxyurea in human K562 cells during erythroid differentiation. Biochim Biophys Acta 1995, 1263: 123-132. A published erratum appears in Biochim Biophys Acta 1995, 1264(3):409 10.1016/0167-4781(95)00079-VView ArticlePubMed
- Andrews NC: The NF-E2 transcription factor. Int J Biochem Cell B 1998,30(4):429-432. 10.1016/S1357-2725(97)00135-0View Article
- Chen RL, Chou YC, Lan YJ, Huang TS, Shen CKJ: Developmental silencing of human ζ-globin gene expression is mediated by the transcriptional repressor RREB1. J Biol Chem 2010,285(14):10189-10197. 10.1074/jbc.M109.049130PubMed CentralView ArticlePubMed
- Hilberg F, Aguzzi A, Howells N, Wagner EF: c-jun is essential for normal mouse development and hepatogenesis. Nature 1993, 365: 179-181. A published erratum appears in Nature 1993, 366(6453):368 10.1038/365179a0View ArticlePubMed
- Pages PA H, Gentleman R, DebRoy S: Biostrings: string objects representing biological sequences, and matching algorithms. http://bioc.ism.ac.jp/2.6/bioc/html/Biostrings.html
- Stormo GD, Zhao Y: Determining the specificity of protein-DNA interactions. Nat Rev Genet 2010,11(11):751-760.View ArticlePubMed
- Wang J, Zhuang JL, Iyer S, Lin XY, Greven MC, Kim BH, Moore J, Pierce BG, Dong XJ, Virgil D, Birney E, Hung JH, Weng Z: Factorbook.org: a Wiki-based database for transcription factor-binding data generated by the ENCODE consortium. Nucleic Acids Res 2013,41(D1):D171-D176. 10.1093/nar/gks1221PubMed CentralView ArticlePubMed
- Yanai I, Benjamin H, Shmoish M, Chalifa-Caspi V, Shklar M, Ophir R, Bar-Even A, Horn-Saban S, Safran M, Domany E, Lancet D, Shmueli O: Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics 2005,21(5):650-659. 10.1093/bioinformatics/bti042View ArticlePubMed
- Schug J, Schuller WP, Kappen C, Salbaum JM, Bucan M, Stoeckert CJ: Promoter features related to tissue specificity as measured by Shannon entropy. Genome Biol 2005,6(4):R33. 10.1186/gb-2005-6-4-r33PubMed CentralView ArticlePubMed
- Merryweather-Clarke AT, Atzberger A, Soneji S: Global gene expression analysis of human erythroid progenitors. Blood 2011, 117: e96-e108. A published erratum appears in Blood 2011, 118(26): 6993 10.1182/blood-2010-07-290825View ArticlePubMed
- McLean CY, Bristor D, Hiller M, Clarke SL, Schaar BT, Lowe CB, Wenger AM, Bejerano G: GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol 2010,28(5):495-U155. 10.1038/nbt.1630View ArticlePubMed
- Addya S, Keller MA, Delgrosso K, Ponte CM, Vadigepalli R, Gonye GE, Surrey S: Erythroid-induced commitment of K562 cells results in clusters of differentially expressed genes enriched for specific transcription regulatory elements. Physiol Genomics 2004,19(1):117-130. 10.1152/physiolgenomics.00028.2004View ArticlePubMed
- Jordan IK, Rogozin IB, Glazko GV, Koonin EV: Origin of a substantial fraction of human regulatory sequences from transposable elements. Trends Genet 2003,19(2):68-72. 10.1016/S0168-9525(02)00006-9View ArticlePubMed
- Lunyak VV, Prefontaine GG, Nunez E, Cramer T, Ju BG, Ohgi KA, Hutt K, Roy R, Garcia-Diaz A, Zhu X, Yung Y, Montoliu L, Glass CK, Rosenfeld MG: Developmentally regulated activation of a SINE B2 repeat as a domain boundary in organogenesis. Science 2007,317(5835):248-251. 10.1126/science.1140871View ArticlePubMed
- Wang JR, Lunyak VV, Jordan IK: Genome-wide prediction and analysis of human chromatin boundary elements. Nucleic Acids Res 2012,40(2):511-529. 10.1093/nar/gkr750PubMed CentralView ArticlePubMed
- Lee YS, Shibata Y, Malhotra A, Dutta A: A novel class of small RNAs: tRNA-derived RNA fragments (tRFs). Gene Dev 2009,23(22):2639-2649. 10.1101/gad.1837609PubMed CentralView ArticlePubMed
- Ren B: Transcription enhancers make non-coding RNA. Nature 2010,465(7295):173-174. 10.1038/465173aView ArticlePubMed
- Kim TK, Hemberg M, Gray JM, Costa AM, Bear DM, Wu J, Harmin DA, Laptewicz M, Barbara-Haley K, Kuersten S, Markenscoff-Papadimitriou E, Kuhl D, Bito H, Worley PF, Kreiman G, Greenberg ME: Widespread transcription at neuronal activity-regulated enhancers. Nature 2010,465(7295):182-187. 10.1038/nature09033PubMed CentralView ArticlePubMed
- Li WB, Notani D, Ma Q, Tanasa B, Nunez E, Chen AY, Merkurjev D, Zhang J, Ohgi K, Song XY, Oh S, Kim HS, Glass CK, Rosenfeld MG: Functional roles of enhancer RNAs for oestrogen-dependent transcriptional activation. Nature 2013,498(7455):516-520. 10.1038/nature12210PubMed CentralView ArticlePubMed
- Lam MTY, Cho H, Lesch HP, Gosselin D, Heinz S, Tanaka-Oishi Y, Benner C, Kaikkonen MU, Kim AS, Kosaka M, Lee CY, Watt A, Grossman TR, Rosenfeld MG, Evans RM, Glass CK: Rev-Erbs repress macrophage gene expression by inhibiting enhancer-directed transcription. Nature 2013,498(7455):511-515. 10.1038/nature12209View ArticlePubMed
- Wang QB, Carroll JS, Brown M: Spatial and temporal recruitment of androgen receptor and its coactivators involves chromosomal looping and polymerase tracking. Mol Cell 2005,19(5):631-642. 10.1016/j.molcel.2005.07.018View ArticlePubMed
- Kim TH, Barrera LO, Zheng M, Qu CX, Singer MA, Richmond TA, Wu YN, Green RD, Ren B: A high-resolution map of active promoters in the human genome. Nature 2005,436(7052):876-880. 10.1038/nature03877PubMed CentralView ArticlePubMed
- Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J: Repbase update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 2005,110(1–4):462-467.View ArticlePubMed
- Kohany O, Gentles AJ, Hankus L, Jurka J: Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor. BMC Bioinforma 2006, 7: 474. 10.1186/1471-2105-7-474View Article
- Johnson DS, Mortazavi A, Myers RM, Wold B: Genome-wide mapping of in vivo protein-DNA interactions. Science 2007,316(5830):1497-1502. 10.1126/science.1141319View ArticlePubMed
- Frietze S, O’Geen H, Blahnik KR, Jin VX, Farnham PJ: ZNF274 recruits the histone methyltransferase SETDB1 to the 3′ ends of ZNF genes. PLoS One 2010,5(12):e15082. 10.1371/journal.pone.0015082PubMed CentralView ArticlePubMed
- Hartman SE, Bertone P, Nath AK, Royce TE, Gerstein M, Weissman S, Snyder M: Global changes in STAT target selection and transcription regulation upon interferon treatments. Gene Dev 2005,19(24):2953-2968. 10.1101/gad.1371305PubMed CentralView ArticlePubMed
- Letoile ND, Fahnestock ML, Shen YH, Aebersold R, Berk AJ: Human transcription factor-IIIc box B binding subunit. Proc Natl Acad Sci USA 1994,91(5):1652-1656. 10.1073/pnas.91.5.1652View Article
- Li ML, Ge QY, Wang W, Wang JK, Lu ZH: c-jun binding site identification in K562 cells. J Genet Genomics 2011,38(6):235-242. 10.1016/j.jgg.2011.05.004View ArticlePubMed
- Bryne JC, Valen E, Tang MHE, Marstrand T, Winther O, da Piedade I, Krogh A, Lenhard B, Sandelin A: JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res 2008, 36: D102-D106. 10.1093/nar/gkn449PubMed CentralView ArticlePubMed
- Bailey TL, Williams N, Misleh C, Li WW: MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res 2006, 34: W369-W373. 10.1093/nar/gkl198PubMed CentralView ArticlePubMed
- Jjingo D, Conley AB, Yi SV, Lunyak VV, Jordan IK: On the presence and role of human gene-body DNA methylation. Oncotarget 2012,3(4):462-474.PubMed CentralView ArticlePubMed
- Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, Cooke MP, Walker JR, Hogenesch JB: A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA 2004,101(16):6062-6067. 10.1073/pnas.0400782101PubMed CentralView ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.