Skip to main content
Fig. 5 | Mobile DNA

Fig. 5

From: RepEnTools: an automated repeat enrichment analysis package for ChIP-seq data reveals hUHRF1 Tandem-Tudor domain enrichment in young repeats

Fig. 5

Simulated datasets show that RepEnTools analysis is reliable for repeat masker regions in human (chm13v2) and mouse (mm39) genomes excluding some simple repeats. A We used the read simulator ART with chm13v2 to generate simulated paired-end reads of 150 bp at various sequencing depths [38]. At the same time, ART created a SAM file containing the true coordinates of the simulated reads. The ground-truth data were assigned to REs while the FASTQ reads were processed by RepEnTools. The normalised counts from the “reference” and the RepEnTools analysed data were compared. This benchmarked the trimming, mapping and RMSK assignment strategies employed by RepEnTools. B Using simulated data from chm13v2, RepEnTools analysis of reads on RMSK annotated elements, in particular young repeats, accurately reproduces the reference data for all sequencing depths tested. At low coverage, 9,062 of the 15,745 REs in RMSK have no reads in the reference file, demonstrating the non-linear relationship between RE coverage and genome-wide coverage. See also Additional file 1: Fig. S5A-C. RepEnTools’ analysis of reads on full-length young repeats (SVA, L1PA) is exceptionally faithful. See also Additional file 1: Fig. S5D-E. r—Pearson correlation. C RepEnTools’ analysis using the latest mouse assembly (mm39) accurately reproduces the reference data on RMSK annotated elements for all sequencing depths tested. See also Additional file 1: Fig. S7A-B. RepEnTools analysis of reads on species-specific repeats (IAP) is exceptionally faithful. See also Additional file 1: Fig. S7C

Back to article page