Skip to main content
Fig. 4 | Mobile DNA

Fig. 4

From: Reproducible evaluation of transposable element detectors with McClintock 2 guides accurate inference of Ty insertion patterns in yeast

Fig. 4

Numbers of Ty elements predicted by McClintock 2 components in a world-wide sample of yeast strains. A Numbers of non-reference TE predictions per strain (summed over all Ty families) and (B) numbers of non-reference TE predictions across Ty families (summed over all strains) in 1,011 S. cerevisiae WGS samples [64, 66], down-sampled to 50\(\times\) fold-coverage. In panel (A), lines inside boxes indicate median values, colored boxes show interquartile ranges (IQR), whiskers show values \(1.5{\times }\)IQR of the upper or lower quartiles, and the dots indicate outliers that beyond \(1.5{\times }\)IQR. Components with bold outlines in panel (A) have have median values of \(\sim\)50 non-reference Ty insertions per strain, as well as recall and precision both >75% in tRNA promoter insertion simulations when allowing non-exact predictions in WGS datasets with >50\(\times\) coverage (see Fig. 3). We note that the y-axis is on a \(\log _{10}\) scale, and that 16 zero-count data points and one extreme TE-locate data point (count=749) is removed to aid with visualization. In panel (B) total numbers of non-reference TE predictions are partitioned as “tRNA” (dark red) if they are located between 1000 bp upstream and 500 bp downstream of tRNA genes, or “non-tRNA” (orange) if outside these windows. Note that the y-scale varies for each component method. The percentage of near tRNA gene predictions is annotated at the top of each bar. “N.A.” means no such Ty family was found using that component. Components with bold outlines in panel (B) predict consistent relative TE family abundance and also have properties of components with bold outlines in panel (A), and thus we designate them as “best-in-class” methods for predicting non-reference TE insertions in S. cerevisiae. Dashed lines in panel (A) represent the average of the median number of non-reference TE insertions across the four best-in-class methods (n=54)

Back to article page