Upregulation of selected HERVW loci in multiple sclerosis

and Introduction Human endogenous retrovirus (HERV) are the present day versions of retroviral germline infections that have occured millions of years ago, which occupy about 8 % of the genome [1]. While they are mostly replication deficient, they are known to express RNA and protein [2] during particular developmental stages, or as a response to aging [3], inflammation and a wide range of pathologies [4]. A human retrovirus discovered in Multiple Sclerosis (MS) patients [5], turned out to be the prototype of a novel HERV family referred to as HERVW [6]. The HERV W family consists of 213 elements, 12 out of which are complete proviral copies with intact LTRs [7]. Increased expression of HERVW in peripheral blood mononuclear cells (PBMCs) has been repeatedly associated with MS, and the presence of HERVW protein or elevated RNA transcription has been correlated with disease activity [8– 10]. While a contribution of HERVW-encoded proteins to brain disease is suggested by their presence in MSassociated brain lesions, expression in peripheral organs may be involved in the disease process through cytokineinduced damage to the blood brain barrier and subsequent infiltration of monocytes. Alterations in peripheral expression may also serve as a useful and practical marker for the diagnostics of this CNS disease. Therefore, we quantified overall HERVW levels and identified individual HERVW loci actually transcribed in PBMCs. Analysis was carried out in patients diagnosed with Clinically Isolated Syndrome (CIS), a precursor to MS, defined by a single episode of neurologic symptoms lasting at least 24 h. CIS is an indicator of future development of MS, as 60 % of the people diagnosed with CIS develop MS [11]. These patients potentially represent the earliest stage of MS routinely available for clinical analysis. We undertook a Next Generation Sequencing (NGS)-based analysis of transcripts amplified from cDNA obtained from patients with CIS and samples from healthy controls. Data presented from this pilot experiment indicate that the relative frequency of specific HERVW copies is altered in PBMC of CIS patients, even in the absence of overall HERVW overexpression. Such altered frequency appears to be derived from less abundantly transcribed but potentially MSrelated HERVW loci.

Human endogenous retrovirus (HERV) are the present day versions of retroviral germline infections that have occured millions of years ago, which occupy about 8 % of the genome [1]. While they are mostly replication deficient, they are known to express RNA and protein [2] during particular developmental stages, or as a response to aging [3], inflammation and a wide range of pathologies [4]. A human retrovirus discovered in Multiple Sclerosis (MS) patients [5], turned out to be the prototype of a novel HERV family referred to as HERVW [6]. The HERV W family consists of 213 elements, 12 out of which are complete proviral copies with intact LTRs [7]. Increased expression of HERVW in peripheral blood mononuclear cells (PBMCs) has been repeatedly associated with MS, and the presence of HERVW protein or elevated RNA transcription has been correlated with disease activity [8][9][10]. While a contribution of HERVW-encoded proteins to brain disease is suggested by their presence in MSassociated brain lesions, expression in peripheral organs may be involved in the disease process through cytokineinduced damage to the blood brain barrier and subsequent infiltration of monocytes. Alterations in peripheral expression may also serve as a useful and practical marker for the diagnostics of this CNS disease. Therefore, we quantified overall HERVW levels and identified individual HERVW loci actually transcribed in PBMCs. Analysis was carried out in patients diagnosed with Clinically Isolated Syndrome (CIS), a precursor to MS, defined by a single episode of neurologic symptoms lasting at least 24 h. CIS is an indicator of future development of MS, as 60 % of the people diagnosed with CIS develop MS [11]. These patients potentially represent the earliest stage of MS routinely available for clinical analysis. We undertook a Next Generation Sequencing (NGS)-based analysis of transcripts amplified from cDNA obtained from patients with CIS and samples from healthy controls. Data presented from this pilot experiment indicate that the relative frequency of specific HERVW copies is altered in PBMC of CIS patients, even in the absence of overall HERVW overexpression. Such altered frequency appears to be derived from less abundantly transcribed but potentially MSrelated HERVW loci.

Expression analysis and PCR
RNA isolation and random-primed cDNA synthesis [14] was carried out as described before. HERVW ENV levels were determined by triplicate qPCR assays as described [14,15]. For the identification and localization of transcribed HERVW loci, cDNA was amplified employing the external primers of an established PCR assay for HERVW ENV [15]. Products were purified and subjected to NGS analysis.

NGS analysis
Library preparation and sequencing was carried out using the IonTorrent technology workflow on an Ion Torrent S5XL platform using an Ion 530 chip. Resulting reads were mapped to the human reference genome (version hg19) using strict criteria to maximize mapping differences between different HERVW copies. Relative frequencies were calculated as the number of reads mapping to an individual HERVW ENV element relative to the total number of reads. Details in Suppl. Methods.

Statistical analysis
SPSS software was used for all analyses and graphs (Version 15.0). Normality and statistical significance of differences were assessed using specific tests. Data were further analyzed using the DESeq2 package [16] to correct p values for multiple testing (False Discovery Rate < 0.05).
More detailed information is available in Suppl. M&M.

Results
We carried out HERVW ENV expression analyses using an optimized assay described by Mameli et al. [15]. No significantly increased expression of HERVW was detected in a small cohort of CIS patients (n = 6) compared to age-matched controls (n = 15)(U-Mann-Whitney p = 0.267) (Fig. 1). Results were not skewed by the use of GAPDH as a reference gene (Fig. 1), as comparison with RPL19 and HSDA reference genes (Table S1 and Suppl Figure 1) showed that there is no statistical difference between the use of either GAPDH or the mean of the three genes (Welch´s t-test; p < 0.05).
In the absence of increased overall expression levels of HERVW in CIS samples, we wondered whether specific copies of HERVW (Table S2) might be differentially expressed. We performed NGS analysis to identify individual HERVW copies with altered expression in PBMC from CIS patients (n = 5) and controls (n = 5). Reads obtained (70,694 ± 24,812 per sample; 25,286-136,704) were mapped to the human genome. Once assigned to unique genomic locations, reads corresponding to 39 HERVW ENV loci were extracted (Table S3  and Table 2). As expected, > 99.85 % of mapped reads correspond to the 39 loci analyzed (data not shown). The resulting data showed that reads obtained from CIS patients mapped to a significant higher number of different HERVW ENV loci (31 ± 13), compared to those obtained from controls (16 ± 5.5) (t-student; p = 0.018) (Fig. 2 A). Over 70 % of the reads mapped to either of two loci: 19q13.2, Xq22.3. Extending the range, reads mapped with high frequency (> 3.6 % of total reads/locus) to a limited number of loci, in particular to HERVW ENV copies located on chromosomes 19q13.2, Xq22.3, 8q21.11, 15q21.3, 12q23.3 and 4q21.22 (Fig. 2  B). We found no significant differences between CIS patients and controls in the relative frequency of reads mapping to these loci (Fig. 2 B; Table 2).

Table 1 Clinical features of MS patients included in this study
Clinical data of patients whose PBMCs were analyzed for HERVW expression. Median ages for both patients and controles groups were 44 years (mean and SEM are 42,0 +/-4,25 and 40,4 +/-1,94 for patient and control groups, respectively) A/NA status refers to active and non-active patients respectively. Posterior progession towards MS diagnosis (RRMS) is indicated for all CIS cases. Samples analyzed by NGS are marked in blue. median ages in these groups are 44 year for patients and 47 years for controls. * indicates samples only analyzed by NGS Table 2 Percentages of reads mapped to individual HERVW loci Mapped reads (Table S3), were recalculated as the number of reads mapping to an individual HERVW ENV element relative to the total number of reads, and represented as a percentage. The HERVW loci to which an increased number of reads mapped in CIS patients are indicated in blue

Discussion
In contrast to the small group of CIS patients analyzed in this study, increased HERVW levels have been associated frequently with MS. Our inability to demonstrate a statistically significant increase of overall HERVW levels in PBMC of CIS patients may be explained by the selection of this particular group or more likely simply by small sample size. However, lack of increased expression is not unprecedented as it was previously reported in a cohort of South African MS patients, although different primers were used for this analysis [17].
We perfomed NGS analysis to identify individual HERVW copies that show altered expression in PBMC, comparing CIS patients (n = 5) to controls (n = 5). Although more definite answers require future analysis of more subjects, in the CIS patients analyzed more HERVW loci are expressed than in control subjects. A similar increase has been reported previously Fig. 2 NGS analysis of transcribed HERVW ENV copies in PBMCs. A specific MSRV ENV PCR assay [15] was applied to random primed cDNA from CIS patients (n=5) and Controls (n=5) and products were sequenced by NGS. For each sample, the relative frequency (%) of reads mapping to individual HERVW ENV copies was calculated relative to the total number of mapped reads. A) The total number of transcribed HERVW ENV copies identified (p=0.018; student's T test). B) Individual HERVW ENV copies and the mean relative frequency (%) of reads mapping to each copy is represented in pie charts. The most abundantly transcribed copies are indicated. C-E) The median relative frequency (%) of reads mapped to the individual HERV-W ENV copies in CIS patients or controls is indicated in MS brain [18]. While previous studies failed to identify MS-specific loci or expression [18,19], in the CIS patients we found statistically significant overrepresentation of reads corresponding to specific loci (i.e. 3q11.2 and 19p12, see Table 3 for complete list).
Locus-specific qPCR assays may first help confirm this finding in a larger patient cohort, and subsequently be evaluated as a potential prognostic assay. These combined overrepresented loci produce only 1-3 % of total transcripts (Fig. 2 C-E). The combined findings on low levels of overexpression, activation of more loci, and activation of low-expressing HERV W elements in CIS patients suggest that their potential contribution to the pathology may be unrelated to overall high expression levels. None of the copies identified encode full-length ENV protein, as the sequences corresponding to the ENV gene are truncated, lack ATG codons, and/or carry frame shifts and STOP codons (Suppl Figure 2). CIS-associated copies may produce proteins (either or not ENVrelated) that are especially active in activation of TLR4 [20], or RNAs that trigger the native immune system through TLR3 [21,22]. Although our analysis shows that upregulation of specific HERVW loci in PBMC is associated with CIS, the presence of these transcripts in MS brain is unknown at present. A potential role of these transcripts in proviral protein production and activation of either the peripheral immune system or CNS disease remains to be established.