Proteomic markers with prognostic impact on outcome of chronic lymphocytic leukemia patients under chemo-immunotherapy: results from the HOVON 109 study

Despite recent identiﬁcation of several prognostic markers, there is still a need for new prognostic parameters able to predict clinical outcome in chronic lymphocytic leukemia (CLL) patients. Here, we aimed to validate the prognostic ability of known (proteomic) markers measured pretreatment and to search for new proteomic markers that might be related to treatment response in CLL. To this end, baseline serum samples of 51 CLL patients treated with chemo-immunotherapy were analyzed for 360 proteomic markers, using Olink technology. Median event-free survival (EFS) was 23 months (range: 1.25 − 60.9). Patients with high levels of sCD23 ( > 11.27, p = 0.026), sCD27 ( > 11.03, p = 0.04), SPINT1 ( > 1.6, p = 0.001), and LY9 ( > 8.22, p = 0.0003) had a shorter EFS than those with marker levels below the median. The effect of sCD23 on EFS dif-fered between immunoglobulin heavy chain variable gene-mutated and unmutated patients, with the shortest EFS for unmutated CLL patients with sCD23 levels above the median. Taken together, our results validate the prognostic impact of sCD23 and

Despite recent identification of several prognostic markers, there is still a need for new prognostic parameters able to predict clinical outcome in chronic lymphocytic leukemia (CLL) patients. Here, we aimed to validate the prognostic ability of known (proteomic) markers measured pretreatment and to search for new proteomic markers that might be related to treatment response in CLL. To this end, baseline serum samples of 51 CLL patients treated with chemo-immunotherapy were analyzed for 360 proteomic markers, using Olink technology. Median event-free survival (EFS) was 23 months (range: 1.25−60.9). Patients with high levels of sCD23 (>11.27, p = 0.026), sCD27 (>11.03, p = 0.04), SPINT1 (>1.6, p = 0.001), and LY9 (>8.22, p = 0.0003) had a shorter EFS than those with marker levels below the median. The effect of sCD23 on EFS differed between immunoglobulin heavy chain variable gene-mutated and unmutated patients, with the shortest EFS for unmutated CLL patients with sCD23 levels above the median. Taken together, our results validate the prognostic impact of sCD23 and highlight SPINT1 and LY9 as possible promising markers for treatment response in CLL patients. © 2020 ISEH -Society for Hematology and Stem Cells. Published by Elsevier Inc. All rights reserved.
The natural history of chronic lymphocytic leukemia (CLL) is highly heterogeneous. Some patients do not require treatment for decades, while others require direct treatment after diagnosis and experience diminished life expectancy because of CLL [1]. Since the introduction of multiple novel treatments, there is a growing need for informative prognostic markers with clinical significance that can influence the choice of standard of care [1].
To date, several prognostic markers have been identified, including immunoglobulin heavy chain variable (IGHV) gene mutation status, ZAP70, chromosomal alterations, CD38, CD40L, and biochemical parameters (e.g., lactate dehydrogenase and b2-microglobulin [b2M]) [1]. Yet, the identification of new prognostic parameters that are able to predict clinical outcome after treatment is important for patient management and may be useful in guiding therapeutic decisions.
Proteomic data are increasingly being used for biomarker discovery and for gaining mechanistic insight into lymphoid malignancies. Among the general population, an immune system environment that is characterized by elevated levels of B-cell stimulatory cytokines has been suggested to contribute to the development of B-cell lymphoma, including CLL [2 −5]. In a study including 105 newly diagnosed and untreated CLL patients, high levels of soluble CD23 (sCD23) at the time of initial diagnosis were a strong predictor of progressive disease within the first year of disease presentation [6]. Other studies obtained similar results implicating sCD23 as a suitable marker with prognostic potential in CLL at diagnosis [7−9].
The aims of this pilot study were (1) to validate the prognostic ability of known proteomic markers (in particular B-cell activation markers such as sCD23 and sCD27) measured pretreatment for treatment response in CLL patients, and (2) to search for new proteomic markers and their biological pathways that might be related to treatment response.

Methods
Study subjects were selected from the HOVON 109 clinical study, which is a phase I/II trial designed for efficacy and safety of first-line therapy involving chlorambucil, rituximab, and lenalidomide in elderly patients and young frail patients with advanced CLL [10]. Total treatment duration was 12 months, and all patients were followed until 5 years after registration. Of 63 patients enrolled in the HOVON 109 study, 51 with an available sample at baseline were included in our current pilot study. Clinical data were collected from the HOVON database.
Four commercially available proteomic panels (Oncology, Inflammation, Immune response, and Development; Olink Bioscience, Uppsala, Sweden) including 360 low-abundance serum proteins were selected for this study, covering multiple proteomic markers previously associated with incidence or progression of CLL in published clinical and populationbased studies [2−9]. Serum samples were analyzed using a multiplex proximity extension assay. Data were expressed in the arbitrary unit NPX (Normalized Protein eXpression) on a log 2 scale and linearized using the formula 2 NPX , where a high NPX value corresponds to a high protein concentration (Supplementary Table E1, online only, available at www. exphem.org).
Statistical procedures are provided in detail in the Supplementary Data (online only, available at www.exphem.org). As the analyses of B-cell activation markers that were found to be predictive in the general population (sCD23 and sCD27) rely on a prior hypothesis, these were not corrected for multiple testing. All other p values were corrected for multiple testing.

Results
Clinical characteristics of the CLL patients (30 males, 21 females; median age: 71 years) are summarized in Supplementary Table E2 (online only, available at www. exphem.org). Until 5 years after registration, 26 events were recorded. We first evaluated measured levels of the proteomic markers in the context of other known prognostic factors. Mutated IGHV status was significantly associated with lower levels of sCD23 and higher levels of nuclear factor of activated T cells 3 (NFATC3), while a positive association between b2M levels and 58 markers was established (Supplementary Table E3, online only, available at www.exphem.org). No marker remained significantly associated with Rai and cytogenetic aberrations after multiple testing correction.
In log-rank testing of known prognostic factors, male patients had shorter EFS as compared with female patients, and IGHV-mutated CLL patients had longer EFS than unmutated CLL patients (Supplementary Figure E1, online only, available at www.exphem.org). No significant EFS effects were seen for b2M levels, Rai stage, or welldefined chromosome aberrations. The latter may be due to the small number of cases with these aberrations in our pilot study.
From the previously studied B-cell activation proteomic markers (Supplementary Table E4, online only, available at www.exphem.org), only the sCD23 level was significantly associated with EFS (HR = 1.56, 95% CI = 1.02−2.4, p = 0.04) in the univariate model, while none appeared significantly correlated to EFS in the models adjusted for gender and IGHV status. As sCD23 level was related to IGHV status, the interaction term was further included in the adjusted model that resulted in a borderline significant association for sCD23 (hazard ratio [HR] = 0.23, 95% confidence interval [CI] = 0.04−1.14, p = 0.07).
When using median values as threshold, patients with sCD23 and sCD27 levels above the median did have a shorter EFS than those with marker levels below the median ( Figure 1A,B). Notably, when combining these markers with significant CLL prognostic factors in this cohort (i.e., IGHV mutation status and gender), sCD23 or sCD27 levels above the median were associated with the lowest EFS in unmutated IGHV patients ( Figure 1C,D). In contrast, survival distributions for male and female patients with sCD23 or sCD27 levels above the median did not differ significantly ( Figure 1E,F).
Several newly studied proteomic markers were significantly associated with EFS in Cox regression models, albeit that upon multiple testing correction, even the top four proteomic markers from the models exhibited only a trend toward significance (Supplementary Table E5, online only, available at www.exphem.org). Nevertheless, the two markers with p values closest to significance, that is, serine peptidase inhibitor SPINT1 and surface antigen LY9, were associated with a significantly longer EFS in patients with marker levels equal or lower than the median ( Figure 2). Moreover, CLL patients with higher levels of IFNLR1 had a shorter EFS than those patients with marker levels equal to or lower than the median.
Findings were independently validated via Lasso regression (details are in the Supplementary Data).

Discussion
In this pilot study, we were able to validate sCD23 levels above the median as significantly associated with a shorter EFS, which is consistent with previous studies [7]. Soluble CD23 is released from activated B cells and can itself induce further B-cell stimulation as well  as function as a potent mitogenic growth factor. Notably, sCD23 seems to be associated with IGHV status, since mutated CLL patients had significantly lower levels of sCD23, as compared with unmutated patients. When both markers are used, patients with unmutated IGHV genes with sCD23 levels above the median can be regarded as very poor prognostic group. Similarly, patients with unmutated IGHV genes and high levels of sCD27 were found to have an even poorer prognosis. High sCD27 levels were described to be associated with higher Rai stage, b2M, and LDH among CLL patients [11,12]. Here we found that high levels of sCD27 were associated with b2M levels and inferior prognosis.
Our study further indicated that higher levels of SPINT1 are associated with shorter EFS time and higher b2M levels. SPINT1, an enzyme that is encoded by the Kunitz-type protease inhibitor 1 gene, modulates matriptase proteolytic activity. Matriptase was identified in Burkitt lymphoma cells [13] and later in CLL [14]. In fact, it is highly upregulated in CLL and promotes cancer cell invasion either directly by degrading matrix proteins or indirectly by activating growth factors or through yet unknown mechanisms [14].
In our cohort, CLL patients with high LY9 levels had an inferior EFS as compared with patients with low LY9 levels. LY9 is known to interact with SLAMassociated protein that has been implicated in autoimmunity. It has been reported that LY9 is a naturally processed antigen in CLL and can serve as tumor-associated antigen in this disease [15]. It was reported that LY9-specific cytotoxic T cells from CLL patients efficiently recognized native and CD40L-activated autologous malignant CLL cells via MHC-I molecules. These findings provide strong evidence that LY9 can be employed for the design of T cell-based immunotherapeutic strategies of LY9-expressing malignancies including CLL [15] and, thus, underline the impact of the results from our current study.
A major strength of this pilot study is the large set of novel proteomic markers, which we measured and which were previously not extensively described in CLL patients. Despite the relatively small number of available cases, which had an impact on statistical power, our pilot study identified SPINT1 and LY9 as promising independent prognostic proteomic markers next to sCD23 and sCD27 in patients treated for CLL. Further studies with larger sample sizes are required to validate these results. Also, as proteomic markers were solely measured before treatment, changes in marker levels during treatment should be evaluated in new CLL patient cohorts.

Supplementary Methods
Written informed consent was obtained before enrollment in the trial. The study was approved by an accredited Ethical Committee and Institutional Review Board and was performed according to the Declaration of Helsinki, the International Conference on Harmonization Good Clinical Practice Guidelines and the European Union Clinical Trial Directive (2001/20/EG). The study was registered with EuraCT number 2010-022294-34 [1].
Protein measurements. Serum samples were analyzed using a multiplex proximity extension assay. In brief, 1 mL sample was incubated in the presence of proximity antibody pairs tagged with DNA-reporter molecules. Once the pair of antibodies is bound to their corresponding antigens, the respective DNA tails form an amplicon by proximity extension, which was quantified by highthroughput real-time PCR (BioMark TM HD System, Fluidigm Corporation). Protein abundance directly correlates with the generated fluorescent signal, which is expressed in quantitation cycles produced by the BioMark's Real-Time PCR Software following the Proseek Multiplex protocol. To minimize variation within and between runs, data (Ct values) were normalized using both an internal control (extension control) and an interplate control, and then transformed using a pre-determined correction factor.
The limit of detection was determined for each biomarker based on the mean value of 4 controls analyzed in each run. Thirty-five markers were excluded from statistical analyses because of a high non-detection rate in the cohort (i.e. >30% of the cases) (Supplementary Table E1, footnote). For 35 samples, the non-detection rate ranged from 2% to 30% (median 7%) and those were set to the value of the lower limit of quantification divided by the square root of two. Around 83% of markers were detected in all samples. The list of biomarkers (n=322) included in the statistical analyses and their median value is shown in Supplementary Table E1.
Statistical analyses. Differences in marker distributions across different levels of prognostic factors were evaluated by Wilcoxon tests. Kaplan−Meier plot and Log-rank test were used to examine the survival distribution for protein markers (≤ median or >median) and other known prognostic factors. The median of the protein markers are shown in Supplementary Table E1.
Cox proportional hazard models were used for testing the effects of the markers for event-free survival (EFS; time from registration to induction failure, progression, or death from any cause, whichever comes first). Induction failure was defined as not having achieved at least a PR during/after a maximum of 12 cycles. Unadjusted analyses were carried out for each proteomic marker. Due to the limited sample size of the study, the Cox model of proteomic markers was further adjusted only for significant known prognostic factors (gender and IGHV status).
Of interest was the added prognostic value of proteomic markers that might help to assess prognosis in clinical practice. Two models, one containing only significant known prognostic markers and the other containing the first model plus significant proteomic markers were compared by estimating the area under a ROC (receiver operating characteristic) curve (AUC) for EFS. R square and AUC corrected for overfitting by 100 bootstrapping were reported.
Validation analysis. As standard regression models perform poorly in a situation with a data set containing a number of variables superior to the number of samples, we additionally applied the least absolute shrinkage and selection operator (Lasso) technique for variable selection [2]. It is a powerful method that perform two main tasks: regularization and feature selection. In order to do so the method applied a shrinking (regularization) process in which the coefficients of the regression variables were penalized, thus shrinking some of them to zero. During the feature selection process the variables that still have a non-zero coefficient after the shrinking process were selected to be part of the model. Optimal tuning parameter λ, which controls the strength of the penalty, was obtained by 5 folds cross-validation.

Supplementary Results
Validation analysis by means of Lasso regression. In total 8 proteomic markers were selected in association with EFS in Lasso analysis (Supplementary Table E6). Interestingly, five proteomic markers suggested in our Cox regression models (LY9, SPINT1, ITM2A, IFNLR1, and CLEC7A) were among the selected variables, thus supporting the validity of these markers as possible prognostic markers.
Supplementary Figure E1. Kaplan-Meier curves for EFS related to well-known CLL prognostic factors