Article Text

Download PDFPDF

Comparison of assessment of programmed death-ligand 1 (PD-L1) status in triple-negative breast cancer biopsies and surgical specimens
  1. Aurelia Noske1,
  2. Katja Steiger1,
  3. Simone Ballke1,
  4. Marion Kiechle2,
  5. Dirk Oettler3,
  6. Wilfried Roth4,
  7. Wilko Weichert1
  1. 1Institute of Pathology, School of Medicine, Technical University of Munich, Munich, Germany
  2. 2Department of Gynaecology and Obstetrics, Technical University of Munich, Munich, Germany
  3. 3Medical affairs, MSD Sharp & Dohme GmbH, Haar, Germany
  4. 4Institute of Pathology, Johannes Gutenberg University, Mainz, Germany
  1. Correspondence to Aurelia Noske, Technical University of Munich, Munich, Germany; aurelia.noske{at}


Aims Programmed death-ligand 1 (PD-L1) status in triple-negative breast cancer (TNBC) is important for immune checkpoint inhibitor therapies but may vary between different immunohistochemical assays, scorings and the type of specimen used for analysis.

Methods We compared the analytical concordance of three clinically relevant PD-L1 assays (VENTANA SP142, VENTANA SP263 and DAKO 22C3 pharmDx) assessing immune cell score (IC), tumour proportion score and combined positive score (CPS) in preoperative biopsies and resection specimens of primary TNBC. PD-L1 expression was scored on virtual whole slide images and compared with expression data from corresponding surgical specimens.

Results The mean PD-L1 positivity in TNBC biopsies defined as IC ≥1% and CPS ≥1 ranged between 11% and 61% with the lowest positivity for SP142 and highest for SP263. The corresponding surgical specimens showed overall higher positivity rates (53%–75%). When comparing biopsies with surgical specimens, the agreement for PD-L1 positivity with SP263 and 22C3 at IC score ≥1% and CPS ≥1 was fair (kappa 0.47–0.52) and poor for SP142 (kappa 0.15–0.19). Using CPS ≥10 cut-off, the agreement for SP263 was excellent (kappa 0.751) but poor for 22C3 (kappa 0.261). Spearman correlation coefficients ranged between 0.489 and 0.75 indicating a generally moderate to strong correlation between biopsies and surgical specimens for all assays and scores.

Conclusions We demonstrate high accordance between biopsies and surgical specimens for SP263 and 22C3 scoring but less for SP142. Generally, biopsies are suitable for PD-L1 testing in TNBC but the appropriate assay, scoring and cut-off must be considered.

  • Biomarkers, Tumor

Data availability statement

The data that support the findings of this study are available from the corresponding author (AN) on request.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Programmed death-ligand 1 (PD-L1) status in triple-negative breast cancer (TNBC) varies between different immunohistochemical assays, scorings and the type of specimen used for analysis.


  • We demonstrate high concordance between TNBC biopsies and corresponding surgical specimens for PD-L1 assay SP263 and 22C3 scoring but less for SP142.


  • Biopsies are suitable for PD-L1 testing in TNBC but the appropriate assay, scoring and cut-off must be considered.


Triple-negative breast cancer (TNBC), lacking oestrogen and progesterone receptor expression and human epidermal growth factor receptor 2 (HER2) overexpression/amplification, is a heterogeneous disease with a diversity of histological subtypes and biological behaviour. A subset of TNBC with high tumour grade and proliferation is characterised by an aggressive course with high risk of recurrence. Some of these tumours have increased levels of immune cells (ICs) as well as programmed death-ligand 1 (PD-L1)-positive immune and tumour cells which may influence response to immune checkpoint inhibitors (ICIs).1 2

Pembrolizumab, a PD-1 inhibitor, in combination with chemotherapy was approved for patients with unresectable or metastatic PD-L1 positive (defined as Combined Positive Score, CPS ≥10) TNBC based on the KN355 trial.3 The approval has been extended to high-risk early-stage TNBC in combination with chemotherapy as neoadjuvant treatment, and then continued as a single agent as adjuvant therapy after surgery based on the KN522 trial without biomarker restrictions.4

Atezolizumab, a PD-L1 inhibitor, in combination with chemotherapy showed a benefit for patients with advanced and early TNBC in clinical trials.5–7 However, the indication of this drug in TNBC was withdrawn by the company.8 9 In contrast to advanced TNBC, the efficacy of ICI is formally independent of the PD-L1 status in early TNBC based on data from pivotal clinical trials.4 7 Since immunotherapy now proceeds into the curative setting of early-stage TNBC, thus predictive biomarkers for response are urgently required also in these treatment settings.

The assessment of PD-L1 for the selection of patients with metastatic TNBC eligible for ICI therapies is recommended by international and national guidelines.10 11 It is well known that the analysis of PD-L1 is still challenging due to different immunohistochemical assays, platforms, scoring criteria as well as the type and origin of the specimen.12 13

On the latter point, this has not been addressed sufficiently in TNBC until now. PD-L1 positivity rate may differ between small tissue samples (biopsies) and surgical specimens as well as primary breast cancer (BC) samples and metastases.13 14 We; therefore, designed this study to investigate the prevalence of PD-L1 positivity with three clinically relevant immunohistochemical assays in primary TNBC biopsies. We compared the data with the PD-L1 status of the corresponding surgical specimens to test for the concordance between different types of tissue acquisition.12

Materials and methods

Study population

This study was designed to assess the PD-L1 expression with three immunohistochemical assays in primary TNBC biopsies and to probe for the comparability of PD-L1 expression between biopsies and corresponding resection specimens. Archival, formalin-fixed, paraffin-embedded (FFPE) resection specimens (n=104) from the Institute of Pathology, Technical University of Munich (TUM), Germany were enrolled as described previously.15 The corresponding preoperative core needle biopsies were ascertained in the laboratory information system of the same institute. The search revealed 56 matching cases. Thereof, 37 FFPE biopsies were collected from the archive. All samples were negative for hormone receptors and HER2 according to ASCO (American Society of Clinical Oncology/ College of American Pathologists)/CAP guidelines.16 17 Clinical data were reported previously, patient specific datasets were not necessary in the context of this study. Outcome data of patients were not available. Tissue processing and use was coordinated within the framework of the Klinikum rechts der Isar/TUM tissue biobank (subject to strict legal and ethical regulations). The investigation complied with the current laws of Germany where the investigation was performed.

PD-L1 IHC assays

Immunohistochemistry was conducted with three PD-L1 antibodies on two different staining platforms. The VENTANA SP142 (Roche Diagnostics, Mannheim, Germany) and the VENTANA SP263 assay (Roche Diagnostics) were used on the VENTANA Benchmark Ultra platform at TUM. PD-L1 IHC 22C3 pharmDx assay (Agilent Technologies, Waldbronn, Germany) was run on a DAKO Autostainer Link 48 at the Institute of Pathology, University Medical Centre Mainz (Germany). All assays are referred to hereafter by the clone of the antibody used.

Evaluation of PD-L1 staining and scoring

All tissue samples were available as whole tissue sections. PD-L1 stained slides and corresponding HE stains were digitised (Leica Aperio AT2, TUM) and stored into a database (Aperio eSlide Manager). Virtual evaluation was done by a board-certified pathologist (AN) with experience in PD-L1 assessment.12 15 18 Access to the slides was randomised and blinded for patient and assay information on the digital platform. PD-L1 expression was scored for IC positivity as the percentage of invasive tumour area covered by stained ICs (defined as staining in granulocytes, lymphocytes, macrophages and dendritic cells of any intensity).19 The tumour proportion score (TPS) was evaluated according to the percentage of stained viable TC in the tumour area showing partial or complete membranous PD-L1 staining of any intensity. The CPS was calculated by summing the number of PD-L1 stained cells (TC, IC) and dividing the sum by the total number of viable tumour cells, multiplied by 100.20

Evaluation of tumour infiltrating lymphocytes and tumour stroma

Tumour infiltrating lymphocytes (TILs) and tumour stroma were evaluated in biopsies and surgical specimens of HE stained, digitised slides.21 The composition of tumour stroma was investigated by assessing the stroma amount and cellularity. Stroma amount was classified as high (predominant stroma with low cellularity of the epithelial tumour part) and low (high cellularity of the epithelial compartment with less stroma). The cellularity of the stroma was categorised into low (predominant fibrotic stroma with low cellularity) and high (highly cellular stroma, fibroblastic enriched).


Cohen’s kappa was used to quantify agreement of categorical measurements. Kappa values were interpreted according to the guideline of Cicchetti.22 Spearman correlation coefficients were calculated for the comparison of continuous measurements. The relation between biopsies and resections specimens for each assay and score was also determined by cross tables and Pearson χ2 tests. Statistical hypothesis testing was performed on exploratory, two-sided 5% significance levels. Exact 95% CIs were computed for relative frequencies. All analyses were performed by IBM SPSS statistics V. (142).


Prevalence of PD-L1 expression in TNBC biopsies

PD-L1 staining was available in 36 biopsies for SP142 and SP263, and 34 biopsies for 22C3. In the remaining cases, PD-L1 expression was not accessible with the necessary precision due to a lack of tissue or invasive cancer. The PD-L1 positivity at IC score ≥1% was 11.1% for SP142, 38.2% for 22C3 and 61.1% for SP263. TPS ≥1% positivity rates were 14.7% for 22C3 and 36.1% for SP263 while the TPS was negative in all biopsies for SP142. PD-L1-positivity rate according to CPS ≥1 was 11.1% for SP142, 32.4% for 22C3 and 61.1% for SP263. The PD-L1 positivity at CPS ≥10 was 23.5% for 22C3, and 50% for SP263, but 0% for SP142.

Comparison of biopsies and surgical resections

To investigate the relation of the PD-L1 positivity at predefined cut-offs between biopsies and corresponding surgical specimens, we used contingency tables and calculated Pearson’s χ2 tests for each assay and scoring method. For this comparison, we used matching PD-L1 expression data of recently evaluated surgical specimens.12 In some cases, no matching PD-L1 status was available due to technical reasons. In total, there were 29 sample pairs for 22C3 staining, 33 for SP142 and 32 for SP263. We observed a significant relationship of PD-L1 positivity for the 22C3 and SP263 assay for each scoring method (IC ≥1%, TPS ≥1%, CPS ≥1). PD-L1 positivity defined by CPS ≥10 cut-off, that is, predictive in advanced TNBC according to KN355 trial, showed a significant association between biopsies and surgical specimens only for the SP263 assay. No significant association was found for SP142. Data are given in table 1. The overall concordance between biopsies and surgical specimens at IC score ≥1% for 22C3 was 72%, for SP142 54% and SP263 78%. The overall concordance at CPS ≥1 and CPS ≥10 for 22C3 was 75% at both cut-offs, for SP142 60% at the low cut-off and not available at the higher level, and for SP263 78% at the low and 87% at the high cut-off level.

Table 1

Association of PD-L1 positivity between biopsies and surgical specimens for each assay and scoring method using χ2 tests

Next, we determined Cohen’s kappa coefficients to assess the variability between biopsies and surgical resections when clinically defined PD-L1 cut-offs were applied (table 2). The agreement for SP263 and 22C3 at low cut-off levels and all scoring methods (IC score ≥1%, TPS ≥1% and CPS ≥1) was fair (kappa 0.47–0.52). At CPS ≥10, the agreement for SP263 was excellent (kappa 0.751) but poor for 22C3 (kappa 0.261). Finally, the agreement for SP142 at IC score ≥1%, and CPS≥1 was poor (kappa 0.15–0.19).

Table 2

Agreement of PD-L1 positivity between biopsies and surgical specimens for each assay and score using Cohen’s kappa statistics

Following, we tested the strength of the relationship by correlating the raw scores with Spearman’s correlation (table 3). The coefficients ranged between 0.489 and 0,75 indicating a moderate to strong correlation of PD-L1 expression between biopsies and surgical specimens for all assays and scores. The association of biopsies and surgical resections is illustrated in stacked bar charts for each assay and score (at cut-offs as mentioned above) as well as for each assay and case in parallel plots (figures 1–3).

Table 3

Correlation of PD-L1 expression between biopsies and surgical specimens for each assay and score

Figure 1

Comparison of biopsies and corresponding surgical resections using IC score cut-off ≥1% for each PD-L1 assay ((A) 22C3, (B) SP142, (C) SP263) and for each case and assay ((D) 22C3, (E) SP142, (F) SP263). IC, immune cell.

Figure 2

Comparison of biopsies and surgical specimens using TPS cut-off ≥1% (A, B) and for each case (C, D) by 22C3 (A, C) and SP263 (B, D). TPS, Tumour Proportion Score.

Figure 3

Comparison of biopsies and surgical resections using CPS cut-off ≥1 for each PD-L1 assay ((A) 22C3, (B) SP142, (C) SP263) and for each case and assay ((D) 22C3, (E) SP142, (F) SP263). CPS, Combined Positive Score.

Impact of the number of cores on the accuracy of PD-L1 status

We hypothesised that a higher number of cores may enhance the concordance of the PD-L1 positivity/negativity between biopsies and resection specimens. We; therefore, documented the number of biopsies per case, ranging from one to six (median 3). Most of the cases had three biopsies (37.1%). In 60% of the cases, up to three biopsies were evaluable and 40% had four or more evaluable biopsies. We tested our hypothesis by χ2 tests. We observed significant relations between biopsies and resection specimens for SP263 at each score when one to three biopsies (IC 1% and CPS 1, respectively, p=0.001, TPS 1% p=0.003) were available but not in those cases with four or more biopsies (IC 1% and CPS 1, respectively, p=0.48, TPS 1% p=0.33). For 22C3, a significant association between less or three biopsies and resection specimens was found for TPS 1% (p=0.009) and CPS 1 (p=0.013) but not in cases with four or more biopsies (p>0.05). Only for IC score 1%, PD-L1 positivity was significantly associated between four or more biopsies and resection specimens (p=0.003 in contrast to less biopsies p=0.383). No significant differences were seen for SP142. Spearman correlation was also higher in cases with three or less biopsies as compared with cases with four or more biopsies (table 4).

Table 4

Correlation of PD-L1 expression between biopsies (0–3 vs ≥4) and surgical specimens for each assay and score

Association of TILs with PD-L1 positivity

TILs were measured as percentage of ICs in stromal tissue within the tumour and assessed as a continuous parameter. In biopsies (n=35), the number of TILs ranged from 0% to 60% (median 10%.) In resections specimens (n=35), TILs ranged from 0% to 50% (median 20%). We compared the number of TILs in biopsies with surgical specimens and found a moderate correlation (Spearman’s correlation coefficient: 0.524; p=0.002). We determined Cohens kappa coefficient and observed a poor agreement at the cut-off of 20%, (kappa 0.264). At the cut-off <20% vs >20%, TILs were significantly associated with PD-L1 expression (IC score ≥1% and CPS ≥1) for each PD-L1 assay (χ2 tests, p<0.008).

Impact of tumour stroma on PD-L1 positivity

The amount of tumour stroma had no impact on PD-L1 status. Low stroma cellularity in resection specimens was significantly associated with increased TILs and PD-L1 positivity with SP142 and SP263 at IC ≥1%, SP263 at CPS ≥1. This association was not seen in biopsies (data not shown).


Our comparison of PD-L1 expression in primary TNBC biopsies with corresponding surgical specimens revealed a lower prevalence of PD-L1 positivity in biopsies as compared with surgical excisions. For SP142 at IC score 1%, we found a positivity rate of 11% in biopsies as compared with 53% in resection specimens.12 Recently, a positivity rate of 30% in TNBC core biopsies and 52% in matching resection tissues for SP142 (IC 1%) was reported.23 Higher positivity rates of 61% (IC score 1%, CPS 1) were observed with SP263 in biopsies that is more in line with the prevalence of 75% in surgical specimens. For 22C3, the positivity rate in biopsies ranged between 32% and 38% at IC score 1% and CPS 1 that is higher as compared with TNBC biopsies of another study with 25.7% (IC 1%) but lower as compared with the matching resection specimens with 53%.24 Applying CPS 10 on biopsies resulted in a further decrease of PD-L1 positive cases. At this cut-off, only the SP263 assay was feasible to achieve an agreement for positivity between biopsy and surgical specimen. The lower prevalence in biopsies might be explainable by the lower amount of tumour available for evaluation. Specifically in biomarker scenarios like PD-L1, where categorisation of positivity depends on only few positive cells in one tumour area in some instances, biopsies might underestimate the true positivity rate of cases. Interestingly, the percentage of cases evaluated as positive in the biopsy setting was higher when antibodies were used, which are known to stain stronger than the average (eg, SP263), this is potentially explained by the fact that with these antibodies even single positive ICs, which in the biopsy setting might already be enough to categorise a case as positive are well detectable.

Comparing both sample types (biopsies and resection specimens), we observed an accordance for 22C3 and SP263 but barely for SP142. The SP142 assay shows generally a lower PD-L1 detection rate in TNBC as compared with other PD-L1 assays.12 13 15 25 Our findings indicate that the lower positivity is even more pronounced in biopsies, this implicates that this assay should potentially only be applied on surgical specimens. Since the SP263 assay identifies a higher PD-L1 positivity due to the enhanced staining of both immune and tumour cells,26 the good overlap between biopsies and surgical excisions is not fully surprising. However, in our previous comparison studies, the SP263 assay was not comparable with other clinically relevant PD-L1 assays in TNBC surgical specimens,12 15 thus this antibody should be used with caution. The 22C3 assay showed comparable PD-L1 positivity in biopsies and surgical specimens at low cut-offs for all scoring methods. However, at CPS ≥10, a clinically relevant cut-off for advanced TNBC, there was less agreement between both sample types. Overall, most antibodies show a good correlation for PD-L1 positivity between biopsy and resection specimen in almost all scoring scenarios implying that both tissue types can in principle be used for PD-L1 status evaluation, with some caveats for the biopsy setting discussed above.

Our study demonstrates that differences in PD-L1 prevalence cannot only be explained by immunohistochemical assays and scoring methods but might also be induced by the type of biomaterial used. Here, we investigated the whole biopsy tissue and compared it with whole tissue sections of primary tumour resection specimens of untreated patients. Biopsies had a good quality and sufficient tumour content. In average, there were at least three core needle biopsies available. Interestingly, three or less biopsies showed a good comparability of PD-L1 positivity between both tissue types. Four or more biopsies did not enhance the comparability. Likely, the tissue quality and tumour content are important and not the pure number of biopsy fragments. We are aware that our analysis is limited by the small number of cases and paired samples. However, studies comparing different types of tissue samples for PD-L1 testing in TNBC are rare and our data suggest that tumour tissue of biopsies might be appropriate for PD-L1 assessment. In a recent study, a good correlation between PD-L1 positive (SP142 IC 1%) TNBC core biopsies and excision samples was demonstrated but discrepancy between PD-L1 negative core biopsy and matched resection with a third of cases converting to PD-L1 positivity.23 In another comparative study of lung biopsies and corresponding resected tumours, the authors reported a poor association of the PD-L1 expression using the SP142 test and pointed out that biopsies can be misleading.27

Clinical trials use a mixture of tissue samples, usually featuring both biopsies and resection specimens of different locations, primary or metastatic sites and pretreated patients.14 28 Interestingly, PD-L1 (SP142 IC>1%) predicted benefit of atezolizumab plus nab-paclitaxel regardless of the tissue source in the IMpassion130 trial.5 14 Although PD-L1 expression is less common in metastases as compared with primary BC,28 29 in this trial, PD-L1 status was obtained from biopsies and resection specimens of primary BC, biopsies of recurrent BC and metastases of different organs. Since immunotherapy moves forward into the curative setting, biopsies might become more important for biomarker testing. Until now a predictive value of PD-L1 is not clinically established in early stage neoadjuvant and adjuvant treatment scenarios but further clinical trials with ICI in BC are ongoing.

Beside PD-L1 the evaluation of the tumour microenvironment (TME) in biopsies can give more information about the immunogenic status of BC. TME is variable and surrounds epithelial tumour cells. It consists of extracellular matrix and different types of stromal cells (mesenchymal cells and inflammatory cells/ICs). Increased tumour infiltrating lymphocytes (TILs) in TNBC are prognostic and believed to be predictive of the benefit of ICI therapies.30 We demonstrate that increased TILs are significantly related to PD-L1 positivity which is in line with other studies.2 18 28 TILs can be easily evaluated on HE stained slides without any additional procedures and integrated in pathological reports. Therefore, TILs may serve as a first guide in decision making for TNBC treatment.30 During tumour progression and probably due to anticancer therapies, TNBC becomes less immunogenic which is also demonstrated by the lower PD-L1 positivity in metastatic TNBC samples.14 28 29

In conclusion, we show that the PD-L1 positivity rates and concordances between biopsies and resection specimens are dependent on antibody clones and scoring algorithms. However, correlation for most assays and scoring methods was high, indicating that assessment on biopsies quite reliably reflects the overall tumour scores. Our data suggest, when all factors are considered, that the 22C3 clone might be the first choice for PD-L1 scoring in BC.

Data availability statement

The data that support the findings of this study are available from the corresponding author (AN) on request.

Ethics statements

Patient consent for publication


We would like to thank Olga Seelbach and Marion Mielke from the Comparative Experimental Pathology team at the Technical University of Munich and Anja Menges from the University Medical Centre of the Johannes Gutenberg University Mainz, for their technical support.



  • Handling editor Vikram Deshpande.

  • Contributors WW, DO and AN were involved in the conception of the study. WW, AN, DO and WR contributed to the study design. The first draft of the manuscript was written by AN and WW. AN acts as guarantor. All authors were involved in material preparation, data collection and analysis, and in reviewing and editing the manuscript content, as well as approving the final manuscript for submission.

  • Funding The work was supported by a grant from Deutsche Krebshilfe (#70113450 to WW, Integrate-TN).

  • Competing interests KS has received funding from Roche Pharma AG. MK has received remuneration from Springer Press, Biermann Press, Celgene, AstraZeneca, Myriad Genetics and Teva, received consultancy or advisory fees from Myriad Genetics, KVB, DKMS LIFE, BLÄK and TEVA, holds stock in Therawis Diagnostics GmbH and AIM GmbH and received funding from Sphingotec, Deutsche Krebshilfe, DFG, BMBF, the Senator Roesner Foundation and the Dr Pommer-Jung Foundation. DO is an employee of MSD Sharp & Dohme GmbH. WR has received funding from Roche Pharma AG, has attended Advisory Boards and served as a speaker for Roche, MSD, Novartis. WW has attended Advisory Boards, served as speaker for Roche, MSD, BMS, AstraZeneca, Pfizer, Merck, Lilly, Boehringer, Novartis, Takeda, Bayer, Amgen, Astellas, Illumina, Siemens, Agilent and Molecular Health and receives research funding from Roche, MSD, BMS and AstraZeneca. The remaining authors (AN and SB) have no conflict of interest.

  • Provenance and peer review Not commissioned; externally peer reviewed.