Article Text

Validity of whole slide images for scoring HER2 chromogenic in situ hybridisation in breast cancer
  1. Shaimaa Al-Janabi1,
  2. Anja Horstman1,
  3. Henk-Jan van Slooten1,
  4. Chantal Kuijpers1,
  5. Clifton Lai-A-Fat1,
  6. Paul J van Diest2,
  7. Mehdi Jiwa1,2
  1. 1Symbiant Pathology Expert Center, Alkmaar, The Netherlands
  2. 2Department of Pathology, University Medical Center Utrecht, Utrecht, The Netherlands
  1. Correspondence to Dr Mehdi Jiwa, Department of Pathology, Alkmaar Medical Center, Symbiant Pathology Expert Centre, PO Box501, Alkmaar 1815 JD, The Netherlands; m.jiwa{at}symbiant.nl

Abstract

Aim Whole slide images (WSIs) have stimulated a paradigm shift from conventional to digital pathology in several applications within pathology. Due to the fact that WSIs have not yet been approved for primary diagnostics, validating their use for different diagnostic purposes is still mandatory. The aim of this study was to test the validity of WSI in assessing human epidermal growth factor receptor 2 (HER2) status in breast cancer specimens using chromogenic in situ hybridisation (CISH).

Materials and methods Ninety-six HER2 CISH slides were scored by two observers on a light microscope (400× viewing magnification) and on WSI (40× scanning magnification, one focus plane) with a minimum of 6 weeks washout period. The concordance between digital and microscopic HER2 scores was assessed.

Results Digitally, 93/96 cases could be assessed (96.8%). Microscopic and digital evaluation of HER2 amplification status were concordant in 68/93 cases ((73.1%, 95% CI: 0.639 −0.823), κ 0.588). CISH underscoring was most noticeable in the amplified and equivocal categories while the highest level concordance was seen in cases with a normal copy number. Additionally there was a noticeable tendency to underestimate the average HER2 scores on WSI: lower in 59 and higher in 11 cases. There was no major difference in time spent for microscopic scoring (86.9 s) and digital scoring (81.7 s).

Conclusions There was a reasonable concordance between microscopic scoring and WSI-based scoring of HER2 copy number of CISH slides. Nevertheless, WSIs scanned on a single focal plane are insufficient to assess HER2 gene amplification status by scoring CISH due to the noticeable tendency towards digitally underestimating the number of HER2 spots. Scanning at multiple focus planes may offer better resolution for improved digital CISH spot counting.

  • DIGITAL PATHOLOGY
  • BREAST CANCER
  • BREAST PATHOLOGY
  • CANCER

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Whole slide imaging is the process of scanning glass slides and converting them into a digital form commonly known as digital slides or whole slide images (WSIs).1 WSIs are usually explored with the aid of image viewers which allow the manipulation of the entire tissue section in any direction and at any magnification. Image viewers offer added benefits such as the ease of accessing, sharing, annotating, storing and retrieving images, hence making the use of WSI in some aspects more convenient than using a glass slide and a conventional microscope. As a result, WSI stimulated a paradigm shift from a conventional to a digital mode in several applications within pathology, particularly in teaching, teleconsultation, clinicopathological conferences, frozen section diagnosis and research. However, the use of WSI for upfront diagnostics is still uncommon possibly because in the USA WSIs have not yet been approved for this purpose by the Food and Drug Administration (FDA). Several validation studies evaluating the efficiency of WSI for primary diagnostics of different pathology specimens have shown a good concordance between digital and conventional diagnoses.2–9 Most of the available studies have assessed the validity of WSI for H&E stained tissue sections10 while the current pathology work relies on H&E stained tissue sections and on immune stains and additional molecular techniques.

In breast cancer it is of great importance to assess the status of specific genes or receptors as they may influence the patient's prognosis and response to therapy.11 Human epidermal growth factor receptor 2 (HER2) is a transmembrane glycoprotein receptor with a tyrosine kinase activity which has shown to be overexpressed in 10–20% of breast cancer cases.12 ,13 This protein is encoded by a gene located on chromosome 17 commonly called ERBB2 or HER2/neu.14–16 Assessing HER2 gene amplification and/or protein overexpression at diagnosis is recommended for all patients with breast cancer especially because a positive HER2 status is commonly associated with a poor prognosis, resistance to conventional chemotherapy17–19 and response to treatment with the recombinant humanised monoclonal anti-HER2 antibody trastuzumab.

HER2 protein overexpression is usually determined by immunohistochemistry, whereas assessing HER2 gene amplification on DNA level is usually done by conducting one of the following tests: fluorescent in situ hybridisation (FISH), chromogenic in situ hybridisation (CISH) or multiplex ligation- dependent probe amplification (MLPA).20 ,21

CISH is a morphological test that allows the evaluation of HER2 gene by assessing small nuclear signals within tumour cells using a glass slide and bright field microscopy. The ease of scoring, image sharing and documentation of annotation on WSI encouraged us to start a study aimed at evaluating the feasibility of using high-resolution WSI in routine assessment of HER2 status using CISH.

Materials and methods

Ethics statement

Since we used archival pathology material which does not interfere with patient care, no ethical approval is required according to Dutch legislation (the Medical Research Involving Human Subjects Act (Wet medisch-wetenschappelijk onderzoek met mensen, WMO)).22 Using anonymous left-over material for scientific purposes is part of the standard treatment contract with patients and therefore informed consent procedure was not required according to our institutional medical ethical review board. This has been also documented by van Diest et al.23

Additionally, we assume that our study is subjected to exemption from the Federal Regulations as has been suggested below: (Exemption 4 includes research involving the collection or study of existing data, documents, records, pathological specimens or diagnostic specimens, if these sources are publicly available or if the information is recorded by the investigator in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects) (Federal Guidelines).

Patients and specimens

Ninety-six randomly selected breast cancer cases (25 biopsies and 71 resections) on which CISH had previously been assessed at the Symbiant Pathology Expert Center in The Netherlands were included in this study. These cases include 89 cases originated from primary tumours, 6 cases from metastatic breast cancer and 1 case originated from a recurrence of previously excised breast cancer. Table 1 shows tumours and specimen types of cases selected for HER2 evaluation.

Table 1

Overview of tumour and specimen type for 96 breast cancer cases subjected to CISH scoring by light microscopy and on WSI

CISH assay was performed using the ZytoDot SPEC HER2 Probe Kit (ZytoVision, Bremerhaven, Germany) according to the manufacturer's instructions. The enzymatic reaction from this test yields prominent brown nuclear signals which can be easily visualised under a microscope at 40× magnification. No correction for chromosome 17 copy number was conducted as true polysomy 17 is now believed to be a very uncommon event in breast cancer.24

HER2 amplification was assessed in at least 30 cells in the invasive part of the tumour. Only nuclei with distinct nuclear borders were evaluated, areas with necrosis or overlapping of nuclei were excluded. Samples were categorised according to the American Society of Clinical Oncology/College of American Pathologists (ASCO/CAP) updated guidelines from the year 2013. Thus samples with an average of <4 spots/nucleus were considered non-amplified (normal) whereas samples with ≥6 spots/nucleus were considered to be amplified. An average number of spots ranging between 4 spots/nucleus and 6 spots/nucleus were considered to be equivocal. To have a better insight about the possibility of scoring CISH per category on WSI, we have randomly selected a comparable number of cases from these three categories. The same observers who did the initial score were asked to rescore their own cases on WSI to avoid interobserver variation as much as possible. The first observer had scored 67 cases and the rest were scored by the second observer. Glass slides were scanned at 40× using Leica SCN400 or Philips slide scanners and presented to the participating observers on high resolution 30″ Barco Pathology Displays (Barco, Brussels, Belgium) having a resolution of 6 megapixels. The time needed to score CISH microscopically and on WSI was recorded for 30 cases.

Most of the microscopic scores were retrieved from the pathology reporting system (Universeel Decentraal PALGA Systeem;Pathologisch Anatomisch Landelijk Geautomatiseerd Archief). If the microscopic scores were not available or could not be scored by the same observer on WSI, the cases were rescored under the microscope after a washout period of at least 6 weeks.

Statistics

Statistical analysis was performed using statistical software SPSS V.20. Microscopic and digital HER2 classes (non-amplified (normal)/equivocal/ amplified) were compared using the concordance coefficient κ (K) as suggested by Landis and Koch.25 A K value of 0.00–0.20 suggests a poor agreement, 0.21–0.40 a fair agreement, 0.41–0.60 a moderate agreement, 0.61–0.80 a substantial agreement and 0.81–1 a perfect agreement. The percentage agreement with its 95% CI was also calculated. Systematic differences between the two scoring modalities were evaluated using linear regression analysis and Wilcoxon's signed rank test. p Values <0.05 were considered statistically significant. Microscopic spot counts were regarded as the gold standard.

Results

Digitally, observers could elaborate HER2 status in 93 cases (96.8%). However, in three cases the observer could not establish HER2 status. Non-interpretation and case deferral was either due to poorly prepared specimens with partial detachment of the tissue from the glass slide making such cases subsequently difficult to be optimally scanned or because of the inability to perceive distinct cell borders and/or clear nuclear signals.

The average time needed to evaluate HER2 status of the last 30 cases microscopically and on WSI was about 86.9 s and 81.7 s, respectively.

Microscopic and digital evaluation of HER2 status were concordant in 68/93 cases ((73.1%, 95% CI: 0.639 −0.823), Kappa 0.588, P=0.0001). The sensitivity, specificity, positive predictive value and negative predictive value for scoring the normal and amplified categories were 95.45%, 100%, 100% and 97.14%, respectively.

Sixty-eight per cent of the discrepancies were seen in cases of the equivocal category, where about more than half of the cases in this category (53.3%) were underscored as normal copy number. In addition, overscoring was seen in one case only of this category. Twenty-five per cent of cases (7/28) in the amplified category were underscored to equivocal or normal copy number on WSI. The lowest rate of discordance was seen in cases with a normal copy number. Table 2 shows the concordance of HER2 status assessed by conventional microscopy and on WSI. Figure 1 presents snapshots of the same regions from glass slides and WSIs from cases downscored on WSI.

Table 2

Concordance between HER2 CISH scoring of 93 breast cancer cases using conventional microscopy and WSI

Figure 1

Snapshots of the same regions from glass slides (left side) and whole slide images (WSIs) (right side) from cases downscored on WSI. (A) Amplified case underscored into normal copy number on WSI, (B) amplified case underscored into equivocal category on WSI, (C) equivocal case underscored into normal copy number on WSI.

Overall, HER2 CISH average scores were digitally lower in 59 and higher in 11 cases. This tendency to underestimate the average spot counts on WSI despite very good correlation (R=0.931) is illustrated by the scatter plot of the linear regression analysis in figure 2 revealing an intercept of −0.579.

Figure 2

Scatter plot of microscopic versus digital human epidermal growth factor receptor 2 (HER2) chromogenic in situ hybridization spot counts indicating the tendency to underestimate the average scores on whole slide images (WSIs) (intercept −0.579) despite very good correlation (R=0.931).

Discussion

Altered HER2 status is usually associated with poor prognosis, shortened disease-free periods and decreased overall survival time in patients diagnosed with breast cancer. More importantly, it predicts eligibility for trastuzumab therapy.26–29 Thus correct estimation of HER2 gene amplification and/or protein overexpression at the time of diagnosis is a pivotal prerequisite to support treatment decisions.

CISH allows the evaluation of HER2 gene amplification by manual scoring of small nuclear signals using glass slides and a conventional microscope.30 ,31 High resolution WSIs are considered to be a novel alternative to glass slides which enable exploring pathology specimens on a computer screen in a way comparable to a microscope. Additionally, the flexibility derived from the ease of accessing and sharing of WSI has led to an inevitable and gradual conversion towards the digital era within pathology where WSIs have been broadly incorporated into several applications mainly in education and teleconsultation.32–35

This study aimed, therefore, to investigate the validity of WSI in assessing HER2 status in patients with breast cancer using CISH. Ninety-six breast cancer biopsies and resections were evaluated microscopically and on WSI by two observers with a washout period of at least 6 weeks. The time needed to score CISH on WSIs of the last 30 cases was comparable to the time needed to score these cases microscopically. This might be related to the fact that the observers become more used to score CISH on WSI leading to a decrease in the time difference in scoring CISH using both methods.

In three of these cases the observer felt uneasy making an assessment on WSI. In the remaining 93 cases, microscopic and WSI-based evaluations of HER2 status were concordant only in 68 cases (73.1%). Despite the high level of sensitivity and specificity in scoring normal and amplified categories, most of the discrepancies were seen in the equivocal category where 16/30 cases were underscored on WSI into normal copy number and one case was overscored to be considered as amplified. Moreover, 7/28 cases were underscored from the amplified category to equivocal or normal levels on WSI. The highest correlation between digital and microscopic scoring was seen in cases having normal HER2 copy number. The concordance rate in this study is within the range of generally observed (stochastic) intraobserver and interobserver variability in pathology and is also comparable to the results of the other studies evaluating the intraobserver variability using WSI and conventional microscopy.36–40 Although there was an excellent correlation (R=0.932) between microscopic and digital spot counts, there was an obvious tendency towards underestimating HER2 spot counts on WSI: 59 cases were underscored digitally versus only 11 cases with overscoring (intercept on linear regression analysis −0.579). The tendency to underscore HER2 nuclear signals on WSI gives rise to serious concerns about the validity of WSI scanned on one focal plane for assessing HER2 gene amplification by CISH in clinical practice, considering the consequences for the proper indication for HER2 therapy. Underscoring HER2 nuclear signals on WSI may indicate an inability to visualise fine nuclear signals which were able to be perceived using a conventional microscope. Scanning tissue sections at one focal plane as in the present study may compromise visualising fine nuclear spots not completely lying in the chosen focus plane. In general, we expect that scanning glass slides at multiple planes (Z-stacking) could offer a better resolution for identifying fine cellular and nuclear details but on the other hand this technique should offer an optimal focusing mode affording continuity when toggling between different levels. Although Z-stacking is still not affordable for routine diagnostics as it demands a long scanning time and necessitates significantly more storage, it may be required for optimal assessment of HER2 CISH.

Although CISH scoring microscopically and on WSI may not have been performed precisely on the same region, we do not believe that sample heterogeneity was an important confounder in the present study. Heterogeneity for HER2 gene amplification is fairly rare, and would not explain the systematic tendency to underestimate HER2 spot counts. It is worth mentioning that the high number of equivocal and amplified cases in this study was due to our research strategy. We randomly selected a comparable number of cases for each category to have a better insight about the possibility of scoring CISH on WSI per category.

As WSIs are highly amenable to automated image analysis, they arouse a growing interest in creating various algorithms for performing different cumbersome diagnostic tasks aiding in saving the time of the pathologist and improving objectivity. Algorithms assessing HER2 immune tests have already been created and some of them, such as the Automated Cellular Imaging System III and PATHIAM IVD from Bioimagene, have been approved by FDA.41 Similar algorithms for quantitative assessment of FISH42 and CISH are also available from Bioimagene and Visiopharm. Given the fact that scanning and analysing WSIs obtained by Z-stacking will consume a lot of time and create a huge amount of data, most of the automated image analysis software are suitable for WSIs scanned on a single focal plane. Algorithms evaluating CISH status start mostly with processing WSI by identifying the cell boundaries and the nuclei of the cells. Thereafter HER2 nuclear signals are detected, accurately sized, segmented into small and large clusters, and counted according to the size. The sophisticated analysis will be finally translated into understandable details in terms of cell count, signal count and ratio of CISH to chromosome 17 in addition to graphs and/ or representative images from areas of interest. Despite the fact that image analysis can add a lot to the pathology practice in terms of increasing the objectivity and productivity,43 ,44 these software may perform suboptimally in certain conditions as in failure of identification of nuclei or in case of fusion of the nuclear signals. Thus the role of the pathologists is still necessary in supervising and finalising the reports created by such algorithms. Further research evaluating the validity of automated image analysis for primary diagnostics and their cost-effectiveness is still necessary.

In conclusion, WSIs scanned on a single focal plane are not suitable for assessing HER2 gene amplification using CISH because of the noticeable tendency towards digitally underestimating the number of HER2 spots leading to missing clinically relevant HER2 amplification. Scanning at multiple focus planes may offer a better resolution for improved CISH spot counting, which deserves to be further studied.

Take home messages

  • Despite of the reasonable concordance between microscopic and WSI assessment of HER2 status using CISH slides, WSI scanned on single focal plane are insufficient to assess HER2 gene amplification status by scoring CISH due to the noticeable tendency toward digitally underestimating the number of HER2 spots.

  • Underscoring HER2 nuclear signals on WSI may indicate the inability to visualise fine nuclear signals not completely lying in the chosen focus plane.

  • Scanning at multiple focus planes may offer a better resolution for improved CISH spot counting.

  • Further progress in image analysis and computer-aided diagnosis will open new ways for more objective diagnostics in pathology.

References

Supplementary materials

  • Abstract in Nederlandse

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Handling editor Cheok Soon Lee

  • Contributors All authors have substantially contributed to writing, reading and approving the final manuscript.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.