Introduction

Breast cancer is the most common cancer and leading cause of cancer-related death amongst women worldwide [13, 35]. The disease is heterogeneous in its histopathology, therapeutic response, metastatic patterns and outcome. Current treatment guidelines are based on histopathological grading, tumour size, lymph node-, hormone receptor-, human epidermal growth factor receptor 2 (HER2)- and proliferation (Ki67) status. More recently, gene expression analyses using c-DNA microarray technology have provided a deeper understanding of the complexity of breast cancer. Perou et al. [30] describe four molecular subtypes: Luminal-like, HER2 enriched, Basal-like and Normal-like. More recent publications have confirmed these subtypes with some modifications and it has been shown that molecular subtypes also differ in their response to treatment and outcome [4, 8]. Molecular subtyping with immunohistochemistry (IHC) and in situ hybridisation (ISH) as surrogates for gene expression analyses makes it possible to study large numbers of archival breast cancer cases with long-term follow-up.

Histopathological grade is a well-established prognostic factor [3, 12, 32]. Recent studies confirm the importance of grading in breast cancer prognostication, although grading systems based on gene expression, such as the Gene expression grade index (GGI), have recently emerged [7, 32, 37]. Molecular subtyping may provide additional information on patient outcome, but consensus has yet to be reached regarding IHC or ISH markers that could be used as surrogates for gene expression analyses [17]. Most surrogate markers used for subtyping are available in clinical practice today, but it remains to document the benefits of a new classification prior to implementation.

The aims of this study were to discover whether reclassification of breast tumours into molecular subtypes provides more information regarding outcome compared to conventional histopathological grading and to study breast cancer-specific survival (BCSS) for molecular subtypes over time. To achieve this, a cohort of breast cancer cases with long-term follow-up was reclassified into molecular subtypes. Most of the markers examined are widely used, such as oestrogen receptor (ER), progesterone receptor (PR), HER2 and Ki67. In addition, cytokeratin 5 (CK5) and epithelial growth factor receptor 1 (EGFR) were included [2, 6]. The patients in this population experienced breast cancer in a time period or at an age where adjuvant treatment after surgery was rarely an option and the disease thus had a near-natural course.

Materials and methods

Study population

Between 1956 and 1959, 25,897 women in the Norwegian county of Nord-Trøndelag, born between 1886 and 1928, were invited to participate in a screening programme for early diagnosis of breast cancer [22, 29]. The screening comprised a clinical examination and a questionnaire focussed on reproductive history. Data were linked with the Norwegian Cancer Registry and the Cause of Death Registry of Norway. In all, 1,393 new cases of breast cancer occurred between 1961 and 2008. Most of these were analysed at the Department of Pathology, St. Olav’s Hospital, Trondheim University Hospital, Norway. A total of 448 cases were excluded from the study. For the remaining 945 cases, formalin-fixed, paraffin-embedded (FFPE) tissue was available and 909 were of sufficient quality for reclassification into molecular subtypes (see Fig. 1).

Fig. 1
figure 1

Study population

Specimen characteristics

Pathology reports and FFPE tissue from all cases were retrieved from the archives of the department of pathology. In cases with recurrent disease or second or multiple primary breast cancer, only the first primary tumour was included. New 4-μm-thick full-face sections were cut from representative paraffin blocks from tumours and lymph node metastases and stained with haematoxylin–erythrosine–saffron (HES). Forty cases comprised only core biopsies or small tissue fragments unsuitable for tissue microarray (TMA). From these, serial sections were made. The HES-stained sections were reviewed under a microscope independently by two experienced pathologists (OAH, AMB) and classified according to histopathological type and grade according to the World Health Organization Classification of Tumours [23] and the Nottingham grading system [12, 33]. Any discrepancies in grade or type were discussed and consensus reached. In cases where tumour size was missing in the pathology report, size was measured in millimetres on the glass slide. Only cases with a measurement of the whole tumour in the pathology report and/or measurement of the full diameter on the glass slide were registered. All other cases were classified as size uncertain [n = 268 (29.5 %)].

TMA construction

TMA blocks were made using the Tissue Arrayer MiniCore® 3 with TMA Designer2 software (Alphelys). Areas of interest in the HES sections were marked by a pathologist. Three 1-mm-diameter tissue cores were extracted from peripheral regions of the tumour in the FFPE blocks and inserted into TMA recipient blocks. From the TMA blocks, 4-μm sections were cut and stained. IHC was done with antibodies for ER, PR, HER2(CB11), CK5, Ki67 and EGFR in addition to HES staining. In addition, HER2 status was also examined by chromogenic in situ hybridization (CISH).

Assay methods

Sections were mounted on Superfrost+glass slides, dried at 37 °C overnight and stored at −20 °C. All sections were stained within 12 weeks of sectioning. The slides were heated to 60 °C for 2 h. Pre-treatment was performed in a PT Link, Pre-Treatment Module for Tissue Specimens (Dako) with buffer (High pH Target Retrieval Solution K8004) at 97 °C for 20 min. All sections were immunostained for ER, PR, HER2 (CB11), CK5 and Ki67 in a DakoCytomation Autostainer Plus (Dako). For visualization, the Dako REAL™ EnVision™ Detection System was used with Peroxidase/DAB+, Rabbit/Mouse, code K5007. EGFR was immunostained using EGFR pharmDx™ for autostainer, code K1494. See Table 1 for sources and dilutions of primary antibodies. Negative controls were included in each staining run. CISH was used to visualize the HER2 gene (red chromagen) and chromosome 17 (blue chromagen) using the dual colour probe kit HER2 CISH pharmDx™ Kit, code 109 (Dako). Two of the steps in the CISH procedure were modified slightly. The incubation time for red chromogen solution was increased from 10 to 15 min, and the dilution of haematoxylin was increased from 1:5 to 1:7.

Table 1 Sources and dilutions of primary antibodies

Scoring and reporting

All HES- and IHC-stained slides were digitalized using the tissue scanner Ariol™ SL-50 3.3 Scan system and analysis station (Genetix) at 5× and 20× magnification. Expression of ER, PR, HER2 (CB11), CK5, Ki67 and EGFR was evaluated using the Ariol review station. The images were viewed and subjectively scored by two persons independently. HER2 gene amplification status was annotated under a bright field microscope. All cases were evaluated by at least one pathologist. Any discrepancies were discussed and consensus reached.

Classification of each marker

ER and PR were positive when ≥1 % of the tumour cells showed positive nuclear staining [19]. For Ki67, a total of 500 tumour nuclei were examined. Cases with ≥15 % positive nuclei were classified as Ki67 high and <15 % as Ki67 low [16].

HER2 was assessed using both IHC and CISH [11]. For HER2 IHC, the CB11 clone [31, 43] was used and the Herceptest (Dako) guidelines for interpretation were used with a membrane-staining score ranging from 0 to +3. HER2 IHC was considered negative when the score was 0 or +1, positive when +3 and borderline when +2. Since the preanalytical treatment of the samples was unknown, the results of HER2 IHC were only used in cases where CISH was unsuccessful. In IHC (+2) and unsuccessful CISH (18 cases), the corresponding IHC was revised by two authors (AMB and MJE) and reclassified as either +1(14 cases) or +3(4 cases).

The HER2 gene was considered amplified if the gene to chromosome ratio was ≥2.0 [1, 34]. A minimum of 20 non-overlapping nuclei with signals for both chromosome and gene were assessed.

For CK5, a staining index (SI) was estimated. Staining intensity was graded as 0 (no staining), 1 (weak), 2 (moderate) and 3 (strong). The proportion of positive staining cells was scored as 1 (<10 %), 2 (10–50 %) and 3 (>50 %). The score for intensity multiplied by proportion is the SI [14, 26]. In this study, the results were considered to be negative when SI was 0–1 and positive when the SI was 2–9. For EGFR, membranous staining was scored according to the guidelines in the Dako PharmDx kit and a SI was calculated when this was combined with the proportion of cells showing positive staining resulting in a SI as described above.

Classification of molecular subtypes

Using the six biomarkers, the tumours were then classified in molecular subtypes: Luminal A, Luminal B (HER2−), Luminal B (HER2+), HER2 subtype five negative phenotype (5NP) and Basal-like phenotype (BP) (Fig. 2).

Fig. 2
figure 2

Classification algorithm for molecular subtyping

Statistical analyses

All women were followed from the date of breast cancer diagnosis to the date of death from breast cancer, death from any other cause or to the end of follow-up (December 31, 2010), whichever came first. BCSS according to molecular subtypes and histopathological grade was estimated using Kaplan–Meier methods and compared by log-rank tests. Cox proportional hazards models were used to estimate risk of death from breast cancer adjusted for age (5-year intervals), stage (in five categories: stage I–IV and unknown) at diagnosis according the data from the Cancer Registry [21] and time period of diagnosis (10-year intervals). Hazard ratios (HR) were calculated with 95 % confidence intervals (CI) for two time periods: first 5 years after diagnosis and from 5 years after diagnosis and onwards (conditional on surviving the first 5 years). Cox analyses of the first 5 years were stratified by histopathological grade. Statistical analyses were carried out using Stata version 12.1 IC for Windows (Stata Corp.). This study complies with the REMARK reporting recommendations for tumour marker studies [25].

Ethics

The study was approved by the Regional Committee for Medical and Health Sciences Research Ethics (REK, Midt-Norge, ref. nr: 836/2009) and dispensation from the requirement of patient consent was granted.

Results

Description of the population

In all, 909 cases were included. Mean age at diagnosis was 72.5 years (SD 10.7; range 41–102). Only 12.5 % were <60 years and 58.9 % were 60–79 years. Most tumours were 2–5 cm in diameter (43.2 %), but for 29.5 %, tumour size was unknown or uncertain. At the end of the observation period, 359 (39.5 %) had died of breast cancer, 390 (42.9 %) of other causes and 160 (17.6 %) were still alive. Median follow-up was 6.4 years [interquartile range (IQR) 10.0 years]. See Table 2 for patient and tumour data.

Table 2 Descriptive statistics for the 909 breast cancer cases

Histopathological characteristics

Of the 909 tumours, 12.9 % were grade 1, 53.7 % grade 2 and 33.4 % grade 3. The histopathological types were as follows: ductal: 70.0 %; lobular: 13.6 %; and other special types: 16.4 %. All cases were reclassified into molecular subtypes based on assessment of ER, PR, HER2, Ki67, CK5 and EGFR. Table 2 shows distribution of histopathological types and grades for each molecular subtype. Table 3 shows the number of positive cases of each marker.

Table 3 The number of positive cases for each marker

Distribution of molecular subtypes

The distribution of subtypes was as follows: Luminal A: 47.6 %; Luminal B (HER2−): 27.4 %; Luminal B (HER2+): 7.7 %; HER2 subtype: 6.6 %; 5NP: 3.6 %; and BP: 7.0 %. See Table 2. Mean age at diagnosis was 72.8 (SD 10.5) for women with luminal tumours and 70.9 (SD 11.8) for non-luminal tumours. Luminal A had the highest proportion of grades 1 and 2 (Fig. 3). Only HER2 subtype and BP comprised a higher proportion of grade 3 than grade 2. Grade 1 was not found in HER2 and 5NP subtypes. The Luminal B subtypes had very similar distribution of grades despite differences in other characteristics.

Fig. 3
figure 3

Distribution of grade in percent according to subtype

Breast cancer-specific survival, molecular subtypes and histopathological grade

Luminal A subtype had the best survival, closely followed by Luminal B (HER2−) with 5-year BCSS higher than 75 %. The HER2 and 5NP subtypes had the poorest prognosis, with 5-year survival around 50 %. Of the triple-negative cases, BP had a better prognosis than 5NP. BP and Luminal B (HER2+) were similar in terms of 5-year survival (Fig. 4).

Fig. 4
figure 4

Kaplan–Meier plot. Breast cancer-specific survival according to molecular subtypes. P-value from log-rank test of differences in BCSS was <0.0001

Figure 5 shows BCSS according to histopathological grade for up to 20 years of follow-up. Adjustment for age did not substantially influence the curves, but after adjustment for stage, survival for grade 1 tumours was improved (data not shown).

Fig. 5
figure 5

Kaplan–Meier plot. Breast cancer-specific survival according to grade. P-value from log-rank test of differences in BCSS was 0.0001

Risk of death from breast cancer

Table 4 shows risk of death from breast cancer according to molecular subtype and histopathological grade. During the first 5 years, grades 2 and 3 had a poorer prognosis compared to grade 1 with HR 3.8 (95 % CI 2.14–6.75) for grade 3 and HR 1.97 (95 % CI 1.11–3.51) for grade 2. In the same time period, the hormone receptor-negative and/or HER2-positive subtypes had the poorest prognoses compared to Luminal A. Particularly poor prognoses were shown for the HER2 subtype [HR 4.24 (95 % CI 2.79–6.42)] and 5NP [HR 3.34, (95 % CI 1.91–5.82)]. After 5 years, neither grade nor molecular subtype showed any clear association with survival. Adjustment for age had no impact on the results, and adjustment for stage only slightly attenuated risk estimates.

Table 4 Risk of death from breast cancer according to molecular subtype and histopathological grade

Table 5 shows risk of death from breast cancer the first 5 years after diagnosis according to molecular subtype for grade 2 and 3. For grade 2, the HR for HER2 subtype compared to Luminal A was 6.62 (95 % CI 2.82–15.57), and adjustment for age and stage did not substantially influence the estimate. In grade 3, there was no clear difference in risk of death from breast cancer according to molecular subtype. Since 12 of the 13 patients who died of grade 1 tumours had Luminal A tumours, HRs were not calculated. Adjustment for time period of diagnosis did not change the results (not shown).

Table 5 Risk of death from breast cancer according to molecular subtype for each histopathological grade

Amongst HER2-positive cases, the hazard ratio for the HER2 subtype compared to Luminal B (HER2+) was 1.8 (95 % CI 1.07–3.05) (not shown in table).

Discussion

In this long-term follow-up of breast cancer patients, the HER2 and 5NP subtypes showed the poorest prognosis during the first 5 years after diagnosis. After 5 years, BCSS did not significantly differ amongst the six molecular subtypes. However, the numbers of 5-year survivors in these two groups are low. The patients came from a cohort of women with breast cancer who lived through a time period with limited access to adjuvant treatment. However, 192 women would have qualified for antihormonal treatment according to the treatment guidelines operative at the time of diagnosis. None were qualified for treatment with trastuzumab. Kaplan–Meier BCSS estimates for patients with ER-positive tumours who may have received treatment and those who did not qualify for treatment do not differ significantly (data not shown).

During the first 5 years of follow-up, differences in survival according to subtype occurred almost exclusively amongst patients with grade 2 tumours. Grade 2 was significantly associated with poorer survival for all subtypes except Luminal B (HER2−).

These results support the findings of others that hormone receptor status defines two groups within HER2-positive breast cancer with differing BCSS [42]. The HER2 subtype had the poorest 5-year survival of all subtypes, whereas the Luminal B (HER2+) subgroup had a substantially better 5-year survival, supporting the significance of ER status in determining survival. It has been shown that, despite problems associated with crosstalk between ER and HER2, Luminal B (HER2+) benefits from antihormonal treatment [20]. The hazard ratio for the HER2 subtype compared to Luminal B (HER2+) would appear to confirm this.

To predict response to endocrine therapy, the cutoff for ER was previously set at 10 % positive staining nuclei [28]. In accordance with current guidelines, the cutoff is now set at ≥1 % [19]. In this study, 24 cases showed ER-positive staining in ≥ 1 < 10 % of tumour cell nuclei and were classified as Luminal. A majority (16 cases) were Luminal B, and in the Luminal B (HER2+) subtype, they accounted for 9 % of cases. Deyarmin et al. [10] have suggested that the classification of ER-low tumours as Luminal may be inappropriate. These cases exert little or no influence on the results of the Kaplan–Meier and Cox analyses in the present study.

Classification of breast cancer into molecular subtypes with surrogate markers for gene expression is widely used. In 2010, Blows et al. [4] published a large collaborative analysis that showed survival for different subtypes, where the subtyping in all the 12 included studies was done by IHC. These methods are more accessible and affordable than gene profile studies and can be applied to archival FFPE tissue. The St. Gallen Consensus Discussion in 2011 opened for molecular subtyping of breast cancer using ER, PR, HER2 and Ki67/grade, all factors already in clinical use, though the cutoff for Ki67 is still controversial [18]. The panel did not support the incorporation of EGFR or CK 5/6, thus the basal phenotype and the five negative phenotype were classified as ‘triple negative’ [15, 17]. Discussion is ongoing regarding which markers are best suited for the classification of molecular subtypes.

In the present study, 5-year survival was better for BP compared to 5NP. This is in contrast with the findings of others [4, 6, 40]. The 5NP subtype had poorer prognosis despite the fact that it comprised a higher proportion of histological grade 2 tumours. Validation studies will reveal whether or not this finding is consistent. This may be a group that would have benefited from adjuvant treatment as offered today.

Histopathological grade, tumour size and lymph node status are strong prognostic factors and are well established in clinical practice. Reduced long-term survival is associated with higher grade [4, 36, 44]. In the present study, high grade was associated with non-luminal subtypes. However, the prognostic value of the different factors may vary with time after diagnosis [24]. Since the risk of relapse and death is the highest during the first 5 years, particularly for ER-negative disease [27, 41], two periods of time were analysed separately in this study: the first 5 years after diagnosis and the subsequent years. Even after many years, there is some risk of breast cancer recurrence. Interestingly, in this cohort, there are no differences in survival according to subtypes for those who have survived the first 5 years. Further research may reveal whether adjuvant treatment modifies this tendency.

Histopathological grade 1 tumours are associated with the best prognosis, whereas grade 3 tumours are associated with the poorest prognosis. Grade 2 tumours comprise a more heterogeneous group where the majority has an intermediate prognosis, but some cases may exhibit similarity with grades 1 and 3 [7, 32]. The same applies in this study. It is possible to classify grade 2 tumours into low risk and high risk of recurrence using the GGI which is based on analysis of 97 genes [37]. A 3-gene proliferation score using PCR assay to identify TOP2A, FOXM1 and MKI67 has similar prognostic value as GGI and might be easier to implement [39]. However, the present study shows that it is possible to obtain significant additional information of prognostic value by using already implemented or readily accessible tests, and this may be of value in prognostication of grade 2 tumours.

This study contributes to the understanding of breast cancer heterogeneity partly because of the unique nature of the study population. These women lived in a time before birth control pills and hormone replacement therapy at menopause were available, and they had not undergone organized mammography screening. Furthermore, due to age and time period, they had limited postoperative treatment and thus we come as close to the natural course of the disease as possible. One drawback in this study is the relatively high age of the cohort and the results must be considered in light of this fact. This may explain the relatively high proportion of grade 2 tumours and the slightly lower proportion of HER2-positive tumours [38]. Another weakness may be the IHC estimation of HER2 where standardized preanalytical conditions were unattainable, thus precluding a semiquantitative estimation of protein expression. Despite this, there was full correlation between IHC and CISH in 587 cases. 13 cases were IHC + 3, but showed chromosome 17 polysomy with ratios <2.0. Two cases scored +3, but no changes in chromosome or gene copy number. For the same reason, false-positive and -negative results may have occurred for the other biomarkers. However, the distribution of subtypes is comparable to that of other studies [4, 5, 9]. All laboratory tests were carried out under standardized conditions and their interpretation together with complete revision of the histopathological diagnoses, type and grade was done within the context of this study according to present-day guidelines. By adding two markers to identify the basal phenotype to the set of markers in clinical use, it was possible to subdivide triple-negative cases into BP and the 5NP. In this study, these two subtypes had significantly differing BCSS. Molecular tests such as GGI are promising in terms of clinical benefit, but so far the documented benefit is complementary to histopathological methods [32]. Similarly, molecular subtyping using surrogate markers may provide important additional information for selected subgroups of breast cancer patients.