Article Text

This article has a correction. Please see:

Download PDFPDF

Predicting clinical behaviour of breast phyllodes tumours: a nomogram based on histological criteria and surgical margins
  1. Puay Hoon Tan1,
  2. Aye Aye Thike1,
  3. Wai Jin Tan1,
  4. Minn Minn Myint Thu1,
  5. Inny Busmanis1,
  6. HuiHua Li2,
  7. Wen Yee Chay3,
  8. Min‐Han Tan4,
  9. The Phyllodes Tumour Network Singapore*
  1. 1Department of Pathology, Singapore General Hospital, Singapore
  2. 2Department of Clinical Research, Singapore General Hospital, Singapore
  3. 3Department of Medical Oncology, National Cancer Centre Singapore, Singapore
  4. 4Institute of Bioengineering and Nanotechnology, Singapore
  1. Correspondence to Dr Puay Hoon Tan, Department of Pathology, Singapore General Hospital, Outram Road, Singapore 169608, Singapore; tan.puay.hoon{at}sgh.com.sg

Abstract

Aim To define a predictive model for clinical behaviour of breast phyllodes tumours (PT) using histological parameters and surgical margin status.

Methods Cases of breast PT diagnosed in the Department of Pathology Singapore General Hospital between January 1992 and December 2010 were stratified into benign, borderline and malignant grades based on a combination of histological parameters (stromal atypia, hypercellularity, mitoses, overgrowth and nature of tumour borders). Surgical margin status was assessed. Clinical follow-up and biostatistical modelling were accomplished.

Results Of 605 PT, 440 (72.7%) were benign, 111 (18.4%) borderline and 54 (8.9%) malignant. Recurrences, which were predominantly local, were documented in 80 (13.2%) women. Deaths from PT occurred in 12 (2%) women. Multivariate analysis revealed stromal atypia, overgrowth and surgical margins to be independently predictive of clinical behaviour, with mitoses achieving near significance. Stromal hypercellularity and tumour borders were not independently useful. A nomogram developed based on atypia, mitoses, overgrowth and surgical margins (AMOS criteria) could predict recurrence-free survival at 1, 3, 5 and 10 years. This nomogram was superior to a total histological score derived from adding values assigned to each of five histological parameters.

Conclusion A predictive nomogram based on three histological criteria and surgical margin status can be used to calculate recurrence-free survival of an individual woman diagnosed with PT. This can be applied for patient counselling and clinical management.

  • Nomogram
  • prediction
  • recurrence
  • phyllodes
  • surgical margin
  • breast pathology
  • breast cancer
  • prostate
  • breast
  • cancer research
  • immunohistochemistry
  • cancer
  • cancer genetics

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Phyllodes tumours (PT) are uncommon, biphasic, fibroepithelial lesions of the breast, characterised by leafy stromal fronds capped by benign bilayered epithelium.1 2 While they resemble the innocuous intracanalicular fibroadenoma, they need to be distinguished and separately recognised because of their recurrent and possible metastatic potential.3–6 PT account for <1% of all primary breast neoplasms, although in Asian countries, they appear to be more frequently encountered,1 with a reported younger age of occurrence.7 Classification into benign, borderline and malignant categories is based on a constellation of histological parameters that primarily focuses on stromal features of atypia, mitoses, cellularity, overgrowth and nature of tumour margins.1 2 8 9 The attention on stromal features is premised on the view that the stroma of PT represents the neoplastic component that progresses and which is responsible for clinical behaviour.10–16 It is increasingly acknowledged, however, that there are epithelial–stromal interactions that play an important role in the pathogenesis of these tumours.11 12 17 18

Apart from many reports of biological and genetic markers that can assist in categorising or grading PT for prognostic purposes,8 9 19–30 classification of this group of tumours remains problematic in terms of standardised universal application. How the assessment of each histological parameter combines together to assign a specific grade, and whether there are histological features that are individually more important in predicting recurrent behaviour, have not been resolved.1 31–36 As such, the management of these tumours remains challenging.

In this study, we evaluate the clinicopathological features, review the follow-up of a large series of breast PT from a single institution and propose a model that can assist in individual prediction of clinical behaviour.

Methods

Patients and tumours

Cases of breast PT diagnosed between January 1992 and December 2010 were derived from the files of the Department of Pathology, Singapore General Hospital. Patient details and tumour laterality were determined from accession forms, while tumour size was obtained from surgical pathology reports. In general, at least one block per centimetre of tumour was sampled for microscopic examination. Histological classification into benign, borderline and malignant categories relied on the degree of stromal hypercellularity, cytologic atypia, mitotic activity, stromal overgrowth and nature of the borders (circumscribed vs permeative). A circumscribed border was characterised by a pushing margin that protruded or bulged against the surrounding breast tissue (figure 1); conversely, a permeative border was one in which extensions of tumour crept into adjacent tissue without a well-defined interface between tumour and non-lesional tissue (figure 2). Stromal hypercellularity and cytologic atypia were categorised as mild, moderate and marked.4 Stromal mitotic activity was quantified per 10 high-power fields (hpf) of the microscope objective (×40 objective and ×10 eyepiece, 0.196 mm2), in the most mitotically active areas of the stroma (figure 3). Stromal overgrowth, defined as a low-power field (×4 microscope objective and ×10 eyepiece, 22.902 mm2) that comprised only stroma without epithelial elements, was deemed absent or present (figure 4). A benign PT was diagnosed when the lesion showed pushing margins, mild or moderate stromal hypercellularity, mild or moderate stromal cytologic atypia, occasional mitoses that numbered up to 4 per 10 hpf and no stromal overgrowth. A malignant tumour was defined by marked stromal hypercellularity and cytologic atypia, presence of stromal overgrowth, brisk mitotic activity (10 or more per 10 hpf) and permeative margins; the finding of a malignant heterologous element placed the tumour into a malignant category. Borderline PT showed some but not all the characteristics observed in malignant lesions. Whether there was pseudoangiomatous stromal hyperplasia within the tumour was documented. The presence of heterologous stromal elements was documented. Multifocality was defined by presence of discrete PT separated by non-lesional breast tissue.

Figure 1

Benign phyllodes tumour shows a pushing margin with a circumscribed rounded border at the interface with adjacent breast tissue.

Figure 2

A phyllodes tumour, classified as borderline, reveals stromal extensions into the surrounding adipose tissue.

Figure 3

A malignant phyllodes tumour with brisk mitotic activity amid stromal cells with generally moderate nuclear atypia.

Figure 4

A low-power microscope field of stroma only, devoid of epithelial elements, indicating stromal overgrowth.

The surgical procedure was defined from accession forms. Surgical margins, based on available histological material, were evaluated as complete, when sections sampled showed a surrounding rim of non-lesional breast tissue; focally involved, when tumour extended in only one focus to the inked or cauterised surgical margin or diffusely involved, when the surgical margin was breached by tumour in more than one focus. Focal and diffuse surgical margin involvement was combined for statistical analysis. In women who underwent a further follow-up surgery, surgical margin status was assessed in the latter pathologic specimen. Adjuvant radiation or chemotherapy was not routine treatment.

Patient follow-up was derived from medical records and any deaths determined from the National Registries of Births and Deaths. Recurrences were either local (at the original site of diagnosis of PT) or distant (metastases).

Total histological score

Scores were assigned to the five histological parameters of stromal hypercellularity, stromal atypia, mitotic rate, stromal overgrowth and nature of borders. Mild, moderate and marked stromal hypercellularity and stromal atypia were each scored from 1 to 3, respectively. Mitotic rates of 0–4, 5–9 and ≥10 per 10 hpf were given scores of 1 to 3, respectively. Absence or presence of stromal overgrowth and pushing versus permeative margins were regarded as 1 and 2 in their respective categories. The total histological score was defined as a sum of the score for each of the five histological parameters and ranged from a minimum of five to a maximum of 13, with a higher histological score expected to portend a worse prognosis, based on an assumption that the various histological parameters had an additive effect.

Statistical analysis

Kaplan–Meier survival curves were used to estimate recurrence-free survival (RFS), which was defined as the time from date of surgery to date of first relapse or death from PT or to the last follow-up date for censored cases. Univariable and multivariable Cox regression analyses were performed to evaluate the effects of potential predictors on RFS.

We used a previously described general approach to model evaluation.37 Briefly, reduced model selection was carried out using a backward stepdown by applying the stopping rule of Akaike's information criterion (AIC). Proportional hazards assumptions were verified systematically for all proposed models. Multivariate Cox regression coefficients were used to generate the nomogram. The ability of the nomogram to predict RFS of individual patients was evaluated with bootstrapping of 200 bootstrap set of resamples. Calibration plots were generated to explore the performance characteristics of the nomogram at 1, 3, 5 and 10 years after surgery.

We compared the performance of this model with total histological score using c-indices and likelihood ratio (LR) analysis. We used LR χ2 of nested models to perform pairwise comparisons between our nomogram and the total score. An adequacy index using LR methods was used to quantify the percentage of the predictive information explained by a subset of the individual predictors (nomogram or total histological score) compared with the information contained in the full set of predictors (nomogram and total histological score) by means of log-likelihood. Harrell's c-index was calculated to evaluate the concordance between predicted and observed responses of individual subjects separately. All analyses were done using R V.2.13.0 (http://www.R-project.org) and STATA V.11.

Results

There were a total of 605 cases of PT during the study period, all occurring in women, with a median age of 43 years. There were 440 (72.7%) benign, 111 (18.4%) borderline and 54 (8.9%) malignant tumours. The right and left breasts were affected in 310 (51.2%) and 293 (48.5%) of cases, respectively, with bilateral disease observed in two (0.3%) women. The microscopic features of the bilateral tumours were almost identical in each case, with one woman having her right and left benign tumours differing only in size (4, 6 cm), mitotic rate (one and four mitoses per 10 hpf) and surgical margin status (complete excision versus focal involvement), while the other woman's bilateral benign tumours also differed only in size (right 2.5 cm, left 2.2 cm) and epithelial hyperplasia (mild vs none).1 For statistical analysis, the right tumours were used. Clinicopathological characteristics stratified according to PT grade are presented in table 1. Older patient age and larger tumour size were observed more frequently in borderline and malignant PT (p<0.001). Heterologous elements were found in 12 malignant PT, comprising five with liposarcoma, two osteosarcoma, two chondrosarcoma, one rhabdomyosarcoma and the remaining with a combination of these components. Multifocality with two discrete PT was observed in six (1.6%) women. This was distinguished from a multilobulated appearance which may be visualised at the outer tumour contours.

Table 1

Clinicopathological features of 605 cases of phyllodes tumour stratified according to histological grade

Recurrences were noted in 80 (13.2%) women (table 2). Mean and median times to recurrence were 29.8 and 24.6 months, respectively. Additional details of recurrent tumours are provided in tables 3–7. Deaths were documented in 19 women, 12 of which were due to PT and the remaining seven from non-PT-related causes.

Table 2

Relationship of grade of phyllodes tumour with recurrence (p<0.001)

Table 3

Pattern of recurrence, both local and distant (metastases) of PT

Table 4

Frequency of recurrence of PT

Table 5

Recurrent pattern stratified according to the grade of PT

Table 6

Sites of metastases of PT

Table 7

Grade of original and initial local recurrences of PT

For statistical analysis, 552 patients were included after discounting 19 non-local residents without follow-up information as their follow-up status could not be accurately determined and those in whom follow-up was <3 months. The final median follow-up for this group of women was 56.9 months with a range of 3.3–229.2 months. Univariate analysis showed that mitotic rate, grade, surgical margin, atypia, hypercellularity, overgrowth and borders affected RFS significantly (table 8). While PT grade was also significantly associated with RFS, we did not incorporate it in the model as grade is derived from multiple histological parameters for which deconstruction into individual factors avoids combinational variability. Following reduced model selection by the AIC stopping rule, multivariate analysis showed that atypia, overgrowth and surgical margin significantly affected RFS significantly, but not hypercellularity or tumour borders (table 9). Mitotic rate was close to statistical significance (p=0.058). We thus excluded stromal hypercellularity and tumour borders from the final model, with the additional considerations that log-likelihood ratio test showed that adding hypercellularity or tumour borders into the model would not significantly improve that incorporating stromal atypia, mitotic rate, stromal overgrowth and surgical margin only (p=0.0645, p=0.3289, respectively). The nomogram derived based on the multivariate Cox regression coefficients is shown (figure 5). Figure 6A–D demonstrates the bootstrap estimates of calibration accuracy for 1-, 3-, 5- and 10-year RFS estimates from the final Cox model. The comparison of our nomogram and total histological score showed that the nomogram with a higher C-index predicted RFS better than total histological score (tables 10 and 11). LR testing also showed that our nomogram provided superior estimates relative to total histological score, as the inclusion of nomogram to the model with total histological score only resulted in a highly statistically improvement (p<0.0001), compared with the inclusion of total histological score to the model with nomogram only (p=0.6381) as pictorially depicted in figure 7. Using the log LR test, we confirmed that the addition of surgical margin status significantly improved our model (p<0.0001). For cases with negative surgical margins, the nature of tumour borders (circumscribed or permeative) was not predictive of RFS (p=0.1110).

Table 8

Univariate analysis of recurrence free survival of women with PT

Table 9

Multivariate analysis of recurrence-free survival without interaction

Figure 5

Nomogram for predicting recurrence-free survival (RFS) of patients with phyllodes tumours. To use the nomogram, locate the first variable. Draw a line straight upwards to the Points axis to determine the number of points received for the variable. Repeat this process for the other three variables and sum up the points achieved for each variable. The sum of these numbers is located on the Total Points axis, and a line is drawn downwards to the survival axes to determine the likelihood of 1-, 3-, 5- or 10-year RFS. ‘For instance, a woman with a phyllodes tumour that demonstrates moderate stromal atypia (11 points), mitoses of 10 per 10 high-power fields (3 points), no stromal overgrowth (0 points) and positive surgical margin (40 points) will have a RFS (based on total points of 54 points) of just above 0.9 at 1 year, 0.7 at 3 years, 0.58 at 5 years and 0.5 at 10 years. Calculation of RFS can be automated through computerised programming.

Figure 6

Bootstrapped estimates of calibration accuracy at (A) 1 year (B) 3 years (C) 5 years and (D) 10 years. The ideal outcome (grey line), the observed outcome (black line) and the optimism corrected outcome (blue line) are depicted. This figure demonstrates how accurately predictions at different risk levels conform to observed outcomes for the nomogram. Mean and maximum deviations between observed and corrected outcomes are 0.03 and 0.02 (1 year), 0.04 and 0.05 (3 years), 0.04 and 0.07 (5 years), 0.04 and 0.08 (10 years).

Table 10

Nomogram and total histological score as predictors of recurrence-free survival

Table 11

Comparison of nomogram and total histological score as predictors of recurrence-free survival

Figure 7

Pictorial representation of the comparison of the performance of the full model comprising nomogram and total histological score, relative to individual nested models with either nomogram or total histological score. An individual nested model comprising nomogram essentially performs equally well as the full model in predicting recurrence (p=0.7026); there is little additional variation explained by adding total histological score to the individual nested model with a nomogram only. Conversely, a model comprising total histological score only has only 22.7% of the explanatory power of the full model.

Discussion

Grading of PT has been widely used to categorise these tumours into prognostic groups. Benign PT are known to potentially recur locally, while borderline and malignant ones have both recurrent and metastatic ability, with deaths from PT often preceded by distant metastases.1 2 While grade has predictive utility across cohorts, with quoted recurrence rates of 17%, 25% and 27% in benign, borderline and malignant PT,2 information on likely recurrent behaviour in an individual patient is uncertain. In our current study, we found recurrences in 80 (13.2%) cases, the vast majority being local in nature (73 out of 80 cases, 91.2%), with metastatic disease occurring consequent to local recurrence in 5 (6.3%) cases and identified after a diagnosis of malignant PT in 7 (8.7%) cases. Overall, recurrences occurred in 10.9% of benign, 14.4% of borderline and 29.6% of malignant PT, lending credence to the ability of grade to predict clinical behaviour.

The issue with grade, however, is with inherent challenges to reproducibility. It is generally agreed that stromal characteristics of atypia, mitoses, cellularity, overgrowth and nature of tumour margins are key parameters in determining grade. Assessment of atypia and cellularity may be hampered by interobserver variability. Evaluation of mitoses, too, may have concerns about the number of fields to be analysed and whether it should be a maximal or average mitotic count. Stromal overgrowth, while appearing the least contentious, may possibly vary due to differences in the size of the low-power microscopic field. Many of these perceived problems are related to the lack of clear definition. While these can be rectified through enunciation of more clearly defined histological criteria, there is the additional difficulty of how various parameters are integrated into a final microscopic grade. At the extreme ends of the spectrum, the diagnosis of grade is mostly straightforward. For example, a tumour with modest stromal atypia, low cellularity and mitoses, with pushing margins and without stromal overgrowth, is unequivocally benign. In contrast, for a PT with marked stromal atypia, high cellularity and mitotic activity, with permeative margins and stromal overgrowth, a malignant diagnosis can be readily rendered. In the intermediate range, however, it is difficult to accurately assign a grade, and it may become a judgement call based on the parameter(s) believed to have stronger biological weightage. Regardless of its value, grade is unable to predict clinical behaviour in an individual patient diagnosed with PT.

Surgical margins have been regarded as a critical element in impacting the likelihood of PT recurrence.1 32 36 38–42 This is affirmed in our current study which shows independent significance of surgical margins on RFS on multivariate analysis.

It is with the knowledge of existing deficiencies in PT classification and behaviour prediction that we embarked on this effort to simplify the evaluation of histological parameters in determining grade, omitting those that may already be biologically reflected by another and incorporating surgical margin status, which represents complete or incomplete tumour removal. The nomogram model that has been built in this manner allows a risk assessment for an individual patient. It helps clarify factors that are independently contributing to recurrence likelihood and provides a powerful tool for estimating outcomes.

The final elements used for the nomogram are the degree of stromal atypia (mild, moderate, marked), stromal mitoses per 10 high-power fields (assessed in the most mitotically active areas), stromal overgrowth (absent or present) and surgical margins (negative or positive), simply referred to as the AMOS criteria. Both stromal hypercellularity and tumour borders have been omitted from the model because of their lack of independent importance in predicting RFS and their exclusion using the AIC rule. Although mitotic rate did not achieve statistical significance, we included it as the p value was close to 0.05, and mitosis counting can be a fairly objective exercise. Counting mitoses is a familiar activity of the pathologist, who routinely accomplishes this in the grading of invasive breast cancer. It should therefore be an achievable goal for mitotic rates to be consistently and reproducibly assessed. Similarly, assessment of degree of atypia is another common component of histological evaluation of many tumours. Stromal overgrowth, while potentially variable due to different microscope types, is not a commonly observed feature, with only 65 (10%) cases demonstrating this change in our series. The slight variation in absolute size of a low magnification field (×4 objective and ×10 eyepiece) of a different microscope makes it unlikely to cause significant inconsistency, as when stromal overgrowth is present, it is often unequivocal.

Assessment of these individual parameters (AMOS criteria) that are added to derive a numerical point value in our nomogram to predict RFS for the individual patient eliminates the generalisation afforded by grade. The nomogram can be translated into risk counselling, allow informed discussions of management strategies and can also be used in future clinical trials. A key advantage of our study is the relatively homogeneous patient cohort, managed in a mostly uniform manner within a single institution. The availability of institutional medical records, together with high-quality follow-up data from the National Registry of Births and Deaths, allow robust recurrence-free follow-up and death information. Our exclusion on non-local residents without follow-up data in the statistical model reduces the possibility of inaccurate follow-up.

We do not advocate that PT grading be replaced by our proposed predictive nomogram. Grade has established utility in assigning a biological category to the tumour, as evidenced by its significant correlation with RFS in our study. It is also important to continue to categorise PT into discrete pathological groups for comparison across studies and to allow investigation of biological markers. Progression of PT into higher histological grades with recurrent episodes, as seen in 23 of 73 (31.5%) of initial recurrences in our series, is also another area that requires additional scrutiny and will benefit from continued pathological segregation of PT.

Our nomogram has a superior performance compared with an arbitrary but simple total histological score obtained from adding up scores of the five histological parameters. This confirms that the biological impact of each microscopic parameter is more complex, with varying influence on RFS, and is therefore not so easily equated with a linear additive scale.

The importance of surgical margin status on outcome is reaffirmed in our current study.1 Apart from its statistically significant relationship with RFS in both univariate and multivariate analysis, inclusion of surgical margin status into the nomogram improved the predictive value. As this is a retrospective study, more details on surgical margin status such as clearance distance could not be assessed, and it was therefore not within the scope of this investigation to comment on an ideal or minimum margin clearance for PT. It has been suggested that a 1-cm margin is the recommended clearance for surgical excision of PT.43

Limitations of our study relate to the relatively few events that have occurred within the series, due largely to the innate biology of the tumour that has a reported overall recurrence rate from 10% to 21%.1 2 Amalgamation of local with distant recurrences as well as deaths may lead to a question of what is the true end point being interrogated. The overwhelming number of local recurrences among all recurrent episodes in our study, however, attests to the nomogram's ability in predicting local recurrences. While death from disease is the ultimate event we wish to eliminate, our previous and current experience with PT is that local recurrence precedes distant metastases that cause demise, with the exception of initially malignant tumours that can metastasise without an intervening locally recurrent episode.1 Another limitation of the nomogram is that it has yet to be validated by an external cohort.

In conclusion, we describe the clinicopathological features and follow-up details of a large series of breast PT. Using the data, we undertook statistical modelling and developed a nomogram that integrates pathological features of AMOS into a predictive risk model that can be used for patient counselling and clinical management. Continued work on PT will help refine the nomogram which we believe can become a useful adjunct in the management of this group of women.

Take-home messages

  • Breast phyllodes tumours (PT) can be classified into benign, borderline and malignant grades based on a constellation of histological findings.

  • Stromal atypia, mitoses, overgrowth and surgical margins (AMOS) are important predictive factors for PT recurrence.

  • A nomogram based on the AMOS criteria can predict the likely recurrence-free survival of individual patients diagnosed with PT.

References

View Abstract

Footnotes

  • * Phyllodes Tumour Network Singapore:

  • Benita Tan, FRCS

  • Chow Yin Wong, FRCS

  • Department of General Surgery, Singapore General Hospital

  • Kong Wee Ong, FRCS

  • Wei Sean Yong, FRCS

  • Gay Hui Ho, FRCS

  • Department of Surgical Oncology, National Cancer Centre Singapore

  • Wei Seong Ooi, MRCP

  • Department of Medical Oncology, National Cancer Centre Singapore

  • Bin Tean Teh, PhD

  • National Cancer Centre—Van Andel Research Institute Translational Research Laboratory, Singapore.

  • Competing interests None.

  • Ethics approval Ethics approval was provided by Centralised Institutional Review Board, SingHealth.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Linked Articles

  • Correction
    BMJ Publishing Group Ltd and Association of Clinical Pathologists