Statistics from Altmetric.com
Is it time to move the goalposts?
In an ideal world, once malignant disease is diagnosed, treatment is administered to effect a cure. Where a cure is not expected then an estimate of how soon the disease might recur or, more importantly, cause death, are issues now often raised. Estimates in most cases are derived from information provided in pathology reports, conveniently translated into a numerical index. For breast cancer, the Nottingham prognostic index (NPI), calculated from invasive cancer size, histological grade, and status of nodes for metastasis, is widely accepted to fulfill that need, and steroid receptor immunohistochemistry is the universal predictive test. In this current era of clinical governance and health equality, it is reasonable to ask questions of these indices, such as: what is the level of pathologist consistency for feature assessments, what is the best manner to discriminate between “degrees” of a feature, and should the cut off points be constant across the spectrum of all cancers?
“Where a cure is not expected then an estimate of how soon the disease might recur or, more importantly, cause death, are issues now often raised”
STANDARD PREDICTION AND PROGNOSIS
In the UK, the standard of steroid receptor assay is centrally monitored, with reported inequalities for detection at the lower level that was achieved by only 37% of over 250 participating laboratories, admittedly from 26 countries with more than 50% from the UK.1 Although recommendations have been made on how to provide quantitative receptor assays, and values matched to likely response levels,2 assessments of consistency are uncommon and regional audits go unreported. In contrast, assessments of pathologist diagnostic consistency were initiated (and continue) under the auspices of the UK National Breast Screening Programme quality assurance initiative, through biannual breast slide circulation. This has revealed both strengths and weaknesses3 but, within the accepted limits of the exercise, there have been demonstrable consistency improvements over time for most feature evaluations, according to the κ scores circulated to all participants (currently around 300). Laudable and reassuring though these measures of pathologist performance may be, it is nevertheless equally relevant to consider the other issue, feature discrimination cut off points. Because the spectrum of disease identified has been changed by the implementation of screening mammography—smaller cancers with less node metastases, and reduced advanced disease4—this may affect the relative statistical weighting of features in the existing index, formulated 20 years ago on the basis of symptomatic practice.
The usefulness of histological grading features combined in the NPI is acknowledged and endorsed in the UK Pathology Guidelines.5 However, an index that brings significant improvement to prognostication, for a population with major mammography screening exposure, has recently been reported.4 Two major factors that differ are: first, the adoption of size categories in increments that accentuate the survival value of very small sizes (rather than as a continuous variable) and, secondly, the incorporation of the recognition of histological special type cancers. This observation needs to be independently validated before it can be promoted. Another breast cancer prognostic feature is considered in this issue of the Journal of Clinical Pathology,6 where the benefit of adopting a lower cut off point for mitotic counts than is used for standard Nottingham grading criteria5 is demonstrated for prognostication in pT1N0M0 cancers. The prognostic strengths of mitotic assessment alone have been amply shown in different settings,7,8 with debate over the “best” routine method being resolved essentially by local practicalities of procedure and time. Thus, evidence is mounting that we must reconsider the goal post positions on the playing fields of prognostication.
What is also important to accept is that pathologists have embarked upon a new phase of multidisciplinary breast cancer management, in which “fine tuning” of pathology detail is necessary to give the patient the greatest benefit of treatment options. This applies as much to cases with the best outlook as to those with the poorest. Because much of screen detected disease, being low grade or non-invasive, does not represent a major survival threat, then in appropriate cases these options might include less rather than more adjuvant treatment. The longer the patient survives, the greater the likelihood that treatment morbidity of any kind will emerge. Yet, concerning the assessment of lymph node status, the future longer term morbidity of axillary surgery is likely to lessen from the advent of sentinel node procedures,9,10 for which support has been provided to pilot procedures of node removal in the UK. It is therefore unfortunate that no support is yet forthcoming for the collaboration of UK pathologists to investigate the implications of more detailed lymph node histological assessment, in an attempt to resolve the question over the clinical relevance of detecting micrometastasis. Several refinements to the reporting and recording of lymph node metastases have been recommended in a recent North American Consensus publication,11 although there is no move to reciprocate in Europe.
“Pathologists have embarked upon a new phase of multidisciplinary breast cancer management, in which fine tuning of pathology detail is necessary to give the patient the greatest benefit of treatment options”
In answer to the consistency question, the evidence tells us (in parlance of κ statistics) that pathologists are “near perfect” in consistency of diagnosis for invasive cancer, and in “substantial agreement” for non-invasive cancer.3,12 When it comes to histological grade, only “good agreement” is achieved, and it appears that this may be a plateau that most groups will not go beyond, at least with the current protocols.13 Nevertheless, it is still probably true that the “best manner to discriminate” remains by the microscope and paraffin wax embedded section, provided the specimen is appropriate (some are suboptimal) and expertise is available.
The question of discrimination points is more difficult to deal with. It is conceivable that the spectrum of breast cancers could be subdivided for more effective prognostication into groups determined by detection mode, disease extent, and biological expression, rather than on historical clinical staging. This is a new prospect that needs further consideration and debate between pathologists, radiologists, surgeons, and oncologists. Unfortunately, most of the histological features that pathologists use as their yardsticks are continuous variables, with consequent problems, not only in reaching high κ agreement, but also for biostatistical determination of optimal cut off points.14 Is it heresy to suggest that deciding where to site the new goal posts may need to be based on pragmatic terms, rather than on experimental trials? Bearing in mind the difficulties that have been associated with reaching conclusions in the UK trials of screening frequency or of non-invasive cancer treatment, adoption of consensus agreement after the debate is as likely to hasten benefit for the patients.
Is it time to move the goalposts?