Inter- and intraobserver variation in the histopathological evaluation of early oesophageal adenocarcinoma
- Brechtje A Grotenhuis1,
- Mark van Heijl2,
- Fiebo J W ten Kate3,4,
- Katharina Biermann5,
- G Johan A Offerhaus4,
- Bas P L Wijnhoven1,
- Mark I van Berge Henegouwen2,
- Hugo W Tilanus1,
- J Jan B van Lanschot1
- 1Department of Surgery, Erasmus Medical Centre, Rotterdam, The Netherlands
- 2Department of Surgery, Academic Medical Centre, Amsterdam, The Netherlands
- 3Department of Pathology, Academic Medical Centre, Amsterdam, The Netherlands
- 4Department of Pathology, University Medical Centre, Utrecht, The Netherlands
- 5Department of Pathology, Erasmus Medical Centre, Rotterdam, The Netherlands
- Correspondence to Dr Brechtje A Grotenhuis, Erasmus Medical Centre, Department of Surgery, PO Box 2040, 3000 CA Rotterdam, The Netherlands;
- Accepted 23 July 2010
- Published Online First 23 September 2010
Aims According to the classification established by the Japanese Society for Oesophageal Disease, early oesophageal cancer can be subdivided into six successive layers of the mucosa or submucosa, which influences the treatment strategy and prognosis of the individual patient. However, the reproducibility of this classification in terms of inter- and intraobserver variability is unclear.
Methods Histological slides from 105 surgical resection specimens of patients who had undergone oesophagectomy for early oesophageal adenocarcinoma were reviewed independently by three gastrointestinal pathologists, and were classified according to the Japanese criteria (m1/m2/m3/sm1/sm2/sm3 tumours). Inter- and intraobserver variation was determined by κ-statistics.
Results The interobserver reproducibility was good between pathologist 1 and 2 (κ=0.61, 95% CI 0.55 to 0.67), and moderate between pathologist 1 and 3 (κ=0.51, 95% CI 0.45 to 0.57) and between pathologist 2 and 3 (κ=0.50, 95% CI 0.38 to 0.61). The intraobserver agreement as assessed by the expert pathologist was good (κ=0.76), with a 95% CI that was interpreted as good to very good (0.67 to 0.85). Most agreement was achieved at the lower (m1) and upper site (sm2, sm3) of the spectrum, whereas the m2 tumours reflected the most discrepant stage. The majority of the observed discrepancy included the variation in one substage only.
Conclusions The reproducibility of the Japanese classification is good in terms of inter- and intraobserver variability when grading early oesophageal adenocarcinoma on surgical resection specimens. The present data confirm that dedicated gastrointestinal pathologists with broad experience are preferred when grading the resection specimens of patients with early oesophageal adenocarcinoma.
Oesophageal adenocarcinoma is diagnosed as having increasing frequency at an early stage, due to the increased awareness of the clinical importance of Barrett's oesophagus, more intensive surveillance programmes, and improved imaging techniques.1 2 According to the UICC (Union Internationale Contre le Cancer) TNM classification, early carcinoma is defined as an invasive tumour limited to the mucosa (T1a) or submucosa (T1b), irrespective of the lymph-nodal status.3 This dichotomy is also incorporated in the revised Vienna classification that has been established to resolve the discrepancies in nomenclature between Western and Japanese pathologists with regard to the grading of gastrointestinal epithelial neoplasia (table 1).4 5 In this classification, intramucosal carcinoma (together with high-grade dysplasia and carcinoma in situ set in category 4) is separated from submucosal invasion (category 5).4 Moreover, in 2001, the Japanese Society of Oesophageal Disease introduced its classification in which early cancers are subdivided into six successive layers of the mucosa (m1, m2, m3) or submucosa (sm1, sm2, sm3).6 However, the usefulness and reproducibility of this more recently introduced, extended classification have not yet been studied.
In general, a classification should be simple, a reflection of available therapeutic options and reproducible. Simplicity means that a classification should have as few categories as possible, but as many as required clinically. Reflection of available therapeutic options should include the possibilities of careful observation, local endoscopic treatment or surgery. It appears that subdivision in six, rather than two, categories is helpful in directing patients to the most optimal treatment strategy. Current treatment options for early oesophageal cancer vary from endoscopic mucosal resection (EMR) to surgical resection with extended or regional lymphadenectomy.7–10 The decision of whether to perform an endoscopic or a surgical resection is mainly dependent on patients' predicted lymph-nodal status,11 which is related to the depth of infiltration of the primary tumour. Lymph-node metastases have not been reported in m1 and m2 mucosal adenocarcinomas (thereby allowing endoscopic treatment), whereas positive lymph nodes can be found in 0–12% of m3 patients.12–15 In submucosal adenocarcinomas, it appears that sm2–sm3 tumours are associated on average with a higher rate of lymph-node metastases than sm1 tumours12–15; hence, an oesophagectomy is considered the standard therapy.9 16 17 The optimal management of carcinomas with invasion into the muscularis mucosae (m3) or the most superficial layer of the submucosa (sm1) is still under debate. Finally, subdivision of early oesophageal cancer in six categories seems justified, as the treatment algorithm depends largely on the chance for lymphatic dissemination that is mainly related to the tumour infiltration depth.
Reproducibility of a classification includes the aspects of inter- and intraobserver variability. Thus far, these observer agreements have not been studied in the light of the Japanese classification of early oesophageal cancer. Therefore, the aim of the present study was to evaluate inter- and intraobserver variation among gastrointestinal pathologists in grading early oesophageal adenocarcinoma according to the criteria proposed by the Japanese Society for Oesophageal Disease, when analysing surgical resection specimens.
Material and methods
The outcome of 120 patients who underwent transhiatal oesophagectomy with regional lymphadenectomy for early adenocarcinoma of the distal oesophagus or gastro-oesophageal junction (pT1-tumours) between 1980 and 2002 in two university hospitals in The Netherlands (Erasmus Medical Centre, Rotterdam, and Academic Medical Centre, Amsterdam) has been described previously.13 Patients did not receive neoadjuvant chemo(radio)therapy. Archival material of the resection specimens was used for the present study. Tissue blocks and slides of the resection specimens of 13 patients were not available anymore, and in the resection specimens of another two patients the tumour was classified as a pT2 tumour (infiltration into the muscularis propria) by all three pathologists. Finally, resection specimens of 105 patients were used in the present study.
Tumours were assigned pathological tumour-node-metastasis stages according to the UICC 2002 system.3 The depth of tumour invasion was measured and subclassified based on the criteria proposed by the Japanese Society for Oesophageal Disease (figure 1).6 High-grade dysplasia or carcinoma in situ without invasion through the basement membrane was classified as m1. Intramucosal carcinoma was defined as a tumour extending beyond the basement membrane into the lamina propria (m2). All carcinomas with invasion into the muscularis mucosae were classified as m3 tumours, irrespective of the presence of a double muscularis mucosae (ie, a superficial and a deep layer observed in Barrett's mucosa can be classified as an m3 and m4 lesion, respectively18). Carcinomas infiltrating beyond the (deeper) muscularis mucosae but limited to the upper third of the submucosa, were classified as sm1. Equally, carcinomas infiltrating the submucosa but limited to the middle or lower third of the submucosa were defined as sm2 or sm3, respectively.
To assess interobserver agreement in analysing early oesophageal cancer, histological slides of the primary tumour of 105 resection specimens were independently reviewed by three gastrointestinal pathologists. Pathologist 1 and pathologist 2 were highly experienced (more than 20 years' experience) and had been closely working together in the same department. The third pathologist has a shorter work experience as a dedicated gastrointestinal pathologist (2 years), but in daily practice evaluates all pathology specimens with regard to oesophageal cancer in a high-volume centre. The pathologists were blinded for the identity of the patient and the initial diagnosis. The resection specimen of each individual patient was categorised based on the deepest infiltration of the primary tumour according to the six classes of early oesophageal cancer (m1, m2, m3, sm1, sm2, sm3). To assess intraobserver agreement, the most experienced pathologist who is considered an expert on oesophageal cancer rereviewed all histological slides 2 weeks after this first assessment.
Inter- and intraobserver agreement were determined by κ statistics. κ statistics are widely used and accepted coefficients that provide a measure of observer agreement accounting for agreement other than that which occurs by chance alone.19 Coefficients <0.21, 0.21 to 0.40, 0.41 to 0.60, 0.61 to 0.80 and 0.81 to 1.00 represent a poor, fair, moderate, good and very good agreement, respectively.20 κ-statistics with corresponding 95% CIs were calculated with regard to interobserver agreement (pathologist 1 versus pathologists 2 and 3, respectively, and pathologist 2 vs pathologist 3) and intraobserver agreement (as assessed by pathologist 1). Data analysis was carried out with SPSS version 15.0 (SPSS, Chicago, Illinois).
Clinicopathological characteristics of the present study population are shown in table 2. The infiltration depth of the primary tumour was initially classified as carcinoma in situ in 10 patients (9.5%), and as invasive carcinoma in 95 patients (90.5%). One m3 patient, five sm2 patients and 11 sm3 patients were diagnosed as having a N1 status.
The interobserver variation has been tested in three ways: pathologist 1 (expert on oesophageal cancer) versus pathologist 2, pathologist 1 versus pathologist 3 and pathologist 2 versus pathologist 3. The outcome with regard to this interobserver variation is shown in table 3. The interobserver reproducibility between pathologists 1 and 2 was good (κ=0.61), with a 95% CI that can be interpreted as moderate to good (0.55 to 0.67). The interobserver agreement was graded as moderate between pathologists 1 and 3 (κ=0.51, 95% CI 0.45 to 0.57) and between pathologists 2 and 3 (κ=0.50, 95% CI 0.38 to 0.61). Furthermore, the most experienced pathologist rereviewed all histological slides to assess the intraobserver variation of this classification. The intraobserver agreement was good with a κ of 0.76, and a concomitant 95% CI that was interpreted as good to very good (0.67 to 0.85).
When analysing the discrepancies between the observers with regard to the six different substages as shown in table 4, it appeared that most agreement was achieved at the lower (m1) and upper site (sm2, sm3) of the spectrum. The m2 stage was the most discrepant stage (interobserver agreement 17% and 25% only), whereas in the m3 and sm1 stage, pathologist 3 showed less agreement. However, when allowing variation in one substage (ie, when the most experienced pathologist scored m3, both m2 and sm1 substages were not considered discrepant), it appeared that there was only 4–5% discordance with regard to the interobserver variation and 3% disagreement for the intraobserver variation of the most experienced observer (pathologist 1).
The reproducibility of the classifications that aim for the discrimination between Barrett metaplasia, dysplasia and adenocarcinoma has been studied previously. Various groups have demonstrated that the use of such classifications (eg, Vienna classification) is still accompanied by considerable interobserver variability.21–23 Furthermore, studies have been undertaken to assess the observer variation in the diagnosis of high-grade dysplasia, intramucosal adenocarcinoma (T1a) or submucosal adenocarcinoma (T1b).24 25 In general, this discrimination in tumour infiltration depth will reflect the treatment options of endoscopic treatment (HGD, T1a) and surgery (T1b). When evaluating surgical resection specimens, the interobserver agreement was moderate when comparing HGD and intramucosal carcinoma, and good in case separation of intramucosal carcinoma from submucosal carcinoma was aimed for.24 When preoperative biopsies were analysed, these agreements appeared to be only fair and poor, respectively.25 However, it has been suggested that interobserver agreement on EMR specimens is significantly higher than on biopsy specimens.26 Nevertheless, it may be a drawback of this study that interobserver variability has been evaluated from surgical resection specimens rather than from EMR specimens, as the latter will largely influence clinical decision-making.
To our knowledge, this is the first study that reports on the reproducibility of the Japanese classification in grading early oesophageal adenocarcinoma into six categories. In the present series, the m2 stage was the most discrepant stage (interobserver agreement 17% and 25% only), although it can be questioned if the interpretation of an m2 tumour as an m1 or m3 lesion will affect largely the clinical decision-making. In the m3 and sm1 stages, of which optimal treatment strategies are still under debate, pathologist 3 showed less agreement. Furthermore, there was no underdiagnosis of submucosal carcinoma that requires undoubtedly surgical resection as only little discordance in the sm2 and sm3 substages was noted, and in 16 out of 17 N1 patients an sm2 or sm3 tumour was observed. It can be questioned whether the differences in agreement between pathologist 1 and 2 versus the agreement between pathologist 3 and pathologists 1 and 2, respectively, reflect a difference in years of working experience, or whether the results are influenced by the fact that pathologists 1 and 2 had worked closely together in the same department for many years. Nevertheless, the present data confirm that dedicated gastrointestinal pathologists are preferred when grading the resection specimens of this subpopulation of patients with oesophageal cancer.
The Japanese classification for grading early oesophageal cancer plays an important role in the phase prior to the onset of treatment as well as the time period after the patient has undergone treatment. In the post-treatment phase, the final tumour stage as determined by histology of the resection specimen will define the patient's prognosis. In the pretreatment phase, its role is even more important, as the decision whether to perform an endoscopic or a surgical resection in patients with early oesophageal cancer is mainly dependent on patients' predicted lymph-nodal status.11 At present, all diagnostic modalities including endoscopic ultrasonography lack the ability to correctly stage the N-category with a very high accuracy (71–86% in literature).27 28 Nevertheless, on the basis of the T-stage as staged by the pathologist on an EMR specimen, one can predict the chance for lymphatic dissemination because of its relationship to the tumour infiltration depth as assessed by the Japanese classification.12–15 EMR is advocated as a staging modality to determine the T-stage in patients with early oesophageal cancer,29 30 although the assessment of the exact depth of submucosal invasion remains difficult, as full-thickness submucosa is virtually absent in these specimens.11 Therefore, if the submucosal invasion exceeds 0.5 mm, one may consider the tumour infiltration beyond the sm1 layer. Other factors that may predict lymphatic involvement in early oesophageal cancer include a diameter greater than 3 cm, a poor differentiation grade and lymphangio-invasion.27 31 32 Although indications for endoscopic (m1, m2 tumours) and surgical treatment (sm2, sm3 tumours) are clear, for each individual with an m3 or sm1 adenocarcinoma it should be carefully considered whether a radical oesophagectomy may be beneficial. Therefore, individual treatment plans and a close interdisciplinary cooperation between surgeon, gastroenterologist and pathologist are mandatory in this patient population.
In conclusion, the reproducibility of the Japanese classification system is good in terms of inter- and intraobserver variability when grading early oesophageal adenocarcinoma on surgical resection specimens. The present data confirm that dedicated gastrointestinal pathologists with broad experience are preferred when grading the resection specimens of patients with early oesophageal adenocarcinoma.
The Japanese Society of Oesophageal Disease introduced a histopathological classification, in which early oesophageal cancer is subdivided into six successive layers of the mucosa (m1, m2, m3) or submucosa (sm1, sm2, sm3).
The present study shows a good reproducibility of this classification for early oesophageal cancer in terms of inter- and intraobserver variability.
Dedicated gastrointestinal pathologists with broad experience are preferred when grading the resection specimens of patients with early oesophageal adenocarcinoma.
We would like to thank SE Hoeks (Erasmus MC, Department of Anaesthesiology) for her statistical advice.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.