Interobserver variation in the histopathological assessment of Helicobacter pylori gastritis

Hum Pathol. 1996 Jan;27(1):35-41. doi: 10.1016/s0046-8177(96)90135-5.

Abstract

The histopathologic detection of Helicobacter pylori in gastric biopsy specimens is considered the gold standard for the diagnosis of H pylori infection. However, few studies have addressed the pathologists' reliability to detect the organism and to assess the degree of the related inflammatory changes. The objectives of this study were to determine the degree of agreement among the findings of four gastrointestinal pathologists in the semiquantitative evaluation of H pylori infection and gastritis. Three slides from specified areas of the stomach of 99 patients with and without H pylori infection were stained with the triple stain, coded, and examined independently by four pathologists. For each specimen, a visual analogue scale graded from 0 (absent/normal) to 5 (maximal intensity) was used to score (1) H pylori (2) neutrophils, and (3) atrophy. Data were analyzed using kappa-statistics. The kappa-coefficient for the detection of H pylori (present vs absent) was approximately .9 (excellent); for the intensity of infection, it was considerably lower on the 6-point scale (approximately .61) and improved slightly on an amalgamated 4-point scale (approximately .71). The agreement on presence or absence of neutrophils was excellent (kappa = .8) in antral biopsies and good (kappa = .67) in corpus biopsies. The kappa for the semiquantitative scoring of neutrophils was poor on the 6-point scale (approximately .43) and fair on the amalgamated scale (approximately .54). The interobserver agreement was the poorest in the evaluation of atrophy (presence, absence, categories, or group categories) with kappa coefficients varying from .08 and .29. This group of pathologists had a high level of concordance on the diagnosis of H pylori infection in any particular patient and a high index in the assessment of the intensity of infection. The agreement was less in the semiquantitative evaluation of active inflammation. When the evaluation concerned a loosely defined feature, such as atrophy, there was essentially no agreement among the pathologists. This study suggests the need for further assessments of pathologists' ability to provide reproducible diagnoses. These results also indicate that more stringent criteria for the diagnosis of "soft" histopathologic features (such as atrophy) are urgently needed.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Atrophy / pathology
  • Biopsy
  • Gastritis / microbiology
  • Gastritis / pathology*
  • Helicobacter Infections / pathology*
  • Helicobacter pylori / isolation & purification*
  • Humans
  • Neutrophils / pathology
  • Observer Variation
  • Reproducibility of Results