A practical application of analysing weighted kappa for panels of experts and EQA schemes in pathology

Karen C Wright; Patricia Harnden; Sue Moss; Dan M Berney; Jane Melia

doi:10.1136/jcp.2010.086330

Article Text

Original article

A practical application of analysing weighted kappa for panels of experts and EQA schemes in pathology

Karen C Wright1,
Patricia Harnden2,
Sue Moss1,
Dan M Berney3,
Jane Melia1

¹Cancer Screening Evaluation Unit, Institute of Cancer Research, Sutton, UK
²Department of Histopathology, Leeds Teaching Hospitals NHS Trust, St James's University Hospital, Leeds, UK
³Centre for Molecular Oncology & Imaging, Barts and the London School of Medicine and Dentistry, London, UK

Correspondence toK C Wright, Cancer Screening Evaluation unit, Institute of Cancer Research, Sir Richard Doll Building, 15 Cotswold Road, Sutton SM2 5NG, UK; Karen.Wright{at}icr.ac.uk

Abstract

Background Kappa statistics are frequently used to analyse observer agreement for panels of experts and External Quality Assurance (EQA) schemes and generally treat all disagreements as total disagreement. However, the differences between ordered categories may not be of equal importance (eg, the difference between grades 1 vs 2 compared with 1 vs 3). Weighted kappa can be used to adjust for this when comparing a small number of readers, but this has not as yet been applied to the large number of readers typical of a national EQA scheme.

Aim To develop and validate a method for applying weighted kappa to a large number of readers within the context of a real dataset: the UK National Urological Pathology EQA Scheme for prostatic biopsies.

Methods Data on Gleason grade recorded by 19 expert readers were extracted from the fixed text responses of 20 cancer cases from four circulations of the EQA scheme. Composite kappa, currently used to compute an unweighted kappa for large numbers of readers, was compared with the mean kappa for all pairwise combinations of readers. Weighted kappa generalised for multiple readers was compared with the newly developed ‘pairwise-weighted’ kappa.

Results For unweighted analyses, the median increase from composite to pairwise kappa was 0.006 (range −0.005 to +0.052). The difference between the pairwise-weighted kappa and generalised weighted kappa for multiple readers never exceeded ±0.01.

Conclusion Pairwise-weighted kappa is a suitable and highly accurate approximation to weighted kappa for multiple readers.

Interobserver agreement
observer variation
weighted kappa statistics
prostate cancer
gleason sum score
epidemiology
prostate

https://doi.org/10.1136/jcp.2010.086330

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

View Full Text

Footnotes

Funding KCW, SM and JM are funded by the Policy Research Programme of the Department of Health. DMB is funded by Orchid (registered with the Charity Commission No 1080540 and registered in England 3963360). The Prostate External Quality Assurance is funded by the NHS Cancer Screening Programme.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Statistics from Altmetric.com

Request Permissions

Footnotes

Read the full text or download the PDF:

Log in using your username and password

Read the full text or download the PDF:

Log in using your username and password