Aims: To assess the interobserver reproducibility of certain histological features proposed for the diagnosis of melanoma.
Methods: In a series of melanomas, 13 histological parameters were analysed: dimension > 6 mm, asymmetry, poor circumscription, irregular confluent nests, single melanocytes predominating, absence of maturation, suprabasal melanocytes, asymmetrical melanin, melanin in deep cells, cytological atypia, mitoses, dermal lymphocytic infiltrate, and necrosis.
Results: The agreement (reproducibility) between the nine observers was excellent (κ > 0.75) for 10 of the 13 examined features (dimension > 6 mm, poor circumscription, irregular confluent nests, single melanocytes predominating, absence of maturation, suprabasal melanocytes, asymmetrical melanin, melanin in deep cells, mitoses, and necrosis). The agreement for asymmetry was very close to excellence (κ = 0.74), and that for cytological atypia (κ = 0.65) and dermal lymphocytic infiltrate (κ = 0.47) was slightly lower, but in the fair to good agreement range. The κ values obtained by comparison with the majority diagnosis were generally high (⩾ 0.85); the mean value of κ was lower (0.70) for only one parameter (dermal lymphocytic infiltrate).
Conclusions: The parameters investigated showed an overall good reproducibility.
Statistics from Altmetric.com
Several morphological features are frequently used in the histopathological diagnosis of cutaneous melanoma, including the following: pagetoid infiltration, melanocytic atypia, mitotic figures, a band-like dermal inflammatory infiltrate containing melanophages, lack of maturation of abnormal melanocytes with the progressive descent into the dermis, pronounced variation in size and in shape of the melanocytic nests, poor circumscription of the intraepidermal melanocytes, confluence of the melanocytic nests, necrosis of melanocytes, involvement of epithelial adnexal structures by atypical melanocytes, asymmetry of the lesion, dimension greater than 6 mm, abundant melanin, mitoses near the base of the neoplasm, uneven distribution of melanin, melanocytes as solitary units predominating over nests, and uneven base of the lesion.1,2,3,4,5,6,7,8,9,10,11 However, despite the relatively high number of diagnostic criteria proposed, melanoma diagnosis remains problematic in a large proportion of cases, with possible diagnostic discord, even among experts.12–14 This could be the result of several factors, one of which might be the interobserver reproducibility of the histological criteria used.
“Despite the relatively high number of diagnostic criteria proposed, melanoma diagnosis remains problematic in a large proportion of cases, with possible diagnostic discord, even among experts”
The aim of our present study was to analyse a series of conventional cutaneous malignant melanomas (melanomas not referable to a specific subtype, such as spitzoid melanoma, desmoplastic melanoma, neurotropic melanoma, myxoid melanoma, etc.) to evaluate the interobserver reproducibility of certain histological features used for the diagnosis of melanoma.
MATERIALS AND METHODS
Our study was undertaken by nine dermatopathologists, affiliated to the melanocytic lesion group of the Italian Association of Dermatopathology (AIDEPAT), from eight Italian institutions, namely: dermatopathology section, S M Annunziata Hospital, Health Unit 10 of Florence (CU); dermatopathology centre – Di. S E M, University of Genoa (FR); institute of dermatology, University La Sapienza, Rome (DI); department of human pathology, University of Messina (DB, ML); institute of dermatology, University Tor Vergata, Rome (SC); institute of dermatology, University of Bari (RF); institute of dermatology, University of Milan (RG); and institute of dermatology, University of Turin (CT).
Eighty melanocytic lesions, originally diagnosed as melanomas, were retrieved from the files of eight Italian institutions (10 consecutive cases for each institution, one slide for each case). The tissue fragments, containing the entire lesion, had been fixed in buffered formalin, processed routinely, and stained with haematoxylin and eosin. All identifying marks were removed from the slides, and they were relabelled with a standard label for the study, randomised, and renumbered from 1 to 80 by a person who did not take part in the histological evaluation.
Definition of histological features
Thirteen histological features commonly used in the diagnosis of malignant melanoma were considered. (1) Dimension > 6 mm,8,11 lesions, cut along their major axes, were measured on the histological slides. (2) Asymmetry, defined as present when a central line divided the lesion into two parts that looked different in shape, in thickness, or in number and position of dermal cells,8,10,11 and as absent only when the lesion appeared perfectly symmetrical. (3) Poor circumscription of the lesion, defined as present when the epidermal melanocytic proliferation, extending beyond the dermal component of the lesion, ended with single cells, rather than with a well defined nest.9–11,15 (4) Irregular and confluent nests, defined as epidermal melanocytic nests, variable in size and in shape and tending to confluence8,9,10,11; the position of melanocytic nests in the epidermal layers was not considered. (5) Single melanocytes predominating, defined as epidermal melanocytes disposed as solitary units predominating over melanocytic nests in some high power fields10,11; the position of melanocytes in the epidermal layers was not considered. (6) Absence of maturation, defined as failure of melanocyte nuclei to become smaller with progressive descent into the dermis.8,9,10,11,15 (7) Suprabasal melanocytes, defined as melanocytes above the dermo–epidermal junction8,10,11; the number of suprabasal cells, their location in the centre or in the periphery of the lesion, and whether cells appeared single or in nests were not considered. (8) Asymmetrical melanin, defined as present when a central line divided the lesion into two parts that showed different amounts of dermal pigment.10,11 (9) Melanin in deep cells, defined as the presence of melanin in melanocytes near the base of the neoplasm10,11; the feature was not graded: it was considered as present even if only small numbers of melanin granules were seen in the cytoplasm of deep melanocytes. (10) Cytological atypia, defined as melanocytic nuclei enlarged (more than keratinocytic ones), variable in size and in shape, hyperchromatic, with eosinophilic or amphophilic nucleoli8,9,10,11,15; atypia was not graded to avoid subjective evaluations: it was considered as present when it was slight/moderate and when it was severe. (11) Mitoses,8,9,10,11 defined as present when at least one was seen in the dermal component of the lesion. (12) Necrosis, defined as present when dermal necrotic melanocytes were seen.8,9,10,11 (13) Dermal lymphocytic infiltrate, defined as present when a dermal lymphocytic infiltrate was evident underlying and/or in the context of the melanocytic proliferation15; the feature was not graded: it was considered as present when it was sparse and pronounced, forming a continuous band in the dermis. Parameters, proposed by the coordinator of the study (CU), were individually discussed in detail, and finally approved by all the members of the panel.
Cases were examined at the multiheaded microscope. Each case was analysed for all the given features, which were evaluated on the basis of a yes/no decision, as present or absent. The dimension of the lesion was measured by one of us (CU) and results were checked by all the other members of the panel. When a feature could not be evaluated in a given case (for example, absence of maturation in an epidermal melanocytic lesion or melanin in deep cells in a superficially compound one), it was considered as non-applicable. For each lesion, participants individually recorded their own evaluations and their final diagnosis on a special form. Each observer evaluated the cases independently in a blinded fashion; that is, without knowledge of which institution had provided the slide and of the other investigators’ evaluations. No clinical data were provided. Subsequently, data were collected and elaborated. In each lesion, each feature was considered as present if at least five of the nine observers had agreed on its presence.
For the statistical study, data were computerised and analysed by the statistical package SAS (SAS Institute Inc Cary, North Carolina, USA; The logistic procedure, SAS/STAT user’s guide, Release 6.03. Cary, NC: SAS Institute, 1988). For the measurement of concordance between the nine pathologists, the evaluations of each pathologist were compared with those of each of the other members of the panel. Agreement among the nine pathologists was assessed using the κ statistic, which is a widely used index for measuring chance corrected agreement on a nominal or ordinal scale.16 Landis and Koch have characterised different ranges of values for κ with respect to the degree of agreement they suggest.17 For most purposes, values greater than 0.75 represent excellent agreement beyond chance, values below 0.40 represent poor agreement beyond chance, and values between 0.40 and 0.75 represent fair to good agreement beyond chance. A κ value close to 1 means near perfect agreement, whereas a κ value close to 0 does not necessarily means that agreement is poor, but only that it is not greater than that expected by chance alone. Negative values denote less than chance agreement. For each characteristic considered, a 2 × 2 diagnostic table was composed, using dichotomous categories, and specific κ values calculated. An overall value of κ as an average of the individual κ values was also calculated. Finally, the performance of each pathologist for each parameter considered was assessed by comparing his/her ratings with the majority diagnosis. The κ values obtained by comparison with the majority diagnosis are an estimate of the agreement between each pathologist and the “true” diagnosis.
Sixteen of the 80 lesions studied were excluded. Two cases showing melanoma cells associated with naevus remnants were excluded because naevus cells could bias the evaluation of one of the parameters studied (absence of maturation); one case was excluded because it displayed spitzoid features—large spindle and epithelioid cells; two cases were excluded because of the poor quality of the histological technique. Eleven cases were excluded because the panel members could not come to a unanimous diagnostic agreement. Sixty four cases, in which the diagnosis of melanoma was unanimously confirmed by the panel, were studied. Table 1 shows the histological characteristics (histotype, Clark level, and thickness). Two of the nine melanomas in situ were of the lentigo maligna type and seven of the superficial spreading type. Fifty three of the 55 invasive melanomas were superficial spreading melanomas, one a nodular melanoma, and one a lentigo maligna melanoma. Tables 2 and 3 show data on the prevalence of the studied parameters. Cytological atypia (fig 1A), dermal lymphocytic infiltrate (fig 1B), and irregular and confluent nests (fig 1D) were present in all nine cases of melanoma in situ. Single melanocytes predominating (fig 1C) were found in eight, and suprabasal melanocytes (fig 1C) and asymmetry in seven of the nine cases. Poor circumscription of the lesion (fig 1E) was seen in six, dimension > 6 mm in four, and asymmetrical melanin in three of the nine cases (table 2). Cytological atypia was present in all 55 cases of invasive melanoma, dermal lymphocytic infiltrate in 54 cases, suprabasal melanocytes in 53 cases, and asymmetry in 52 cases. Irregular and confluent nests, single melanocytes predominating, dimension > 6 mm, poor circumscription of the lesion, asymmetrical melanin, melanin in deep cells (fig 1G), and absence of maturation (fig 1F) were seen in most lesions. Mitoses (fig 1A) and necrosis (fig 1H) were found less frequently (table 3).
Table 4 summarises the results of the agreement for each histological parameter considered (κ values). Agreement (reproducibility) between the nine observers was excellent (κ > 0.75) for 10 of the 13 examined features. The results for asymmetry (κ = 0.74), cytological atypia (κ = 0.65), and dermal lymphocytic infiltrate (κ = 0.47) were in the fair to good agreement range (table 4). The κ values obtained by comparison with the majority diagnosis were > 0.75; the mean value of κ was less than 0.75 for one parameter only (dermal lymphocytic infiltrate; table 5). The κ values of absence of maturation, melanin in deep cells, mitoses, and necrosis were calculated in the 55 cases of invasive melanoma with a dermal melanocytic component (tables 4, 5).
The histopathological diagnosis of cutaneous melanoma is a multifactorial process, based on the recognition of several architectural, cytological, and dermal histological criteria, identified over a long period of time.1,2,3,4,5,6,7,8,9,10,11 A melanocytic lesion is diagnosed as melanoma in the presence of a combination of histological criteria, none of which by itself is diagnostic of melanoma.12 In a subset of cases, however, the diagnosis may be particularly difficult for several reasons, which have not yet been adequately clarified. Among these, is the inconsistent presence of diagnostic histological criteria in a given lesion. Although possible in theory, in practice, a single melanoma will not show all the diagnostic parameters commonly used, but will generally exhibit only a certain number of them, and can sometimes show conflicting characteristics.14 Another factor is related to the fact the almost all criteria used are frequently found in benign melanocytic lesions, including special naevi, such as Spitz or Reed naevi, and common naevi.18–20 Finally, a further factor may be the poor reproducibility of the parameters currently used.
Several studies have investigated the interobserver reproducibility of diagnostic and prognostic criteria in melanocytic lesions.12,21,22,23,24 In our present study, we analysed 13 histological features used in the diagnosis of melanoma in a series of malignant melanomas. The statistical study of the agreement between the nine observers, using κ statistics, showed that 10 of the investigated features (dimension >6 mm, poor circumscription, irregular confluent nests, single melanocytes predominating, absence of maturation, suprabasal melanocytes, asymmetrical melanin, melanin in deep cells, mitoses, and necrosis) showed high reproducibility (κ > 0.75); asymmetry (κ = 0.74), cytological atypia (κ = 0.65), and dermal lymphocytic infiltrate (κ = 0.47) were less reproducible (table 4). The relatively low κ value for cytological atypia (κ = 0.65) resulted from a strict interpretation of the definition by some pathologists, who considered such a parameter absent in cases where the melanocytes did not show all the alterations required (enlargement of the nuclei, nuclear variation of size and shape, nuclear hyperchromatism, eosinophilic or amphophilic nucleoli). The relatively lower reproducibility of the dermal lymphocytic infiltrate was confirmed by comparison with the majority diagnosis (κ = 0.70; table 5). In a previous study, the interobserver reproducibility of some features (symmetry, sharp circumscription, pagetoid infiltration, and nuclear atypia) was lower than that obtained in our present study, but results are not comparable, because the parameter definitions and methods used were different.23 The good degree of reproducibility obtained in our present study resulted from an accurate definition and a preliminary discussion about each parameter; this excellent reproducibility may also reflect the fact that the pathologists in our study were all dermatopathologists with a special interest in melanoma. However, in conclusion, the relatively high degree of agreement between observers suggests that diagnostic discord in melanoma diagnosis is not the result of the poor reproducibility of current parameters.
Take home messages
We assessed the degree of interobserver agreement for 13 histological diagnostic criteria in 64 malignant melanomas and found that agreement between the nine observers was excellent (κ > 0.75) for 10 of the 13 features, one feature had a κ value close to excellence, and the other two were slightly lower, but in the fair to good agreement range
Thus, the relatively high degree of agreement between observers suggests that diagnostic discord in melanoma diagnosis is not the result of the poor reproducibility of current parameters
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.