Article Text

Download PDFPDF
Beyond the routine CBC: machine learning and statistical analyses identify research CBC parameter associations with myelodysplastic syndromes and specific underlying pathogenic variants


Aims Given the time, expense and clinical expertise required for a myelodysplastic syndrome (MDS) diagnosis, there is a clear need for a cost-effective screening laboratory test that can rapidly and accurately distinguish patients with cytopenias related to MDS from other causes.

Methods We measured conventional and research use only complete blood cell (CBC) parameters using the Sysmex XN-series haematology analyser in 102 MDS patients (70 patients with active MDS and 32 patients in remission), 43 patients with cytopenia without morphological evidence of MDS and 484 age-adjusted controls. A variety of algorithms, including random forest machine learning, were used to construct parameter-based models to predict the presence of MDS using both CBC and molecular data or CBC data alone and correlated individual pathogenic variants/genetic pathways with CBC parameters changes.

Results Using the CBC parameters alone, our predictive model for active MDS showed a 0.86 receiver operating characteristic curve (ROC)/area under the ROC curve (AUC), with 0.87 sensitivity and 0.72 specificity; with the addition of the molecular and demographic status, the ROC/AUC improved to 0.93, sensitivity to 0.89 and specificity to 0.84. The most discriminatory MDS parameters were reflective of dysplastic neutrophil morphology, red cell count fragmentation and degree of platelet immaturity. Specific patterns of parameters were associated with individual gene pathogenic variants or affected pathways.

Conclusions CBC research parameters can be used as an adjunct to the haematological workup of cytopenia(s) to help screen for patients with high likelihood of MDS.

  • hematology
  • myelodysplastic syndromes
  • information technology
  • cell count
  • morphological and microscopic findings

Data availability statement

All data relevant to the study are included in the article or uploaded as online supplemental information.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.