Article Text

Download PDFPDF
Beyond the routine CBC: machine learning and statistical analyses identify research CBC parameter associations with myelodysplastic syndromes and specific underlying pathogenic variants
  1. Olga Pozdnyakova1,
  2. Radu Stefan Niculescu2,
  3. Tracey Kroll3,
  4. Lisa Golemme1,
  5. Nolan Raymond1,
  6. Debra Briggs4,
  7. Annette Kim1
  1. 1 Department of Pathology, Brigham and Women's Hospital, Boston, Massachusetts, USA
  2. 2 PTC, Wayne, Pennsylvania, USA
  3. 3 PTC, Wayne, South Carolina, USA
  4. 4 Dana Farber Cancer Institute, Boston, Massachusetts, USA
  1. Correspondence to Dr Olga Pozdnyakova, Pathology, Brigham and Women's Hospital Department of Pathology, Boston, Massachusetts, USA; opozdnyakova{at}


Aims Given the time, expense and clinical expertise required for a myelodysplastic syndrome (MDS) diagnosis, there is a clear need for a cost-effective screening laboratory test that can rapidly and accurately distinguish patients with cytopenias related to MDS from other causes.

Methods We measured conventional and research use only complete blood cell (CBC) parameters using the Sysmex XN-series haematology analyser in 102 MDS patients (70 patients with active MDS and 32 patients in remission), 43 patients with cytopenia without morphological evidence of MDS and 484 age-adjusted controls. A variety of algorithms, including random forest machine learning, were used to construct parameter-based models to predict the presence of MDS using both CBC and molecular data or CBC data alone and correlated individual pathogenic variants/genetic pathways with CBC parameters changes.

Results Using the CBC parameters alone, our predictive model for active MDS showed a 0.86 receiver operating characteristic curve (ROC)/area under the ROC curve (AUC), with 0.87 sensitivity and 0.72 specificity; with the addition of the molecular and demographic status, the ROC/AUC improved to 0.93, sensitivity to 0.89 and specificity to 0.84. The most discriminatory MDS parameters were reflective of dysplastic neutrophil morphology, red cell count fragmentation and degree of platelet immaturity. Specific patterns of parameters were associated with individual gene pathogenic variants or affected pathways.

Conclusions CBC research parameters can be used as an adjunct to the haematological workup of cytopenia(s) to help screen for patients with high likelihood of MDS.

  • hematology
  • myelodysplastic syndromes
  • information technology
  • cell count
  • morphological and microscopic findings

Data availability statement

All data relevant to the study are included in the article or uploaded as online supplemental information.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Data availability statement

All data relevant to the study are included in the article or uploaded as online supplemental information.

View Full Text


  • Handling editor Tahir S Pillay.

  • Contributors OP and AK: designed the study, collected data, performed analyses and wrote the paper; RSN and TK: designed statistical platform and performed analyses; LG, NR and DB: collected data. OP is acting as guarantor.

  • Funding This work was supported through a research grant from Sysmex America.

  • Disclaimer This financial support had no influence on the interpretation of the results or findings stated by the authors.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.