Development of a severity of disease score and classification model by machine learning for hospitalized COVID-19 patients

PLoS One. 2021 Apr 21;16(4):e0240200. doi: 10.1371/journal.pone.0240200. eCollection 2021.

Abstract

Background: Efficient and early triage of hospitalized Covid-19 patients to detect those with higher risk of severe disease is essential for appropriate case management.

Methods: We trained, validated, and externally tested a machine-learning model to early identify patients who will die or require mechanical ventilation during hospitalization from clinical and laboratory features obtained at admission. A development cohort with 918 Covid-19 patients was used for training and internal validation, and 352 patients from another hospital were used for external testing. Performance of the model was evaluated by calculating the area under the receiver-operating-characteristic curve (AUC), sensitivity and specificity.

Results: A total of 363 of 918 (39.5%) and 128 of 352 (36.4%) Covid-19 patients from the development and external testing cohort, respectively, required mechanical ventilation or died during hospitalization. In the development cohort, the model obtained an AUC of 0.85 (95% confidence interval [CI], 0.82 to 0.87) for predicting severity of disease progression. Variables ranked according to their contribution to the model were the peripheral blood oxygen saturation (SpO2)/fraction of inspired oxygen (FiO2) ratio, age, estimated glomerular filtration rate, procalcitonin, C-reactive protein, updated Charlson comorbidity index and lymphocytes. In the external testing cohort, the model performed an AUC of 0.83 (95% CI, 0.81 to 0.85). This model is deployed in an open source calculator, in which Covid-19 patients at admission are individually stratified as being at high or non-high risk for severe disease progression.

Conclusions: This machine-learning model, applied at hospital admission, predicts risk of severe disease progression in Covid-19 patients.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Area Under Curve
  • COVID-19 / classification*
  • COVID-19 / diagnosis
  • COVID-19 / epidemiology
  • COVID-19 / therapy
  • Cohort Studies
  • Female
  • Forecasting
  • Hospitalization / statistics & numerical data
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Models, Statistical
  • ROC Curve
  • Respiration, Artificial
  • Retrospective Studies
  • Risk Assessment
  • SARS-CoV-2 / isolation & purification
  • Severity of Illness Index
  • Spain / epidemiology
  • Triage / methods

Grants and funding

This work was partially supported by Instituto de Salud Carlos III, Ministerio de Ciencia e Innovación (Madrid, Spain) and FEDER Funds “Una manera de hacer Europa”, by grants CIBERCV CB16/11/00374 to Pedro-Luis Sánchez and RD16/0017/0023 to Miguel Marcos, and by Institute of Biomedical Research of Salamanca (IBSAL) through a special grant for Covid-19 research.