COVID-19 Evidence Alerts
from McMaster PLUSTM

Current best evidence for clinical care (more info)

COVID-19 Evidence Alerts needs your support. If our service is of value to you, please consider donating to keep it going. Learn more Donate now

Clinical Prediction Guide Hu C, Liu Z, Jiang Y, et al. Early prediction of mortality risk among patients with severe COVID-19, using machine learning. Int J Epidemiol. 2021 Jan 23;49(6):1918-1929. doi: 10.1093/ije/dyaa171.

BACKGROUND: Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 infection, has been spreading globally. We aimed to develop a clinical model to predict the outcome of patients with severe COVID-19 infection early.

METHODS: Demographic, clinical and first laboratory findings after admission of 183 patients with severe COVID-19 infection (115 survivors and 68 non-survivors from the Sino-French New City Branch of Tongji Hospital, Wuhan) were used to develop the predictive models. Machine learning approaches were used to select the features and predict the patients' outcomes. The area under the receiver operating characteristic curve (AUROC) was applied to compare the models' performance. A total of 64 with severe COVID-19 infection from the Optical Valley Branch of Tongji Hospital, Wuhan, were used to externally validate the final predictive model.

RESULTS: The baseline characteristics and laboratory tests were significantly different between the survivors and non-survivors. Four variables (age, high-sensitivity C-reactive protein level, lymphocyte count and d-dimer level) were selected by all five models. Given the similar performance among the models, the logistic regression model was selected as the final predictive model because of its simplicity and interpretability. The AUROCs of the external validation sets were 0.881. The sensitivity and specificity were 0.839 and 0.794 for the validation set, when using a probability of death of 50% as the cutoff. Risk score based on the selected variables can be used to assess the mortality risk. The predictive model is available at [].

CONCLUSIONS: Age, high-sensitivity C-reactive protein level, lymphocyte count and d-dimer level of COVID-19 patients at admission are informative for the patients' outcomes.

Discipline / Specialty Area Score
Infectious Disease
Intensivist/Critical Care
Hospital Doctor/Hospitalists
Internal Medicine
Comments from MORE raters

Hospital Doctor/Hospitalists rater

I don't think the AI aspect of the study adds anything new to the prognostication in the study.

Infectious Disease rater

There are many papers looking at predictive factors of outcome in COVID-19 infected patients. The authors have validated their data with a validation cohort. The prognostic factors are not surprising.

Internal Medicine rater

This is a fast-moving target, and I question how relevant these results are today. To not have obesity enter into the model raises a face validity problem. We no longer check serial D-dimers because we didn't find the results to be that helpful. So, I don't think this information is that relevant to what we see in our institution.

Internal Medicine rater

Need to disseminate these data as they arise.

Respirology/Pulmonology rater

The idea of the study is good, but the sample size is very low. The results need to be validated in other cohorts and with a higher sample size.

Respirology/Pulmonology rater

This machine learning (ML) study found that in patients hospitalized with COVID-19, 4 variables (age, hsCRP, lymphocyte count, and D-dimer) predicted with reasonable accuracy those who would die. Some of the associations (e.g., age) are well known already but others are not. If true, finding that these 4 variables are sufficient for prognosis would be novel. There are, however, important methodologic concerns here, starting with the small sample size. This is likely why they did not use tuning and testing data sets to refine the model before external validation, as is usually done in ML studies. The optimal model chosen ended up being logistic regression, which might make the ML issues moot; however, I have trouble believing that such a model could be stable and still provide a proper test of so many candidate variables with so few events. Also, they excluded patients based on future information (hospital stay <7 days) or >30% missing data, which is concerning for bias.