Abstract
Purpose: Sepsis is a life-threatening condition with high mortality rates and expensive treatment costs. To improve short- and long-term outcomes, it is critical to detect at-risk sepsis patients at an early stage.
Methods: A data-set consisting of high-frequency physiological data from 1161 critically ill patients was analyzed. 377 patients had developed sepsis, and had data at least 3 h prior to the onset of sepsis. A random forest classifier was trained to discriminate between sepsis and non-sepsis patients in real-time using a total of 132 features extracted from a moving time-window. The model was trained on 80% of the patients and was tested on the remaining 20% of the patients, for two observational periods of lengths 3 and 6 h prior to onset.
Results: The model that used continuous physiological data alone resulted in sensitivity and F1 score of up to 80% and 67% one hour before sepsis onset. On average, these models were able to predict sepsis 294.19 ± 6.50 min (5 h) before the onset.
Conclusions: The use of machine learning algorithms on continuous streams of physiological data can allow for early identification of at-risk patients in real-time with high accuracy.