Content uploaded by Pierre Ravaux
Author content
All content in this area was uploaded by Pierre Ravaux on Oct 03, 2021
Content may be subject to copyright.
ICU Patient State Characterization Using
Machine Learning in a Time Series Framework
Daniel Calvelo1,Marie-C.Chambrin
2,DenisPomorski
1, and Pierre Ravaux2
1Laboratoire d’Automatique et Informatique Industrielle de Lille CNRS ESA 8021
2Universit´e de Lille 2
Abstract. We present a methodology for the study of real-world time-
series data using supervised machine learning techniques. It is based
on the windowed construction of dynamic explanatory models, whose
evolution over time points to state changes. It has been developed to
suit the needs of data monitoring in adult Intensive Care Unit, where
data are highly heterogeneous. Changes in the built model are considered
to reflect the underlying system state transitions, whether of intrinsic or
exogenous origin. We apply this methodology after making choices based
on field knowledge and ex-post corroborated assumptions. The results
appear promising, although an extensive validation should be performed.
1 Introduction
We seek to identify stable ICU patient’s states within the recordings of mon-
itored data. We propose the following framework: local, window-based models
will be built using a carefully chosen modeling system. The built models will be
characterized by a set of indicators. The time variation of these indicators will
show stationarity violations with respect to the model class considered.
2 Machine Learning from Raw Data
The windowed approach is a classic means of dealing with non-stationarity. By
carefully choosing a modeling system that focuses on relevant information, we
introduce an abstraction layer on which state changes are detected.
Machine learning modeling systems [1] seem adapted to this framework:
•they span a large class of models, including hybrid numeric/symbolic;
•they are mostly non-parametric, avoiding arbitrary parameter settings;
•they offer in most cases explicit models that can be integrated in a knowledge
framework.
Furthermore, the data we have at our disposal — issued from common ICU
monitoring equipment and acquired by the AidDiag system [2] — contain vari-
ables that are useful indicators of the overall patient’s state.1These can be used
1Available variables include respiratory and hæmodynamic parameters, ventilator
settings, blood gas measurements — these constitute our observation variables. (As
of October 1998, in the about 200 patient-days in the AidDiag database, the median
number of parameters recorded per session was 22.)
W. Horn et al. (Eds.): AIMDM’99, LNAI 1620, pp. 356–360, 1999.
c
Springer-Verlag Berlin Heidelberg 1999
ICU Patient State Characterization Using Machine Learning 357
as observation variables, as opposed to measured state variables. We thus chose
to work with systems able to exploit this variable specialization, namely, we
concentrated on supervised learning techniques.
Ex-post validation for our methodology would ideally rely on explicitation of
the information inferred, which led us to use tree inducer systems (see [3] for an
up-to-date review): they give strong explicitation in form of trees, and further
transformation into rule sets enables the introduction of field knowledge [4]. We
have chosen to work with the C4.5 system [5], a well-studied system for induction
of decision trees.
2.1 Adapting Machine Learning to Time-Series
To properly exploit our data, it is necessary to adapt this system, designed for
static classification, to a time-series dynamic modeling framework (see e.g. [6]
and [7] for other approaches).
Introduction of lagged variables [8] has been experimentally ruled out in
favor of derivative-like variables, carrying the trend information. Indeed, sup-
plementary lagged variables result in heavily grown datasets and more complex,
harder to interpret models. Furthermore, trend has been proposed [9], [10] as
the preferred description means for physio-pathological processes.
2.2 Trend at Characteristic Scales
For the calculation of trend, we hypothesize the existence of a characteristic time-
scale for each variable, that separates short- from long-term behavior. A linear
filter, equivalent to a regression of univariate data with respect to time, is then
applied at this scale. It yields both the trend and an error variance, interpreted
as a stability indicator around the trend. This constitutes a projection of each
variable in the two-dimensional space of trend vs. stability. This space provides a
rich visual representation of the current and past dynamic states of each variable
(see Fig.1-b).
From a learning dataset, we calculate the trends and errors for each scale, at
every point. A classic test yields the count of significant regressions as a function
of scale. We define the characteristic time-scale τras the one beyond which no
better local linear approximations can be found. Other criteria have been tested
(e.g. the characteristic time-scale is the one where the best piecewise-linear model
is first found) and exhibit other properties (e.g. improved robustness with respect
to the size of the learning dataset), yet they don’t fit as well as the τrcriterion
in producing a derivative-like variable.
2.3 Characterization of the Built Models
After filtering, we apply windowed decision tree construction and characterize
the resulting models using their complexity, their intrinsic error and a presence
index representing the decision tree morphology, as a summarized presence for
358 D. Calvelo et al.
each variable. (As a validation argument for the introduction of trend variables,
the trees generated on the augmented dataset are much smaller and more accu-
rate than in the static case.)
Window size was determined by ex-post validation: having chosen a window
size deemed large enough to proceed to induction (which needs the estimation of
joint probabilities), variations of this size, from halving up to doubling, produced
the same qualitative results.
Evaluation of the methodology can be performed, partly, by exploiting data:
the zones we determine must correspond to behavioral stability of the dataset.
On the other hand, external actions that apriorichange the patient’s state
should be correlated to transitions as we identify them.
3Results
We illustrate the approach applied to a hand-documented dataset (Fig.1).
Data are sampled at 5speriod for eight hours, directly from the routine mon-
itoring of an adult ICU patient. Five seconds is the minimal period technically
available for a synchronized simultaneous acquisition from the available devices.
We have used a conservative approach with respect to missing data when
calculating the characteristic time-scale: missing data points, leading to missing
values for any trend depending on them, are left out of the counts. Yet, we did
fill the one-sample gaps with the mean of surrounding points for the calculation
of the trend variables.2For the decision tree building itself, missing is a special
category.
The τrcalculated from the whole dataset show groupings by physiological
subsystem: extremely fast behavior for VTe; the arterial pressures show close
τr, smaller than the airway pressure ones (Fig.1-a). The orders of magnitude
of τrare always comparable for the same variable between different patients
(not shown here), and the relative ordering of the variables with respect to τris
(loosely) preserved as well.
The windowing induction of decision trees is then applied, with a window
size of 1h24mn (1000 data points).
Expert and na¨ıve visual inspection of the presence index map, alongside error
and complexity evolution (Fig.1-c), shows temporal zones that can be correlated
with observed external actions, as changes in oxygenation levels and suctions.
Namely, the zones that can be visually detected (in Fig. 1 units, these are the in-
tervals [0; 40] [40; 260] [260; 500] [500; 700] [700; 880] [880; 1020] and [1020; 1070])
correlate with suctions for the beginning of second and third; changes in incom-
ing oxygenation occur at around point 500 (here, bedside care is also being done)
and point 700; at 880, stabilization of cardiac frequency (hitherto decreasing)
takes place; finally bedside care happens from around 1020 and on.
2This was done in order to minimize the number of large gaps in the augmented
dataset — each missing datum forbids the calculation of trend variables for τrsuc-
cessive samples.
ICU Patient State Characterization Using Machine Learning 359
Besides, the cross-entropy maps between all variables within each of the
aforementioned zones are definitely distinct, and stable within each zone.
The prediction errors of the locally built trees remain low within the identified
zones, and grow beyond them, showing their specificity to the considered time
zone.
0 4 8 12 16 20 24
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
ArtDia
ArtMoy ArtSys
RespComp
RespFreq
RespPeak
RespPeep
RespPlat
RespPminRespPmoy
RespR
RespVM
RespVTe
EcgFc
Spo2
τr definition
fraction of significant regressions
time scale (mins)
Presence Index
ArtDia
ArtMoy
ArtSys
RespComp
RespFIO2
RespFreq
RespPeak
RespPeep
RespPlat
RespPmin
RespPmoy
RespR
RespVM
RespVTe
EcgFc
EcgST1
EcgST2
constancy
meanPower
sigNoise
vArtDia
vArtMoy
vArtSys
vRespComp
vRespFIO2
vRespFreq
vRespPeak
vRespPeep
vRespPlat
vRespPmin
vRespPmoy
vRespR
vRespVM
vRespVTe
vEcgFc
vEcgST1
vEcgST2
vSpo2
vconstancy
vmeanPower
vsigNoise
50
100
150
200
# nodes
20
40
# errors
0 200 400 600 800 1000
80
90
100
SpO2(%)
time (in 25s units)
a)
b)
c)
−0.0053 −0.0035 −0.0017 0 0.0017 0.0035 0.0053
0
0.08
0.17
0.26
0.35
0.44
0.53
trend (%.min−1)
stability (%.min−1)
SpO2 projection
Fig. 1. Characteristic Scales and Windowed Processing of Dynamic Decision Trees
a) Count of significant trend calculations as a function of scale. Risk is p<10−2.
Circled crosses show the characteristic scales τr. They are estimated at 95% below the
first maximum.
b) Projection of SpO2into the trend vs. stability plane, at its characteristic scale τr.
Axis are labelled in percent units by minute. Lighter points are past, darker points
correspond to the end of recording.
c) From top to bottom, presence index map (darker means stronger presence), complex-
ity, error and (for reference) time-series of the classification variable (pulse oxymetry
SpO2) as a function of time.
Variables prefixed with ’v’ denote supplementary trend variables. Abscissa values are
the median time points of each window. Window size is 200 graphical units (1u=25s).
360 D. Calvelo et al.
4 Conclusion and Perspectives
The database we have now at our disposal is too irregular (patients’, pathological,
therapeutic characteristics) for any rigorous testing. Investigation protocols to
come will provide a well-controlled individual and pathological framework in
which to validate this methodology.
In a well-known experimental environment, we should be able to separate
exogenous (well detected in the preceding illustration) from endogenous state
shifts. This would enable the study of the particular pathology as a succession
of states separated by transitions, defining each state as the configuration of
relationships between measured variables. In-depth analysis within each state
will hopefully give insights into the phases of evolution of the disease.
References
1. Y.Kodratoff, R.Michalski, Machine Learning: An Artificial Intelligence Approach,
Vol.III, Morgan Kaufmann, 1990.
2. P.Ravaux, M.C.Chambrin, A.Jaborska, C.Vilhelm, M.Boniface, AIDDIAG : Un
Syst`eme d’Aide au Diagnostic Utilisant l’Acquisition de la Connaissance, Biometric
Bulletin 11(3):10, 1994.
3. S.K.Murthy, Automatic construction of decision trees from data: A multi-
disciplinary survey, to appear in Data Mining and Knowledge Discovery journal
2(4), 1999.
4. G.Holmes, A.Donkin, I.H.Witten, WEKA: A Machine Learning Workbench,Proc.
Second Australia and New Zealand Conference on Intelligent Information Systems,
Brisbane, Australia, 1994.
5. J.R.Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1992.
6. R.S.Mitchell, Application of Machine Learning Techniques to Time-Series Data
Working Paper 95/15, Computer Science Department, University of Waikato, New
Zealand, 1995.
7. L.Torgo, Applying Propositional Learning to Time Series Prediction in Y.Kodratoff
et al., Workshop on Statistics, Machine Learning and Knowledge Discovery in
Databases, ECML-95, 1995.
8. D.Pomorski, M.Staroswiecki, Analysis of Dynamical Systems based on Information
Theory, World Automation Congress (WAC’96), Montpellier, May 27–30, 1996.
9. I.J.Haimovitz, I.Kohane, Managing temporal worlds for medical trend diagnosis,
Artificial Intelligence in Medicine, 8(3), 1996
10. F.Steimann, The interpretation of time-varying data with DiaMon-1,ArtificialIn-
tellignece in Medicine 8(4), Aug. 1996.