ArticlePDF AvailableLiterature Review

Graph neural networks for clinical risk prediction based on electronic health records: A survey

Authors:
Journal of Biomedical Informatics 151 (2024) 104616
Available online 27 February 2024
1532-0464/© 2024 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Contents lists available at ScienceDirect
Journal of Biomedical Informatics
journal homepage: www.elsevier.com/locate/yjbin
Graph neural networks for clinical risk prediction based on electronic health
records: A survey
Heloísa Oss Boll a,b,, Ali Amirahmadi b, Mirfarid Musavian Ghazani b,
Wagner Ourique de Morais b, Edison Pignaton de Freitas a, Amira Soliman b, Farzaneh Etminani b,
Stefan Byttner b, Mariana Recamonde-Mendoza a,c
aInstitute of Informatics, Universidade Federal do Rio Grande do Sul, Avenida Bento Gonçalves, 9500, Porto Alegre, 91501-970, RS, Brazil
bSchool of Information Technology, Halmstad University, Kristian IV:s väg 3, Halmstad, 301 18, Sweden
cBioinformatics Core, Hospital de Clínicas de Porto Alegre (HCPA), Av. Protásio Alves, 211, Bloco C, Porto Alegre, 90035-903, RS, Brazil
ARTICLE INFO
Keywords:
Graph neural networks
Electronic health records
Deep learning
Artificial intelligence
Graph representation learning
Keyword
ABSTRACT
Objective: This study aims to comprehensively review the use of graph neural networks (GNNs) for clinical
risk prediction based on electronic health records (EHRs). The primary goal is to provide an overview of
the state-of-the-art of this subject, highlighting ongoing research efforts and identifying existing challenges in
developing effective GNNs for improved prediction of clinical risks.
Methods: A search was conducted in the Scopus, PubMed, ACM Digital Library, and Embase databases to
identify relevant English-language papers that used GNNs for clinical risk prediction based on EHR data. The
study includes original research papers published between January 2009 and May 2023.
Results: Following the initial screening process, 50 articles were included in the data collection. A significant
increase in publications from 2020 was observed, with most selected papers focusing on diagnosis prediction (n
= 36). The study revealed that the graph attention network (GAT) (n = 19) was the most prevalent architecture,
and MIMIC-III (n = 23) was the most common data resource.
Conclusion: GNNs are relevant tools for predicting clinical risk by accounting for the relational aspects among
medical events and entities and managing large volumes of EHR data. Future studies in this area may address
challenges such as EHR data heterogeneity, multimodality, and model interpretability, aiming to develop more
holistic GNN models that can produce more accurate predictions, be effectively implemented in clinical settings,
and ultimately improve patient care.
1. Introduction
Electronic health records (EHRs) are extensive, heterogeneous, and
longitudinal repositories that document patients’ health, including
symptoms, prescriptions, clinical notes, and medical images. With the
increase in EHR data collection, there is growing interest in leveraging
this information to improve patient care, especially in the context
of clinical risk prediction [1]. Recent machine learning approaches
focused on predicting events such as disease diagnoses, mortality, and
hospital readmissions have been relevant to this endeavor [2,3].
Despite the rich information present in EHRs, translating it into
actionable insights presents challenges due to data-related problems
such as heterogeneity (multiple types of medical attributes describing
Abbreviations: EHR, Electronic health record; GNN, Graph neural network; GCN, Graph convolutional network; GAT, Graph attention network; GAE, Graph
autoencoder; CNN, Convolutional neural network; RNN, Recurrent neural network; GRU, Gated recurrent unit; LSTM, Long short-term memory
Corresponding author at: Institute of Informatics, Universidade Federal do Rio Grande do Sul, Avenida Bento Gonçalves, 9500, Porto Alegre, 91501-970, RS,
Brazil.
E-mail address: hoboll@inf.ufrgs.br (H. Oss Boll).
a patient), high dimensionality (a large number of attributes associated
with a patient), quality (missing values and inconsistencies) and tem-
poral dynamics (numerous patient encounters and timestamped clinical
events) [1,47]. Considering that the success of machine learning mod-
els depends largely on an adequate representation of the input data,
studies in representation learning the process of learning expressive
representations of the input data for improved performance of predic-
tors [8] are paramount for effectively transforming patient data from
the raw EHR format into adequate representations that fully capture
their health status [1].
Recent deep learning techniques have effectively addressed these
challenges. Unlike traditional machine learning approaches, which rely
https://doi.org/10.1016/j.jbi.2024.104616
Received 22 September 2023; Received in revised form 21 February 2024; Accepted 23 February 2024
Journal of Biomedical Informatics 151 (2024) 104616
2
H. Oss Boll et al.
Fig. 1. Electronic health records contain a range of multimodal patient data. This information can be used for patient graph representations, focusing on a patient’s visit or medical
record. In the case of a visit, a hierarchical and homogeneous graph is shown, while medical records are made up of sequences of visit graphs. Alternatively, the entire EHR data
can be modeled as a heterogeneous graph, with different types of nodes and edges represented by different colors. In all examples, nodes and edges can have feature vectors
processed using GNN for further clinical risk prediction tasks. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this
article.)
heavily on expert-driven feature engineering, deep learning models
can automatically extract meaningful latent feature representations
from complex raw data [911]. Among these, graph neural networks
(GNNs) stand out. The goal of graph representation learning is to
encode graphs into a low-dimensional vector space while preserving
topology and node properties [12]. In this sense, GNNs are particularly
adept at representing EHRs because they can capture the intricate
relationships and dependencies between medical entities to generate
rich, context-aware embeddings for further downstream tasks [13,14].
This is a promising feature in contrast to other machine learning and
deep learning algorithms, which often treat medical concepts as a
flat ‘‘bag of features’’, disregarding structural information and variable
interdependencies during model development [15,16]. Furthermore,
GNNs are powerful in handling the high sparsity and frequent missing
values found in EHR data, as they can respectively propagate infor-
mation through the graph structure to densify representation and infer
features based on the attributes of neighboring nodes in the medical
graph [1719].
The strength of GNNs lies in their capability to navigate the intrica-
cies of non-Euclidean spaces [13,20]. Unlike grid-based data structures
such as images, which have inherent locality and consistent neigh-
boring relationships, graphs often lack a natural node ordering, and
the spatial proximity of nodes does not determine their relationships,
making it more challenging to apply key operations such as convolu-
tions [21,22]. For example, EHR graphs can represent a dense web of
patient histories, diagnoses, treatments, and other clinical outcomes,
with heterogeneity in node types, nodes with varied degrees, and
edges indicating co-occurrence, causality, hierarchical relations, and
other relevant interactions, resulting in complex topological structures
(Fig. 1). In this sense, GNNs offer the necessary flexibility to capture
and exploit these relationships, yielding thorough representations and,
consequently, improved interpretability and efficiency compared to
other deep learning models [23].
Early studies on GNNs for clinical risk prediction aimed to take
advantage of hierarchical medical information, structured as ontologies
and knowledge graphs, as distant supervision [24]. They introduced
label information through structured knowledge graph propagation,
learning correlations between medical codes and paralleling them with
codes observed in patients to obtain better predictions [24]. This
approach enabled more accurate predictions than other deep learning
baseline models [25]. Subsequent approaches started to prioritize the
learning of novel graph representations based on EHR rather than the
integration of knowledge graphs. Some of these representations include
patient similarity, patient-medication interactions, and temporal rela-
tions between medical events [2628]. Today, given the plethora of
existing EHR multimodal information, heterogeneous graphs have also
been used, including clinical notes, disease codes, medical images, and
lab results into the learned embeddings, enriching the representations
of the data used for critical health predictions [29,30].
The manifold use of GNNs in EHRs represents a transformative
paradigm in the landscape of clinical task predictions. GNNs, as pow-
erful tools for modeling complex relationships within graph-structured
data, have demonstrated remarkable efficacy in capturing intricate
dependencies inherent in healthcare systems and will likely support
future disruptive advances in this domain. Notably, recent studies have
discussed the applications of deep learning in electronic health records.
However, none have explicitly focused on using GNNs for clinical
risk prediction based on EHR. For example, [7] have concentrated on
temporal patient presentation, while [1,11] have focused on general
deep learning techniques for EHR. The most recent studies concerning
graph representation have limited their scope to diagnosis prediction
only [31] or did not focus on GNNs [32].
Thus, while preceding review papers have explored the broader
landscape of deep learning applications in EHRs, a dedicated review
addressing the intricacies of utilizing GNNs for clinical risk prediction
based on EHRs remains an unexplored niche in the literature. The
narrative review presented in this paper aims to bridge this gap by of-
fering a targeted exploration of the advancements, open challenges, and
potential future research directions in this specific application domain.
Inspired by systematic protocols, the intention is to summarize the
current scope and depth of the available literature, while also setting
the stage for future systematic reviews as the field grows, providing a
fundamental overview essential for advancing research in the area (see
Table 1).
2. Background
2.1. EHR representation learning
The primary aim of representation learning for EHRs is to transform
input data into a suitable representation that can enhance downstream
Journal of Biomedical Informatics 151 (2024) 104616
3
H. Oss Boll et al.
Table 1
Significance of the presented survey.
Problem. The abundance of heterogeneous EHR data and
the relations among medical events and entities
are two key factors that pose challenges for
developing clinical risk prediction models.
What is already known. Leveraging the relational nature of medical
data as graphs with deep learning approaches
can improve clinical risk prediction.
What this paper adds. A narrative review of the state-of-the-art GNN
techniques for EHR-based clinical risk
prediction. The study highlights the main
challenges related to EHR graph data
representation, temporality, and prediction
tasks and describes various GNN architectures
used in this endeavor.
clinical prediction tasks [15]. Early studies on EHR representation
learning, exemplified by Doctor AI [33], RETAIN [34], and Dipole [35],
utilized recurrent neural network (RNN)-based models to account for
the sequential nature of EHR data in clinical risk prediction. Convolu-
tional neural networks (CNNs), word embedding methods, and stacked
denoising autoencoders were also utilized [3642]. These method-
ologies aimed to model dependencies between medical codes of the
International Classification of Diseases (ICD), an international standard
for diagnosing and classifying health conditions [43].
Subsequent studies recognized the need for integrating structured
information into models, emphasizing the increasing relevance of graph
representation learning approaches. In the Heterogeneous Convolu-
tional Neural Network (HCNN), a CNN was adapted to capture the tem-
poral relationships between medical events in EHR data, which were
represented as attributed graphs [44]. Attention-based models such as
GRAM [45], KAME [46], MMORE [47], and CAMP [48] also incorpo-
rated hierarchical diagnosis information contained in the ICD ontology
for clinical risk prediction. The MiME model employed multilevel
relationships to learn representations of medical concepts [49]. Other
similar studies that have aimed to model EHR as graphs include [50
53].
2.2. Graph neural networks
Graph neural networks (GNNs) are deep learning models designed
for processing and analyzing data organized as graphs [54]. A graph
can represent any relationships (edges) between a collection of entities
(nodes) [55]. Graphs are a fundamental way in which it is possible
to obtain data from the natural world, and a large part of patterns
observed in nature can be expressed and understood using graph struc-
tures [56]. In this way, GNNs are a powerful resource for representation
learning, as they can distill structural information and learn powerful
high-level representations [57]. GNNs have demonstrated applicability
across many network-related areas, including traffic forecasting, social
network analysis, and improved recommendation systems [56]. In
healthcare and biology, they have been used to discover new drugs,
predict protein–protein interactions, forecast disease outbreaks, and
assess patient similarity [15,5863].
A graph is conventionally defined as 𝐺= (𝑉 , 𝐸 ), where 𝑉is the
set of nodes, and 𝐸is the set of edges that connect pairs of nodes
in 𝑉[56]. In EHRs, depending on the approach, nodes can represent
various healthcare entities such as patients, diagnoses, and medications
(Fig. 1). Each node 𝑢𝑉has an associated feature vector 𝑥𝑢
R𝑘. These features are organized into a node feature matrix 𝑋
R𝑉×𝑘, where each row denotes a node’s feature, making them suitable
for machine learning models [56]. For instance, a feature vector for
a patient node can include encoded attributes such as age, gender,
medical history information, and vital signs. Furthermore, an edge
set 𝐸is typically represented by an adjacency matrix 𝐴R𝑉×𝑉,
indicating whether node 𝑢is related to another node 𝑣through binary
indicators (𝑎𝑢𝑣 = 1 or 0). The presence of an edge between a patient
node and a medication node could indicate, for example, whether a
patient received specific medication during a visit. Nonetheless, this
is a simplified representation where edges are not attributed, meaning
they do not encapsulate additional information, such as the strength
or type of relationship. Even in such cases, the core properties of the
GNNs remain the same.
GNNs operate by aggregating information from a node’s immediate
neighborhood, which is then combined with the node’s features to
create a richer latent representation (Fig. 2) [64,65]. This is achieved
with matrix multiplication of the adjacency (𝐴) and feature (𝑋) matri-
ces [66]. After 𝑘aggregation steps, the structural information within a
node’s k-hop neighborhood is captured, and a final graph representa-
tion can be obtained by pooling the transformed feature vectors of all
nodes [67]. This propagation mechanism, often referred to as message
passing, enables GNNs to learn expressive representations and solve
tasks such as node classification (predicting the label of a node), link
prediction (predicting whether there is an edge between two nodes),
and graph classification (predicting a label for the entire graph) [68].
GNNs must follow the principles of permutation invariance and
equivariance to handle graph structures. This ensures that their out-
put remains consistent, regardless of how nodes are presented [56].
Furthermore, GNNs must handle varying numbers of node neighbors.
For instance, one patient node might be linked to several medication
nodes during a visit, while another patient might be connected to only
a few, and the same model must address both cases. This is achieved by
employing local functions (𝑓) that compute node-level outputs (ℎ𝑢) by
considering not only the features of the target node (𝑢) but also those of
its specific number of neighboring nodes [56]. The choice of particular
local functions depends on the GNN architecture; these variations are
further discussed in the following sections.
Note that this study focuses on GNNs utilized for EHR data analysis.
For a comprehensive review encompassing various types of GNNs,
interested readers are referred to [5456,6769].
2.2.1. Graph convolutional networks
Graph convolutional networks (GCNs), introduced by Kipf and
Welling [70], propagate information across the graph and aggregate
it to update node representations, extending convolutions from the
Euclidean domain to the graph domain which is characterized by
data structured as nodes and edges [20]. In the spatial approach, each
GCN layer computes new node representations based on their current
features and those of their neighbors, similar to how convolutional
layers in CNNs aggregate local information in images but in a
configuration where operation objects are non-fixed in size [54,71].
As nodes progress through the GCN layers, the learned representations
encapsulate a broader neighborhood, similar to the increasing receptive
fields in CNNs. The local function that provides the update rule for node
𝑣in a GCN involves a weighted sum of the features of the node and its
neighbors and is given by
(𝑙+1)
𝑣=𝜎
𝑢𝑁(𝑣)∪{𝑣}
1
𝑐𝑣𝑢
𝑊(𝑙)(𝑙)
𝑢
where (𝑙)
𝑣is the feature vector of node 𝑣in layer 𝑙,𝑁(𝑣) {𝑣}is the
set of neighbors of 𝑣and 𝑣itself, 𝑐𝑣𝑢 is a normalization constant, 𝑊(𝑙)
is a weight matrix, and 𝜎is a nonlinear activation function.
GCNs can also be developed using spectral approaches [72]. Spec-
tral models treat graphs as signals processed through graph convolution
in the spectral domain. In particular, graph signals are first transformed
into the spectral domain using a Graph Fourier Transform (GFT), which
leverages the eigenvectors of the graph Laplacian as its basis functions.
Specifically, given a graph with the Laplacian matrix 𝐿, its eigenvalue
decomposition leads to eigenvectors that serve as harmonic modes for
the GFT. Once in the spectral domain, the graph signals are filtered
Journal of Biomedical Informatics 151 (2024) 104616
4
H. Oss Boll et al.
Fig. 2. Simple representation of how a GNN layer operates for clinical risk prediction. The double-edge arrows represent the message-passing process between neighboring nodes.
This interaction allows nodes to aggregate information and generate context-aware embeddings. The model’s design, including its outputs and loss functions, are adapted based on
the specific requirements of the tasks.
and subsequently transformed back to the spatial domain with the
GFT [54].
Despite their different starting points, both spectral and spatial
models iteratively collect neighborhood information (organized as low-
dimensional vectors) to capture high-order correlations among the
analyzed graphs [57].
2.2.2. Graph attention networks
Graph attention networks (GATs), proposed by Veličković et al. in
2018, extend GCNs by introducing an attention mechanism to weigh
the importance of neighboring nodes [59]. This mechanism is inspired
by the attention of the Transformer model, which allows the network
to simultaneously focus on different parts of the input [73].
The local function that provides the update rule for GATs involves
two main steps: calculating attention scores for each pair of nodes in
the graph, which indicate the relevance of neighbor nodes’ features
to a given central node; and a weighted aggregation of the features
of the central node and its neighbors, where the previously calculated
attention scores determine the weights [59]. The rule is then defined
by
(𝑙+1)
𝑖=𝜎
𝑗𝑁(𝑖)
𝛼𝑖𝑗 𝑊 (𝑙)
𝑗
where 𝛼𝑖𝑗 is the attention coefficient computed as
𝛼𝑖𝑗 =exp LeakyReLU 𝑎𝑇[𝑊 𝑖𝑊 𝑗]
𝑘𝑁(𝑖)exp LeakyReLU 𝑎𝑇[𝑊 𝑖𝑊 𝑘]
GATs offer particular value in interpretability, as learned attention
weights enable a deeper understanding of the relevance of specific
nodes for a given end task [74].
2.2.3. Graph autoencoders
Graph autoencoders (GAEs) are unsupervised learning models that
aim to learn compact representations of graph nodes, acting as a type
of dimensionality reduction technique [75]. The local function in GAEs
is employed in encoding and decoding graph information. During en-
coding, node features are transformed into a lower-dimensional latent
representation, capturing information both about its neighboring nodes
and the general graph structure. During decoding, the latent represen-
tations are used to reconstruct the original features while minimizing
the reconstruction error [76].
In 2016, Kipf and Welling extended the GAE framework by in-
troducing a variational inference approach called variational graph
autoencoder (VGAE) [77]. In VGAE, the encoder not only generates a
low-dimensional vector representation for each node but also models
the underlying probability distribution of these representations, usually
with a Gaussian distribution. The decoder, then, aims to minimize the
difference between the distributions of the reconstructed data and the
actual data, allowing the model to account for uncertainty and leading
to more robust graph reconstructions.
3. Methods
3.1. Search strategy and information sources
A comprehensive, narrative review of the use of GNNs in EHR-based
clinical risk prediction was conducted. The PRISMA protocol (Preferred
Reporting Items for Systematic Reviews and Meta-Analyses) [78] was
used as inspiration for the adopted review process, as detailed in Fig. 3.
However, the option was for a more adaptable strategy to provide
a comprehensive summary and perspective on GNNs, which led to
qualifying the review as ‘‘narrative’’ rather than ‘‘systematic’’. This
decision reflects the intention to offer a broad, interpretive overview of
the literature, emphasizing context and insight rather than a narrower
quantitative analysis, which we believe is particularly valuable in the
developing field of GNNs for clinical risk prediction. Five databases
were used: Scopus, PubMed, ACM Digital Library, and Embase. The
search term used was as follows:
(‘‘Graph’’ OR ‘‘Graph neural network’’ OR ‘‘Graph neural’’ OR ‘‘GNN’’
OR ‘‘Graph convolutional’’ OR ‘‘GCN’’ OR ‘‘Graph autoencoder’’ OR ‘‘GAE’’
OR ‘‘Graph attention’’ OR ‘‘Graph attention network’’ OR ‘‘GAT’’ OR
‘‘Graph self-attention’’ OR ‘‘GSA’’ OR ‘‘Graph transformer’’ OR ‘‘Graph
transformer network’’ OR ‘‘GTN’’ OR ‘‘Graph transformer’’ OR ‘‘Graph-
based’’ OR ‘‘Graph embedding’’) AND (‘‘Electronic health record’’ OR
‘‘EHR’’ OR ‘‘Electronic medical record’’ OR ‘‘EMR’’ OR ‘‘electronic health
data’’) AND (‘‘Deep learning’’ OR ‘‘Neural network’’ OR ‘‘Representation
learning’’ OR ‘‘Artificial intelligence’’)
An initial assessment of the literature using these terms revealed
a large number of potentially relevant articles. Therefore, to maintain
specificity, we chose not to include the more general terms ‘‘machine
learning’’ and ‘‘clinical’’, which could dilute the focus of the analysis.
The gray literature, specifically the ArXiv repository, was also revised
using similar terms. The search was limited to articles published be-
tween January 1, 2009, and May 14, 2023, as it was anticipated that
there would be no relevant articles published before 2009 since GNNs
started to become more popular after Scarselli et al. (2009) [65].
3.2. Eligibility criteria
This review aimed to include all related articles published in English
that used GNNs for clinical risk prediction based on EHRs within the
search period. The exclusion criteria employed during the screening
Journal of Biomedical Informatics 151 (2024) 104616
5
H. Oss Boll et al.
Fig. 3. Flowchart for literature search and selection of articles for the review.
and full-text review stages encompassed the following attributes: uti-
lization of unstructured EHR data as the main focus; absence of clinical
risk prediction focus; non-research or conference content; and paper not
in the English language. Review articles, duplicate records, and studies
that did not include EHRs or GNNs were also excluded. Three authors
screened the abstracts, while the first author reviewed the full texts and
extracted the data. Data extraction results were revised by the other
authors. The search strategy and selection process are summarized in
Fig. 3.
3.3. Data extraction, synthesis, and analysis
First, the selected papers were categorized based on the specific
clinical risk prediction task(s) they intended to solve. Then, details were
gathered about the task prediction levels (patient or visit), graph rep-
resentation, and node definitions, including whether they incorporated
temporal information and the type of resource employed to help in
the learning process, if any. Concerning the model’s implementation,
data were extracted regarding the EHR datasets used, patient counts,
evaluation metrics implemented in each study, available repository
links, and techniques utilized for interpretability, if any. Finally, all
the collected information was consolidated for subsequent qualitative
and quantitative analyses, centered on three axes: data representation,
incorporation of temporality, and prediction tasks.
3.4. Summary of study selection
After deduplication of the initial 449 articles, 351 papers were
obtained. Out of these, 50 articles were selected for full-text review
and data collection following the steps detailed in Fig. 3. Among the
69 preprints retrieved from Arxiv, only one study was eligible for the
review and included in the selected articles. Table 2 summarizes the
statistics of the 50 papers included in this review. Detailed information
for all papers is provided in Supplementary Table, and an overview of
the results is provided in the next sections.
4. Results
The Results section is structured as follows: first, a brief summary
of the main results is presented. Next, a more detailed description of
the main graph data representation approaches is provided, including
the incorporation of temporality into clinical predictions, and the most
common clinical risk prediction tasks. Finally, a discussion addresses
the unique challenges of modeling EHR data with GNNs, presenting
perspectives in the field, and describing study limitations.
4.1. Overview of the studies characteristics
Using GNNs for clinical risk prediction is a rising trend, as shown in
Fig. 4. There were numerous approaches to graph representation in the
evaluated articles, which often overlap. One of the main ones involves
using a hierarchical graph based on codes from medical ontologies,
such as ICD (diseases, procedures) and ATC (medications), which de-
scribe well-known parent–child relations (n = 15, Table 3). In this
case, the hierarchical medical code graph is considered homogeneous.
Another technique involves learning heterogeneous EHR graphs, where
nodes other than medical codes are comprised, such as patients, lab
values, and doctors, and edge relations may vary (n =9, Table 3).
These two approaches are often combined to create a heterogeneous
graph where medical ontology knowledge is also an aspect integrated
into the learned embeddings (n =5, Table 3). Other techniques in-
clude similarity networks, where nodes are linked based on feature
similarities (n =6); bipartite graphs, with two types of nodes (n =
5); hypergraphs, where edges can connect any number of nodes (n =
2); dynamic graphs, where features change over time; and multi-view
graphs, which include a combination of different types of networks of
medical entities interactions (n =3). For a summary, please refer to
Table 3; more information can be found in the Supplementary Material.
Out of all the clinical risk prediction tasks, diagnosis prediction has
received the most attention (Fig. 5, n =36), followed by mortality and
readmission. The diagnosis task also tested the most diverse GNN model
architecture set (Fig. 6). An overview of GNNs used for these three main
tasks is provided in Section 4.4.
When considering model architectures, it was possible to observe
that graph attention networks (GAT) were the most employed (n =
19), followed by graph convolutional networks (GCN) (n =18) and
GraphSAGE (n =4) (Fig. 7). Different architectures were also employed
to handle unique medical data requirements, such as GRU, LSTM, and
Journal of Biomedical Informatics 151 (2024) 104616
6
H. Oss Boll et al.
Table 2
A summary of the selected articles.
Year Ref. Clinical risk prediction task Data sourceaInterp.bMetrics Reprod.
Diag. Mort. Read. Otr. ACC AUPRC AUROC F1
2020 [16] x x x EI, SY AW 0.77 0.53; 0.59 x
2020 [25] x M3 0.65 x
2020 [28] x M3, PV 0.94
2020 [79] x M3, UT 0.48 0.72
2020 [80] x M3, PV 0.92
2020 [81] x PC 0.91 0.09 0.81 0.15 x
2020 [82] x M3 0.57
2020 [83] x M3 0.34 0.9
2020 [29] x M3 TS x
2020 [84] x M3 TS 0.63 0.67 0.68
2021 [18] x x x M3, EI, NY SVA 0.71; 0.39 x
2021 [26] x x EI AW 0.43 0.85 x
2021 [85] x x IQ, EI PCA 0.82; 0.2 0.86; 0.67 x
2021 [30] x PV AW 0.74 0.66 x
2021 [17] x PV 0.45 x
2021 [86] x M3 TS, H 0.60
2021 [87] x M3, WH, SP, MB AW 0.82 0.93 x
2021 [88] x PV
2021 [89] x M3 0.84; 0.67 0.85; 0.67 x
2021 [90] x IB U, C 0.79 x
2021 [91] x PV 0.53; 0.62
2021 [92] x M3 0.37 0.78 0.42
2021 [93] x M3 0.85 0.72; 0.48 x
2021 [94] x x SY 0.9
2022 [95] x M4, EI AW 0.96; 0.15 0.60; 0.49 0.89; 0.8 x
2022 [96] x M3 TS 0.21
2022 [97] x PV AW 0.44
2022 [98] x EM 0.75
2022 [99] x CN 0.72 0.15
2022 [100] x M3 TS 0.87 0.84
2022 [101] x x M3, XM 0.53; 0.85 0.87; 0.90 0.5; 0.84
2022 [102] x M3, M4 0.9 0.73; 0.26 x
2022 [103] x PV TS, GE 0.91
2022 [104] x x M3 TS 0.62; 0.65 0.90; 0.75 x
2022 [105] x NC CIC 0.15 0.75 x
2022 [106] x CD
2022 [107] x PH C 0.15 x
2022 [108] x PV 0.81
2022 [109] x x M3, EI 0.62; 0.39 0.91; 0.72 0.53; 0.37
2022 [110] x x M3, CK, TJ, HM 0.76; 0.8; 0.54 0.83; 0.95; 0.86 x
2022 [111] x M4 0.68 0.53
2022 [112] x M3, M4 TS 0.93 x
2023 [19] x CF C 0.89 0.8
2023 [113] x CU, UV AW 0.88 0.95
2023 [27] x PV 0.83 0.27
2023 [114] x M4, PE 0.93 0.87; 0.89
2023 [115] x M4, PV GE 0.78 x
2023 [31] x M3, EI TS, AW 0.86 0.64; 0.74
2023 [116] x MC S 0.99 x
2023 [12] x MC TS
aEI: eICU, M3: MIMIC-III, M4: MIMIC-IV, PV: Private, UT: UTP, PC: Physionet Challenge, NY: NYU Langone Health, IQ: IQVIA US, WH: WHAS, SP: SUPPORT, MB: METABRIC,
IB: IBM Explorys, SY: Synthetic, EM: EMRNet, CN: Cardionet, XM: Xiangya Medical, NC: N3C, CD: COPD, PH: PopHR, CK: CKD, CG: Cardiology, TJ: TJH, HM: HMH, CF: Cerner’s
Health Facts, CU: CNUH, UV: UV, PE: P-EHRs, MC: Mayo Clinic.
bTS: t-SNE, A: Attention, SV: SVA, P: PCA, U: UMAP, GE: GNN Explainer, CIC: Clinically intuitive concepts, S: SHAP, EGE: Extended GNNExplainer, C: Clustering, AW: Attention
weights.
Table 3
Types of EHR graph representation employed in the analyzed papers.
Graph representation Papers
Hierarchical [25,27,28,30,31,80,9193,96,103,107,
108,112]
Heterogeneous [29,88,95,97,100,110,111,114,116]
Hierarchical and heterogeneous [16,19,82,105,106]
Similarity [26,89,94,101,104,115]
Bipartite [17,85,90,93,99]
Dynamic [79,83,102]
Hypergraph [96,109]
Other [12,18,81,84,86,87,98,113]
BERT (Supplementary Table). In these cases, GNNs were responsible for
the graph representation learning step, while the others were used for
processing temporal aspects and multimodal data.
Regarding incorporating temporal patient information, most eval-
uated approaches consider some sort of clinical event sequentiality,
such as the order of patient visits (Supplementary Table, n =32).
However, most of these works disregard irregular time intervals. More
information about the temporal aspects of reviewed models is described
in Section 5.5.
GNNs were evaluated against and consistently outperformed other
machine learning and deep learning techniques in all evaluated studies,
indicating that graph representations are valuable for clinical risk
prediction. A summary of metrics is shown in Table 2, and a full report
can be found in the Supplementary Table. Most authors compared the
proposed GNN models with traditional and baseline approaches using
Journal of Biomedical Informatics 151 (2024) 104616
7
H. Oss Boll et al.
Fig. 4. Yearly distribution of selected articles. The cutoff date for including articles was May 14, 2023.
Fig. 5. Distribution of articles addressing different clinical risk prediction tasks.
Table 4
A summary of EHR data resources used in the evaluated articles.
Data resources Papers
M3 (MIMIC-III) [18,25,28,29,31,79,80,8284,86,87,89,92,93,96,
100102,104,109,110,112]
M4 (MIMIC-IV) [95,102,111,112,114,115]
EI (eICU) [16,18,26,31,85,95,109]
SY (Synthetic) [16,94]
Other [12,1719,27,28,30,7981,85,87,88,90,91,9799,
101,103,105108,110,113116]
classic ML metrics such as Accuracy, Precision, Recall, and F1-score, as
well as AUPRC (Area Under the Precision–Recall Curve) and AUROC
(Area Under the Receiver Operating Curve).
The availability of resources and documentation highlights the as-
pect of reproducibility in the evaluated models. Of the 50 selected
articles, 22 included a link to a repository containing the project’s
source code, as shown in Table 2 (links in the Supplementary Mate-
rial). Most papers provided comprehensive details about their model
architecture and parameters, including the specifics of GNN layers,
frameworks, learning rates, optimizers, and data split percentages.
Furthermore, the use of benchmark datasets also helped with repro-
ducibility efforts, as it allows for validations against a known standard.
For example, the Medical Information Mart for Intensive Care III,
or MIMIC-III [117], was the most frequently used dataset (n =23,
Table 2). MIMIC is a freely accessible database, one of the most widely
employed EHR datasets, encompassing data from over 40,000 patients
admitted to the Beth Israel Deaconess Medical Center in the USA,
spanning the years 2001 to 2012. Other datasets include the eICU
dataset [118], which provides data of over 200,000 admissions to
various intensive care units (ICUs), MIMIC-IV [119], an updated and
more extensive version of MIMIC-III, as well as synthetic EHR datasets,
which are created based on the real structure of EHR datasets. Aside
from these, a number of studies make use of proprietary and private
datasets obtained directly from hospitals for their GNN analyses. A
summary of utilized datasets and the corresponding articles is presented
in Table 4.
Journal of Biomedical Informatics 151 (2024) 104616
8
H. Oss Boll et al.
Fig. 6. Heatmap of articles using different GNN architectures for solving different clinical risk prediction tasks. (For interpretation of the references to color in this figure legend,
the reader is referred to the web version of this article.).
Fig. 7. Distribution of articles using different GNN architectures.
In the evaluated articles, half employed interpretability techniques
(n =25, Fig. 8). These methods aim to identify critical nodes, edges,
subgraphs, and their features responsible for GNN outputs [120]. The
most prevalent approach involved analyzing the t-SNE plot of the
embedding space (n =10). The t-SNE method maps high-dimensional
data into a lower-dimensional space, preserving local and global infor-
mation [121]. The resultant clusters, ideally non-overlapping, represent
different subtypes learned by the model. Other dimensionality reduc-
tion techniques, such as UMAP and PCA, were also used for similar
purposes. The second most common approach was the analysis of
learned attention weights (n =8). Graph attention layers use these
weights to compute attention-guided embeddings for nodes, edges, sub-
graphs, or combinations [74]. Their magnitude reflects the importance
of a given node j’s features to node i, which naturally becomes an
interpretability resource.
4.2. Graph data representation
4.2.1. Ontology-based approaches
The primary aim of representation learning for EHR is to trans-
form the input data into an expressive representation that can en-
hance downstream clinical prediction tasks [15]. The main goal of
utilizing GNNs is to incorporate the existing relations among medical
events and entities from EHRs into the learned representations. This
process is often achieved through knowledge injection, which intro-
duces a priori medical knowledge to guide and enrich deep learning
architectures [122,123].
Medical knowledge graphs and ontologies contain rich hierarchical
information (such as medical events co-occurrence, and ‘‘is the cause
of’’ and ‘‘is caused by’’ relations), which can offer a comprehensive
and reliable understanding of how clinical concepts interact. Examples
include KnowLife [124], a knowledge graph that integrates unstruc-
tured biomedical data; the International Classification of Diseases (ICD)
Journal of Biomedical Informatics 151 (2024) 104616
9
H. Oss Boll et al.
Fig. 8. Distribution of articles using different interpretability techniques. (For interpretation of the references to color in this figure legend, the reader is referred to the web
version of this article.).
Table 5
A priori medical knowledge used in GNNs for clinical risk prediction.
Knowledge
injection
sources
Papers
ICD [25,28,30,31,80,82,9193,96,107,108]
CCS [28,80,92,103,112]
CMeKG [106,108]
KnowLife [25]
Other [17,29,30,105,107,108]
ontology [43], which is the global standard for reporting and recording
diseases, symptoms, signs, and other events in all realms of healthcare;
the Clinical Classifications Software (CCS), which clusters the numerous
individual ICD codes into a smaller number of clinically meaningful
categories [125], and CMeKG, the first Chinese medical knowledge
graph [126]. All of them and more have been used for domain knowl-
edge injection in GNNs for clinical risk prediction (n =26); a summary
of the main sources can be found in Table 5.
The hierarchical relations in medical ontologies can be naturally
represented as parent–child graphs, in which the non-leaf nodes in
the tree represent broader, more general classes of medical concepts,
and leaf nodes represent more specific instances of these classes [80].
In ICD-9, for example, parental codes 390–459, referring to ‘‘Diseases
of the Circulatory System’’, represent a broad category encompassing
various circulatory system conditions. They are connected to sub-
classes that detail the conditions, such as codes 401–405 ‘‘Hypertensive
Disease’’ [127].
A frequently observed method of inputting knowledge in GNNs
involves extracting medical codes from a patient’s EHR, especially
ICD codes related to diagnoses, and combining them with the graph
structure features of hierarchical medical ontology graphs (n =12,
Table 5) [1]. More specifically, the node embeddings in the ontology-
based GNN models are learned as a combination of embeddings of the
observed medical codes for a patient and the code’s ancestors on the
medical graph [11]. This method often depicts each patient’s visit or
medical history as a graph with medical codes as nodes [25,103]. Edges
indicate mainly the hierarchical relationships among the codes ob-
served in an ontology, and can also be combined with co-occurrence in-
formation, indicating comorbidity or a significant relationship between
the observed diseases [80].
Some examples of this approach include the Graph Neural net-
works based Diagnosis Prediction (GNDP) model, which leverages
the ICD ontology to predict medical codes occurring at the next
visit [28]; and Sherbet, which utilizes hyperbolic embeddings to recon-
struct the disease hierarchical structure and predict temporal health
events [31]. MedPath [30] introduces the concept of extracting per-
sonalized knowledge subgraphs from ontology graphs for individual
patients, which is further explored in MedML [105]. In the Graph-
based Structural Knowledge-aware Network (GSKN) model, subgraphs
are also utilized, but with the intent to capture deep-level knowledge
graph structure information and dynamic representations of medi-
cal entities [108]. The HyperGraph-based disease prediction model
(EHR2HG) incorporates the ICD hierarchy and hypergraphs to con-
sider higher-order relations among diseases and patients [82]. The
Graph ATtention-Embedded Topic Model (GAT-ETM), simultaneously
learns node embeddings based on both the ICD (diseases) and ATC
(medications) ontology medical codes with a GNN [107].
Other sources of knowledge were also proposed to capture medical
event connections that might exist outside the scope of ontologies.
In [16], Choi et al. introduced a normalized conditional probability
matrix P, which restricts the model’s search space based on the co-
occurrence statistics of medical codes in an EHR dataset. In the Joint
Medical Ontology Representation Learning (JMRL) [25] and DUal-
GRAph Representation Learning (DUGRA) [82] models, the informa-
tion contained in both medical ontologies and co-occurrence statistics
of medical codes is explored. Variational regularization has also been
used to impose constraints on the learning process and perform struc-
ture learning without predefined guiding graphs [18]. Direct medical
expert validation has also been employed [105].
4.2.2. Heterogeneous approaches
Recently, new approaches have been leveraging more of the rich
multimodal information contained in health records (n =9, Table 3).
Instead of focusing only on hierarchical medical information and struc-
tured knowledge graphs, medical events and entities can be represented
within a heterogeneous EHR graph. This results in a dense network
of interactions that more effectively capture the complexity of a pa-
tient’s health status, and can be applied to broader clinical settings
and data resources [79,84,110]. While traditional GNN architectures
were designed for homogeneous graphs, these new methods employ
novel GNNs designed for graphs with often multiple types of nodes and
edges. Alternatively, some of the evaluated methods aim to transform
a heterogeneous EHR graph into a homogeneous one, allowing it to be
processed by standard GNN layers [29].
In Med2Meta [84], Chowdhury et al. uses graph autoencoders to
learn feature-specific embeddings for singular medical concept cate-
gories in the EHR: demographics, laboratory results, as well as clinical
notes. These are then combined into meta-embeddings for downstream
tasks, which significantly benefits predicting diagnosis in a patient’s
subsequent visit. The HarmOnized Representation learning on Dynamic
EHR graphs (HORDE) model generates harmonized medical entity
embeddings based on a multimodal dynamic EHR graph [79]. The
Journal of Biomedical Informatics 151 (2024) 104616
10
H. Oss Boll et al.
approach focuses on two types of nodes: time-invariant nodes with
static properties, such as events (diagnoses, procedures, laboratory
results) and medical concepts (from unstructured information from
clinical notes), and time-varying nodes with dynamic properties, such
as patients. Multimodality resulted in a more robust model for disease
classification. To minimize the impact of noise in the heterogeneous
EHR graph, the Heterogeneous Similarity Graph Neural Network (HS-
GNN) [29] model employs a preprocessing method that splits the
heterogeneous EHR graph into several homogeneous subgraphs, which
are then merged into a single homogeneous graph. The unified graph
can then be fed into a GNN for diagnosis prediction.
To handle multi-dimensional time-series EHR data, the Time-aware
Context-Gated Graph Attention Network (T-ContextGGAN) [95] model
also utilizes a heterogeneous EHR graph, where patient nodes are
connected to clinical event nodes such as lab tests, infusion drugs,
and prescriptions. The model then uses meta-paths to connect nodes
from various time steps for survival prediction. ME2Vec also addresses
both the heterogeneity of EHRs and time series data, employing a
hierarchical framework to embed medical services, doctors, and pa-
tients [85]. Medical services are embedded using a random-walk ap-
proach to account for irregular time intervals, while a GNN and a
proximity-preserving network embedding approach is used for doctors
and patients.
Cho et al. developed a novel EHR graph-database approach using
a multi-attributed and multi-relational bipartite graph to represent
patients and their relations with hospital visits and clinical events [99].
Then, HinSAGE was employed for predicting cardiovascular disease
events, demonstrating superior performance over traditional ML algo-
rithms in a link prediction task.
4.3. Temporality
Given that clinical risk prediction tasks can benefit from considering
the temporal aspects of patient records, EHR data are often not repre-
sented as a single graph but as a series of graphs observed in different
timesteps for a particular patient; or as a dynamic graph, where new
EHR information can be added and thus its nodes and edges change
over time [79,80,102]
To handle the sequential nature of hospital encounters, several
models couple GNNs with recurrent neural networks (RNNs) such as
long short-term memory (LSTM) networks or gated recurrent units
(GRU) (n =18, Supplementary Material). These architectures operate
by passing hidden state information from one input unit to the next,
encoding information about the entire sequence of medical events [1].
When coupled with GNNs, this pairing enables the model to leverage
both structural knowledge from medical graphs and time dependency
between clinical events [23]. For example, in JMRL, an attentive GRU
is used to aggregate temporal information between visits represented as
graphs for next-visit diagnosis prediction [25]. In the Longitudinal and
Graph Integrated (LIGHTED) model, Dong et al. account for the time
between visits by concatenating the embeddings of nodes representing
patient visits with the raw features observed for that visit and feeding
it sequentially into an LSTM [19].
Another approach focused on handling irregular sampling in patient
time information is using ordinary differential equations (ODEs). In the
Graph Attention and RNN-based Neural Ordinary Differential Equations
Model (GROM) model [92], the ODE represents dynamic clinic data as
a continuous trajectory influenced by local initial states and the global
dynamics of the entire time series, and a time-invariant neural network
function determines the whole of the latent trajectory. The data is then
fed back to a bidirectional RNN layer to mitigate gradient and sequence
length-related issues.
Other studies have handled event sequentiality by employing spa-
tiotemporal GCNs to model both the graph structure and temporality of
the EHR data (n =2, Supplementary Material). In these models, tem-
poral information was leveraged by stacking node attributes along the
timestamp and employing a convolution operation to extract features
in the temporal domain [28,80].
A dynamic graph, on the other hand, can be characterized as a
pair (𝐺, 𝑂), where 𝐺refers to a static graph that represents the initial
state of the dynamic graph, and 𝑂represents a tuple consisting of the
event type, the specific event (e.g., edge or node addition or deletion),
and its timestamp [128]. In the HORDE model [79], medical events,
clinical notes, and patients are represented as nodes in a dynamic
graph, with edges indicating event co-occurrence. As patient condition
changes over time, LSTM captures these changes, providing a temporal
context to the evolving graph structure. [12] adds a new timestamped
node in the patient’s EHR graph for each registered event. In TAG-
Net [83], time-series EHR data is also represented as a dynamic graph
to depict a single patient’s physiological condition across visits. A GRU
model is then employed to create the different representations for each
timestamp.
Lu et al. introduced a global dynamic disease graph, shifting the
focus from individual patients [102]. They extract subgraphs from
observed diseases during patient visits, which are then processed with
GRUs to discern temporal disease patterns and forecast potential future
outbreaks.
4.4. Clinical risk prediction tasks
4.4.1. Diagnosis prediction
Diagnosis prediction aims to forecast whether a patient will be
diagnosed with a medical condition (n =36, Fig. 5). The reviewed
studies primarily focused on predicting diagnoses within the current or
upcoming visit [16,102]. Still, some extended it to many hours before
manifestation or even up to a year or two in advance [18,98]. Most
studies have treated the task as a multilabel classification problem (n
=24, Supplementary Table). In this context, various probabilities are
determined for multiple diseases, allowing a patient to be associated
with several diseases.
Graph convolutional networks (GCNs) are frequently employed for
diagnosis prediction (n =15; Fig. 6). In [86,129], GCNs were used
to learn medical concept embeddings based on a medical ontology.
The HealGCN model utilizes a heterogeneous GCN and a symptom
retrieval system to enable online disease self-diagnosis [88]. In [28,80],
spatiotemporal GCNs were applied to capture the sequential patterns in
patient records and predict multiple diseases.
Graph attention networks (GATs) have also been widely utilized for
diagnosis prediction (n =12; Fig. 6). They assign different attention
weights to specific medical entity nodes, allowing the models to learn
which nodes are more relevant for a given prediction task [74]. For
example, Choi et al. introduced the Graph Convolutional Transformer
(GCT) with an attention mechanism to learn the graphical representa-
tion of a patient’s visit, highlighting the most significant medical events
for future diagnosis prediction [16]. The JMRL model uses GATs with a
feedback strategy, incorporating medical knowledge graph embeddings
and medical concept co-occurrence to predict multiple diagnoses [25].
The MedPath model utilizes attention weights to provide explanations
for medical paths used in diagnosis prediction [30], and the GTGAT
model employs a gated tree-based GAT with hierarchical and semantic-
aware attention to distill valuable information from nodes, enhancing
personalized disease diagnosis performance [98].
Many studies have focused on predicting specific diagnoses (n =14,
Supplementary Table), particularly cardiovascular diseases which
can be explained by the high prevalence of positive diagnosis cases in
EHR datasets (n =5, Supplementary Table) [130]. For instance, the
HinSAGE model was utilized in [99] to predict cardiovascular disease
outcomes, whereas [12,84,95] employed GNNs for heart failure pre-
diction. Other specific diagnoses include diabetes, chronic hepatitis B,
opioid overdose, lymphocytic leukemia, sepsis, and pediatric COVID, as
seen in [19,27,85,94,95,105,110,114]. Furthermore, efforts to diagnose
rare diseases have also been observed [17].
Journal of Biomedical Informatics 151 (2024) 104616
11
H. Oss Boll et al.
4.4.2. Mortality prediction
Mortality prediction aims to determine whether a specific patient
will pass away (n =10, Fig. 5). This can be based on the patient’s
current visit or a particular period after admission or discharge from the
hospital or the Intensive Care Unit (ICU) [18,110]. In most cases, mor-
tality prediction has been evaluated along with other clinical tasks such
as readmission and diagnosis prediction (n =5, Table 2) [18,26,109].
Furthermore, it is often formulated as a binary classification problem,
where 1 predicts expiration, and 0 predicts survival.
Various strategies have been employed to address this problem.
In the Temporal Aware Graph Convolution Network (TAGNet) model,
EHR time-series data are represented as evolving graphs, and a GCN is
used to mine the structural information for mortality forecasting [83].
Rocheteau et al. predict in-hospital mortality using a patient similar-
ity graph [26]. Multimodal information, such as clinical notes and
patient correlation, is integrated into a multi-view approach in the
Patient Multi-view Multi-modal Feature Fusion Network (PM2F2N)
model [104]. Furthermore, AttenSurv aimed to predict patient survival
using a global attention mechanism and a GNN to extract and identify
latent correlations between clinical risk factors [87].
4.4.3. Readmission prediction
Readmission prediction involves forecasting whether a particular
patient will be readmitted to the hospital after discharge or to the
Intensive Care Unit (ICU) during the same hospital visit [89,115] (n
=8, Fig. 5). This task is often formulated as a binary classification
problem, where 1 predicts readmission, and 0 predicts no readmission.
Choi et al. and Wu et al. [16,85] proposed predicting ICU readmis-
sions using a single graph-structured encounter. In [18], readmission
at discharge was predicted using a variationally regularized encoder–
decoder graph network. [109] utilized hypergraph contrastive learning
to predict readmission using patient data collected within the initial
24 h of admission to the ICU [109].
Recent studies have focused on extending the readmission predic-
tion horizon to 30 days. Golmaei and Luo proposed a model that
integrates clinical note information with the topological structure of
patient networks [89]. Furthermore, Tang et al. utilized a multimodal
spatiotemporal GNN incorporating patient similarity to improve read-
mission predictions [115].
5. Discussion
This section examines the main aspects and perspectives of using
GNNs for EHR analysis, returning to the topics mentioned in the Results
section.
5.1. Overview
Going beyond the limitations of tabular patient data, GNNs have the
differential of leveraging medical events and entity dependencies into
predictions to generate more reliable and personalized patient results.
Some of these include the use of medical ontology hierarchies as an
a priori knowledge source, as seen in [25,28,80] (Table 5); patient
similarity graphs, as seen in [26,98]; and multi-view approaches, which
incorporate the structure of different medical graphs into predictions
such as treatment, medication, and diagnosis interactions, as observed
in [91,96,104]. A summary of identified EHR graph representations
was provided in Table 3, and further details can be found in the
Supplementary Material.
5.2. Multimodality
EHR data present inherent complexity owing to their multimodal
nature. It encompasses diverse data types, from images to continuous
and discrete attributes, including medical images, clinical notes, lab
results, and medications. A number of studies have primarily focused on
using codes from medical ontologies to represent a patient (Supplemen-
tary Table). Such an approach is limited in capturing the rich diversity
of information contained within medical records [1]. By integrating
various data modalities, a more comprehensive view of a patient’s
health status can be achieved, thereby enhancing the performance of
clinical risk prediction.
It was possible to observe that GNNs can help deal with the chal-
lenges posed by this dense amount of patient heterogeneous data,
especially when considering the most recent models. For example, in
the DeepNote-GNN model, a natural language processing BERT module
was also used to leverage clinical notes for readmission prediction.
Moreover, [116] used EHRs and genetic reports to predict cancer. A
single study, [115], employed imaging data, specifically chest radio-
graphs. Some further methods of accomplishing this include employing
multi-view approaches and heterogeneous graphs, as in previously
described models. Coupled with adequate GNN architectures, these rep-
resentations allow the exploration of various relevant relations within
multi-modal patient data. For instance, one model segment may focus
on learning disease–disease relations while another can emphasize
patient similarity. When coupled, they can become an important and
holistic resource for modeling the complex topology of EHR relations
within patient information.
Furthermore, beyond the data types already present in most EHRs,
the integration of novel information from omics data (e.g., genomics
and transcriptomics) (only one observed study, [116]) and sensor/
wearable data (no observed studies) has the potential to provide ad-
ditional insights, enriching even further a patient’s profile.
5.3. Model evaluation
All studies provided some sort of GNN comparison against other
machine learning and deep learning models. For example, [98] com-
pared a Gated Tree-based Graph Attention Network with a multilayer
perceptron (MLP) and achieved an improvement in accuracy of almost
9%; and MERGE [101] had an increase in the AUROC compared with
other deep learning baselines of almost 16%. It is worth mentioning
that the degree of improvement heavily depends on the conditions
of the studies, including the model architecture, data resources, and
evaluated clinical risk prediction tasks.
Furthermore, EHR datasets are predominantly organized in tabular
format. This allows researchers to design different EHR graph structures
optimized for various downstream tasks. However, this flexibility also
challenges the comparability of graph-based models. For example, even
if two GNNs are trained on an identical EHR dataset (such as observed:
23 of the 50 articles used a benchmark dataset, MIMIC-III, for model
deployment; Table 4), the model results may be complex to compare
owing to the differences in the GNN architectures and the unique EHR
graph topologies each one of them proposes. Furthermore, these models
may interpret tasks differently. For instance, models might vary in
defining the threshold for assigning a ‘1’ (indicating positive) or ‘0’
(indicating negative) in clinical risk prediction tasks.
This complexity shows that making predictions in the medical field
is still a delicate task that depends heavily on the available resources
and the limitations of the architectures of the models. Moreover, model
variability underscores the need for unified assessment metrics and
cross-model interpretability tools to improve the comparability of the
GNN models. This is especially crucial for clinical tasks, as differ-
ences in model predictions, whether from varying graphical struc-
tures or interpretive paradigms, can lead to different critical clinical
interventions, directly impacting a patient’s health.
Journal of Biomedical Informatics 151 (2024) 104616
12
H. Oss Boll et al.
5.4. Interpretability
Regarding interpretability, it was observed that half of the models
employed some form of interpretability technique (Fig. 8). Among
these, highlights include attention weights and visualizations of the em-
bedding space with dimensionality reduction techniques. For example,
in [26], the attention weights explain the model’s behavior in learning
to correctly diagnose a patient based on assigning higher weights to
other patients with shared diagnoses. In [86,90], the embedding space
was visualized, providing insights into the learned clinical clusters.
However, these interpretability resources may only partially ex-
plain the models, especially if they involve high architectural and
preprocessing complexity. Deep learning methods are often referred
to as ‘black boxes’ because they operate by inputting complex data,
learning abstract patterns, and producing difficult-to-interpret results,
omitting their internal logic to users [131]. Given this, overall, the more
heterogeneous the data used to represent patients, the more complex
and challenging it is to understand the GNN models, making it a double-
edged sword. For example, a GNN based only on diagnosis codes can be
less complex to examine than one based on multiple medical relations
and entities. The lack of graphical ground-truth explanations makes
interpretability even more complex, especially since most EHR datasets
are tabular, and standard evaluation strategies for graph machine learn-
ing are still emerging [132]. In this sense, it is crucial to move beyond
traditional interpretability methods to ensure the practical utility of
GNN-based systems in the medical field, given that predicting health
outcomes is a critical subject that requires thorough understanding to
be implemented in real clinical settings.
Although the identified techniques provide a level of insight, efforts
should be directed towards incorporating novel GNN interpretabil-
ity approaches, visual analytics, and user experience (UX) methods,
enabling medical professionals to effectively evaluate the outputs of
GNNs and provide feedback on the system’s predictions. Some re-
cently proposed methods include GNN-Explainer [133] and neuron
analysis [134]. More possibilities are described in [120,135] and point
to relevant directions for investigation, especially when combined to
include medical professionals in the model development process.
5.5. Temporality
Most existing GNN models applied to EHR data focus primarily on
capturing the sequential nature of events, disregarding the irregular
time aspect of clinical records. Data irregularity has already been
pointed out as a major challenge in modeling patient temporal data [7].
Models often overlook the exact time gaps between medical events and
visits; for example, if a visit happened 1-day apart or one year apart.
This is a crucial gap, as the time intervals between clinical events can
carry important information about the progression of a patient’s health
condition. By considering irregular time intervals between events, GNN
models can offer a more comprehensive representation of temporal
dynamics and potentially enable more accurate predictions.
Using GNNs that can handle time series information and coupling
GNNs with other deep learning architectures that can leverage irreg-
ular sequentiality represent interesting investigation directions. For
example, GNN architectures that can handle dynamic EHR graphs, as
they can update the graph structure as new timestamped patient data
comes in [79]; and modules based on ordinary differential equations
(ODEs), well-suited for modeling continuous-time data [92]. Other
GNN techniques that can be used for time series processing are listed
in [136].
5.6. Prediction tasks
Among the evaluated articles, diagnosis prediction was the predom-
inant task, as it benefits from abundant and standardized diagnostic
code information in EHRs (n =36, Fig. 5). It is essential to predict
diagnoses accurately, but clinical tasks beyond diagnosis prediction also
need to be diversified. Recent research has focused on novel topics,
such as survival analysis, hospitalization risk, patient deterioration, and
disease severity (Fig. 5). It was also observed a need for more diversity
in GNN models evaluating other clinical tasks, with significantly fewer
published articles than diagnosis prediction (Fig. 6). Also, models for
predicting rare diseases and conditions are incentivized, as among the
analyzed studies, only one aimed to tackle this problem [17]. The
high concentration and good performance of GNN architectures tested
for diagnosis prediction suggest that other prediction tasks can also
benefit from these techniques. Moreover, there is still ample room for
further exploration in evaluating these clinical risk tasks with better
patient representation based on graphs, especially those based on a
more heterogeneous, comprehensive view of a patient’s health state.
Finally, clinical tasks primarily aim at predicting outcomes related
to health concerns. Although reactive medicine is vital for patient care,
exploring the favorable factors that enhance patient health or optimize
treatment processes presents valuable opportunities for a more holis-
tic and patient-centered approach. In contrast to reactive medicine,
proactive and personalized medicine dedicates more time and resources
to disease prevention, early diagnosis, and treatment at stages when
it is more cost-effective and potent. This approach also emphasizes
managing chronic conditions before they escalate and lead to severe
complications.
5.7. Limitations
This study did not cover certain aspects, providing opportunities
for future research. This includes the evaluation of GNN model bias
and fairness across diverse patient populations, an area increasingly
recognized for its importance in clinical settings, as well as patient
data confidentiality. Understanding how these models perform across
different patient demographics and how to handle patient data safely
in GNN models is critical in clinical settings to ensure equitable health
outcomes.
It is also important to highlight the need to investigate the in-
tegration of these models into actual clinical workflows. The direct
application and evaluation of these models in clinical environments,
alongside a thorough comparative analysis with non-ML models like
clinical expert systems, were outside the study’s focus but are valuable
directions for future research. This involves considering how these
models interact with and augment the decision-making processes of
healthcare professionals.
Additionally, the scalability of these models, especially in terms of
computational requirements, was not a primary focus of this study.
GNNs, mainly when dealing with large and complex datasets typical in
healthcare, can require significant computational power. Investigating
ways to optimize these models for more widespread and cost-effective
use in diverse clinical environments is essential for subsequent studies.
In regard to the narrative review approach, it proves beneficial for
a broad and comprehensive overview. Yet, it presents some limitations.
Due to the rapidly evolving nature of GNN applications in clinical risk
prediction, some recent developments may not have been included. In
addition, narrative reviews can limit the generalizability of findings and
thus lead to potential bias in the selection and interpretation of studies.
These limitations highlight the need for future systematic reviews and
meta-analyses to enhance the fundamental insights provided by this
study.
Finally, this study specifically focused on studies that employed
GNN approaches. Given the novelty of this field and the often incon-
sistent terminology, there is a possibility of unintentionally omitting
specific works. However, the filtering criteria were carefully designed
to maximize the inclusion of relevant studies, ensuring a comprehensive
listing despite the potential for omissions.
Journal of Biomedical Informatics 151 (2024) 104616
13
H. Oss Boll et al.
6. Conclusion
The comprehensive review presented in this paper examined the
application of GNNs for clinical risk prediction using EHRs. Initially, a
background on EHR graph representation learning was provided, along
with its relevance to relational medical data, and an introduction to
GNNs and graph analysis. Next, the paper highlighted state-of-the-art
GNN approaches that are effective in modeling EHR data, including the
graph convolutional network (GCN), graph attention network (GAT),
and graph autoencoder (GAE).
The Results section provided statistics on the analyzed articles and
detailed the analysis around three axes: data representation, temporal-
ity, and clinical prediction tasks. It was possible to identify a growing
trend of articles focusing on this topic, and the positive outcomes of
the analyzed models demonstrate the significance and potential of the
area. Among the architectures employed, GAT and GCN were the most
common, and the task on which these models focused the most was
the prediction of diagnoses, followed by mortality and readmission.
Furthermore, there was a predominance of the MIMIC-III dataset as a
resource for building the models.
Finally, the Discussion section presented open challenges in the
area. Future research directions include developing models that ef-
fectively handle multimodal, heterogeneous, and irregular time in-
formation in the EHR data. Additionally, interpretability should be
emphasized, especially considering enhancing clinicians’ understanding
of the predictions, and efforts should be directed towards diversifying
the scope of investigated clinical risk prediction tasks. These perspec-
tives will contribute to more informed healthcare decision-making and
ultimately improve patient care.
CRediT authorship contribution statement
Heloísa Oss Boll: Writing review & editing, Writing origi-
nal draft, Visualization, Methodology, Investigation, Formal analysis,
Data curation, Conceptualization. Ali Amirahmadi: Writing review
& editing, Conceptualization. Mirfarid Musavian Ghazani: Writing
review & editing, Conceptualization. Wagner Ourique de Morais:
Writing review & editing, Conceptualization. Edison Pignaton de
Freitas: Writing review & editing, Conceptualization. Amira Soli-
man: Writing review & editing, Supervision, Methodology, Con-
ceptualization. Farzaneh Etminani: Writing review & editing, Su-
pervision, Methodology, Conceptualization. Stefan Byttner: Writing
review & editing, Supervision, Methodology, Conceptualization. Mari-
ana Recamonde-Mendoza: Writing review & editing, Visualization,
Supervision, Methodology, Conceptualization.
Declaration of competing interest
The authors declare that they have no known competing finan-
cial interests or personal relationships that could have appeared to
influence the work reported in this paper.
Acknowledgments
Funding
This work was financed in part by the Swedish Council for Higher
Education through the Linnaeus-Palme Partnership, Sweden
(3.3.1.34.16456), Coordenação de Aperfeiçoamento de Pessoal de Nível
Superior (CAPES), Brazil - Finance Code 001, and Conselho Nacional
de Desenvolvimento Científico e Tecnológico (CNPq), Brazil through
grants nr. 309505/2020-8 and 308075/2021- 8. We also acknowledge
the support from Fundação de Amparo à Pesquisa do Estado do Rio
Grande do Sul (FAPERGS), Brazil through grants nr. 22/2551-0000390-
7 (Project CIARS) and 21/2551-0002052-0.
Appendix A. Supplementary data
Supplementary material related to this article can be found online
at https://doi.org/10.1016/j.jbi.2024.104616.
References
[1] Y. Si, J. Du, Z. Li, X. Jiang, T. Miller, F. Wang, W. Jim Zheng, K. Roberts,
Deep representation learning of patient data from Electronic Health Records
(EHR): A systematic review, J. Biomed. Inform. 115 (2021) 103671, http:
//dx.doi.org/10.1016/j.jbi.2020.103671, URL https://www.sciencedirect.com/
science/article/pii/S1532046420302999.
[2] N.J. Carson, B. Mullin, M.J. Sanchez, F. Lu, K. Yang, M. Menezes, B.L. Cook,
Identification of suicidal behavior among psychiatrically hospitalized adoles-
cents using natural language processing and machine learning of electronic
health records, PLoS One 14 (2) (2019) e0211116, http://dx.doi.org/10.1371/
journal.pone.0211116, URL https://dx.plos.org/10.1371/journal.pone.0211116.
[3] T. Zheng, W. Xie, L. Xu, X. He, Y. Zhang, M. You, G. Yang, Y. Chen,
A machine learning-based framework to identify type 2 diabetes through
electronic health records, Int. J. Med. Inform. 97 (2017) 120–127, http://dx.
doi.org/10.1016/j.ijmedinf.2016.09.014, URL https://linkinghub.elsevier.com/
retrieve/pii/S1386505616302155.
[4] S. Fu, L.Y. Leung, A.-O. Raulli, D.F. Kallmes, K.A. Kinsman, K.B. Nelson, M.S.
Clark, P.H. Luetmer, P.R. Kingsbury, D.M. Kent, H. Liu, Assessment of the
impact of EHR heterogeneity for clinical research through a case study of
silent brain infarction, BMC Med. Inform. Decis. Mak. 20 (1) (2020) 60, http://
dx.doi.org/10.1186/s12911-020- 1072-9, URL https://bmcmedinformdecismak.
biomedcentral.com/articles/10.1186/s12911-020- 1072-9.
[5] B. Theodorou, C. Xiao, J. Sun, Synthesize high-dimensional longitudinal elec-
tronic health records via hierarchical autoregressive language model, 2023,
arXiv:2304.02169 [cs] URL http://arxiv.org/abs/2304.02169.
[6] B.J. Wells, A.S. Nowacki, K. Chagin, M.W. Kattan, Strategies for handling
missing data in electronic health record derived data, eGEMs J. Electron. Health
Data Methods 1 (3) (2013) 7, http://dx.doi.org/10.13063/2327-9214.1035,
URL https://up-j- gemgem.ubiquityjournal.website/articles/30.
[7] F. Xie, H. Yuan, Y. Ning, M.E.H. Ong, M. Feng, W. Hsu, B. Chakraborty,
N. Liu, Deep learning for temporal data representation in electronic health
records: A systematic review of challenges and methodologies, J. Biomed.
Inform. 126 (2022) 103980, http://dx.doi.org/10.1016/j.jbi.2021.103980, URL
https://linkinghub.elsevier.com/retrieve/pii/S1532046421003099.
[8] Y. Bengio, A. Courville, P. Vincent, Representation learning: A review and new
perspectives, 2014, arXiv:1206.5538 [cs] URL http://arxiv.org/abs/1206.5538.
[9] S.F. Ahmed, M.S.B. Alam, M. Hassan, M.R. Rozbu, T. Ishtiak, N. Rafa, M. Mofi-
jur, A.B.M. Shawkat Ali, A.H. Gandomi, Deep learning modelling techniques:
current progress, applications, advantages, and challenges, Artif. Intell. Rev. 56
(11) (2023) 13521–13617, http://dx.doi.org/10.1007/s10462-023- 10466-8.
[10] Q. Suo, H. Xue, J. Gao, A. Zhang, Risk factor analysis based on deep
learning models, in: Proceedings of the 7th ACM International Conference
on Bioinformatics, Computational Biology, and Health Informatics, BCB ’16,
Association for Computing Machinery, New York, NY, USA, 2016, pp. 394–403,
http://dx.doi.org/10.1145/2975167.2975208.
[11] C. Xiao, E. Choi, J. Sun, Opportunities and challenges in developing deep
learning models using electronic health records data: a systematic review,
J. Amer. Med. Inform. Assoc. 25 (10) (2018) 1419–1428, http://dx.doi.
org/10.1093/jamia/ocy068, URL https://academic.oup.com/jamia/article/25/
10/1419/5035024.
[12] S. Chowdhury, Y. Chen, A. Wen, X. Ma, Q. Dai, Y. Yu, S. Fu, X. Jiang, N.
Zong, Predicting physiological response in heart failure management: A graph
representation learning approach using electronic health records, 2023, http:
//dx.doi.org/10.1101/2023.01.27.23285129, URL http://medrxiv.org/lookup/
doi/10.1101/2023.01.27.23285129.
[13] H. Lu, S. Uddin, Disease prediction using graph machine learning based on
electronic health data: A review of approaches and trends, Healthcare 11
(7) (2023) http://dx.doi.org/10.3390/healthcare11071031, URL https://www.
mdpi.com/2227-9032/11/7/1031.
[14] A. Amirahmadi, M. Ohlsson, K. Etminani, Deep learning prediction mod-
els based on EHR trajectories: A systematic review, J. Biomed. Inform.
144 (2023) 104430, http://dx.doi.org/10.1016/j.jbi.2023.104430, URL https:
//www.sciencedirect.com/science/article/pii/S153204642300151X.
[15] W.-H. Weng, P. Szolovits, Representation learning for electronic health records,
2019, arXiv:1909.09248 [cs, stat] URL http://arxiv.org/abs/1909.09248.
[16] E. Choi, Z. Xu, Y. Li, M. Dusenberry, G. Flores, E. Xue, A. Dai, Learning
the graphical structure of electronic health records with graph convolutional
transformer, in: Proceedings of the AAAI Conference on Artificial Intelligence,
Vol. 34, 2020, pp. 606–613, http://dx.doi.org/10.1609/aaai.v34i01.5400, URL
https://ojs.aaai.org/index.php/AAAI/article/view/5400 no. 01.
[17] Z. Sun, H. Yin, H. Chen, T. Chen, L. Cui, F. Yang, Disease prediction via
graph neural networks, IEEE J. Biomed. Health Inform. 25 (3) (2021) 818–
826, http://dx.doi.org/10.1109/JBHI.2020.3004143, URL https://ieeexplore.
ieee.org/document/9122573/.
Journal of Biomedical Informatics 151 (2024) 104616
14
H. Oss Boll et al.
[18] W. Zhu, N. Razavian, Variationally regularized graph-based representation
learning for electronic health records, in: Proceedings of the Confer-
ence on Health, Inference, and Learning, ACM, 2021, pp. 1–13, http:
//dx.doi.org/10.1145/3450439.3451855, URL https://dl.acm.org/doi/10.1145/
3450439.3451855.
[19] X. Dong, R. Wong, W. Lyu, K. Abell-Hart, J. Deng, Y. Liu, J.G. Hajagos,
R.N. Rosenthal, C. Chen, F. Wang, An integrated LSTM-HeteroRGNN model
for interpretable opioid overdose risk prediction, Artif. Intell. Med. 135
(2023) 102439, http://dx.doi.org/10.1016/j.artmed.2022.102439, URL https:
//linkinghub.elsevier.com/retrieve/pii/S0933365722001919.
[20] M.M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, P. Vandergheynst, Geometric
deep learning: Going beyond Euclidean data, IEEE Signal Process. Mag. 34
(4) (2017) 18–42, http://dx.doi.org/10.1109/MSP.2017.2693418, Conference
Name: IEEE Signal Processing Magazine.
[21] I.A. Chikwendu, X. Zhang, I.O. Agyemang, I. Adjei-Mensah, U.C. Chima, C.J.
Ejiyi, A comprehensive survey on deep graph representation learning methods,
J. Artificial Intelligence Res. 78 (2023) 287–356, http://dx.doi.org/10.1613/
jair.1.14768, URL https://jair.org/index.php/jair/article/view/14768.
[22] W. Jiang, J. Luo, Graph neural network for traffic forecasting: A survey, Expert
Syst. Appl. 207 (2022) 117921, http://dx.doi.org/10.1016/j.eswa.2022.117921,
URL https://www.sciencedirect.com/science/article/pii/S0957417422011654.
[23] J. Xu, X. Xi, J. Chen, V.S. Sheng, J. Ma, Z. Cui, A survey of deep learning for
electronic health records, Appl. Sci. 12 (22) (2022) 11709, http://dx.doi.org/
10.3390/app122211709.
[24] H. Cui, J. Lu, S. Wang, R. Xu, W. Ma, S. Yu, Y. Yu, X. Kan, T. Fu, C. Ling, J.
Ho, F. Wang, C. Yang, A survey on knowledge graphs for healthcare: Resources,
application progress, and promise, in: ICML 3rd Workshop on Interpretable
Machine Learning in Healthcare, IMLH, 2023, p. 19, URL https://openreview.
net/forum?id=CZCktJoBRh.
[25] K. Wang, N. Chen, T. Chen, Joint medical ontology representation learning for
healthcare predictions, in: 2020 International Joint Conference on Neural Net-
works (IJCNN), IEEE, 2020, pp. 1–7, http://dx.doi.org/10.1109/IJCNN48605.
2020.9207355, URL https://ieeexplore.ieee.org/document/9207355/.
[26] E. Rocheteau, C. Tong, P. Veličković, N. Lane, P. Liò, Predicting patient
outcomes with graph representation learning, 2021, arXiv:2101.03940 [cs] URL
http://arxiv.org/abs/2101.03940.
[27] Z.E. Wu, D. Xu, P.J.-H. Hu, T.-S. Huang, A hierarchical multilabel graph
attention network method to predict the deterioration paths of chronic hepatitis
B patients, J. Amer. Med. Inform. Assoc. 30 (5) (2023) 846–858, http://dx.doi.
org/10.1093/jamia/ocad008, URL https://academic.oup.com/jamia/article/30/
5/846/7040373.
[28] Y. Li, B. Qian, X. Zhang, H. Liu, Knowledge guided diagnosis prediction via
graph spatial-temporal network, in: Proceedings of the 2020 SIAM International
Conference on Data Mining, SDM, SIAM 2020, 2020, pp. 19–27, http://
dx.doi.org/10.1137/1.9781611976236.3,arXiv:https://epubs.siam.org/doi/pdf/
10.1137/1.9781611976236.3 URL https://epubs.siam.org/doi/abs/10.1137/1.
9781611976236.3.
[29] Z. Liu, X. Li, H. Peng, L. He, P.S. Yu, Heterogeneous similarity graph neural
network on electronic health records, in: 2020 IEEE International Conference
on Big Data (Big Data), IEEE, 2020, pp. 1196–1205, http://dx.doi.org/10.
1109/BigData50022.2020.9377795, URL https://ieeexplore.ieee.org/document/
9377795/.
[30] M. Ye, S. Cui, Y. Wang, J. Luo, C. Xiao, F. Ma, MedPath: Augmenting
health risk prediction via medical knowledge paths, in: Proceedings of the
Web Conference 2021, ACM, 2021, pp. 1397–1409, http://dx.doi.org/10.1145/
3442381.3449860, URL https://dl.acm.org/doi/10.1145/3442381.3449860.
[31] C. Lu, C.K. Reddy, Y. Ning, Self-supervised graph learning with hyperbolic
embedding for temporal health event prediction, IEEE Trans. Cybern. 53
(4) (2023) 2124–2136, http://dx.doi.org/10.1109/TCYB.2021.3109881,arXiv:
2106.04751 [cs] URL http://arxiv.org/abs/2106.04751.
[32] J. Schrodt, A. Dudchenko, P. Knaup-Gregori, M. Ganzinger, Graph-
representation of patient data: a systematic literature review, J. Med. Syst. 44
(4) (2020) 86.
[33] E. Choi, M.T. Bahadori, A. Schuetz, W.F. Stewart, J. Sun, Doctor AI: Predicting
clinical events via recurrent neural networks, 2016, arXiv:1511.05942 [cs] URL
http://arxiv.org/abs/1511.05942.
[34] E. Choi, M.T. Bahadori, J.A. Kulas, A. Schuetz, W.F. Stewart, J. Sun, RETAIN:
An interpretable predictive model for healthcare using reverse time atten-
tion mechanism, 2017, arXiv:1608.05745 [cs] URL http://arxiv.org/abs/1608.
05745.
[35] F. Ma, R. Chitta, J. Zhou, Q. You, T. Sun, J. Gao, Dipole: Diagnosis prediction
in healthcare via attention-based bidirectional recurrent neural networks, in:
Proceedings of the 23rd ACM SIGKDD International Conference on Knowl-
edge Discovery and Data Mining, 2017, pp. 1903–1911, http://dx.doi.org/
10.1145/3097983.3098088,arXiv:1706.05764 [cs] URL http://arxiv.org/abs/
1706.05764.
[36] P. Nguyen, T. Tran, N. Wickramasinghe, S. Venkatesh, Deepr: A convolutional
net for medical records, 2016, arXiv:1607.07519 [cs, stat] URL http://arxiv.
org/abs/1607.07519.
[37] E. Choi, M.T. Bahadori, E. Searles, C. Coffey, J. Sun, Multi-layer representation
learning for medical concepts, 2016, arXiv:1602.05568 [cs] URL http://arxiv.
org/abs/1602.05568.
[38] Y. Choi, C.Y.-I. Chiu, D. Sontag, Learning low-dimensional representations of
medical concepts, AMIA Summits Transl. Sci. Proc. 2016 (2016) 41–50, URL
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001761/.
[39] Z. Che, D. Kale, W. Li, M.T. Bahadori, Y. Liu, Deep computational phenotyp-
ing, in: Proceedings of the 21th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, KDD ’15, Association for Computing
Machinery, 2015, pp. 507–516, http://dx.doi.org/10.1145/2783258.2783365.
[40] R. Miotto, L. Li, B.A. Kidd, J.T. Dudley, Deep patient: An unsupervised
representation to predict the future of patients from the electronic health
records, Sci. Rep. 6 (1) (2016) 26094, http://dx.doi.org/10.1038/srep26094,
URL https://www.nature.com/articles/srep26094, Number: 1 Publisher: Nature
Publishing Group.
[41] Y. Cheng, F. Wang, P. Zhang, J. Hu, Risk prediction with electronic health
records: A deep learning approach, in: Proceedings of the 2016 SIAM Inter-
national Conference on Data Mining, Society for Industrial and Applied Math-
ematics, 2016, pp. 432–440, http://dx.doi.org/10.1137/1.9781611974348.49,
URL https://epubs.siam.org/doi/10.1137/1.9781611974348.49.
[42] T. Pham, T. Tran, D. Phung, S. Venkatesh, DeepCare: A deep dynamic memory
model for predictive medicine, 2017, arXiv:1602.00357 [cs, stat] URL http:
//arxiv.org/abs/1602.00357.
[43] International classification of diseases (ICD), 2023, URL https://www.who.int/
standards/classifications/classification-of- diseases.
[44] J. Zhang, J. Gong, L. Barnes, HCNN: Heterogeneous convolutional neural
networks for comorbid risk prediction with electronic health records, in: 2017
IEEE/ACM International Conference on Connected Health: Applications, Systems
and Engineering Technologies (CHASE), IEEE, 2017, pp. 214–221, http://
dx.doi.org/10.1109/CHASE.2017.80, URL http://ieeexplore.ieee.org/document/
8010635/.
[45] E. Choi, M.T. Bahadori, L. Song, W.F. Stewart, J. Sun, GRAM: Graph-based
attention model for healthcare representation learning, in: Proceedings of the
23rd ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, ACM, 2017, pp. 787–795, http://dx.doi.org/10.1145/3097983.
3098126, URL https://dl.acm.org/doi/10.1145/3097983.3098126.
[46] F. Ma, Q. You, H. Xiao, R. Chitta, J. Zhou, J. Gao, KAME: Knowledge-
based attention model for diagnosis prediction in healthcare, in: Proceedings
of the 27th ACM International Conference on Information and Knowledge
Management, ACM, 2018, pp. 743–752, http://dx.doi.org/10.1145/3269206.
3271701, URL https://dl.acm.org/doi/10.1145/3269206.3271701.
[47] L. Song, C.W. Cheong, K. Yin, W.K. Cheung, B.C.M. Fung, J. Poon, Medical
concept embedding with multiple ontological representations, in: Proceedings
of the Twenty-Eighth International Joint Conference on Artificial Intelligence,
International Joint Conferences on Artificial Intelligence Organization, 2019,
pp. 4613–4619, http://dx.doi.org/10.24963/ijcai.2019/641, URL https://www.
ijcai.org/proceedings/2019/641.
[48] J. Gao, X. Wang, Y. Wang, Z. Yang, J. Gao, J. Wang, W. Tang, X. Xie, CAMP: Co-
attention memory networks for diagnosis prediction in healthcare, in: 2019 IEEE
International Conference on Data Mining (ICDM), IEEE, 2019, pp. 1036–1041,
http://dx.doi.org/10.1109/ICDM.2019.00120, URL https://ieeexplore.ieee.org/
document/8970792/.
[49] E. Choi, C. Xiao, W.F. Stewart, J. Sun, MiME: Multilevel medical embedding
of electronic health records for predictive healthcare, 2018, arXiv:1810.09593
[cs, stat] URL http://arxiv.org/abs/1810.09593.
[50] Y. Wang, W. Chen, D. Pi, R. Boots, Graph augmented triplet architecture for
fine-grained patient similarity, World Wide Web 23 (5) (2020) 2739–2752,
http://dx.doi.org/10.1007/s11280-020- 00794-y, URL http://link.springer.com/
10.1007/s11280-020- 00794-y.
[51] B. Hettige, Y.-F. Li, W. Wang, S. Le, W. Buntine, MedGraph: Structural and
temporal representation learning of electronic medical records, 2020, arXiv:
1912.03703 [cs, stat] URL http://arxiv.org/abs/1912.03703.
[52] R. Li, C. Yin, S. Yang, B. Qian, P. Zhang, Marrying medical domain knowledge
with deep learning on electronic health records: A deep visual analytics
approach, J. Med. Internet Res. 22 (9) (2020) e20645, http://dx.doi.org/10.
2196/20645, URL http://www.jmir.org/2020/9/e20645/.
[53] H. Jiang, D. Yang, Learning graph-based embedding from EHRs for time-aware
patient similarity, Eng. Lett. 28 (4) (2020).
[54] J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, M.
Sun, Graph neural networks: A review of methods and applications, AI Open
1 (2020) 57–81, http://dx.doi.org/10.1016/j.aiopen.2021.01.001, URL https:
//linkinghub.elsevier.com/retrieve/pii/S2666651021000012.
[55] B. Sanchez-Lengeling, E. Reif, A. Pearce, A.B. Wiltschko, A gentle introduction
to graph neural networks, Distill 6 (9) (2021) e33, http://dx.doi.org/10.23915/
distill.00033, URL https://distill.pub/2021/gnn-intro.
[56] P. Veličković, Everything is connected: Graph neural networks, 2023, arXiv:
2301.08210 [cs, stat] URL http://arxiv.org/abs/2301.08210.
[57] C. Gao, Y. Zheng, N. Li, Y. Li, Y. Qin, J. Piao, Y. Quan, J. Chang, D. Jin, X. He,
Y. Li, A survey of graph neural networks for recommender systems: Challenges,
methods, and directions, ACM Trans. Recomm. Syst. 1 (1) (2023) 3:1–3:51, http:
//dx.doi.org/10.1145/3568022, URL https://dl.acm.org/doi/10.1145/3568022.
Journal of Biomedical Informatics 151 (2024) 104616
15
H. Oss Boll et al.
[58] L.A. Alves, N.C.D.S. Ferreira, V. Maricato, A.V.P. Alberto, E.A. Dias, N. Jose
Aguiar Coelho, Graph neural networks as a potential tool in improving virtual
screening programs, Front. Chem. 9 (2022) 787194, http://dx.doi.org/10.3389/
fchem.2021.787194, URL https://www.frontiersin.org/articles/10.3389/fchem.
2021.787194/full.
[59] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, Y. Bengio, Graph
attention networks, 2018, arXiv:1710.10903 [cs, stat] URL http://arxiv.org/
abs/1710.10903.
[60] K. Jha, S. Saha, H. Singh, Prediction of protein–protein interaction using
graph neural networks, Sci. Rep. 12 (1) (2022) 8360, http://dx.doi.org/
10.1038/s41598-022- 12201-9, URL https://www.nature.com/articles/s41598-
022-12201- 9.
[61] G. Panagopoulos, G. Nikolentzos, M. Vazirgiannis, Transfer graph neural
networks for pandemic forecasting, in: Proceedings of the AAAI Confer-
ence on Artificial Intelligence, Vol. 35, 2021, pp. 4838–4845, http://dx.
doi.org/10.1609/aaai.v35i6.16616, URL https://ojs.aaai.org/index.php/AAAI/
article/view/16616, no. 6.
[62] P. Bongini, M. Bianchini, F. Scarselli, Molecular generative graph neural
networks for drug discovery, Neurocomputing 450 (2021) 242–252, http://dx.
doi.org/10.1016/j.neucom.2021.04.039, URL https://linkinghub.elsevier.com/
retrieve/pii/S0925231221005737.
[63] Z. Lin, D. Yang, H. Jiang, H. Yin, Learning patient similarity via heterogeneous
medical knowledge graph embedding, Int. J. Comput. Sci. 48 (4) (2021).
[64] M. Gori, G. Monfardini, F. Scarselli, A new model for learning in graph
domains, in: Proceedings. 2005 IEEE International Joint Conference on Neural
Networks, 2005, Vol. 2, 2005, pp. 729–734, http://dx.doi.org/10.1109/IJCNN.
2005.1555942, ISSN: 2161-4407, vol. 2.
[65] F. Scarselli, M. Gori, A.C. Tsoi, M. Hagenbuchner, G. Monfardini, The graph
neural network model, IEEE Trans. Neural Netw. 20 (1) (2009) 61–80,
http://dx.doi.org/10.1109/TNN.2008.2005605, URL http://ieeexplore.ieee.org/
document/4700287/.
[66] S.K. Maurya, X. Liu, T. Murata, Feature selection: Key to enhance node
classification with graph neural networks, CAAI Trans. Intell. Technol. 8 (1)
(2023) 14–28, http://dx.doi.org/10.1049/cit2.12166, URL https://ietresearch.
onlinelibrary.wiley.com/doi/10.1049/cit2.12166.
[67] K. Xu, W. Hu, J. Leskovec, S. Jegelka, How powerful are graph neural
networks?, 2019, arXiv:1810.00826 [cs, stat] URL http://arxiv.org/abs/1810.
00826.
[68] A. Mohi ud din, S. Qureshi, A review of challenges and solutions in the design
and implementation of deep graph neural networks, Int. J. Comput. Appl.
45 (3) (2023) 221–230, http://dx.doi.org/10.1080/1206212X.2022.2133805,
Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/1206212X.2022.
2133805.
[69] A. Daigavane, B. Ravindran, G. Aggarwal, Understanding convolutions on
graphs, Distill 6 (9) (2021) e32, http://dx.doi.org/10.23915/distill.00032, URL
https://distill.pub/2021/understanding-gnns.
[70] T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional
networks, 2017, arXiv:1609.02907 [cs, stat] URL http://arxiv.org/abs/1609.
02907.
[71] C. Gao, Y. Zheng, N. Li, Y. Li, Y. Qin, J. Piao, Y. Quan, J. Chang, D. Jin, X. He,
Y. Li, A survey of graph neural networks for recommender systems: Challenges,
methods, and directions, ACM Trans. Recomm. Syst. 1 (1) (2023) 1–51, http:
//dx.doi.org/10.1145/3568022, URL https://dl.acm.org/doi/10.1145/3568022.
[72] Z. Chen, F. Chen, L. Zhang, T. Ji, K. Fu, L. Zhao, F. Chen, L. Wu, C.
Aggarwal, C.-T. Lu, Bridging the gap between spatial and spectral domains:
A survey on graph neural networks, 2021, arXiv:2002.11867 [cs, stat] URL
http://arxiv.org/abs/2002.11867.
[73] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł.
Kaiser, I. Polosukhin, Attention is all you need, in: I. Guyon, U.V. Luxburg, S.
Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (Eds.), in: Advances
in Neural Information Processing Systems, vol. 30, Curran Associates, Inc.,
2017, pp. 1–11, URL https://proceedings.neurips.cc/paper_files/paper/2017/
file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
[74] S. Chaudhari, V. Mithal, G. Polatkan, R. Ramanath, An attentive survey of
attention models, ACM Trans. Intell. Syst. Technol. 12 (5) (2021) 53:1–53:32,
http://dx.doi.org/10.1145/3465055.
[75] Z. Zhang, P. Cui, W. Zhu, Deep learning on graphs: A survey, IEEE Trans.
Knowl. Data Eng. 34 (1) (2022) 249–270, http://dx.doi.org/10.1109/
TKDE.2020.2981333, URL https://ieeexplore.ieee.org/abstract/document/
9039675?casa_token=INPsqM5XnUsAAAAA:NoY3hnH2701HYZeclrkwnBf2GJ-
1dZeTvQqsCd2IZyZhnxrhWS7nA1rrNcoaoKNJSpAZIPbotGc.
[76] W.L. Hamilton, R. Ying, J. Leskovec, Representation learning on graphs:
Methods and applications, 2018, arXiv:1709.05584 [cs] URL http://arxiv.org/
abs/1709.05584.
[77] T.N. Kipf, M. Welling, Variational graph auto-encoders, 2016, arXiv:1611.07308
[cs, stat] URL http://arxiv.org/abs/1611.07308.
[78] M.J. Page, J.E. McKenzie, P.M. Bossuyt, I. Boutron, T.C. Hoffmann, C.D.
Mulrow, L. Shamseer, J.M. Tetzlaff, E.A. Akl, S.E. Brennan, R. Chou, J.
Glanville, J.M. Grimshaw, A. Hróbjartsson, M.M. Lalu, T. Li, E.W. Loder, E.
Mayo-Wilson, S. McDonald, L.A. McGuinness, L.A. Stewart, J. Thomas, A.C.
Tricco, V.A. Welch, P. Whiting, D. Moher, The PRISMA 2020 statement: an
updated guideline for reporting systematic reviews, BMJ 372 (2021) n71,
http://dx.doi.org/10.1136/bmj.n71, URL https://www.bmj.com/content/372/
bmj.n71, Publisher: British Medical Journal Publishing Group Section: Research
Methods & Reporting.
[79] D. Lee, X. Jiang, H. Yu, Harmonized representation learning on dynamic
EHR graphs, J. Biomed. Inform. 106 (2020) 103426, http://dx.doi.org/
10.1016/j.jbi.2020.103426, URL https://linkinghub.elsevier.com/retrieve/pii/
S153204642030054X.
[80] Y. Li, B. Qian, X. Zhang, H. Liu, Graph neural network-based diagnosis
prediction, Big Data 8 (5) (2020) 379–390, http://dx.doi.org/10.1089/big.2020.
0070, URL https://www.liebertpub.com/doi/10.1089/big.2020.0070.
[81] B.T. Lee, O.-Y. Kwon, H. Park, K.-J. Cho, J.-M. Kwon, Y. Lee, Graph
convolutional networks-based noisy data imputation in electronic health
record, Crit. Care Med. 48 (11) (2020) e1106–e1111, http://dx.doi.org/10.
1097/CCM.0000000000004583, URL https://journals.lww.com/10.1097/CCM.
0000000000004583.
[82] Q. Wang, B.C.M. Fung, P.C.K. Hung, DUGRA: Dual-graph representation learn-
ing for health information networks, in: 2020 IEEE International Conference
on Big Data (Big Data), IEEE, 2020, pp. 4961–4970, http://dx.doi.org/10.
1109/BigData50022.2020.9378420, URL https://ieeexplore.ieee.org/document/
9378420/.
[83] S. Wang, J. Liu, TAGNet: Temporal aware graph convolution network for
clinical information extraction, in: 2020 IEEE International Conference on
Bioinformatics and Biomedicine (BIBM), IEEE, 2020, pp. 2105–2108, http:
//dx.doi.org/10.1109/BIBM49941.2020.9313530, URL https://ieeexplore.ieee.
org/document/9313530/.
[84] S. Chowdhury, C. Zhang, P. Yu, Y. Luo, Med2Meta: Learning representa-
tions of medical concepts with meta-embeddings:, in: Proceedings of the
13th International Joint Conference on Biomedical Engineering Systems and
Technologies, SCITEPRESS - Science and Technology Publications, 2020, pp.
369–376, http://dx.doi.org/10.5220/0008934403690376, URL https://www.
scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/0008934403690376.
[85] T. Wu, Y. Wang, Y. Wang, E. Zhao, Y. Yuan, Leveraging graph-based hier-
archical medical entity embedding for healthcare applications, Sci. Rep. 11
(1) (2021) 5858, http://dx.doi.org/10.1038/s41598-021- 85255-w, URL https:
//www.nature.com/articles/s41598-021- 85255-w.
[86] Y. Shi, Y. Guo, H. Wu, J. Li, X. Li, Multi-relational EHR representation learning
with infusing information of Diagnosis and Medication, in: 2021 IEEE 45th
Annual Computers, Software, and Applications Conference (COMPSAC), IEEE,
2021, pp. 1617–1622, http://dx.doi.org/10.1109/COMPSAC51774.2021.00241,
URL https://ieeexplore.ieee.org/document/9529837/.
[87] Z. Sun, W. Dong, J. Shi, K. He, Z. Huang, Attention-based deep recurrent model
for survival prediction, ACM Trans. Comput. Healthc. 2 (4) (2021) 1–18, http:
//dx.doi.org/10.1145/3466782, URL https://dl.acm.org/doi/10.1145/3466782.
[88] Z. Wang, R. Wen, X. Chen, S. Cao, S.-L. Huang, B. Qian, Y. Zheng, Online
disease diagnosis with inductive heterogeneous graph convolutional networks,
in: Proceedings of the Web Conference 2021, ACM, 2021, pp. 3349–3358,
http://dx.doi.org/10.1145/3442381.3449795, URL https://dl.acm.org/doi/10.
1145/3442381.3449795.
[89] S.N. Golmaei, X. Luo, DeepNote-GNN: predicting hospital readmission using
clinical notes and patient network, in: Proceedings of the 12th ACM Conference
on Bioinformatics, Computational Biology, and Health Informatics, ACM, 2021,
pp. 1–9, http://dx.doi.org/10.1145/3459930.3469547, URL https://dl.acm.org/
doi/10.1145/3459930.3469547.
[90] R. Vinas, X. Zheng, J. Hayes, A graph-based imputation method for sparse
medical records, 2021, arXiv:2111.09084 [cs] URL http://arxiv.org/abs/2111.
09084.
[91] W. Yang, S. Zhang, B. Zhang, Medical assistant diagnosis method based
on graph neural network and attention mechanism, in: 2021 the 3rd
World Symposium on Software Engineering, ACM, 2021, pp. 194–198, http:
//dx.doi.org/10.1145/3488838.3488871, URL https://dl.acm.org/doi/10.1145/
3488838.3488871.
[92] H. Qiu, C. Zhang, Z. Fei, M. Qiu, S.-Y. Kung (Eds.), Readmission prediction
with knowledge graph attention and RNN-based ordinary differential equations,
in: Lecture Notes in Computer Science, vol. 12817, Springer International
Publishing, 2021, http://dx.doi.org/10.1007/978-3- 030-82153- 1, URL https:
//link.springer.com/10.1007/978-3- 030-82153- 1.
[93] C. Lu, C.K. Reddy, P. Chakraborty, S. Kleinberg, Y. Ning, Collaborative graph
learning with auxiliary text for temporal event prediction in healthcare, in:
Proceedings of the Thirtieth International Joint Conference on Artificial Intel-
ligence, International Joint Conferences on Artificial Intelligence Organization,
2021, pp. 3529–3535, http://dx.doi.org/10.24963/ijcai.2021/486, URL https:
//www.ijcai.org/proceedings/2021/486.
[94] A. Pieroni, A. Cabroni, F. Fallucchi, N. Scarpato, Predictive modeling ap-
plied to structured clinical data extracted from electronic health records:
An architectural hypothesis and A first experiment, J. Comput. Sci. 17 (9)
(2021) 762–775, http://dx.doi.org/10.3844/jcssp.2021.762.775, URL https://
thescipub.com/abstract/10.3844/jcssp.2021.762.775.
Journal of Biomedical Informatics 151 (2024) 104616
16
H. Oss Boll et al.
[95] Y. Xu, H. Ying, S. Qian, F. Zhuang, X. Zhang, D. Wang, J. Wu, H. Xiong,
Time-aware context-gated graph attention network for clinical risk prediction,
IEEE Trans. Knowl. Data Eng. (2022) 1–12, http://dx.doi.org/10.1109/TKDE.
2022.3181780, URL https://ieeexplore.ieee.org/document/9794568/.
[96] Z. Sun, X. Yang, Z. Feng, T. Xu, X. Fan, J. Tian, EHR2HG: Modeling of
EHRs data based on hypergraphs for disease prediction, in: 2022 IEEE Interna-
tional Conference on Bioinformatics and Biomedicine (BIBM), IEEE, 2022, pp.
1730–1733, http://dx.doi.org/10.1109/BIBM55620.2022.9995204, URL https:
//ieeexplore.ieee.org/document/9995204/.
[97] Z. Qu, L. Cui, Y. Xu, Disease risk prediction via heterogeneous graph at-
tention networks, in: 2022 IEEE International Conference on Bioinformatics
and Biomedicine (BIBM), IEEE, 2022, pp. 3385–3390, http://dx.doi.org/10.
1109/BIBM55620.2022.9995491, URL https://ieeexplore.ieee.org/document/
9995491/.
[98] J. Jiang, T. Wang, B. Wang, L. Ma, Y. Guan, Gated tree-based graph atten-
tion network (GTGAT) for medical knowledge graph reasoning, Artif. Intell.
Med. 130 (2022) 102329, http://dx.doi.org/10.1016/j.artmed.2022.102329,
URL https://linkinghub.elsevier.com/retrieve/pii/S093336572200094X.
[99] H.N. Cho, I. Ahn, H. Gwon, H.J. Kang, Y. Kim, H. Seo, H. Choi, M.
Kim, J. Han, G. Kee, T.J. Jun, Y.-H. Kim, Heterogeneous graph construction
and HinSAGE learning from electronic medical records, Sci. Rep. 12 (1)
(2022) 21152, http://dx.doi.org/10.1038/s41598-022- 25693-2, URL https://
www.nature.com/articles/s41598-022- 25693-2.
[100] Y. Li, D. Yang, X. Gong, Patient similarity via medical attributed heterogeneous
graph convolutional network, Int. J. Comput. Sci. (2022) 1152–1161, URL
https://www.iaeng.org/IJCS/issues_v49/issue_4/IJCS_49_4_18.pdf, 10p..
[101] Y. An, R. Li, X. Chen, MERGE: A multi-graph attentive representa-
tion learning framework integrating group information from similar pa-
tients, Comput. Biol. Med. 151 (2022) 106245, http://dx.doi.org/10.1016/
j.compbiomed.2022.106245, URL https://linkinghub.elsevier.com/retrieve/pii/
S0010482522009532.
[102] C. Lu, T. Han, Y. Ning, Context-aware health event prediction via transition
functions on dynamic disease graphs, in: Proceedings of the AAAI Confer-
ence on Artificial Intelligence, Vol. 36, 2022, pp. 4567–4574, http://dx.
doi.org/10.1609/aaai.v36i4.20380, URL https://ojs.aaai.org/index.php/AAAI/
article/view/20380, no. 4.
[103] T. Kanchinadam, S. Gauher, Predicting clinical events via graph neural net-
works, in: 2022 21st IEEE International Conference on Machine Learning
and Applications (ICMLA), IEEE, 2022, pp. 1296–1303, http://dx.doi.org/
10.1109/ICMLA55696.2022.00207, URL https://ieeexplore.ieee.org/document/
10069726/.
[104] Y. Zhang, B. Zhou, K. Song, X. Sui, G. Zhao, N. Jiang, X. Yuan, PM2F2N: Patient
multi-view multi-modal feature fusion networks for clinical outcome prediction,
ACL Anthol. (2022).
[105] J. Gao, C. Yang, J. Heintz, S. Barrows, E. Albers, M. Stapel, S. Warfield,
A. Cross, J. Sun, MedML: Fusing medical knowledge and machine learning
models for early pediatric COVID-19 hospitalization and severity prediction,
iScience 25 (9) (2022) 104970, http://dx.doi.org/10.1016/j.isci.2022.104970,
URL https://linkinghub.elsevier.com/retrieve/pii/S2589004222012421.
[106] Q. Zhao, J. Li, L. Zhao, Z. Zhu, Knowledge guided feature aggregation
for the prediction of chronic obstructive pulmonary disease with Chinese
EMRs, IEEE/ACM Trans. Comput. Biol. Bioinform. (2022) 1–10, http://dx.doi.
org/10.1109/TCBB.2022.3198798, URL https://ieeexplore.ieee.org/document/
9857572/.
[107] Y. Zou, A. Pesaranghader, Z. Song, A. Verma, D.L. Buckeridge, Y. Li, Modeling
electronic health record data using an end-to-end knowledge-graph-informed
topic model, Sci. Rep. 12 (1) (2022) 17868, http://dx.doi.org/10.1038/s41598-
022-22956- w, URL https://www.nature.com/articles/s41598-022- 22956-w.
[108] K. Zhang, B. Hu, F. Zhou, Y. Song, X. Zhao, X. Huang, Graph-based structural
knowledge-aware network for diagnosis assistant, Math. Biosci. Eng. 19 (10)
(2022) 10533–10549, http://dx.doi.org/10.3934/mbe.2022492, URL http://
www.aimspress.com/article/doi/10.3934/mbe.2022492.
[109] D. Cai, C. Sun, M. Song, B. Zhang, S. Hong, H. Li, Hypergraph Contrastive
Learning for Electronic Health Records, Society for Industrial and Applied Math-
ematics, Philadelphia, PA, 2022, http://dx.doi.org/10.1137/1.9781611977172,
URL https://epubs.siam.org/doi/book/10.1137/1.9781611977172.
[110] X. Ma, Y. Wang, X. Chu, L. Ma, W. Tang, J. Zhao, Y. Yuan, G. Wang, Patient
health representation learning via correlational sparse prior of medical features,
IEEE Trans. Knowl. Data Eng. (2022) 1–14, http://dx.doi.org/10.1109/TKDE.
2022.3230454, Conference Name: IEEE Transactions on Knowledge and Data
Engineering.
[111] H.-R. Yao, N. Cao, K. Russell, D.-C. Chang, O. Frieder, J. Fineman, Self-
supervised representation learning on electronic health records with graph
kernel infomax, 2022, arXiv:2209.00655 [cs] URL http://arxiv.org/abs/2209.
00655.
[112] W. Li, H. Li, B. Yang, L. Zhou, X. Yang, M. Zhang, B. Wang,
Knowledge-aware representation learning for diagnosis prediction, Expert Syst.
40 (3) (2023) e13175, http://dx.doi.org/10.1111/exsy.13175, URL https://
onlinelibrary.wiley.com/doi/10.1111/exsy.13175.
[113] T.-C. Do, H.-J. Yang, G.-S. Lee, S.-H. Kim, B.-G. Kho, Rapid response system
based on graph attention network for predicting in-hospital clinical deteriora-
tion, IEEE Access 11 (2023) 29091–29100, http://dx.doi.org/10.1109/ACCESS.
2023.3257406, URL https://ieeexplore.ieee.org/document/10070599/.
[114] Y. Li, L. Feng, Patient multi-relational graph structure learning for diabetes
clinical assistant diagnosis, Math. Biosci. Eng. 20 (5) (2023) 8428–8445, http://
dx.doi.org/10.3934/mbe.2023369, URL http://www.aimspress.com/article/doi/
10.3934/mbe.2023369.
[115] S. Tang, A. Tariq, J.A. Dunnmon, U. Sharma, P. Elugunti, D.L. Rubin, B.N. Patel,
I. Banerjee, Predicting 30-day all-cause hospital readmission using multimodal
spatiotemporal graph neural networks, IEEE J. Biomed. Health Inform. (2023)
1–12, http://dx.doi.org/10.1109/JBHI.2023.3236888, URL https://ieeexplore.
ieee.org/document/10016722/.
[116] N. Zong, V. Ngo, D.J. Stone, A. Wen, Y. Zhao, Y. Yu, S. Liu, M. Huang, C.
Wang, G. Jiang, Leveraging genetic reports and electronic health records for
the prediction of primary cancers: Algorithm development and validation study,
JMIR Med. Inform. 9 (5) (2021) e23586, http://dx.doi.org/10.2196/23586, URL
https://medinform.jmir.org/2021/5/e23586.
[117] A. Johnson, T. Pollard, R. Mark, MIMIC-III clinical database, 2015, http://dx.
doi.org/10.13026/C2XW26, URL https://physionet.org/content/mimiciii/1.4/.
[118] T.J. Pollard, A.E.W. Johnson, J.D. Raffa, L.A. Celi, R.G. Mark, O. Badawi, The
eICU Collaborative Research Database, a freely available multi-center database
for critical care research, Sci. Data 5 (1) (2018) 180178, http://dx.doi.org/10.
1038/sdata.2018.178, URL https://www.nature.com/articles/sdata2018178.
[119] A.E.W. Johnson, L. Bulgarelli, L. Shen, A. Gayles, A. Shammout, S. Horng,
T.J. Pollard, S. Hao, B. Moody, B. Gow, L.-w.H. Lehman, L.A. Celi, R.G.
Mark, MIMIC-IV, a freely accessible electronic health record dataset, Sci. Data
10 (1) (2023) 1, http://dx.doi.org/10.1038/s41597-022- 01899-x, URL https:
//www.nature.com/articles/s41597-022- 01899-x.
[120] A. Khan, E.B. Mobaraki, Interpretability methods for graph neural networks,
2023.
[121] L. van der Maaten, G. Hinton, Visualizing data using t-SNE, J. Mach. Learn. Res.
9 (86) (2008) 2579–2605, URL http://jmlr.org/papers/v9/vandermaaten08a.
html.
[122] X. Liu, H. Wang, T. He, Y. Liao, C. Jian, Recent advances in representation
learning for electronic health records: A systematic review, J. Phys. Conf.
Ser. 2188 (1) (2022) 012007, http://dx.doi.org/10.1088/1742-6596/2188/
1/012007, URL https://iopscience.iop.org/article/10.1088/1742-6596/2188/1/
012007.
[123] C. Yin, R. Zhao, B. Qian, X. Lv, P. Zhang, Domain knowledge guided deep
learning with electronic health records, in: 2019 IEEE International Conference
on Data Mining (ICDM), IEEE, 2019, pp. 738–747, http://dx.doi.org/10.1109/
ICDM.2019.00084, URL https://ieeexplore.ieee.org/document/8970777/.
[124] P. Ernst, A. Siu, G. Weikum, KnowLife: a versatile approach for constructing a
large knowledge graph for biomedical sciences, BMC Bioinform. 16 (1) (2015)
157, http://dx.doi.org/10.1186/s12859-015- 0549-5.
[125] HCUP-US tools & software page, 2023, URL https://hcup-us.ahrq.gov/
toolssoftware/ccs/ccsfactsheet.jsp.
[126] CMeKG(Chinese medical knowledge graph) Dataset_Tianchi datasets, 2020, URL
https://tianchi.aliyun.com/dataset/81506.
[127] ICD - ICD-9-CM - international classification of diseases, ninth revision, clinical
modification, 2021, URL https://www.cdc.gov/nchs/icd/icd9cm.htm.
[128] S.M. Kazemi, R. Goel, K. Jain, I. Kobyzev, A. Sethi, P. Forsyth, P. Poupart,
Representation learning for dynamic graphs: A survey, 2019.
[129] Q. Yuan, J. Chen, C. Lu, H. Huang, The graph-based mutual attentive network
for automatic diagnosis, in: Proceedings of the Twenty-Ninth International
Joint Conference on Artificial Intelligence, International Joint Conferences on
Artificial Intelligence Organization, 2020, pp. 3393–3399, http://dx.doi.org/10.
24963/ijcai.2020/469, URL https://www.ijcai.org/proceedings/2020/469.
[130] J.E. Rudy, Y. Khan, J.K. Bower, S. Patel, R.E. Foraker, Cardiovascular
health trends in electronic health record data (2012–2015): A Cross-Sectional
Analysis of The Guideline Advantage, eGEMs 7 (1) (2019) 30, http://dx.
doi.org/10.5334/egems.268, URL https://www.ncbi.nlm.nih.gov/pmc/articles/
PMC6646939/.
[131] R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, D. Pedreschi, A
survey of methods for explaining Black Box Models, ACM Comput. Surv. 51 (5)
(2019) 1–42, http://dx.doi.org/10.1145/3236009, URL https://dl.acm.org/doi/
10.1145/3236009.
[132] C. Agarwal, O. Queen, H. Lakkaraju, M. Zitnik, Evaluating explainability
for graph neural networks, Sci. Data 10 (1) (2023) 144, http://dx.doi.org/
10.1038/s41597-023- 01974-x, URL https://www.nature.com/articles/s41597-
023-01974- x.
Journal of Biomedical Informatics 151 (2024) 104616
17
H. Oss Boll et al.
[133] R. Ying, D. Bourgeois, J. You, M. Zitnik, J. Leskovec, GNNExplainer: Generating
explanations for graph neural networks, 2019, arXiv:1903.03894 [cs, stat] URL
http://arxiv.org/abs/1903.03894.
[134] H. Xuanyuan, P. Barbiero, D. Georgiev, L.C. Magister, P. Lió, Global concept-
based interpretability for graph neural networks via neuron analysis, 2023,
arXiv:2208.10609 [cs], URL http://arxiv.org/abs/2208.10609.
[135] H. Yuan, H. Yu, S. Gui, S. Ji, Explainability in graph neural networks: A
taxonomic survey, 2022, arXiv:2012.15445 [cs] URL http://arxiv.org/abs/2012.
15445.
[136] M. Jin, H.Y. Koh, Q. Wen, D. Zambon, C. Alippi, G.I. Webb, I. King, S. Pan, A
survey on graph neural networks for time series: Forecasting, classification,
imputation, and anomaly detection, 2023, http://dx.doi.org/10.48550/arXiv.
2307.03759, URL http://arxiv.org/abs/2307.03759 arXiv:2307.03759 [cs].
... There are many applications of significant societal importance, e.g. cyber-security [30][31][32], humantrafficking prevention [33,34], social media & misinformation [35], contact-tracing [36], patient care [37,38] where we would like to be able to use GNNs, with uncertainty quantification, on dynamic graphs. ...
Preprint
Full-text available
Graph neural networks (GNNs) are powerful black-box models which have shown impressive empirical performance. However, without any form of uncertainty quantification, it can be difficult to trust such models in high-risk scenarios. Conformal prediction aims to address this problem, however, an assumption of exchangeability is required for its validity which has limited its applicability to static graphs and transductive regimes. We propose to use unfolding, which allows any existing static GNN to output a dynamic graph embedding with exchangeability properties. Using this, we extend the validity of conformal prediction to dynamic GNNs in both transductive and semi-inductive regimes. We provide a theoretical guarantee of valid conformal prediction in these cases and demonstrate the empirical validity, as well as the performance gains, of unfolded GNNs against standard GNN architectures on both simulated and real datasets.
Article
Full-text available
Deep learning (DL) is revolutionizing evidence-based decision-making techniques that can be applied across various sectors. Specifically, it possesses the ability to utilize two or more levels of non-linear feature transformation of the given data via representation learning in order to overcome limitations posed by large datasets. As a multidisciplinary field that is still in its nascent phase, articles that survey DL architectures encompassing the full scope of the field are rather limited. Thus, this paper comprehensively reviews the state-of-art DL modelling techniques and provides insights into their advantages and challenges. It was found that many of the models exhibit a highly domain-specific efficiency and could be trained by two or more methods. However, training DL models can be very time-consuming, expensive, and requires huge samples for better accuracy. Since DL is also susceptible to deception and misclassification and tends to get stuck on local minima, improved optimization of parameters is required to create more robust models. Regardless, DL has already been leading to groundbreaking results in the healthcare, education, security, commercial, industrial, as well as government sectors. Some models, like the convolutional neural network (CNN), generative adversarial networks (GAN), recurrent neural network (RNN), recursive neural networks, and autoencoders, are frequently used, while the potential of other models remains widely unexplored. Pertinently, hybrid conventional DL architectures have the capacity to overcome the challenges experienced by conventional models. Considering that capsule architectures may dominate future DL models, this work aimed to compile information for stakeholders involved in the development and use of DL models in the contemporary world.
Article
Full-text available
Graph machine-learning (ML) methods have recently attracted great attention and have made significant progress in graph applications. To date, most graph ML approaches have been evaluated on social networks, but they have not been comprehensively reviewed in the health informatics domain. Herein, a review of graph ML methods and their applications in the disease prediction domain based on electronic health data is presented in this study from two levels: node classification and link prediction. Commonly used graph ML approaches for these two levels are shallow embedding and graph neural networks (GNN). This study performs comprehensive research to identify articles that applied or proposed graph ML models on disease prediction using electronic health data. We considered journals and conferences from four digital library databases (i.e., PubMed, Scopus, ACM digital library, and IEEEXplore). Based on the identified articles, we review the present status of and trends in graph ML approaches for disease prediction using electronic health data. Even though GNN-based models have achieved outstanding results compared with the traditional ML methods in a wide range of disease prediction tasks, they still confront interpretability and dynamic graph challenges. Though the disease prediction field using ML techniques is still emerging, GNN-based models have the potential to be an excellent approach for disease prediction, which can be used in medical diagnosis, treatment, and the prognosis of diseases.
Article
Full-text available
As explanations are increasingly used to understand the behavior of graph neural networks (GNNs), evaluating the quality and reliability of GNN explanations is crucial. However, assessing the quality of GNN explanations is challenging as existing graph datasets have no or unreliable ground-truth explanations. Here, we introduce a synthetic graph data generator, ShapeGGen, which can generate a variety of benchmark datasets (e.g., varying graph sizes, degree distributions, homophilic vs. heterophilic graphs) accompanied by ground-truth explanations. The flexibility to generate diverse synthetic datasets and corresponding ground-truth explanations allows ShapeGGen to mimic the data in various real-world areas. We include ShapeGGen and several real-world graph datasets in a graph explainability library, GraphXAI. In addition to synthetic and real-world graph datasets with ground-truth explanations, GraphXAI provides data loaders, data processing functions, visualizers, GNN model implementations, and evaluation metrics to benchmark GNN explainability methods.
Article
Full-text available
In-hospital clinical deterioration is a major worldwide healthcare burden in the intensive care units (ICUs), as it requires rapid intervention. Rapid response systems (RRSs) are widely used in many hospitals for the early detection of clinical deterioration to prevent cardiac arrest. Recently, with the increasing use of deep learning (DL) and electronic health records (EHR), many DL models have been developed for the intensive care domain, such as prediction of cardiac arrest, sepsis, or transferring to ICU. However, most existing methods do not explicitly learn the structure of multivariate time-series data, and this leads to high false-alarm rates and low sensitivity. In this research, we propose a novel DL-based framework that interpolates high-dimensional sequential data. Our approach combines two graph neural networks with an attention mechanism to learn the complex dependencies among multivariate time series. The experiments were conducted on two datasets: a private clinical dataset collected from Chonnam National University Hospital (CNUH) and a public dataset from the University of Virginia (UV). The experimental results show the potential performance of our model compared to some other related research.
Article
Full-text available
The rapid accumulation of electronic health records (EHRs) and the advancements in data analysis technology have laid the foundation for research and clinical decision-making in the healthcare community. Graph neural networks (GNNs), a deep learning model family for graph embedding representations, have been widely used in the field of smart healthcare. However, traditional GNNs rely on the basic assumption that the graph structure extracted from the complex interactions among the EHRs must be a real topology. Noisy connections or false topology in the graph structure leads to inefficient disease prediction. We devise a new model named PM-GSL to improve diabetes clinical assistant diagnosis based on patient multi-relational graph structure learning. Specifically, we first build a patient multi-relational graph based on patient demographics, diagnostic information, laboratory tests, and complex interactions between medicines in EHRs. Second, to fully consider the heterogeneity of the patient multi-relational graph, we consider the node characteristics and the higher-order semantics of nodes. Thus, three candidate graphs are generated in the PM-GSL model: original subgraph, overall feature graph, and higher-order semantic graph. Finally, we fuse the three candidate graphs into a new heterogeneous graph and jointly optimize the graph structure with GNNs in the disease prediction task. The experimental results indicate that PM-GSL outperforms other state-of-the-art models in diabetes clinical assistant diagnosis tasks.
Article
There has been a lot of activity in graph representation learning in recent years. Graph representation learning aims to produce graph representation vectors to represent the structure and characteristics of huge graphs precisely. This is crucial since the effectiveness of the graph representation vectors will influence how well they perform in subsequent tasks like anomaly detection, connection prediction, and node classification. Recently, there has been an increase in the use of other deep-learning breakthroughs for data-based graph problems. Graph-based learning environments have a taxonomy of approaches, and this study reviews all their learning settings. The learning problem is theoretically and empirically explored. This study briefly introduces and summarizes the Graph Neural Architecture Search (G-NAS), outlines several Graph Neural Networks’ drawbacks, and suggests some strategies to mitigate these challenges. Lastly, the study discusses several potential future study avenues yet to be explored.
Article
Background: Electronic health records (EHRs) are generated at an ever-increasing rate. EHR trajectories, the temporal aspect of health records, facilitate predicting patients' future health-related risks. It enables healthcare systems to increase the quality of care through early identification and primary prevention. Deep learning techniques have shown great capacity for analyzing complex data and have been successful for prediction tasks using complex EHR trajectories. This systematic review aims to analyze recent studies to identify challenges, knowledge gaps, and ongoing research directions. Methods: For this systematic review, we searched Scopus, PubMed, IEEE Xplore, and ACM databases from Jan 2016 to April 2022 using search terms centered around EHR, deep learning, and trajectories. Then the selected papers were analyzed according to publication characteristics, objectives, and their solutions regarding existing challenges, such as the model's capacity to deal with intricate data dependencies, data insufficiency, and explainability. Results: After removing duplicates and out-of-scope papers, 63 papers were selected, which showed rapid growth in the number of research in recent years. Predicting all diseases in the next visit and the onset of cardiovascular diseases were the most common targets. Different contextual and non-contextual representation learning methods are employed to retrieve important information from the sequence of EHR trajectories. Recurrent neural networks and the time-aware attention mechanism for modeling long-term dependencies, self-attentions, convolutional neural networks, graphs for representing inner visit relations, and attention scores for explainability were frequently used among the reviewed publications. Conclusions: This systematic review demonstrated how recent breakthroughs in deep learning methods have facilitated the modeling of EHR trajectories. Research on improving the ability of graph neural networks, attention mechanisms, and cross-modal learning to analyze intricate dependencies among EHRs has shown good progress. There is a need to increase the number of publicly available EHR trajectory datasets to allow for easier comparison among different models. Also, very few developed models can handle all aspects of EHR trajectory data. .
Article
Objective: Estimating the deterioration paths of chronic hepatitis B (CHB) patients is critical for physicians' decisions and patient management. A novel, hierarchical multilabel graph attention-based method aims to predict patient deterioration paths more effectively. Applied to a CHB patient data set, it offers strong predictive utilities and clinical value. Materials and methods: The proposed method incorporates patients' responses to medications, diagnosis event sequences, and outcome dependencies to estimate deterioration paths. From the electronic health records maintained by a major healthcare organization in Taiwan, we collect clinical data about 177 959 patients diagnosed with hepatitis B virus infection. We use this sample to evaluate the proposed method's predictive efficacy relative to 9 existing methods, as measured by precision, recall, F-measure, and area under the curve (AUC). Results: We use 20% of the sample as holdouts to test each method's prediction performance. The results indicate that our method consistently and significantly outperforms all benchmark methods. It attains the highest AUC, with a 4.8% improvement over the best-performing benchmark, as well as 20.9% and 11.4% improvements in precision and F-measures, respectively. The comparative results demonstrate that our method is more effective for predicting CHB patients' deterioration paths than existing predictive methods. Discussion and conclusion: The proposed method underscores the value of patient-medication interactions, temporal sequential patterns of distinct diagnosis, and patient outcome dependencies for capturing dynamics that underpin patient deterioration over time. Its efficacious estimates grant physicians a more holistic view of patient progressions and can enhance their clinical decision-making and patient management.