ArticlePDF AvailableLiterature Review

Graph neural networks for clinical risk prediction based on electronic health records: A survey

February 2024
Journal of Biomedical Informatics 151(2):104616

February 2024
151(2):104616

DOI:10.1016/j.jbi.2024.104616

License
CC BY 4.0

Authors:

Heloísa Oss Boll

Universidade Federal do Rio Grande do Sul

Show all 9 authorsHide

Content uploaded by Heloísa Oss Boll

Content may be subject to copyright.

Journal of Biomedical Informatics 151 (2024) 104616

Available online 27 February 2024

Contents lists available at ScienceDirect

Journal of Biomedical Informatics

journal homepage: www.elsevier.com/locate/yjbin

Graph neural networks for clinical risk prediction based on electronic health

records: A survey

Heloísa Oss Boll a,b,∗, Ali Amirahmadi b, Mirfarid Musavian Ghazani b,

Wagner Ourique de Morais b, Edison Pignaton de Freitas a, Amira Soliman b, Farzaneh Etminani b,

Stefan Byttner b, Mariana Recamonde-Mendoza a,c

aInstitute of Informatics, Universidade Federal do Rio Grande do Sul, Avenida Bento Gonçalves, 9500, Porto Alegre, 91501-970, RS, Brazil

bSchool of Information Technology, Halmstad University, Kristian IV:s väg 3, Halmstad, 301 18, Sweden

cBioinformatics Core, Hospital de Clínicas de Porto Alegre (HCPA), Av. Protásio Alves, 211, Bloco C, Porto Alegre, 90035-903, RS, Brazil

ARTICLE INFO

Keywords:

Graph neural networks

Electronic health records

Deep learning

Artificial intelligence

Graph representation learning

Keyword

ABSTRACT

Objective: This study aims to comprehensively review the use of graph neural networks (GNNs) for clinical

risk prediction based on electronic health records (EHRs). The primary goal is to provide an overview of

the state-of-the-art of this subject, highlighting ongoing research efforts and identifying existing challenges in

developing effective GNNs for improved prediction of clinical risks.

Methods: A search was conducted in the Scopus, PubMed, ACM Digital Library, and Embase databases to

identify relevant English-language papers that used GNNs for clinical risk prediction based on EHR data. The

study includes original research papers published between January 2009 and May 2023.

Results: Following the initial screening process, 50 articles were included in the data collection. A significant

increase in publications from 2020 was observed, with most selected papers focusing on diagnosis prediction (n

= 36). The study revealed that the graph attention network (GAT) (n = 19) was the most prevalent architecture,

and MIMIC-III (n = 23) was the most common data resource.

Conclusion: GNNs are relevant tools for predicting clinical risk by accounting for the relational aspects among

medical events and entities and managing large volumes of EHR data. Future studies in this area may address

challenges such as EHR data heterogeneity, multimodality, and model interpretability, aiming to develop more

holistic GNN models that can produce more accurate predictions, be effectively implemented in clinical settings,

and ultimately improve patient care.

1. Introduction

Electronic health records (EHRs) are extensive, heterogeneous, and

longitudinal repositories that document patients’ health, including

symptoms, prescriptions, clinical notes, and medical images. With the

increase in EHR data collection, there is growing interest in leveraging

this information to improve patient care, especially in the context

of clinical risk prediction [1]. Recent machine learning approaches

focused on predicting events such as disease diagnoses, mortality, and

hospital readmissions have been relevant to this endeavor [2,3].

Despite the rich information present in EHRs, translating it into

actionable insights presents challenges due to data-related problems

such as heterogeneity (multiple types of medical attributes describing

Abbreviations: EHR, Electronic health record; GNN, Graph neural network; GCN, Graph convolutional network; GAT, Graph attention network; GAE, Graph

autoencoder; CNN, Convolutional neural network; RNN, Recurrent neural network; GRU, Gated recurrent unit; LSTM, Long short-term memory

∗Corresponding author at: Institute of Informatics, Universidade Federal do Rio Grande do Sul, Avenida Bento Gonçalves, 9500, Porto Alegre, 91501-970, RS,

Brazil.

E-mail address: hoboll@inf.ufrgs.br (H. Oss Boll).

a patient), high dimensionality (a large number of attributes associated

with a patient), quality (missing values and inconsistencies) and tem-

poral dynamics (numerous patient encounters and timestamped clinical

events) [1,4–7]. Considering that the success of machine learning mod-

els depends largely on an adequate representation of the input data,

studies in representation learning – the process of learning expressive

representations of the input data for improved performance of predic-

tors [8] – are paramount for effectively transforming patient data from

the raw EHR format into adequate representations that fully capture

their health status [1].

Recent deep learning techniques have effectively addressed these

challenges. Unlike traditional machine learning approaches, which rely

https://doi.org/10.1016/j.jbi.2024.104616

Received 22 September 2023; Received in revised form 21 February 2024; Accepted 23 February 2024

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

Fig. 1. Electronic health records contain a range of multimodal patient data. This information can be used for patient graph representations, focusing on a patient’s visit or medical

record. In the case of a visit, a hierarchical and homogeneous graph is shown, while medical records are made up of sequences of visit graphs. Alternatively, the entire EHR data

can be modeled as a heterogeneous graph, with different types of nodes and edges represented by different colors. In all examples, nodes and edges can have feature vectors

processed using GNN for further clinical risk prediction tasks. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this

article.)

heavily on expert-driven feature engineering, deep learning models

can automatically extract meaningful latent feature representations

from complex raw data [9–11]. Among these, graph neural networks

(GNNs) stand out. The goal of graph representation learning is to

encode graphs into a low-dimensional vector space while preserving

topology and node properties [12]. In this sense, GNNs are particularly

adept at representing EHRs because they can capture the intricate

relationships and dependencies between medical entities to generate

rich, context-aware embeddings for further downstream tasks [13,14].

This is a promising feature in contrast to other machine learning and

deep learning algorithms, which often treat medical concepts as a

flat ‘‘bag of features’’, disregarding structural information and variable

interdependencies during model development [15,16]. Furthermore,

GNNs are powerful in handling the high sparsity and frequent missing

values found in EHR data, as they can respectively propagate infor-

mation through the graph structure to densify representation and infer

features based on the attributes of neighboring nodes in the medical

graph [17–19].

The strength of GNNs lies in their capability to navigate the intrica-

cies of non-Euclidean spaces [13,20]. Unlike grid-based data structures

such as images, which have inherent locality and consistent neigh-

boring relationships, graphs often lack a natural node ordering, and

the spatial proximity of nodes does not determine their relationships,

making it more challenging to apply key operations such as convolu-

tions [21,22]. For example, EHR graphs can represent a dense web of

patient histories, diagnoses, treatments, and other clinical outcomes,

with heterogeneity in node types, nodes with varied degrees, and

edges indicating co-occurrence, causality, hierarchical relations, and

other relevant interactions, resulting in complex topological structures

(Fig. 1). In this sense, GNNs offer the necessary flexibility to capture

and exploit these relationships, yielding thorough representations and,

consequently, improved interpretability and efficiency compared to

other deep learning models [23].

Early studies on GNNs for clinical risk prediction aimed to take

advantage of hierarchical medical information, structured as ontologies

and knowledge graphs, as distant supervision [24]. They introduced

label information through structured knowledge graph propagation,

learning correlations between medical codes and paralleling them with

codes observed in patients to obtain better predictions [24]. This

approach enabled more accurate predictions than other deep learning

baseline models [25]. Subsequent approaches started to prioritize the

learning of novel graph representations based on EHR rather than the

integration of knowledge graphs. Some of these representations include

patient similarity, patient-medication interactions, and temporal rela-

tions between medical events [26–28]. Today, given the plethora of

existing EHR multimodal information, heterogeneous graphs have also

been used, including clinical notes, disease codes, medical images, and

lab results into the learned embeddings, enriching the representations

of the data used for critical health predictions [29,30].

The manifold use of GNNs in EHRs represents a transformative

paradigm in the landscape of clinical task predictions. GNNs, as pow-

erful tools for modeling complex relationships within graph-structured

data, have demonstrated remarkable efficacy in capturing intricate

dependencies inherent in healthcare systems and will likely support

future disruptive advances in this domain. Notably, recent studies have

discussed the applications of deep learning in electronic health records.

However, none have explicitly focused on using GNNs for clinical

risk prediction based on EHR. For example, [7] have concentrated on

temporal patient presentation, while [1,11] have focused on general

deep learning techniques for EHR. The most recent studies concerning

graph representation have limited their scope to diagnosis prediction

only [31] or did not focus on GNNs [32].

Thus, while preceding review papers have explored the broader

landscape of deep learning applications in EHRs, a dedicated review

addressing the intricacies of utilizing GNNs for clinical risk prediction

based on EHRs remains an unexplored niche in the literature. The

narrative review presented in this paper aims to bridge this gap by of-

fering a targeted exploration of the advancements, open challenges, and

potential future research directions in this specific application domain.

Inspired by systematic protocols, the intention is to summarize the

current scope and depth of the available literature, while also setting

the stage for future systematic reviews as the field grows, providing a

fundamental overview essential for advancing research in the area (see

Table 1).

2. Background

2.1. EHR representation learning

The primary aim of representation learning for EHRs is to transform

input data into a suitable representation that can enhance downstream

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

Table 1

Significance of the presented survey.

Problem. The abundance of heterogeneous EHR data and

the relations among medical events and entities

are two key factors that pose challenges for

developing clinical risk prediction models.

What is already known. Leveraging the relational nature of medical

data as graphs with deep learning approaches

can improve clinical risk prediction.

What this paper adds. A narrative review of the state-of-the-art GNN

techniques for EHR-based clinical risk

prediction. The study highlights the main

challenges related to EHR graph data

representation, temporality, and prediction

tasks and describes various GNN architectures

used in this endeavor.

clinical prediction tasks [15]. Early studies on EHR representation

learning, exemplified by Doctor AI [33], RETAIN [34], and Dipole [35],

utilized recurrent neural network (RNN)-based models to account for

the sequential nature of EHR data in clinical risk prediction. Convolu-

tional neural networks (CNNs), word embedding methods, and stacked

denoising autoencoders were also utilized [36–42]. These method-

ologies aimed to model dependencies between medical codes of the

International Classification of Diseases (ICD), an international standard

for diagnosing and classifying health conditions [43].

Subsequent studies recognized the need for integrating structured

information into models, emphasizing the increasing relevance of graph

representation learning approaches. In the Heterogeneous Convolu-

tional Neural Network (HCNN), a CNN was adapted to capture the tem-

poral relationships between medical events in EHR data, which were

represented as attributed graphs [44]. Attention-based models such as

GRAM [45], KAME [46], MMORE [47], and CAMP [48] also incorpo-

rated hierarchical diagnosis information contained in the ICD ontology

for clinical risk prediction. The MiME model employed multilevel

relationships to learn representations of medical concepts [49]. Other

similar studies that have aimed to model EHR as graphs include [50–

53].

2.2. Graph neural networks

Graph neural networks (GNNs) are deep learning models designed

for processing and analyzing data organized as graphs [54]. A graph

can represent any relationships (edges) between a collection of entities

(nodes) [55]. Graphs are a fundamental way in which it is possible

to obtain data from the natural world, and a large part of patterns

observed in nature can be expressed and understood using graph struc-

tures [56]. In this way, GNNs are a powerful resource for representation

learning, as they can distill structural information and learn powerful

high-level representations [57]. GNNs have demonstrated applicability

across many network-related areas, including traffic forecasting, social

network analysis, and improved recommendation systems [56]. In

healthcare and biology, they have been used to discover new drugs,

predict protein–protein interactions, forecast disease outbreaks, and

assess patient similarity [15,58–63].

A graph is conventionally defined as 𝐺= (𝑉 , 𝐸 ), where 𝑉is the

set of nodes, and 𝐸is the set of edges that connect pairs of nodes

in 𝑉[56]. In EHRs, depending on the approach, nodes can represent

various healthcare entities such as patients, diagnoses, and medications

(Fig. 1). Each node 𝑢∈𝑉has an associated feature vector 𝑥𝑢∈

R𝑘. These features are organized into a node feature matrix 𝑋∈

R𝑉×𝑘, where each row denotes a node’s feature, making them suitable

for machine learning models [56]. For instance, a feature vector for

a patient node can include encoded attributes such as age, gender,

medical history information, and vital signs. Furthermore, an edge

set 𝐸is typically represented by an adjacency matrix 𝐴∈R𝑉×𝑉,

indicating whether node 𝑢is related to another node 𝑣through binary

indicators (𝑎𝑢𝑣 = 1 or 0). The presence of an edge between a patient

node and a medication node could indicate, for example, whether a

patient received specific medication during a visit. Nonetheless, this

is a simplified representation where edges are not attributed, meaning

they do not encapsulate additional information, such as the strength

or type of relationship. Even in such cases, the core properties of the

GNNs remain the same.

GNNs operate by aggregating information from a node’s immediate

neighborhood, which is then combined with the node’s features to

create a richer latent representation (Fig. 2) [64,65]. This is achieved

with matrix multiplication of the adjacency (𝐴) and feature (𝑋) matri-

ces [66]. After 𝑘aggregation steps, the structural information within a

node’s k-hop neighborhood is captured, and a final graph representa-

tion can be obtained by pooling the transformed feature vectors of all

nodes [67]. This propagation mechanism, often referred to as message

passing, enables GNNs to learn expressive representations and solve

tasks such as node classification (predicting the label of a node), link

prediction (predicting whether there is an edge between two nodes),

and graph classification (predicting a label for the entire graph) [68].

GNNs must follow the principles of permutation invariance and

equivariance to handle graph structures. This ensures that their out-

put remains consistent, regardless of how nodes are presented [56].

Furthermore, GNNs must handle varying numbers of node neighbors.

For instance, one patient node might be linked to several medication

nodes during a visit, while another patient might be connected to only

a few, and the same model must address both cases. This is achieved by

employing local functions (𝑓) that compute node-level outputs (ℎ𝑢) by

considering not only the features of the target node (𝑢) but also those of

its specific number of neighboring nodes [56]. The choice of particular

local functions depends on the GNN architecture; these variations are

further discussed in the following sections.

Note that this study focuses on GNNs utilized for EHR data analysis.

For a comprehensive review encompassing various types of GNNs,

interested readers are referred to [54–56,67–69].

2.2.1. Graph convolutional networks

Graph convolutional networks (GCNs), introduced by Kipf and

Welling [70], propagate information across the graph and aggregate

it to update node representations, extending convolutions from the

Euclidean domain to the graph domain — which is characterized by

data structured as nodes and edges [20]. In the spatial approach, each

GCN layer computes new node representations based on their current

features and those of their neighbors, similar to how convolutional

layers in CNNs aggregate local information in images — but in a

configuration where operation objects are non-fixed in size [54,71].

As nodes progress through the GCN layers, the learned representations

encapsulate a broader neighborhood, similar to the increasing receptive

fields in CNNs. The local function that provides the update rule for node

𝑣in a GCN involves a weighted sum of the features of the node and its

neighbors and is given by

ℎ(𝑙+1)

𝑣=𝜎

𝑢∈𝑁(𝑣)∪{𝑣}

𝑐𝑣𝑢

𝑊(𝑙)ℎ(𝑙)

𝑢

where ℎ(𝑙)

𝑣is the feature vector of node 𝑣in layer 𝑙,𝑁(𝑣) ∪ {𝑣}is the

set of neighbors of 𝑣and 𝑣itself, 𝑐𝑣𝑢 is a normalization constant, 𝑊(𝑙)

is a weight matrix, and 𝜎is a nonlinear activation function.

GCNs can also be developed using spectral approaches [72]. Spec-

tral models treat graphs as signals processed through graph convolution

in the spectral domain. In particular, graph signals are first transformed

into the spectral domain using a Graph Fourier Transform (GFT), which

leverages the eigenvectors of the graph Laplacian as its basis functions.

Specifically, given a graph with the Laplacian matrix 𝐿, its eigenvalue

decomposition leads to eigenvectors that serve as harmonic modes for

the GFT. Once in the spectral domain, the graph signals are filtered

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

Fig. 2. Simple representation of how a GNN layer operates for clinical risk prediction. The double-edge arrows represent the message-passing process between neighboring nodes.

This interaction allows nodes to aggregate information and generate context-aware embeddings. The model’s design, including its outputs and loss functions, are adapted based on

the specific requirements of the tasks.

and subsequently transformed back to the spatial domain with the

GFT [54].

Despite their different starting points, both spectral and spatial

models iteratively collect neighborhood information (organized as low-

dimensional vectors) to capture high-order correlations among the

analyzed graphs [57].

2.2.2. Graph attention networks

Graph attention networks (GATs), proposed by Veličković et al. in

2018, extend GCNs by introducing an attention mechanism to weigh

the importance of neighboring nodes [59]. This mechanism is inspired

by the attention of the Transformer model, which allows the network

to simultaneously focus on different parts of the input [73].

The local function that provides the update rule for GATs involves

two main steps: calculating attention scores for each pair of nodes in

the graph, which indicate the relevance of neighbor nodes’ features

to a given central node; and a weighted aggregation of the features

of the central node and its neighbors, where the previously calculated

attention scores determine the weights [59]. The rule is then defined

ℎ(𝑙+1)

𝑖=𝜎

𝑗∈𝑁(𝑖)

𝛼𝑖𝑗 𝑊 ℎ(𝑙)

𝑗

where 𝛼𝑖𝑗 is the attention coefficient computed as

𝛼𝑖𝑗 =exp LeakyReLU 𝑎𝑇[𝑊 ℎ𝑖∥𝑊 ℎ𝑗]

𝑘∈𝑁(𝑖)exp LeakyReLU 𝑎𝑇[𝑊 ℎ𝑖∥𝑊 ℎ𝑘]

GATs offer particular value in interpretability, as learned attention

weights enable a deeper understanding of the relevance of specific

nodes for a given end task [74].

2.2.3. Graph autoencoders

Graph autoencoders (GAEs) are unsupervised learning models that

aim to learn compact representations of graph nodes, acting as a type

of dimensionality reduction technique [75]. The local function in GAEs

is employed in encoding and decoding graph information. During en-

coding, node features are transformed into a lower-dimensional latent

representation, capturing information both about its neighboring nodes

and the general graph structure. During decoding, the latent represen-

tations are used to reconstruct the original features while minimizing

the reconstruction error [76].

In 2016, Kipf and Welling extended the GAE framework by in-

troducing a variational inference approach called variational graph

autoencoder (VGAE) [77]. In VGAE, the encoder not only generates a

low-dimensional vector representation for each node but also models

the underlying probability distribution of these representations, usually

with a Gaussian distribution. The decoder, then, aims to minimize the

difference between the distributions of the reconstructed data and the

actual data, allowing the model to account for uncertainty and leading

to more robust graph reconstructions.

3. Methods

3.1. Search strategy and information sources

A comprehensive, narrative review of the use of GNNs in EHR-based

clinical risk prediction was conducted. The PRISMA protocol (Preferred

Reporting Items for Systematic Reviews and Meta-Analyses) [78] was

used as inspiration for the adopted review process, as detailed in Fig. 3.

However, the option was for a more adaptable strategy to provide

a comprehensive summary and perspective on GNNs, which led to

qualifying the review as ‘‘narrative’’ rather than ‘‘systematic’’. This

decision reflects the intention to offer a broad, interpretive overview of

the literature, emphasizing context and insight rather than a narrower

quantitative analysis, which we believe is particularly valuable in the

developing field of GNNs for clinical risk prediction. Five databases

were used: Scopus, PubMed, ACM Digital Library, and Embase. The

search term used was as follows:

(‘‘Graph’’ OR ‘‘Graph neural network’’ OR ‘‘Graph neural’’ OR ‘‘GNN’’

OR ‘‘Graph convolutional’’ OR ‘‘GCN’’ OR ‘‘Graph autoencoder’’ OR ‘‘GAE’’

OR ‘‘Graph attention’’ OR ‘‘Graph attention network’’ OR ‘‘GAT’’ OR

‘‘Graph self-attention’’ OR ‘‘GSA’’ OR ‘‘Graph transformer’’ OR ‘‘Graph

transformer network’’ OR ‘‘GTN’’ OR ‘‘Graph transformer’’ OR ‘‘Graph-

based’’ OR ‘‘Graph embedding’’) AND (‘‘Electronic health record’’ OR

‘‘EHR’’ OR ‘‘Electronic medical record’’ OR ‘‘EMR’’ OR ‘‘electronic health

data’’) AND (‘‘Deep learning’’ OR ‘‘Neural network’’ OR ‘‘Representation

learning’’ OR ‘‘Artificial intelligence’’)

An initial assessment of the literature using these terms revealed

a large number of potentially relevant articles. Therefore, to maintain

specificity, we chose not to include the more general terms ‘‘machine

learning’’ and ‘‘clinical’’, which could dilute the focus of the analysis.

The gray literature, specifically the ArXiv repository, was also revised

using similar terms. The search was limited to articles published be-

tween January 1, 2009, and May 14, 2023, as it was anticipated that

there would be no relevant articles published before 2009 since GNNs

started to become more popular after Scarselli et al. (2009) [65].

3.2. Eligibility criteria

This review aimed to include all related articles published in English

that used GNNs for clinical risk prediction based on EHRs within the

search period. The exclusion criteria employed during the screening

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

Fig. 3. Flowchart for literature search and selection of articles for the review.

and full-text review stages encompassed the following attributes: uti-

lization of unstructured EHR data as the main focus; absence of clinical

risk prediction focus; non-research or conference content; and paper not

in the English language. Review articles, duplicate records, and studies

that did not include EHRs or GNNs were also excluded. Three authors

screened the abstracts, while the first author reviewed the full texts and

extracted the data. Data extraction results were revised by the other

authors. The search strategy and selection process are summarized in

Fig. 3.

3.3. Data extraction, synthesis, and analysis

First, the selected papers were categorized based on the specific

clinical risk prediction task(s) they intended to solve. Then, details were

gathered about the task prediction levels (patient or visit), graph rep-

resentation, and node definitions, including whether they incorporated

temporal information and the type of resource employed to help in

the learning process, if any. Concerning the model’s implementation,

data were extracted regarding the EHR datasets used, patient counts,

evaluation metrics implemented in each study, available repository

links, and techniques utilized for interpretability, if any. Finally, all

the collected information was consolidated for subsequent qualitative

and quantitative analyses, centered on three axes: data representation,

incorporation of temporality, and prediction tasks.

3.4. Summary of study selection

After deduplication of the initial 449 articles, 351 papers were

obtained. Out of these, 50 articles were selected for full-text review

and data collection following the steps detailed in Fig. 3. Among the

69 preprints retrieved from Arxiv, only one study was eligible for the

review and included in the selected articles. Table 2 summarizes the

statistics of the 50 papers included in this review. Detailed information

for all papers is provided in Supplementary Table, and an overview of

the results is provided in the next sections.

4. Results

The Results section is structured as follows: first, a brief summary

of the main results is presented. Next, a more detailed description of

the main graph data representation approaches is provided, including

the incorporation of temporality into clinical predictions, and the most

common clinical risk prediction tasks. Finally, a discussion addresses

the unique challenges of modeling EHR data with GNNs, presenting

perspectives in the field, and describing study limitations.

4.1. Overview of the studies characteristics

Using GNNs for clinical risk prediction is a rising trend, as shown in

Fig. 4. There were numerous approaches to graph representation in the

evaluated articles, which often overlap. One of the main ones involves

using a hierarchical graph based on codes from medical ontologies,

such as ICD (diseases, procedures) and ATC (medications), which de-

scribe well-known parent–child relations (n = 15, Table 3). In this

case, the hierarchical medical code graph is considered homogeneous.

Another technique involves learning heterogeneous EHR graphs, where

nodes other than medical codes are comprised, such as patients, lab

values, and doctors, and edge relations may vary (n =9, Table 3).

These two approaches are often combined to create a heterogeneous

graph where medical ontology knowledge is also an aspect integrated

into the learned embeddings (n =5, Table 3). Other techniques in-

clude similarity networks, where nodes are linked based on feature

similarities (n =6); bipartite graphs, with two types of nodes (n =

5); hypergraphs, where edges can connect any number of nodes (n =

2); dynamic graphs, where features change over time; and multi-view

graphs, which include a combination of different types of networks of

medical entities interactions (n =3). For a summary, please refer to

Table 3; more information can be found in the Supplementary Material.

Out of all the clinical risk prediction tasks, diagnosis prediction has

received the most attention (Fig. 5, n =36), followed by mortality and

readmission. The diagnosis task also tested the most diverse GNN model

architecture set (Fig. 6). An overview of GNNs used for these three main

tasks is provided in Section 4.4.

When considering model architectures, it was possible to observe

that graph attention networks (GAT) were the most employed (n =

19), followed by graph convolutional networks (GCN) (n =18) and

GraphSAGE (n =4) (Fig. 7). Different architectures were also employed

to handle unique medical data requirements, such as GRU, LSTM, and

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

Table 2

A summary of the selected articles.

Year Ref. Clinical risk prediction task Data sourceaInterp.bMetrics Reprod.

Diag. Mort. Read. Otr. ACC AUPRC AUROC F1

2020 [16] x x x EI, SY AW 0.77 0.53; 0.59 x

2020 [25] x M3 0.65 x

2020 [28] x M3, PV 0.94

2020 [79] x M3, UT 0.48 0.72

2020 [80] x M3, PV 0.92

2020 [81] x PC 0.91 0.09 0.81 0.15 x

2020 [82] x M3 0.57

2020 [83] x M3 0.34 0.9

2020 [29] x M3 TS x

2020 [84] x M3 TS 0.63 0.67 0.68

2021 [18] x x x M3, EI, NY SVA 0.71; 0.39 x

2021 [26] x x EI AW 0.43 0.85 x

2021 [85] x x IQ, EI PCA 0.82; 0.2 0.86; 0.67 x

2021 [30] x PV AW 0.74 0.66 x

2021 [17] x PV 0.45 x

2021 [86] x M3 TS, H 0.60

2021 [87] x M3, WH, SP, MB AW 0.82 0.93 x

2021 [88] x PV

2021 [89] x M3 0.84; 0.67 0.85; 0.67 x

2021 [90] x IB U, C 0.79 x

2021 [91] x PV 0.53; 0.62

2021 [92] x M3 0.37 0.78 0.42

2021 [93] x M3 0.85 0.72; 0.48 x

2021 [94] x x SY 0.9

2022 [95] x M4, EI AW 0.96; 0.15 0.60; 0.49 0.89; 0.8 x

2022 [96] x M3 TS 0.21

2022 [97] x PV AW 0.44

2022 [98] x EM 0.75

2022 [99] x CN 0.72 0.15

2022 [100] x M3 TS 0.87 0.84

2022 [101] x x M3, XM 0.53; 0.85 0.87; 0.90 0.5; 0.84

2022 [102] x M3, M4 0.9 0.73; 0.26 x

2022 [103] x PV TS, GE 0.91

2022 [104] x x M3 TS 0.62; 0.65 0.90; 0.75 x

2022 [105] x NC CIC 0.15 0.75 x

2022 [106] x CD

2022 [107] x PH C 0.15 x

2022 [108] x PV 0.81

2022 [109] x x M3, EI 0.62; 0.39 0.91; 0.72 0.53; 0.37

2022 [110] x x M3, CK, TJ, HM 0.76; 0.8; 0.54 0.83; 0.95; 0.86 x

2022 [111] x M4 0.68 0.53

2022 [112] x M3, M4 TS 0.93 x

2023 [19] x CF C 0.89 0.8

2023 [113] x CU, UV AW 0.88 0.95

2023 [27] x PV 0.83 0.27

2023 [114] x M4, PE 0.93 0.87; 0.89

2023 [115] x M4, PV GE 0.78 x

2023 [31] x M3, EI TS, AW 0.86 0.64; 0.74

2023 [116] x MC S 0.99 x

2023 [12] x MC TS

aEI: eICU, M3: MIMIC-III, M4: MIMIC-IV, PV: Private, UT: UTP, PC: Physionet Challenge, NY: NYU Langone Health, IQ: IQVIA US, WH: WHAS, SP: SUPPORT, MB: METABRIC,

IB: IBM Explorys, SY: Synthetic, EM: EMRNet, CN: Cardionet, XM: Xiangya Medical, NC: N3C, CD: COPD, PH: PopHR, CK: CKD, CG: Cardiology, TJ: TJH, HM: HMH, CF: Cerner’s

Health Facts, CU: CNUH, UV: UV, PE: P-EHRs, MC: Mayo Clinic.

bTS: t-SNE, A: Attention, SV: SVA, P: PCA, U: UMAP, GE: GNN Explainer, CIC: Clinically intuitive concepts, S: SHAP, EGE: Extended GNNExplainer, C: Clustering, AW: Attention

weights.

Table 3

Types of EHR graph representation employed in the analyzed papers.

Graph representation Papers

Hierarchical [25,27,28,30,31,80,91–93,96,103,107,

108,112]

Heterogeneous [29,88,95,97,100,110,111,114,116]

Hierarchical and heterogeneous [16,19,82,105,106]

Similarity [26,89,94,101,104,115]

Bipartite [17,85,90,93,99]

Dynamic [79,83,102]

Hypergraph [96,109]

Other [12,18,81,84,86,87,98,113]

BERT (Supplementary Table). In these cases, GNNs were responsible for

the graph representation learning step, while the others were used for

processing temporal aspects and multimodal data.

Regarding incorporating temporal patient information, most eval-

uated approaches consider some sort of clinical event sequentiality,

such as the order of patient visits (Supplementary Table, n =32).

However, most of these works disregard irregular time intervals. More

information about the temporal aspects of reviewed models is described

in Section 5.5.

GNNs were evaluated against and consistently outperformed other

machine learning and deep learning techniques in all evaluated studies,

indicating that graph representations are valuable for clinical risk

prediction. A summary of metrics is shown in Table 2, and a full report

can be found in the Supplementary Table. Most authors compared the

proposed GNN models with traditional and baseline approaches using

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

Fig. 4. Yearly distribution of selected articles. The cutoff date for including articles was May 14, 2023.

Fig. 5. Distribution of articles addressing different clinical risk prediction tasks.

Table 4

A summary of EHR data resources used in the evaluated articles.

Data resources Papers

M3 (MIMIC-III) [18,25,28,29,31,79,80,82–84,86,87,89,92,93,96,

100–102,104,109,110,112]

M4 (MIMIC-IV) [95,102,111,112,114,115]

EI (eICU) [16,18,26,31,85,95,109]

SY (Synthetic) [16,94]

Other [12,17–19,27,28,30,79–81,85,87,88,90,91,97–99,

101,103,105–108,110,113–116]

classic ML metrics such as Accuracy, Precision, Recall, and F1-score, as

well as AUPRC (Area Under the Precision–Recall Curve) and AUROC

(Area Under the Receiver Operating Curve).

The availability of resources and documentation highlights the as-

pect of reproducibility in the evaluated models. Of the 50 selected

articles, 22 included a link to a repository containing the project’s

source code, as shown in Table 2 (links in the Supplementary Mate-

rial). Most papers provided comprehensive details about their model

architecture and parameters, including the specifics of GNN layers,

frameworks, learning rates, optimizers, and data split percentages.

Furthermore, the use of benchmark datasets also helped with repro-

ducibility efforts, as it allows for validations against a known standard.

For example, the Medical Information Mart for Intensive Care III,

or MIMIC-III [117], was the most frequently used dataset (n =23,

Table 2). MIMIC is a freely accessible database, one of the most widely

employed EHR datasets, encompassing data from over 40,000 patients

admitted to the Beth Israel Deaconess Medical Center in the USA,

spanning the years 2001 to 2012. Other datasets include the eICU

dataset [118], which provides data of over 200,000 admissions to

various intensive care units (ICUs), MIMIC-IV [119], an updated and

more extensive version of MIMIC-III, as well as synthetic EHR datasets,

which are created based on the real structure of EHR datasets. Aside

from these, a number of studies make use of proprietary and private

datasets obtained directly from hospitals for their GNN analyses. A

summary of utilized datasets and the corresponding articles is presented

in Table 4.

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

Fig. 6. Heatmap of articles using different GNN architectures for solving different clinical risk prediction tasks. (For interpretation of the references to color in this figure legend,

the reader is referred to the web version of this article.).

Fig. 7. Distribution of articles using different GNN architectures.

In the evaluated articles, half employed interpretability techniques

(n =25, Fig. 8). These methods aim to identify critical nodes, edges,

subgraphs, and their features responsible for GNN outputs [120]. The

most prevalent approach involved analyzing the t-SNE plot of the

embedding space (n =10). The t-SNE method maps high-dimensional

data into a lower-dimensional space, preserving local and global infor-

mation [121]. The resultant clusters, ideally non-overlapping, represent

different subtypes learned by the model. Other dimensionality reduc-

tion techniques, such as UMAP and PCA, were also used for similar

purposes. The second most common approach was the analysis of

learned attention weights (n =8). Graph attention layers use these

weights to compute attention-guided embeddings for nodes, edges, sub-

graphs, or combinations [74]. Their magnitude reflects the importance

of a given node j’s features to node i, which naturally becomes an

interpretability resource.

4.2. Graph data representation

4.2.1. Ontology-based approaches

The primary aim of representation learning for EHR is to trans-

form the input data into an expressive representation that can en-

hance downstream clinical prediction tasks [15]. The main goal of

utilizing GNNs is to incorporate the existing relations among medical

events and entities from EHRs into the learned representations. This

process is often achieved through knowledge injection, which intro-

duces a priori medical knowledge to guide and enrich deep learning

architectures [122,123].

Medical knowledge graphs and ontologies contain rich hierarchical

information (such as medical events co-occurrence, and ‘‘is the cause

of’’ and ‘‘is caused by’’ relations), which can offer a comprehensive

and reliable understanding of how clinical concepts interact. Examples

include KnowLife [124], a knowledge graph that integrates unstruc-

tured biomedical data; the International Classification of Diseases (ICD)

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

Fig. 8. Distribution of articles using different interpretability techniques. (For interpretation of the references to color in this figure legend, the reader is referred to the web

version of this article.).

Table 5

A priori medical knowledge used in GNNs for clinical risk prediction.

Knowledge

injection

sources

Papers

ICD [25,28,30,31,80,82,91–93,96,107,108]

CCS [28,80,92,103,112]

CMeKG [106,108]

KnowLife [25]

Other [17,29,30,105,107,108]

ontology [43], which is the global standard for reporting and recording

diseases, symptoms, signs, and other events in all realms of healthcare;

the Clinical Classifications Software (CCS), which clusters the numerous

individual ICD codes into a smaller number of clinically meaningful

categories [125], and CMeKG, the first Chinese medical knowledge

graph [126]. All of them and more have been used for domain knowl-

edge injection in GNNs for clinical risk prediction (n =26); a summary

of the main sources can be found in Table 5.

The hierarchical relations in medical ontologies can be naturally

represented as parent–child graphs, in which the non-leaf nodes in

the tree represent broader, more general classes of medical concepts,

and leaf nodes represent more specific instances of these classes [80].

In ICD-9, for example, parental codes 390–459, referring to ‘‘Diseases

of the Circulatory System’’, represent a broad category encompassing

various circulatory system conditions. They are connected to sub-

classes that detail the conditions, such as codes 401–405 ‘‘Hypertensive

Disease’’ [127].

A frequently observed method of inputting knowledge in GNNs

involves extracting medical codes from a patient’s EHR, especially

ICD codes related to diagnoses, and combining them with the graph

structure features of hierarchical medical ontology graphs (n =12,

Table 5) [1]. More specifically, the node embeddings in the ontology-

based GNN models are learned as a combination of embeddings of the

observed medical codes for a patient and the code’s ancestors on the

medical graph [11]. This method often depicts each patient’s visit or

medical history as a graph with medical codes as nodes [25,103]. Edges

indicate mainly the hierarchical relationships among the codes ob-

served in an ontology, and can also be combined with co-occurrence in-

formation, indicating comorbidity or a significant relationship between

the observed diseases [80].

Some examples of this approach include the Graph Neural net-

works based Diagnosis Prediction (GNDP) model, which leverages

the ICD ontology to predict medical codes occurring at the next

visit [28]; and Sherbet, which utilizes hyperbolic embeddings to recon-

struct the disease hierarchical structure and predict temporal health

events [31]. MedPath [30] introduces the concept of extracting per-

sonalized knowledge subgraphs from ontology graphs for individual

patients, which is further explored in MedML [105]. In the Graph-

based Structural Knowledge-aware Network (GSKN) model, subgraphs

are also utilized, but with the intent to capture deep-level knowledge

graph structure information and dynamic representations of medi-

cal entities [108]. The HyperGraph-based disease prediction model

(EHR2HG) incorporates the ICD hierarchy and hypergraphs to con-

sider higher-order relations among diseases and patients [82]. The

Graph ATtention-Embedded Topic Model (GAT-ETM), simultaneously

learns node embeddings based on both the ICD (diseases) and ATC

(medications) ontology medical codes with a GNN [107].

Other sources of knowledge were also proposed to capture medical

event connections that might exist outside the scope of ontologies.

In [16], Choi et al. introduced a normalized conditional probability

matrix P, which restricts the model’s search space based on the co-

occurrence statistics of medical codes in an EHR dataset. In the Joint

Medical Ontology Representation Learning (JMRL) [25] and DUal-

GRAph Representation Learning (DUGRA) [82] models, the informa-

tion contained in both medical ontologies and co-occurrence statistics

of medical codes is explored. Variational regularization has also been

used to impose constraints on the learning process and perform struc-

ture learning without predefined guiding graphs [18]. Direct medical

expert validation has also been employed [105].

4.2.2. Heterogeneous approaches

Recently, new approaches have been leveraging more of the rich

multimodal information contained in health records (n =9, Table 3).

Instead of focusing only on hierarchical medical information and struc-

tured knowledge graphs, medical events and entities can be represented

within a heterogeneous EHR graph. This results in a dense network

of interactions that more effectively capture the complexity of a pa-

tient’s health status, and can be applied to broader clinical settings

and data resources [79,84,110]. While traditional GNN architectures

were designed for homogeneous graphs, these new methods employ

novel GNNs designed for graphs with often multiple types of nodes and

edges. Alternatively, some of the evaluated methods aim to transform

a heterogeneous EHR graph into a homogeneous one, allowing it to be

processed by standard GNN layers [29].

In Med2Meta [84], Chowdhury et al. uses graph autoencoders to

learn feature-specific embeddings for singular medical concept cate-

gories in the EHR: demographics, laboratory results, as well as clinical

notes. These are then combined into meta-embeddings for downstream

tasks, which significantly benefits predicting diagnosis in a patient’s

subsequent visit. The HarmOnized Representation learning on Dynamic

EHR graphs (HORDE) model generates harmonized medical entity

embeddings based on a multimodal dynamic EHR graph [79]. The

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

approach focuses on two types of nodes: time-invariant nodes with

static properties, such as events (diagnoses, procedures, laboratory

results) and medical concepts (from unstructured information from

clinical notes), and time-varying nodes with dynamic properties, such

as patients. Multimodality resulted in a more robust model for disease

classification. To minimize the impact of noise in the heterogeneous

EHR graph, the Heterogeneous Similarity Graph Neural Network (HS-

GNN) [29] model employs a preprocessing method that splits the

heterogeneous EHR graph into several homogeneous subgraphs, which

are then merged into a single homogeneous graph. The unified graph

can then be fed into a GNN for diagnosis prediction.

To handle multi-dimensional time-series EHR data, the Time-aware

Context-Gated Graph Attention Network (T-ContextGGAN) [95] model

also utilizes a heterogeneous EHR graph, where patient nodes are

connected to clinical event nodes such as lab tests, infusion drugs,

and prescriptions. The model then uses meta-paths to connect nodes

from various time steps for survival prediction. ME2Vec also addresses

both the heterogeneity of EHRs and time series data, employing a

hierarchical framework to embed medical services, doctors, and pa-

tients [85]. Medical services are embedded using a random-walk ap-

proach to account for irregular time intervals, while a GNN and a

proximity-preserving network embedding approach is used for doctors

and patients.

Cho et al. developed a novel EHR graph-database approach using

a multi-attributed and multi-relational bipartite graph to represent

patients and their relations with hospital visits and clinical events [99].

Then, HinSAGE was employed for predicting cardiovascular disease

events, demonstrating superior performance over traditional ML algo-

rithms in a link prediction task.

4.3. Temporality

Given that clinical risk prediction tasks can benefit from considering

the temporal aspects of patient records, EHR data are often not repre-

sented as a single graph but as a series of graphs observed in different

timesteps for a particular patient; or as a dynamic graph, where new

EHR information can be added and thus its nodes and edges change

over time [79,80,102]

To handle the sequential nature of hospital encounters, several

models couple GNNs with recurrent neural networks (RNNs) such as

long short-term memory (LSTM) networks or gated recurrent units

(GRU) (n =18, Supplementary Material). These architectures operate

by passing hidden state information from one input unit to the next,

encoding information about the entire sequence of medical events [1].

When coupled with GNNs, this pairing enables the model to leverage

both structural knowledge from medical graphs and time dependency

between clinical events [23]. For example, in JMRL, an attentive GRU

is used to aggregate temporal information between visits represented as

graphs for next-visit diagnosis prediction [25]. In the Longitudinal and

Graph Integrated (LIGHTED) model, Dong et al. account for the time

between visits by concatenating the embeddings of nodes representing

patient visits with the raw features observed for that visit and feeding

it sequentially into an LSTM [19].

Another approach focused on handling irregular sampling in patient

time information is using ordinary differential equations (ODEs). In the

Graph Attention and RNN-based Neural Ordinary Differential Equations

Model (GROM) model [92], the ODE represents dynamic clinic data as

a continuous trajectory influenced by local initial states and the global

dynamics of the entire time series, and a time-invariant neural network

function determines the whole of the latent trajectory. The data is then

fed back to a bidirectional RNN layer to mitigate gradient and sequence

length-related issues.

Other studies have handled event sequentiality by employing spa-

tiotemporal GCNs to model both the graph structure and temporality of

the EHR data (n =2, Supplementary Material). In these models, tem-

poral information was leveraged by stacking node attributes along the

timestamp and employing a convolution operation to extract features

in the temporal domain [28,80].

A dynamic graph, on the other hand, can be characterized as a

pair (𝐺, 𝑂), where 𝐺refers to a static graph that represents the initial

state of the dynamic graph, and 𝑂represents a tuple consisting of the

event type, the specific event (e.g., edge or node addition or deletion),

and its timestamp [128]. In the HORDE model [79], medical events,

clinical notes, and patients are represented as nodes in a dynamic

graph, with edges indicating event co-occurrence. As patient condition

changes over time, LSTM captures these changes, providing a temporal

context to the evolving graph structure. [12] adds a new timestamped

node in the patient’s EHR graph for each registered event. In TAG-

Net [83], time-series EHR data is also represented as a dynamic graph

to depict a single patient’s physiological condition across visits. A GRU

model is then employed to create the different representations for each

timestamp.

Lu et al. introduced a global dynamic disease graph, shifting the

focus from individual patients [102]. They extract subgraphs from

observed diseases during patient visits, which are then processed with

GRUs to discern temporal disease patterns and forecast potential future

outbreaks.

4.4. Clinical risk prediction tasks

4.4.1. Diagnosis prediction

Diagnosis prediction aims to forecast whether a patient will be

diagnosed with a medical condition (n =36, Fig. 5). The reviewed

studies primarily focused on predicting diagnoses within the current or

upcoming visit [16,102]. Still, some extended it to many hours before

manifestation or even up to a year or two in advance [18,98]. Most

studies have treated the task as a multilabel classification problem (n

=24, Supplementary Table). In this context, various probabilities are

determined for multiple diseases, allowing a patient to be associated

with several diseases.

Graph convolutional networks (GCNs) are frequently employed for

diagnosis prediction (n =15; Fig. 6). In [86,129], GCNs were used

to learn medical concept embeddings based on a medical ontology.

The HealGCN model utilizes a heterogeneous GCN and a symptom

retrieval system to enable online disease self-diagnosis [88]. In [28,80],

spatiotemporal GCNs were applied to capture the sequential patterns in

patient records and predict multiple diseases.

Graph attention networks (GATs) have also been widely utilized for

diagnosis prediction (n =12; Fig. 6). They assign different attention

weights to specific medical entity nodes, allowing the models to learn

which nodes are more relevant for a given prediction task [74]. For

example, Choi et al. introduced the Graph Convolutional Transformer

(GCT) with an attention mechanism to learn the graphical representa-

tion of a patient’s visit, highlighting the most significant medical events

for future diagnosis prediction [16]. The JMRL model uses GATs with a

feedback strategy, incorporating medical knowledge graph embeddings

and medical concept co-occurrence to predict multiple diagnoses [25].

The MedPath model utilizes attention weights to provide explanations

for medical paths used in diagnosis prediction [30], and the GTGAT

model employs a gated tree-based GAT with hierarchical and semantic-

aware attention to distill valuable information from nodes, enhancing

personalized disease diagnosis performance [98].

Many studies have focused on predicting specific diagnoses (n =14,

Supplementary Table), particularly cardiovascular diseases — which

can be explained by the high prevalence of positive diagnosis cases in

EHR datasets (n =5, Supplementary Table) [130]. For instance, the

HinSAGE model was utilized in [99] to predict cardiovascular disease

outcomes, whereas [12,84,95] employed GNNs for heart failure pre-

diction. Other specific diagnoses include diabetes, chronic hepatitis B,

opioid overdose, lymphocytic leukemia, sepsis, and pediatric COVID, as

seen in [19,27,85,94,95,105,110,114]. Furthermore, efforts to diagnose

rare diseases have also been observed [17].

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

4.4.2. Mortality prediction

Mortality prediction aims to determine whether a specific patient

will pass away (n =10, Fig. 5). This can be based on the patient’s

current visit or a particular period after admission or discharge from the

hospital or the Intensive Care Unit (ICU) [18,110]. In most cases, mor-

tality prediction has been evaluated along with other clinical tasks such

as readmission and diagnosis prediction (n =5, Table 2) [18,26,109].

Furthermore, it is often formulated as a binary classification problem,

where 1 predicts expiration, and 0 predicts survival.

Various strategies have been employed to address this problem.

In the Temporal Aware Graph Convolution Network (TAGNet) model,

EHR time-series data are represented as evolving graphs, and a GCN is

used to mine the structural information for mortality forecasting [83].

Rocheteau et al. predict in-hospital mortality using a patient similar-

ity graph [26]. Multimodal information, such as clinical notes and

patient correlation, is integrated into a multi-view approach in the

Patient Multi-view Multi-modal Feature Fusion Network (PM2F2N)

model [104]. Furthermore, AttenSurv aimed to predict patient survival

using a global attention mechanism and a GNN to extract and identify

latent correlations between clinical risk factors [87].

4.4.3. Readmission prediction

Readmission prediction involves forecasting whether a particular

patient will be readmitted to the hospital after discharge or to the

Intensive Care Unit (ICU) during the same hospital visit [89,115] (n

=8, Fig. 5). This task is often formulated as a binary classification

problem, where 1 predicts readmission, and 0 predicts no readmission.

Choi et al. and Wu et al. [16,85] proposed predicting ICU readmis-

sions using a single graph-structured encounter. In [18], readmission

at discharge was predicted using a variationally regularized encoder–

decoder graph network. [109] utilized hypergraph contrastive learning

to predict readmission using patient data collected within the initial

24 h of admission to the ICU [109].

Recent studies have focused on extending the readmission predic-

tion horizon to 30 days. Golmaei and Luo proposed a model that

integrates clinical note information with the topological structure of

patient networks [89]. Furthermore, Tang et al. utilized a multimodal

spatiotemporal GNN incorporating patient similarity to improve read-

mission predictions [115].

5. Discussion

This section examines the main aspects and perspectives of using

GNNs for EHR analysis, returning to the topics mentioned in the Results

section.

5.1. Overview

Going beyond the limitations of tabular patient data, GNNs have the

differential of leveraging medical events and entity dependencies into

predictions to generate more reliable and personalized patient results.

Some of these include the use of medical ontology hierarchies as an

a priori knowledge source, as seen in [25,28,80] (Table 5); patient

similarity graphs, as seen in [26,98]; and multi-view approaches, which

incorporate the structure of different medical graphs into predictions

such as treatment, medication, and diagnosis interactions, as observed

in [91,96,104]. A summary of identified EHR graph representations

was provided in Table 3, and further details can be found in the

Supplementary Material.

5.2. Multimodality

EHR data present inherent complexity owing to their multimodal

nature. It encompasses diverse data types, from images to continuous

and discrete attributes, including medical images, clinical notes, lab

results, and medications. A number of studies have primarily focused on

using codes from medical ontologies to represent a patient (Supplemen-

tary Table). Such an approach is limited in capturing the rich diversity

of information contained within medical records [1]. By integrating

various data modalities, a more comprehensive view of a patient’s

health status can be achieved, thereby enhancing the performance of

clinical risk prediction.

It was possible to observe that GNNs can help deal with the chal-

lenges posed by this dense amount of patient heterogeneous data,

especially when considering the most recent models. For example, in

the DeepNote-GNN model, a natural language processing BERT module

was also used to leverage clinical notes for readmission prediction.

Moreover, [116] used EHRs and genetic reports to predict cancer. A

single study, [115], employed imaging data, specifically chest radio-

graphs. Some further methods of accomplishing this include employing

multi-view approaches and heterogeneous graphs, as in previously

described models. Coupled with adequate GNN architectures, these rep-

resentations allow the exploration of various relevant relations within

multi-modal patient data. For instance, one model segment may focus

on learning disease–disease relations while another can emphasize

patient similarity. When coupled, they can become an important and

holistic resource for modeling the complex topology of EHR relations

within patient information.

Furthermore, beyond the data types already present in most EHRs,

the integration of novel information from omics data (e.g., genomics

and transcriptomics) (only one observed study, [116]) and sensor/

wearable data (no observed studies) has the potential to provide ad-

ditional insights, enriching even further a patient’s profile.

5.3. Model evaluation

All studies provided some sort of GNN comparison against other

machine learning and deep learning models. For example, [98] com-

pared a Gated Tree-based Graph Attention Network with a multilayer

perceptron (MLP) and achieved an improvement in accuracy of almost

9%; and MERGE [101] had an increase in the AUROC compared with

other deep learning baselines of almost 16%. It is worth mentioning

that the degree of improvement heavily depends on the conditions

of the studies, including the model architecture, data resources, and

evaluated clinical risk prediction tasks.

Furthermore, EHR datasets are predominantly organized in tabular

format. This allows researchers to design different EHR graph structures

optimized for various downstream tasks. However, this flexibility also

challenges the comparability of graph-based models. For example, even

if two GNNs are trained on an identical EHR dataset (such as observed:

23 of the 50 articles used a benchmark dataset, MIMIC-III, for model

deployment; Table 4), the model results may be complex to compare

owing to the differences in the GNN architectures and the unique EHR

graph topologies each one of them proposes. Furthermore, these models

may interpret tasks differently. For instance, models might vary in

defining the threshold for assigning a ‘1’ (indicating positive) or ‘0’

(indicating negative) in clinical risk prediction tasks.

This complexity shows that making predictions in the medical field

is still a delicate task that depends heavily on the available resources

and the limitations of the architectures of the models. Moreover, model

variability underscores the need for unified assessment metrics and

cross-model interpretability tools to improve the comparability of the

GNN models. This is especially crucial for clinical tasks, as differ-

ences in model predictions, whether from varying graphical struc-

tures or interpretive paradigms, can lead to different critical clinical

interventions, directly impacting a patient’s health.

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

5.4. Interpretability

Regarding interpretability, it was observed that half of the models

employed some form of interpretability technique (Fig. 8). Among

these, highlights include attention weights and visualizations of the em-

bedding space with dimensionality reduction techniques. For example,

in [26], the attention weights explain the model’s behavior in learning

to correctly diagnose a patient based on assigning higher weights to

other patients with shared diagnoses. In [86,90], the embedding space

was visualized, providing insights into the learned clinical clusters.

However, these interpretability resources may only partially ex-

plain the models, especially if they involve high architectural and

preprocessing complexity. Deep learning methods are often referred

to as ‘black boxes’ because they operate by inputting complex data,

learning abstract patterns, and producing difficult-to-interpret results,

omitting their internal logic to users [131]. Given this, overall, the more

heterogeneous the data used to represent patients, the more complex

and challenging it is to understand the GNN models, making it a double-

edged sword. For example, a GNN based only on diagnosis codes can be

less complex to examine than one based on multiple medical relations

and entities. The lack of graphical ground-truth explanations makes

interpretability even more complex, especially since most EHR datasets

are tabular, and standard evaluation strategies for graph machine learn-

ing are still emerging [132]. In this sense, it is crucial to move beyond

traditional interpretability methods to ensure the practical utility of

GNN-based systems in the medical field, given that predicting health

outcomes is a critical subject that requires thorough understanding to

be implemented in real clinical settings.

Although the identified techniques provide a level of insight, efforts

should be directed towards incorporating novel GNN interpretabil-

ity approaches, visual analytics, and user experience (UX) methods,

enabling medical professionals to effectively evaluate the outputs of

GNNs and provide feedback on the system’s predictions. Some re-

cently proposed methods include GNN-Explainer [133] and neuron

analysis [134]. More possibilities are described in [120,135] and point

to relevant directions for investigation, especially when combined to

include medical professionals in the model development process.

5.5. Temporality

Most existing GNN models applied to EHR data focus primarily on

capturing the sequential nature of events, disregarding the irregular

time aspect of clinical records. Data irregularity has already been

pointed out as a major challenge in modeling patient temporal data [7].

Models often overlook the exact time gaps between medical events and

visits; for example, if a visit happened 1-day apart or one year apart.

This is a crucial gap, as the time intervals between clinical events can

carry important information about the progression of a patient’s health

condition. By considering irregular time intervals between events, GNN

models can offer a more comprehensive representation of temporal

dynamics and potentially enable more accurate predictions.

Using GNNs that can handle time series information and coupling

GNNs with other deep learning architectures that can leverage irreg-

ular sequentiality represent interesting investigation directions. For

example, GNN architectures that can handle dynamic EHR graphs, as

they can update the graph structure as new timestamped patient data

comes in [79]; and modules based on ordinary differential equations

(ODEs), well-suited for modeling continuous-time data [92]. Other

GNN techniques that can be used for time series processing are listed

in [136].

5.6. Prediction tasks

Among the evaluated articles, diagnosis prediction was the predom-

inant task, as it benefits from abundant and standardized diagnostic

code information in EHRs (n =36, Fig. 5). It is essential to predict

diagnoses accurately, but clinical tasks beyond diagnosis prediction also

need to be diversified. Recent research has focused on novel topics,

such as survival analysis, hospitalization risk, patient deterioration, and

disease severity (Fig. 5). It was also observed a need for more diversity

in GNN models evaluating other clinical tasks, with significantly fewer

published articles than diagnosis prediction (Fig. 6). Also, models for

predicting rare diseases and conditions are incentivized, as among the

analyzed studies, only one aimed to tackle this problem [17]. The

high concentration and good performance of GNN architectures tested

for diagnosis prediction suggest that other prediction tasks can also

benefit from these techniques. Moreover, there is still ample room for

further exploration in evaluating these clinical risk tasks with better

patient representation based on graphs, especially those based on a

more heterogeneous, comprehensive view of a patient’s health state.

Finally, clinical tasks primarily aim at predicting outcomes related

to health concerns. Although reactive medicine is vital for patient care,

exploring the favorable factors that enhance patient health or optimize

treatment processes presents valuable opportunities for a more holis-

tic and patient-centered approach. In contrast to reactive medicine,

proactive and personalized medicine dedicates more time and resources

to disease prevention, early diagnosis, and treatment at stages when

it is more cost-effective and potent. This approach also emphasizes

managing chronic conditions before they escalate and lead to severe

complications.

5.7. Limitations

This study did not cover certain aspects, providing opportunities

for future research. This includes the evaluation of GNN model bias

and fairness across diverse patient populations, an area increasingly

recognized for its importance in clinical settings, as well as patient

data confidentiality. Understanding how these models perform across

different patient demographics and how to handle patient data safely

in GNN models is critical in clinical settings to ensure equitable health

outcomes.

It is also important to highlight the need to investigate the in-

tegration of these models into actual clinical workflows. The direct

application and evaluation of these models in clinical environments,

alongside a thorough comparative analysis with non-ML models like

clinical expert systems, were outside the study’s focus but are valuable

directions for future research. This involves considering how these

models interact with and augment the decision-making processes of

healthcare professionals.

Additionally, the scalability of these models, especially in terms of

computational requirements, was not a primary focus of this study.

GNNs, mainly when dealing with large and complex datasets typical in

healthcare, can require significant computational power. Investigating

ways to optimize these models for more widespread and cost-effective

use in diverse clinical environments is essential for subsequent studies.

In regard to the narrative review approach, it proves beneficial for

a broad and comprehensive overview. Yet, it presents some limitations.

Due to the rapidly evolving nature of GNN applications in clinical risk

prediction, some recent developments may not have been included. In

addition, narrative reviews can limit the generalizability of findings and

thus lead to potential bias in the selection and interpretation of studies.

These limitations highlight the need for future systematic reviews and

meta-analyses to enhance the fundamental insights provided by this

study.

Finally, this study specifically focused on studies that employed

GNN approaches. Given the novelty of this field and the often incon-

sistent terminology, there is a possibility of unintentionally omitting

specific works. However, the filtering criteria were carefully designed

to maximize the inclusion of relevant studies, ensuring a comprehensive

listing despite the potential for omissions.

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

6. Conclusion

The comprehensive review presented in this paper examined the

application of GNNs for clinical risk prediction using EHRs. Initially, a

background on EHR graph representation learning was provided, along

with its relevance to relational medical data, and an introduction to

GNNs and graph analysis. Next, the paper highlighted state-of-the-art

GNN approaches that are effective in modeling EHR data, including the

graph convolutional network (GCN), graph attention network (GAT),

and graph autoencoder (GAE).

The Results section provided statistics on the analyzed articles and

detailed the analysis around three axes: data representation, temporal-

ity, and clinical prediction tasks. It was possible to identify a growing

trend of articles focusing on this topic, and the positive outcomes of

the analyzed models demonstrate the significance and potential of the

area. Among the architectures employed, GAT and GCN were the most

common, and the task on which these models focused the most was

the prediction of diagnoses, followed by mortality and readmission.

Furthermore, there was a predominance of the MIMIC-III dataset as a

resource for building the models.

Finally, the Discussion section presented open challenges in the

area. Future research directions include developing models that ef-

fectively handle multimodal, heterogeneous, and irregular time in-

formation in the EHR data. Additionally, interpretability should be

emphasized, especially considering enhancing clinicians’ understanding

of the predictions, and efforts should be directed towards diversifying

the scope of investigated clinical risk prediction tasks. These perspec-

tives will contribute to more informed healthcare decision-making and

ultimately improve patient care.

CRediT authorship contribution statement

Heloísa Oss Boll: Writing – review & editing, Writing – origi-

nal draft, Visualization, Methodology, Investigation, Formal analysis,

Data curation, Conceptualization. Ali Amirahmadi: Writing – review

& editing, Conceptualization. Mirfarid Musavian Ghazani: Writing

– review & editing, Conceptualization. Wagner Ourique de Morais:

Writing – review & editing, Conceptualization. Edison Pignaton de

Freitas: Writing – review & editing, Conceptualization. Amira Soli-

man: Writing – review & editing, Supervision, Methodology, Con-

ceptualization. Farzaneh Etminani: Writing – review & editing, Su-

pervision, Methodology, Conceptualization. Stefan Byttner: Writing –

review & editing, Supervision, Methodology, Conceptualization. Mari-

ana Recamonde-Mendoza: Writing – review & editing, Visualization,

Supervision, Methodology, Conceptualization.

Declaration of competing interest

The authors declare that they have no known competing finan-

cial interests or personal relationships that could have appeared to

influence the work reported in this paper.

Acknowledgments

Funding

This work was financed in part by the Swedish Council for Higher

Education through the Linnaeus-Palme Partnership, Sweden

(3.3.1.34.16456), Coordenação de Aperfeiçoamento de Pessoal de Nível

Superior (CAPES), Brazil - Finance Code 001, and Conselho Nacional

de Desenvolvimento Científico e Tecnológico (CNPq), Brazil through

grants nr. 309505/2020-8 and 308075/2021- 8. We also acknowledge

the support from Fundação de Amparo à Pesquisa do Estado do Rio

Grande do Sul (FAPERGS), Brazil through grants nr. 22/2551-0000390-

7 (Project CIARS) and 21/2551-0002052-0.

Appendix A. Supplementary data

Supplementary material related to this article can be found online

at https://doi.org/10.1016/j.jbi.2024.104616.

References

[1] Y. Si, J. Du, Z. Li, X. Jiang, T. Miller, F. Wang, W. Jim Zheng, K. Roberts,

Deep representation learning of patient data from Electronic Health Records

(EHR): A systematic review, J. Biomed. Inform. 115 (2021) 103671, http:

//dx.doi.org/10.1016/j.jbi.2020.103671, URL https://www.sciencedirect.com/

science/article/pii/S1532046420302999.

[2] N.J. Carson, B. Mullin, M.J. Sanchez, F. Lu, K. Yang, M. Menezes, B.L. Cook,

Identification of suicidal behavior among psychiatrically hospitalized adoles-

cents using natural language processing and machine learning of electronic

health records, PLoS One 14 (2) (2019) e0211116, http://dx.doi.org/10.1371/

journal.pone.0211116, URL https://dx.plos.org/10.1371/journal.pone.0211116.

[3] T. Zheng, W. Xie, L. Xu, X. He, Y. Zhang, M. You, G. Yang, Y. Chen,

A machine learning-based framework to identify type 2 diabetes through

electronic health records, Int. J. Med. Inform. 97 (2017) 120–127, http://dx.

doi.org/10.1016/j.ijmedinf.2016.09.014, URL https://linkinghub.elsevier.com/

retrieve/pii/S1386505616302155.

[4] S. Fu, L.Y. Leung, A.-O. Raulli, D.F. Kallmes, K.A. Kinsman, K.B. Nelson, M.S.

Clark, P.H. Luetmer, P.R. Kingsbury, D.M. Kent, H. Liu, Assessment of the

impact of EHR heterogeneity for clinical research through a case study of

silent brain infarction, BMC Med. Inform. Decis. Mak. 20 (1) (2020) 60, http://

dx.doi.org/10.1186/s12911-020- 1072-9, URL https://bmcmedinformdecismak.

biomedcentral.com/articles/10.1186/s12911-020- 1072-9.

[5] B. Theodorou, C. Xiao, J. Sun, Synthesize high-dimensional longitudinal elec-

tronic health records via hierarchical autoregressive language model, 2023,

arXiv:2304.02169 [cs] URL http://arxiv.org/abs/2304.02169.

[6] B.J. Wells, A.S. Nowacki, K. Chagin, M.W. Kattan, Strategies for handling

missing data in electronic health record derived data, eGEMs J. Electron. Health

Data Methods 1 (3) (2013) 7, http://dx.doi.org/10.13063/2327-9214.1035,

URL https://up-j- gemgem.ubiquityjournal.website/articles/30.

[7] F. Xie, H. Yuan, Y. Ning, M.E.H. Ong, M. Feng, W. Hsu, B. Chakraborty,

N. Liu, Deep learning for temporal data representation in electronic health

records: A systematic review of challenges and methodologies, J. Biomed.

Inform. 126 (2022) 103980, http://dx.doi.org/10.1016/j.jbi.2021.103980, URL

https://linkinghub.elsevier.com/retrieve/pii/S1532046421003099.

[8] Y. Bengio, A. Courville, P. Vincent, Representation learning: A review and new

perspectives, 2014, arXiv:1206.5538 [cs] URL http://arxiv.org/abs/1206.5538.

[9] S.F. Ahmed, M.S.B. Alam, M. Hassan, M.R. Rozbu, T. Ishtiak, N. Rafa, M. Mofi-

jur, A.B.M. Shawkat Ali, A.H. Gandomi, Deep learning modelling techniques:

current progress, applications, advantages, and challenges, Artif. Intell. Rev. 56

(11) (2023) 13521–13617, http://dx.doi.org/10.1007/s10462-023- 10466-8.

[10] Q. Suo, H. Xue, J. Gao, A. Zhang, Risk factor analysis based on deep

learning models, in: Proceedings of the 7th ACM International Conference

on Bioinformatics, Computational Biology, and Health Informatics, BCB ’16,

Association for Computing Machinery, New York, NY, USA, 2016, pp. 394–403,

http://dx.doi.org/10.1145/2975167.2975208.

[11] C. Xiao, E. Choi, J. Sun, Opportunities and challenges in developing deep

learning models using electronic health records data: a systematic review,

J. Amer. Med. Inform. Assoc. 25 (10) (2018) 1419–1428, http://dx.doi.

org/10.1093/jamia/ocy068, URL https://academic.oup.com/jamia/article/25/

10/1419/5035024.

[12] S. Chowdhury, Y. Chen, A. Wen, X. Ma, Q. Dai, Y. Yu, S. Fu, X. Jiang, N.

Zong, Predicting physiological response in heart failure management: A graph

representation learning approach using electronic health records, 2023, http:

//dx.doi.org/10.1101/2023.01.27.23285129, URL http://medrxiv.org/lookup/

doi/10.1101/2023.01.27.23285129.

[13] H. Lu, S. Uddin, Disease prediction using graph machine learning based on

electronic health data: A review of approaches and trends, Healthcare 11

(7) (2023) http://dx.doi.org/10.3390/healthcare11071031, URL https://www.

mdpi.com/2227-9032/11/7/1031.

[14] A. Amirahmadi, M. Ohlsson, K. Etminani, Deep learning prediction mod-

els based on EHR trajectories: A systematic review, J. Biomed. Inform.

144 (2023) 104430, http://dx.doi.org/10.1016/j.jbi.2023.104430, URL https:

//www.sciencedirect.com/science/article/pii/S153204642300151X.

[15] W.-H. Weng, P. Szolovits, Representation learning for electronic health records,

2019, arXiv:1909.09248 [cs, stat] URL http://arxiv.org/abs/1909.09248.

[16] E. Choi, Z. Xu, Y. Li, M. Dusenberry, G. Flores, E. Xue, A. Dai, Learning

the graphical structure of electronic health records with graph convolutional

transformer, in: Proceedings of the AAAI Conference on Artificial Intelligence,

Vol. 34, 2020, pp. 606–613, http://dx.doi.org/10.1609/aaai.v34i01.5400, URL

https://ojs.aaai.org/index.php/AAAI/article/view/5400 no. 01.

[17] Z. Sun, H. Yin, H. Chen, T. Chen, L. Cui, F. Yang, Disease prediction via

graph neural networks, IEEE J. Biomed. Health Inform. 25 (3) (2021) 818–

826, http://dx.doi.org/10.1109/JBHI.2020.3004143, URL https://ieeexplore.

ieee.org/document/9122573/.

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

[18] W. Zhu, N. Razavian, Variationally regularized graph-based representation

learning for electronic health records, in: Proceedings of the Confer-

ence on Health, Inference, and Learning, ACM, 2021, pp. 1–13, http:

//dx.doi.org/10.1145/3450439.3451855, URL https://dl.acm.org/doi/10.1145/

3450439.3451855.

[19] X. Dong, R. Wong, W. Lyu, K. Abell-Hart, J. Deng, Y. Liu, J.G. Hajagos,

R.N. Rosenthal, C. Chen, F. Wang, An integrated LSTM-HeteroRGNN model

for interpretable opioid overdose risk prediction, Artif. Intell. Med. 135

(2023) 102439, http://dx.doi.org/10.1016/j.artmed.2022.102439, URL https:

//linkinghub.elsevier.com/retrieve/pii/S0933365722001919.

[20] M.M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, P. Vandergheynst, Geometric

deep learning: Going beyond Euclidean data, IEEE Signal Process. Mag. 34

(4) (2017) 18–42, http://dx.doi.org/10.1109/MSP.2017.2693418, Conference

Name: IEEE Signal Processing Magazine.

[21] I.A. Chikwendu, X. Zhang, I.O. Agyemang, I. Adjei-Mensah, U.C. Chima, C.J.

Ejiyi, A comprehensive survey on deep graph representation learning methods,

J. Artificial Intelligence Res. 78 (2023) 287–356, http://dx.doi.org/10.1613/

jair.1.14768, URL https://jair.org/index.php/jair/article/view/14768.

[22] W. Jiang, J. Luo, Graph neural network for traffic forecasting: A survey, Expert

Syst. Appl. 207 (2022) 117921, http://dx.doi.org/10.1016/j.eswa.2022.117921,

URL https://www.sciencedirect.com/science/article/pii/S0957417422011654.

[23] J. Xu, X. Xi, J. Chen, V.S. Sheng, J. Ma, Z. Cui, A survey of deep learning for

electronic health records, Appl. Sci. 12 (22) (2022) 11709, http://dx.doi.org/

10.3390/app122211709.

[24] H. Cui, J. Lu, S. Wang, R. Xu, W. Ma, S. Yu, Y. Yu, X. Kan, T. Fu, C. Ling, J.

Ho, F. Wang, C. Yang, A survey on knowledge graphs for healthcare: Resources,

application progress, and promise, in: ICML 3rd Workshop on Interpretable

Machine Learning in Healthcare, IMLH, 2023, p. 19, URL https://openreview.

net/forum?id=CZCktJoBRh.

[25] K. Wang, N. Chen, T. Chen, Joint medical ontology representation learning for

healthcare predictions, in: 2020 International Joint Conference on Neural Net-

works (IJCNN), IEEE, 2020, pp. 1–7, http://dx.doi.org/10.1109/IJCNN48605.

2020.9207355, URL https://ieeexplore.ieee.org/document/9207355/.

[26] E. Rocheteau, C. Tong, P. Veličković, N. Lane, P. Liò, Predicting patient

outcomes with graph representation learning, 2021, arXiv:2101.03940 [cs] URL

http://arxiv.org/abs/2101.03940.

[27] Z.E. Wu, D. Xu, P.J.-H. Hu, T.-S. Huang, A hierarchical multilabel graph

attention network method to predict the deterioration paths of chronic hepatitis

B patients, J. Amer. Med. Inform. Assoc. 30 (5) (2023) 846–858, http://dx.doi.

org/10.1093/jamia/ocad008, URL https://academic.oup.com/jamia/article/30/

5/846/7040373.

[28] Y. Li, B. Qian, X. Zhang, H. Liu, Knowledge guided diagnosis prediction via

graph spatial-temporal network, in: Proceedings of the 2020 SIAM International

Conference on Data Mining, SDM, SIAM 2020, 2020, pp. 19–27, http://

dx.doi.org/10.1137/1.9781611976236.3,arXiv:https://epubs.siam.org/doi/pdf/

10.1137/1.9781611976236.3 URL https://epubs.siam.org/doi/abs/10.1137/1.

9781611976236.3.

[29] Z. Liu, X. Li, H. Peng, L. He, P.S. Yu, Heterogeneous similarity graph neural

network on electronic health records, in: 2020 IEEE International Conference

on Big Data (Big Data), IEEE, 2020, pp. 1196–1205, http://dx.doi.org/10.

1109/BigData50022.2020.9377795, URL https://ieeexplore.ieee.org/document/

9377795/.

[30] M. Ye, S. Cui, Y. Wang, J. Luo, C. Xiao, F. Ma, MedPath: Augmenting

health risk prediction via medical knowledge paths, in: Proceedings of the

Web Conference 2021, ACM, 2021, pp. 1397–1409, http://dx.doi.org/10.1145/

3442381.3449860, URL https://dl.acm.org/doi/10.1145/3442381.3449860.

[31] C. Lu, C.K. Reddy, Y. Ning, Self-supervised graph learning with hyperbolic

embedding for temporal health event prediction, IEEE Trans. Cybern. 53

(4) (2023) 2124–2136, http://dx.doi.org/10.1109/TCYB.2021.3109881,arXiv:

2106.04751 [cs] URL http://arxiv.org/abs/2106.04751.

[32] J. Schrodt, A. Dudchenko, P. Knaup-Gregori, M. Ganzinger, Graph-

representation of patient data: a systematic literature review, J. Med. Syst. 44

(4) (2020) 86.

[33] E. Choi, M.T. Bahadori, A. Schuetz, W.F. Stewart, J. Sun, Doctor AI: Predicting

clinical events via recurrent neural networks, 2016, arXiv:1511.05942 [cs] URL

http://arxiv.org/abs/1511.05942.

[34] E. Choi, M.T. Bahadori, J.A. Kulas, A. Schuetz, W.F. Stewart, J. Sun, RETAIN:

An interpretable predictive model for healthcare using reverse time atten-

tion mechanism, 2017, arXiv:1608.05745 [cs] URL http://arxiv.org/abs/1608.

05745.

[35] F. Ma, R. Chitta, J. Zhou, Q. You, T. Sun, J. Gao, Dipole: Diagnosis prediction

in healthcare via attention-based bidirectional recurrent neural networks, in:

Proceedings of the 23rd ACM SIGKDD International Conference on Knowl-

edge Discovery and Data Mining, 2017, pp. 1903–1911, http://dx.doi.org/

10.1145/3097983.3098088,arXiv:1706.05764 [cs] URL http://arxiv.org/abs/

1706.05764.

[36] P. Nguyen, T. Tran, N. Wickramasinghe, S. Venkatesh, Deepr: A convolutional

net for medical records, 2016, arXiv:1607.07519 [cs, stat] URL http://arxiv.

org/abs/1607.07519.

[37] E. Choi, M.T. Bahadori, E. Searles, C. Coffey, J. Sun, Multi-layer representation

learning for medical concepts, 2016, arXiv:1602.05568 [cs] URL http://arxiv.

org/abs/1602.05568.

[38] Y. Choi, C.Y.-I. Chiu, D. Sontag, Learning low-dimensional representations of

medical concepts, AMIA Summits Transl. Sci. Proc. 2016 (2016) 41–50, URL

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001761/.

[39] Z. Che, D. Kale, W. Li, M.T. Bahadori, Y. Liu, Deep computational phenotyp-

ing, in: Proceedings of the 21th ACM SIGKDD International Conference on

Knowledge Discovery and Data Mining, KDD ’15, Association for Computing

Machinery, 2015, pp. 507–516, http://dx.doi.org/10.1145/2783258.2783365.

[40] R. Miotto, L. Li, B.A. Kidd, J.T. Dudley, Deep patient: An unsupervised

representation to predict the future of patients from the electronic health

records, Sci. Rep. 6 (1) (2016) 26094, http://dx.doi.org/10.1038/srep26094,

URL https://www.nature.com/articles/srep26094, Number: 1 Publisher: Nature

Publishing Group.

[41] Y. Cheng, F. Wang, P. Zhang, J. Hu, Risk prediction with electronic health

records: A deep learning approach, in: Proceedings of the 2016 SIAM Inter-

national Conference on Data Mining, Society for Industrial and Applied Math-

ematics, 2016, pp. 432–440, http://dx.doi.org/10.1137/1.9781611974348.49,

URL https://epubs.siam.org/doi/10.1137/1.9781611974348.49.

[42] T. Pham, T. Tran, D. Phung, S. Venkatesh, DeepCare: A deep dynamic memory

model for predictive medicine, 2017, arXiv:1602.00357 [cs, stat] URL http:

//arxiv.org/abs/1602.00357.

[43] International classification of diseases (ICD), 2023, URL https://www.who.int/

standards/classifications/classification-of- diseases.

[44] J. Zhang, J. Gong, L. Barnes, HCNN: Heterogeneous convolutional neural

networks for comorbid risk prediction with electronic health records, in: 2017

IEEE/ACM International Conference on Connected Health: Applications, Systems

and Engineering Technologies (CHASE), IEEE, 2017, pp. 214–221, http://

dx.doi.org/10.1109/CHASE.2017.80, URL http://ieeexplore.ieee.org/document/

8010635/.

[45] E. Choi, M.T. Bahadori, L. Song, W.F. Stewart, J. Sun, GRAM: Graph-based

attention model for healthcare representation learning, in: Proceedings of the

23rd ACM SIGKDD International Conference on Knowledge Discovery and

Data Mining, ACM, 2017, pp. 787–795, http://dx.doi.org/10.1145/3097983.

3098126, URL https://dl.acm.org/doi/10.1145/3097983.3098126.

[46] F. Ma, Q. You, H. Xiao, R. Chitta, J. Zhou, J. Gao, KAME: Knowledge-

based attention model for diagnosis prediction in healthcare, in: Proceedings

of the 27th ACM International Conference on Information and Knowledge

Management, ACM, 2018, pp. 743–752, http://dx.doi.org/10.1145/3269206.

3271701, URL https://dl.acm.org/doi/10.1145/3269206.3271701.

[47] L. Song, C.W. Cheong, K. Yin, W.K. Cheung, B.C.M. Fung, J. Poon, Medical

concept embedding with multiple ontological representations, in: Proceedings

of the Twenty-Eighth International Joint Conference on Artificial Intelligence,

International Joint Conferences on Artificial Intelligence Organization, 2019,

pp. 4613–4619, http://dx.doi.org/10.24963/ijcai.2019/641, URL https://www.

ijcai.org/proceedings/2019/641.

[48] J. Gao, X. Wang, Y. Wang, Z. Yang, J. Gao, J. Wang, W. Tang, X. Xie, CAMP: Co-

attention memory networks for diagnosis prediction in healthcare, in: 2019 IEEE

International Conference on Data Mining (ICDM), IEEE, 2019, pp. 1036–1041,

http://dx.doi.org/10.1109/ICDM.2019.00120, URL https://ieeexplore.ieee.org/

document/8970792/.

[49] E. Choi, C. Xiao, W.F. Stewart, J. Sun, MiME: Multilevel medical embedding

of electronic health records for predictive healthcare, 2018, arXiv:1810.09593

[cs, stat] URL http://arxiv.org/abs/1810.09593.

[50] Y. Wang, W. Chen, D. Pi, R. Boots, Graph augmented triplet architecture for

fine-grained patient similarity, World Wide Web 23 (5) (2020) 2739–2752,

http://dx.doi.org/10.1007/s11280-020- 00794-y, URL http://link.springer.com/

10.1007/s11280-020- 00794-y.

[51] B. Hettige, Y.-F. Li, W. Wang, S. Le, W. Buntine, MedGraph: Structural and

temporal representation learning of electronic medical records, 2020, arXiv:

1912.03703 [cs, stat] URL http://arxiv.org/abs/1912.03703.

[52] R. Li, C. Yin, S. Yang, B. Qian, P. Zhang, Marrying medical domain knowledge

with deep learning on electronic health records: A deep visual analytics

approach, J. Med. Internet Res. 22 (9) (2020) e20645, http://dx.doi.org/10.

2196/20645, URL http://www.jmir.org/2020/9/e20645/.

[53] H. Jiang, D. Yang, Learning graph-based embedding from EHRs for time-aware

patient similarity, Eng. Lett. 28 (4) (2020).

[54] J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, M.

Sun, Graph neural networks: A review of methods and applications, AI Open

1 (2020) 57–81, http://dx.doi.org/10.1016/j.aiopen.2021.01.001, URL https:

//linkinghub.elsevier.com/retrieve/pii/S2666651021000012.

[55] B. Sanchez-Lengeling, E. Reif, A. Pearce, A.B. Wiltschko, A gentle introduction

to graph neural networks, Distill 6 (9) (2021) e33, http://dx.doi.org/10.23915/

distill.00033, URL https://distill.pub/2021/gnn-intro.

[56] P. Veličković, Everything is connected: Graph neural networks, 2023, arXiv:

2301.08210 [cs, stat] URL http://arxiv.org/abs/2301.08210.

[57] C. Gao, Y. Zheng, N. Li, Y. Li, Y. Qin, J. Piao, Y. Quan, J. Chang, D. Jin, X. He,

Y. Li, A survey of graph neural networks for recommender systems: Challenges,

methods, and directions, ACM Trans. Recomm. Syst. 1 (1) (2023) 3:1–3:51, http:

//dx.doi.org/10.1145/3568022, URL https://dl.acm.org/doi/10.1145/3568022.

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

[58] L.A. Alves, N.C.D.S. Ferreira, V. Maricato, A.V.P. Alberto, E.A. Dias, N. Jose

Aguiar Coelho, Graph neural networks as a potential tool in improving virtual

screening programs, Front. Chem. 9 (2022) 787194, http://dx.doi.org/10.3389/

fchem.2021.787194, URL https://www.frontiersin.org/articles/10.3389/fchem.

2021.787194/full.

[59] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, Y. Bengio, Graph

attention networks, 2018, arXiv:1710.10903 [cs, stat] URL http://arxiv.org/

abs/1710.10903.

[60] K. Jha, S. Saha, H. Singh, Prediction of protein–protein interaction using

graph neural networks, Sci. Rep. 12 (1) (2022) 8360, http://dx.doi.org/

10.1038/s41598-022- 12201-9, URL https://www.nature.com/articles/s41598-

022-12201- 9.

[61] G. Panagopoulos, G. Nikolentzos, M. Vazirgiannis, Transfer graph neural

networks for pandemic forecasting, in: Proceedings of the AAAI Confer-

ence on Artificial Intelligence, Vol. 35, 2021, pp. 4838–4845, http://dx.

doi.org/10.1609/aaai.v35i6.16616, URL https://ojs.aaai.org/index.php/AAAI/

article/view/16616, no. 6.

[62] P. Bongini, M. Bianchini, F. Scarselli, Molecular generative graph neural

networks for drug discovery, Neurocomputing 450 (2021) 242–252, http://dx.

doi.org/10.1016/j.neucom.2021.04.039, URL https://linkinghub.elsevier.com/

retrieve/pii/S0925231221005737.

[63] Z. Lin, D. Yang, H. Jiang, H. Yin, Learning patient similarity via heterogeneous

medical knowledge graph embedding, Int. J. Comput. Sci. 48 (4) (2021).

[64] M. Gori, G. Monfardini, F. Scarselli, A new model for learning in graph

domains, in: Proceedings. 2005 IEEE International Joint Conference on Neural

Networks, 2005, Vol. 2, 2005, pp. 729–734, http://dx.doi.org/10.1109/IJCNN.

2005.1555942, ISSN: 2161-4407, vol. 2.

[65] F. Scarselli, M. Gori, A.C. Tsoi, M. Hagenbuchner, G. Monfardini, The graph

neural network model, IEEE Trans. Neural Netw. 20 (1) (2009) 61–80,

http://dx.doi.org/10.1109/TNN.2008.2005605, URL http://ieeexplore.ieee.org/

document/4700287/.

[66] S.K. Maurya, X. Liu, T. Murata, Feature selection: Key to enhance node

classification with graph neural networks, CAAI Trans. Intell. Technol. 8 (1)

(2023) 14–28, http://dx.doi.org/10.1049/cit2.12166, URL https://ietresearch.

onlinelibrary.wiley.com/doi/10.1049/cit2.12166.

[67] K. Xu, W. Hu, J. Leskovec, S. Jegelka, How powerful are graph neural

networks?, 2019, arXiv:1810.00826 [cs, stat] URL http://arxiv.org/abs/1810.

00826.

[68] A. Mohi ud din, S. Qureshi, A review of challenges and solutions in the design

and implementation of deep graph neural networks, Int. J. Comput. Appl.

45 (3) (2023) 221–230, http://dx.doi.org/10.1080/1206212X.2022.2133805,

Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/1206212X.2022.

2133805.

[69] A. Daigavane, B. Ravindran, G. Aggarwal, Understanding convolutions on

graphs, Distill 6 (9) (2021) e32, http://dx.doi.org/10.23915/distill.00032, URL

https://distill.pub/2021/understanding-gnns.

[70] T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional

networks, 2017, arXiv:1609.02907 [cs, stat] URL http://arxiv.org/abs/1609.

02907.

[71] C. Gao, Y. Zheng, N. Li, Y. Li, Y. Qin, J. Piao, Y. Quan, J. Chang, D. Jin, X. He,

Y. Li, A survey of graph neural networks for recommender systems: Challenges,

methods, and directions, ACM Trans. Recomm. Syst. 1 (1) (2023) 1–51, http:

//dx.doi.org/10.1145/3568022, URL https://dl.acm.org/doi/10.1145/3568022.

[72] Z. Chen, F. Chen, L. Zhang, T. Ji, K. Fu, L. Zhao, F. Chen, L. Wu, C.

Aggarwal, C.-T. Lu, Bridging the gap between spatial and spectral domains:

A survey on graph neural networks, 2021, arXiv:2002.11867 [cs, stat] URL

http://arxiv.org/abs/2002.11867.

[73] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł.

Kaiser, I. Polosukhin, Attention is all you need, in: I. Guyon, U.V. Luxburg, S.

Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (Eds.), in: Advances

in Neural Information Processing Systems, vol. 30, Curran Associates, Inc.,

2017, pp. 1–11, URL https://proceedings.neurips.cc/paper_files/paper/2017/

file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.

[74] S. Chaudhari, V. Mithal, G. Polatkan, R. Ramanath, An attentive survey of

attention models, ACM Trans. Intell. Syst. Technol. 12 (5) (2021) 53:1–53:32,

http://dx.doi.org/10.1145/3465055.

[75] Z. Zhang, P. Cui, W. Zhu, Deep learning on graphs: A survey, IEEE Trans.

Knowl. Data Eng. 34 (1) (2022) 249–270, http://dx.doi.org/10.1109/

TKDE.2020.2981333, URL https://ieeexplore.ieee.org/abstract/document/

9039675?casa_token=INPsqM5XnUsAAAAA:NoY3hnH2701HYZeclrkwnBf2GJ-

1dZeTvQqsCd2IZyZhnxrhWS7nA1rrNcoaoKNJSpAZIPbotGc.

[76] W.L. Hamilton, R. Ying, J. Leskovec, Representation learning on graphs:

Methods and applications, 2018, arXiv:1709.05584 [cs] URL http://arxiv.org/

abs/1709.05584.

[77] T.N. Kipf, M. Welling, Variational graph auto-encoders, 2016, arXiv:1611.07308

[cs, stat] URL http://arxiv.org/abs/1611.07308.

[78] M.J. Page, J.E. McKenzie, P.M. Bossuyt, I. Boutron, T.C. Hoffmann, C.D.

Mulrow, L. Shamseer, J.M. Tetzlaff, E.A. Akl, S.E. Brennan, R. Chou, J.

Glanville, J.M. Grimshaw, A. Hróbjartsson, M.M. Lalu, T. Li, E.W. Loder, E.

Mayo-Wilson, S. McDonald, L.A. McGuinness, L.A. Stewart, J. Thomas, A.C.

Tricco, V.A. Welch, P. Whiting, D. Moher, The PRISMA 2020 statement: an

updated guideline for reporting systematic reviews, BMJ 372 (2021) n71,

http://dx.doi.org/10.1136/bmj.n71, URL https://www.bmj.com/content/372/

bmj.n71, Publisher: British Medical Journal Publishing Group Section: Research

Methods & Reporting.

[79] D. Lee, X. Jiang, H. Yu, Harmonized representation learning on dynamic

EHR graphs, J. Biomed. Inform. 106 (2020) 103426, http://dx.doi.org/

10.1016/j.jbi.2020.103426, URL https://linkinghub.elsevier.com/retrieve/pii/

S153204642030054X.

[80] Y. Li, B. Qian, X. Zhang, H. Liu, Graph neural network-based diagnosis

prediction, Big Data 8 (5) (2020) 379–390, http://dx.doi.org/10.1089/big.2020.

0070, URL https://www.liebertpub.com/doi/10.1089/big.2020.0070.

[81] B.T. Lee, O.-Y. Kwon, H. Park, K.-J. Cho, J.-M. Kwon, Y. Lee, Graph

convolutional networks-based noisy data imputation in electronic health

record, Crit. Care Med. 48 (11) (2020) e1106–e1111, http://dx.doi.org/10.

1097/CCM.0000000000004583, URL https://journals.lww.com/10.1097/CCM.

0000000000004583.

[82] Q. Wang, B.C.M. Fung, P.C.K. Hung, DUGRA: Dual-graph representation learn-

ing for health information networks, in: 2020 IEEE International Conference

on Big Data (Big Data), IEEE, 2020, pp. 4961–4970, http://dx.doi.org/10.

1109/BigData50022.2020.9378420, URL https://ieeexplore.ieee.org/document/

9378420/.

[83] S. Wang, J. Liu, TAGNet: Temporal aware graph convolution network for

clinical information extraction, in: 2020 IEEE International Conference on

Bioinformatics and Biomedicine (BIBM), IEEE, 2020, pp. 2105–2108, http:

//dx.doi.org/10.1109/BIBM49941.2020.9313530, URL https://ieeexplore.ieee.

org/document/9313530/.

[84] S. Chowdhury, C. Zhang, P. Yu, Y. Luo, Med2Meta: Learning representa-

tions of medical concepts with meta-embeddings:, in: Proceedings of the

13th International Joint Conference on Biomedical Engineering Systems and

Technologies, SCITEPRESS - Science and Technology Publications, 2020, pp.

369–376, http://dx.doi.org/10.5220/0008934403690376, URL https://www.

scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/0008934403690376.

[85] T. Wu, Y. Wang, Y. Wang, E. Zhao, Y. Yuan, Leveraging graph-based hier-

archical medical entity embedding for healthcare applications, Sci. Rep. 11

(1) (2021) 5858, http://dx.doi.org/10.1038/s41598-021- 85255-w, URL https:

//www.nature.com/articles/s41598-021- 85255-w.

[86] Y. Shi, Y. Guo, H. Wu, J. Li, X. Li, Multi-relational EHR representation learning

with infusing information of Diagnosis and Medication, in: 2021 IEEE 45th

Annual Computers, Software, and Applications Conference (COMPSAC), IEEE,

2021, pp. 1617–1622, http://dx.doi.org/10.1109/COMPSAC51774.2021.00241,

URL https://ieeexplore.ieee.org/document/9529837/.

[87] Z. Sun, W. Dong, J. Shi, K. He, Z. Huang, Attention-based deep recurrent model

for survival prediction, ACM Trans. Comput. Healthc. 2 (4) (2021) 1–18, http:

//dx.doi.org/10.1145/3466782, URL https://dl.acm.org/doi/10.1145/3466782.

[88] Z. Wang, R. Wen, X. Chen, S. Cao, S.-L. Huang, B. Qian, Y. Zheng, Online

disease diagnosis with inductive heterogeneous graph convolutional networks,

in: Proceedings of the Web Conference 2021, ACM, 2021, pp. 3349–3358,

http://dx.doi.org/10.1145/3442381.3449795, URL https://dl.acm.org/doi/10.

1145/3442381.3449795.

[89] S.N. Golmaei, X. Luo, DeepNote-GNN: predicting hospital readmission using

clinical notes and patient network, in: Proceedings of the 12th ACM Conference

on Bioinformatics, Computational Biology, and Health Informatics, ACM, 2021,

pp. 1–9, http://dx.doi.org/10.1145/3459930.3469547, URL https://dl.acm.org/

doi/10.1145/3459930.3469547.

[90] R. Vinas, X. Zheng, J. Hayes, A graph-based imputation method for sparse

medical records, 2021, arXiv:2111.09084 [cs] URL http://arxiv.org/abs/2111.

09084.

[91] W. Yang, S. Zhang, B. Zhang, Medical assistant diagnosis method based

on graph neural network and attention mechanism, in: 2021 the 3rd

World Symposium on Software Engineering, ACM, 2021, pp. 194–198, http:

//dx.doi.org/10.1145/3488838.3488871, URL https://dl.acm.org/doi/10.1145/

3488838.3488871.

[92] H. Qiu, C. Zhang, Z. Fei, M. Qiu, S.-Y. Kung (Eds.), Readmission prediction

with knowledge graph attention and RNN-based ordinary differential equations,

in: Lecture Notes in Computer Science, vol. 12817, Springer International

Publishing, 2021, http://dx.doi.org/10.1007/978-3- 030-82153- 1, URL https:

//link.springer.com/10.1007/978-3- 030-82153- 1.

[93] C. Lu, C.K. Reddy, P. Chakraborty, S. Kleinberg, Y. Ning, Collaborative graph

learning with auxiliary text for temporal event prediction in healthcare, in:

Proceedings of the Thirtieth International Joint Conference on Artificial Intel-

ligence, International Joint Conferences on Artificial Intelligence Organization,

2021, pp. 3529–3535, http://dx.doi.org/10.24963/ijcai.2021/486, URL https:

//www.ijcai.org/proceedings/2021/486.

[94] A. Pieroni, A. Cabroni, F. Fallucchi, N. Scarpato, Predictive modeling ap-

plied to structured clinical data extracted from electronic health records:

An architectural hypothesis and A first experiment, J. Comput. Sci. 17 (9)

(2021) 762–775, http://dx.doi.org/10.3844/jcssp.2021.762.775, URL https://

thescipub.com/abstract/10.3844/jcssp.2021.762.775.

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

[95] Y. Xu, H. Ying, S. Qian, F. Zhuang, X. Zhang, D. Wang, J. Wu, H. Xiong,

Time-aware context-gated graph attention network for clinical risk prediction,

IEEE Trans. Knowl. Data Eng. (2022) 1–12, http://dx.doi.org/10.1109/TKDE.

2022.3181780, URL https://ieeexplore.ieee.org/document/9794568/.

[96] Z. Sun, X. Yang, Z. Feng, T. Xu, X. Fan, J. Tian, EHR2HG: Modeling of

EHRs data based on hypergraphs for disease prediction, in: 2022 IEEE Interna-

tional Conference on Bioinformatics and Biomedicine (BIBM), IEEE, 2022, pp.

1730–1733, http://dx.doi.org/10.1109/BIBM55620.2022.9995204, URL https:

//ieeexplore.ieee.org/document/9995204/.

[97] Z. Qu, L. Cui, Y. Xu, Disease risk prediction via heterogeneous graph at-

tention networks, in: 2022 IEEE International Conference on Bioinformatics

and Biomedicine (BIBM), IEEE, 2022, pp. 3385–3390, http://dx.doi.org/10.

1109/BIBM55620.2022.9995491, URL https://ieeexplore.ieee.org/document/

9995491/.

[98] J. Jiang, T. Wang, B. Wang, L. Ma, Y. Guan, Gated tree-based graph atten-

tion network (GTGAT) for medical knowledge graph reasoning, Artif. Intell.

Med. 130 (2022) 102329, http://dx.doi.org/10.1016/j.artmed.2022.102329,

URL https://linkinghub.elsevier.com/retrieve/pii/S093336572200094X.

[99] H.N. Cho, I. Ahn, H. Gwon, H.J. Kang, Y. Kim, H. Seo, H. Choi, M.

Kim, J. Han, G. Kee, T.J. Jun, Y.-H. Kim, Heterogeneous graph construction

and HinSAGE learning from electronic medical records, Sci. Rep. 12 (1)

(2022) 21152, http://dx.doi.org/10.1038/s41598-022- 25693-2, URL https://

www.nature.com/articles/s41598-022- 25693-2.

[100] Y. Li, D. Yang, X. Gong, Patient similarity via medical attributed heterogeneous

graph convolutional network, Int. J. Comput. Sci. (2022) 1152–1161, URL

https://www.iaeng.org/IJCS/issues_v49/issue_4/IJCS_49_4_18.pdf, 10p..

[101] Y. An, R. Li, X. Chen, MERGE: A multi-graph attentive representa-

tion learning framework integrating group information from similar pa-

tients, Comput. Biol. Med. 151 (2022) 106245, http://dx.doi.org/10.1016/

j.compbiomed.2022.106245, URL https://linkinghub.elsevier.com/retrieve/pii/

S0010482522009532.

[102] C. Lu, T. Han, Y. Ning, Context-aware health event prediction via transition

functions on dynamic disease graphs, in: Proceedings of the AAAI Confer-

ence on Artificial Intelligence, Vol. 36, 2022, pp. 4567–4574, http://dx.

doi.org/10.1609/aaai.v36i4.20380, URL https://ojs.aaai.org/index.php/AAAI/

article/view/20380, no. 4.

[103] T. Kanchinadam, S. Gauher, Predicting clinical events via graph neural net-

works, in: 2022 21st IEEE International Conference on Machine Learning

and Applications (ICMLA), IEEE, 2022, pp. 1296–1303, http://dx.doi.org/

10.1109/ICMLA55696.2022.00207, URL https://ieeexplore.ieee.org/document/

10069726/.

[104] Y. Zhang, B. Zhou, K. Song, X. Sui, G. Zhao, N. Jiang, X. Yuan, PM2F2N: Patient

multi-view multi-modal feature fusion networks for clinical outcome prediction,

ACL Anthol. (2022).

[105] J. Gao, C. Yang, J. Heintz, S. Barrows, E. Albers, M. Stapel, S. Warfield,

A. Cross, J. Sun, MedML: Fusing medical knowledge and machine learning

models for early pediatric COVID-19 hospitalization and severity prediction,

iScience 25 (9) (2022) 104970, http://dx.doi.org/10.1016/j.isci.2022.104970,

URL https://linkinghub.elsevier.com/retrieve/pii/S2589004222012421.

[106] Q. Zhao, J. Li, L. Zhao, Z. Zhu, Knowledge guided feature aggregation

for the prediction of chronic obstructive pulmonary disease with Chinese

EMRs, IEEE/ACM Trans. Comput. Biol. Bioinform. (2022) 1–10, http://dx.doi.

org/10.1109/TCBB.2022.3198798, URL https://ieeexplore.ieee.org/document/

9857572/.

[107] Y. Zou, A. Pesaranghader, Z. Song, A. Verma, D.L. Buckeridge, Y. Li, Modeling

electronic health record data using an end-to-end knowledge-graph-informed

topic model, Sci. Rep. 12 (1) (2022) 17868, http://dx.doi.org/10.1038/s41598-

022-22956- w, URL https://www.nature.com/articles/s41598-022- 22956-w.

[108] K. Zhang, B. Hu, F. Zhou, Y. Song, X. Zhao, X. Huang, Graph-based structural

knowledge-aware network for diagnosis assistant, Math. Biosci. Eng. 19 (10)

(2022) 10533–10549, http://dx.doi.org/10.3934/mbe.2022492, URL http://

www.aimspress.com/article/doi/10.3934/mbe.2022492.

[109] D. Cai, C. Sun, M. Song, B. Zhang, S. Hong, H. Li, Hypergraph Contrastive

Learning for Electronic Health Records, Society for Industrial and Applied Math-

ematics, Philadelphia, PA, 2022, http://dx.doi.org/10.1137/1.9781611977172,

URL https://epubs.siam.org/doi/book/10.1137/1.9781611977172.

[110] X. Ma, Y. Wang, X. Chu, L. Ma, W. Tang, J. Zhao, Y. Yuan, G. Wang, Patient

health representation learning via correlational sparse prior of medical features,

IEEE Trans. Knowl. Data Eng. (2022) 1–14, http://dx.doi.org/10.1109/TKDE.

2022.3230454, Conference Name: IEEE Transactions on Knowledge and Data

Engineering.

[111] H.-R. Yao, N. Cao, K. Russell, D.-C. Chang, O. Frieder, J. Fineman, Self-

supervised representation learning on electronic health records with graph

kernel infomax, 2022, arXiv:2209.00655 [cs] URL http://arxiv.org/abs/2209.

00655.

[112] W. Li, H. Li, B. Yang, L. Zhou, X. Yang, M. Zhang, B. Wang,

Knowledge-aware representation learning for diagnosis prediction, Expert Syst.

40 (3) (2023) e13175, http://dx.doi.org/10.1111/exsy.13175, URL https://

onlinelibrary.wiley.com/doi/10.1111/exsy.13175.

[113] T.-C. Do, H.-J. Yang, G.-S. Lee, S.-H. Kim, B.-G. Kho, Rapid response system

based on graph attention network for predicting in-hospital clinical deteriora-

tion, IEEE Access 11 (2023) 29091–29100, http://dx.doi.org/10.1109/ACCESS.

2023.3257406, URL https://ieeexplore.ieee.org/document/10070599/.

[114] Y. Li, L. Feng, Patient multi-relational graph structure learning for diabetes

clinical assistant diagnosis, Math. Biosci. Eng. 20 (5) (2023) 8428–8445, http://

dx.doi.org/10.3934/mbe.2023369, URL http://www.aimspress.com/article/doi/

10.3934/mbe.2023369.

[115] S. Tang, A. Tariq, J.A. Dunnmon, U. Sharma, P. Elugunti, D.L. Rubin, B.N. Patel,

I. Banerjee, Predicting 30-day all-cause hospital readmission using multimodal

spatiotemporal graph neural networks, IEEE J. Biomed. Health Inform. (2023)

1–12, http://dx.doi.org/10.1109/JBHI.2023.3236888, URL https://ieeexplore.

ieee.org/document/10016722/.

[116] N. Zong, V. Ngo, D.J. Stone, A. Wen, Y. Zhao, Y. Yu, S. Liu, M. Huang, C.

Wang, G. Jiang, Leveraging genetic reports and electronic health records for

the prediction of primary cancers: Algorithm development and validation study,

JMIR Med. Inform. 9 (5) (2021) e23586, http://dx.doi.org/10.2196/23586, URL

https://medinform.jmir.org/2021/5/e23586.

[117] A. Johnson, T. Pollard, R. Mark, MIMIC-III clinical database, 2015, http://dx.

doi.org/10.13026/C2XW26, URL https://physionet.org/content/mimiciii/1.4/.

[118] T.J. Pollard, A.E.W. Johnson, J.D. Raffa, L.A. Celi, R.G. Mark, O. Badawi, The

eICU Collaborative Research Database, a freely available multi-center database

for critical care research, Sci. Data 5 (1) (2018) 180178, http://dx.doi.org/10.

1038/sdata.2018.178, URL https://www.nature.com/articles/sdata2018178.

[119] A.E.W. Johnson, L. Bulgarelli, L. Shen, A. Gayles, A. Shammout, S. Horng,

T.J. Pollard, S. Hao, B. Moody, B. Gow, L.-w.H. Lehman, L.A. Celi, R.G.

Mark, MIMIC-IV, a freely accessible electronic health record dataset, Sci. Data

10 (1) (2023) 1, http://dx.doi.org/10.1038/s41597-022- 01899-x, URL https:

//www.nature.com/articles/s41597-022- 01899-x.

[120] A. Khan, E.B. Mobaraki, Interpretability methods for graph neural networks,

2023.

[121] L. van der Maaten, G. Hinton, Visualizing data using t-SNE, J. Mach. Learn. Res.

9 (86) (2008) 2579–2605, URL http://jmlr.org/papers/v9/vandermaaten08a.

html.

[122] X. Liu, H. Wang, T. He, Y. Liao, C. Jian, Recent advances in representation

learning for electronic health records: A systematic review, J. Phys. Conf.

Ser. 2188 (1) (2022) 012007, http://dx.doi.org/10.1088/1742-6596/2188/

1/012007, URL https://iopscience.iop.org/article/10.1088/1742-6596/2188/1/

012007.

[123] C. Yin, R. Zhao, B. Qian, X. Lv, P. Zhang, Domain knowledge guided deep

learning with electronic health records, in: 2019 IEEE International Conference

on Data Mining (ICDM), IEEE, 2019, pp. 738–747, http://dx.doi.org/10.1109/

ICDM.2019.00084, URL https://ieeexplore.ieee.org/document/8970777/.

[124] P. Ernst, A. Siu, G. Weikum, KnowLife: a versatile approach for constructing a

large knowledge graph for biomedical sciences, BMC Bioinform. 16 (1) (2015)

157, http://dx.doi.org/10.1186/s12859-015- 0549-5.

[125] HCUP-US tools & software page, 2023, URL https://hcup-us.ahrq.gov/

toolssoftware/ccs/ccsfactsheet.jsp.

[126] CMeKG(Chinese medical knowledge graph) Dataset_Tianchi datasets, 2020, URL

https://tianchi.aliyun.com/dataset/81506.

[127] ICD - ICD-9-CM - international classification of diseases, ninth revision, clinical

modification, 2021, URL https://www.cdc.gov/nchs/icd/icd9cm.htm.

[128] S.M. Kazemi, R. Goel, K. Jain, I. Kobyzev, A. Sethi, P. Forsyth, P. Poupart,

Representation learning for dynamic graphs: A survey, 2019.

[129] Q. Yuan, J. Chen, C. Lu, H. Huang, The graph-based mutual attentive network

for automatic diagnosis, in: Proceedings of the Twenty-Ninth International

Joint Conference on Artificial Intelligence, International Joint Conferences on

Artificial Intelligence Organization, 2020, pp. 3393–3399, http://dx.doi.org/10.

24963/ijcai.2020/469, URL https://www.ijcai.org/proceedings/2020/469.

[130] J.E. Rudy, Y. Khan, J.K. Bower, S. Patel, R.E. Foraker, Cardiovascular

health trends in electronic health record data (2012–2015): A Cross-Sectional

Analysis of The Guideline Advantage™, eGEMs 7 (1) (2019) 30, http://dx.

doi.org/10.5334/egems.268, URL https://www.ncbi.nlm.nih.gov/pmc/articles/

PMC6646939/.

[131] R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, D. Pedreschi, A

survey of methods for explaining Black Box Models, ACM Comput. Surv. 51 (5)

(2019) 1–42, http://dx.doi.org/10.1145/3236009, URL https://dl.acm.org/doi/

10.1145/3236009.

[132] C. Agarwal, O. Queen, H. Lakkaraju, M. Zitnik, Evaluating explainability

for graph neural networks, Sci. Data 10 (1) (2023) 144, http://dx.doi.org/

10.1038/s41597-023- 01974-x, URL https://www.nature.com/articles/s41597-

023-01974- x.

Journal of Biomedical Informatics 151 (2024) 104616

H. Oss Boll et al.

[133] R. Ying, D. Bourgeois, J. You, M. Zitnik, J. Leskovec, GNNExplainer: Generating

explanations for graph neural networks, 2019, arXiv:1903.03894 [cs, stat] URL

http://arxiv.org/abs/1903.03894.

[134] H. Xuanyuan, P. Barbiero, D. Georgiev, L.C. Magister, P. Lió, Global concept-

based interpretability for graph neural networks via neuron analysis, 2023,

arXiv:2208.10609 [cs], URL http://arxiv.org/abs/2208.10609.

[135] H. Yuan, H. Yu, S. Gui, S. Ji, Explainability in graph neural networks: A

taxonomic survey, 2022, arXiv:2012.15445 [cs] URL http://arxiv.org/abs/2012.

15445.

[136] M. Jin, H.Y. Koh, Q. Wen, D. Zambon, C. Alippi, G.I. Webb, I. King, S. Pan, A

survey on graph neural networks for time series: Forecasting, classification,

imputation, and anomaly detection, 2023, http://dx.doi.org/10.48550/arXiv.

2307.03759, URL http://arxiv.org/abs/2307.03759 arXiv:2307.03759 [cs].

Valid Conformal Prediction for Dynamic GNNs

Preprint

Full-text available

May 2024

Graph neural networks (GNNs) are powerful black-box models which have shown impressive empirical performance. However, without any form of uncertainty quantification, it can be difficult to trust such models in high-risk scenarios. Conformal prediction aims to address this problem, however, an assumption of exchangeability is required for its validity which has limited its applicability to static graphs and transductive regimes. We propose to use unfolding, which allows any existing static GNN to output a dynamic graph embedding with exchangeability properties. Using this, we extend the validity of conformal prediction to dynamic GNNs in both transductive and semi-inductive regimes. We provide a theoretical guarantee of valid conformal prediction in these cases and demonstrate the empirical validity, as well as the performance gains, of unfolded GNNs against standard GNN architectures on both simulated and real datasets.

Deep learning modelling techniques: current progress, applications, advantages, and challenges

Article

Full-text available

Apr 2023
ARTIF INTELL REV

Deep learning (DL) is revolutionizing evidence-based decision-making techniques that can be applied across various sectors. Specifically, it possesses the ability to utilize two or more levels of non-linear feature transformation of the given data via representation learning in order to overcome limitations posed by large datasets. As a multidisciplinary field that is still in its nascent phase, articles that survey DL architectures encompassing the full scope of the field are rather limited. Thus, this paper comprehensively reviews the state-of-art DL modelling techniques and provides insights into their advantages and challenges. It was found that many of the models exhibit a highly domain-specific efficiency and could be trained by two or more methods. However, training DL models can be very time-consuming, expensive, and requires huge samples for better accuracy. Since DL is also susceptible to deception and misclassification and tends to get stuck on local minima, improved optimization of parameters is required to create more robust models. Regardless, DL has already been leading to groundbreaking results in the healthcare, education, security, commercial, industrial, as well as government sectors. Some models, like the convolutional neural network (CNN), generative adversarial networks (GAN), recurrent neural network (RNN), recursive neural networks, and autoencoders, are frequently used, while the potential of other models remains widely unexplored. Pertinently, hybrid conventional DL architectures have the capacity to overcome the challenges experienced by conventional models. Considering that capsule architectures may dominate future DL models, this work aimed to compile information for stakeholders involved in the development and use of DL models in the contemporary world.

Disease Prediction Using Graph Machine Learning Based on Electronic Health Data: A Review of Approaches and Trends

Article

Full-text available

Apr 2023

Graph machine-learning (ML) methods have recently attracted great attention and have made significant progress in graph applications. To date, most graph ML approaches have been evaluated on social networks, but they have not been comprehensively reviewed in the health informatics domain. Herein, a review of graph ML methods and their applications in the disease prediction domain based on electronic health data is presented in this study from two levels: node classification and link prediction. Commonly used graph ML approaches for these two levels are shallow embedding and graph neural networks (GNN). This study performs comprehensive research to identify articles that applied or proposed graph ML models on disease prediction using electronic health data. We considered journals and conferences from four digital library databases (i.e., PubMed, Scopus, ACM digital library, and IEEEXplore). Based on the identified articles, we review the present status of and trends in graph ML approaches for disease prediction using electronic health data. Even though GNN-based models have achieved outstanding results compared with the traditional ML methods in a wide range of disease prediction tasks, they still confront interpretability and dynamic graph challenges. Though the disease prediction field using ML techniques is still emerging, GNN-based models have the potential to be an excellent approach for disease prediction, which can be used in medical diagnosis, treatment, and the prognosis of diseases.

Evaluating explainability for graph neural networks

Article

Full-text available

Mar 2023

As explanations are increasingly used to understand the behavior of graph neural networks (GNNs), evaluating the quality and reliability of GNN explanations is crucial. However, assessing the quality of GNN explanations is challenging as existing graph datasets have no or unreliable ground-truth explanations. Here, we introduce a synthetic graph data generator, ShapeGGen, which can generate a variety of benchmark datasets (e.g., varying graph sizes, degree distributions, homophilic vs. heterophilic graphs) accompanied by ground-truth explanations. The flexibility to generate diverse synthetic datasets and corresponding ground-truth explanations allows ShapeGGen to mimic the data in various real-world areas. We include ShapeGGen and several real-world graph datasets in a graph explainability library, GraphXAI. In addition to synthetic and real-world graph datasets with ground-truth explanations, GraphXAI provides data loaders, data processing functions, visualizers, GNN model implementations, and evaluation metrics to benchmark GNN explainability methods.

Rapid Response System Based on Graph Attention Network for Predicting In-Hospital Clinical Deterioration

Article

Full-text available

Jan 2023

In-hospital clinical deterioration is a major worldwide healthcare burden in the intensive care units (ICUs), as it requires rapid intervention. Rapid response systems (RRSs) are widely used in many hospitals for the early detection of clinical deterioration to prevent cardiac arrest. Recently, with the increasing use of deep learning (DL) and electronic health records (EHR), many DL models have been developed for the intensive care domain, such as prediction of cardiac arrest, sepsis, or transferring to ICU. However, most existing methods do not explicitly learn the structure of multivariate time-series data, and this leads to high false-alarm rates and low sensitivity. In this research, we propose a novel DL-based framework that interpolates high-dimensional sequential data. Our approach combines two graph neural networks with an attention mechanism to learn the complex dependencies among multivariate time series. The experiments were conducted on two datasets: a private clinical dataset collected from Chonnam National University Hospital (CNUH) and a public dataset from the University of Virginia (UV). The experimental results show the potential performance of our model compared to some other related research.

Patient multi-relational graph structure learning for diabetes clinical assistant diagnosis

Article

Full-text available

Mar 2023

The rapid accumulation of electronic health records (EHRs) and the advancements in data analysis technology have laid the foundation for research and clinical decision-making in the healthcare community. Graph neural networks (GNNs), a deep learning model family for graph embedding representations, have been widely used in the field of smart healthcare. However, traditional GNNs rely on the basic assumption that the graph structure extracted from the complex interactions among the EHRs must be a real topology. Noisy connections or false topology in the graph structure leads to inefficient disease prediction. We devise a new model named PM-GSL to improve diabetes clinical assistant diagnosis based on patient multi-relational graph structure learning. Specifically, we first build a patient multi-relational graph based on patient demographics, diagnostic information, laboratory tests, and complex interactions between medicines in EHRs. Second, to fully consider the heterogeneity of the patient multi-relational graph, we consider the node characteristics and the higher-order semantics of nodes. Thus, three candidate graphs are generated in the PM-GSL model: original subgraph, overall feature graph, and higher-order semantic graph. Finally, we fuse the three candidate graphs into a new heterogeneous graph and jointly optimize the graph structure with GNNs in the disease prediction task. The experimental results indicate that PM-GSL outperforms other state-of-the-art models in diabetes clinical assistant diagnosis tasks.

A Comprehensive Survey on Deep Graph Representation Learning Methods

Article

Oct 2023
JAIR

There has been a lot of activity in graph representation learning in recent years. Graph representation learning aims to produce graph representation vectors to represent the structure and characteristics of huge graphs precisely. This is crucial since the effectiveness of the graph representation vectors will influence how well they perform in subsequent tasks like anomaly detection, connection prediction, and node classification. Recently, there has been an increase in the use of other deep-learning breakthroughs for data-based graph problems. Graph-based learning environments have a taxonomy of approaches, and this study reviews all their learning settings. The learning problem is theoretically and empirically explored. This study briefly introduces and summarizes the Graph Neural Architecture Search (G-NAS), outlines several Graph Neural Networks’ drawbacks, and suggests some strategies to mitigate these challenges. Lastly, the study discusses several potential future study avenues yet to be explored.

PM2F2N: Patient Multi-view Multi-modal Feature Fusion Networks for Clinical Outcome Prediction

Conference Paper

Jan 2022

Deep learning prediction models based on EHR trajectories: A systematic review

Article

Jun 2023
J BIOMED INFORM

Background: Electronic health records (EHRs) are generated at an ever-increasing rate. EHR trajectories, the temporal aspect of health records, facilitate predicting patients' future health-related risks. It enables healthcare systems to increase the quality of care through early identification and primary prevention. Deep learning techniques have shown great capacity for analyzing complex data and have been successful for prediction tasks using complex EHR trajectories. This systematic review aims to analyze recent studies to identify challenges, knowledge gaps, and ongoing research directions. Methods: For this systematic review, we searched Scopus, PubMed, IEEE Xplore, and ACM databases from Jan 2016 to April 2022 using search terms centered around EHR, deep learning, and trajectories. Then the selected papers were analyzed according to publication characteristics, objectives, and their solutions regarding existing challenges, such as the model's capacity to deal with intricate data dependencies, data insufficiency, and explainability. Results: After removing duplicates and out-of-scope papers, 63 papers were selected, which showed rapid growth in the number of research in recent years. Predicting all diseases in the next visit and the onset of cardiovascular diseases were the most common targets. Different contextual and non-contextual representation learning methods are employed to retrieve important information from the sequence of EHR trajectories. Recurrent neural networks and the time-aware attention mechanism for modeling long-term dependencies, self-attentions, convolutional neural networks, graphs for representing inner visit relations, and attention scores for explainability were frequently used among the reviewed publications. Conclusions: This systematic review demonstrated how recent breakthroughs in deep learning methods have facilitated the modeling of EHR trajectories. Research on improving the ability of graph neural networks, attention mechanisms, and cross-modal learning to analyze intricate dependencies among EHRs has shown good progress. There is a need to increase the number of publicly available EHR trajectory datasets to allow for easier comparison among different models. Also, very few developed models can handle all aspects of EHR trajectory data. .

Predicting Clinical Events via Graph Neural Networks

Conference Paper

Dec 2022

A hierarchical multilabel graph attention network method to predict the deterioration paths of chronic hepatitis B patients

Article

Feb 2023
J AM MED INFORM ASSN

Objective: Estimating the deterioration paths of chronic hepatitis B (CHB) patients is critical for physicians' decisions and patient management. A novel, hierarchical multilabel graph attention-based method aims to predict patient deterioration paths more effectively. Applied to a CHB patient data set, it offers strong predictive utilities and clinical value. Materials and methods: The proposed method incorporates patients' responses to medications, diagnosis event sequences, and outcome dependencies to estimate deterioration paths. From the electronic health records maintained by a major healthcare organization in Taiwan, we collect clinical data about 177 959 patients diagnosed with hepatitis B virus infection. We use this sample to evaluate the proposed method's predictive efficacy relative to 9 existing methods, as measured by precision, recall, F-measure, and area under the curve (AUC). Results: We use 20% of the sample as holdouts to test each method's prediction performance. The results indicate that our method consistently and significantly outperforms all benchmark methods. It attains the highest AUC, with a 4.8% improvement over the best-performing benchmark, as well as 20.9% and 11.4% improvements in precision and F-measures, respectively. The comparative results demonstrate that our method is more effective for predicting CHB patients' deterioration paths than existing predictive methods. Discussion and conclusion: The proposed method underscores the value of patient-medication interactions, temporal sequential patterns of distinct diagnosis, and patient outcome dependencies for capturing dynamics that underpin patient deterioration over time. Its efficacious estimates grant physicians a more holistic view of patient progressions and can enhance their clinical decision-making and patient management.

Graph neural networks for clinical risk prediction based on electronic health records: A survey

Recommended publications

A Comprehensive Survey on Deep Graph Representation Learning

Information Flow in Graph Neural Networks: A Clinical Triage Use Case

A Masked Language Model for Multi-Source EHR Trajectories Contextual Representation Learning

Multi-Criteria-based Graph Neural Networks for a Medical Emergency Response System

Deep learning prediction models based on EHR trajectories: A systematic review