Figure 2 - uploaded by Soheila Talebi
Content may be subject to copyright.
a) Distribution of patients' length of stay in the hospital; b) On the left, we have test and training loss with respect to the number of epochs, and on the right, we have the average precision score and area under receiver operator characteristics curves.

a) Distribution of patients' length of stay in the hospital; b) On the left, we have test and training loss with respect to the number of epochs, and on the right, we have the average precision score and area under receiver operator characteristics curves.

Source publication
Article
Full-text available
Incorporating repeated measurements of vitals and laboratory measurements can improve mortality risk-prediction and identify key risk factors in individualized treatment of COVID-19 hospitalized patients. In this observational study, demographic and laboratory data of all admitted patients to 5 hospitals of Mount Sinai Health System, New York, with...

Contexts in source publication

Context 1
... median age is 67 years, and 57% are male, with median BMI of 27.7. Figure 2a provides the distribution of patients' hospital length of stay. The median length of stay is 5 days. ...
Context 2
... Optimization: To avoid overfitting and optimize the number of epochs (runs through the data), we plot the test and training loss for our next day mortality model when fitted on the entire length of patients' stay, over the number of epochs in Figure 2b. One can see that the model converges quickly after 4-10 epochs, which is approximately when the average precision score is maximized, therefore, to avoid overfitting, we only use 10 epochs in our next day mortality model. ...

Citations

... observations) would increase the resources required due to the increased length of the vector. In addition, there are a variety of models that either combine the BERT architecture with other machine learning models (Shang et al. (2019), Poulain et al. (2022), Li et al. (2021)) or focus exclusively on specific use cases (Azhir et al. (2022), Prakash et al. (2021), Rao et al. (2022)). All the aforementioned approaches either lack generalizability to different domains due to specific pre-training (a key advantage of transfer learning using transformers), do not incorporate enough variety in patient information to generate informed decisions or are limited in the amount of data of a single patient they can process. ...
Preprint
Full-text available
In this study, we introduce ExBEHRT, an extended version of BEHRT (BERT applied to electronic health records), and apply different algorithms to interpret its results. While BEHRT considers only diagnoses and patient age, we extend the feature space to several multimodal records, namely demographics, clinical characteristics, vital signs, smoking status, diagnoses, procedures, medications, and laboratory tests, by applying a novel method to unify the frequencies and temporal dimensions of the different features. We show that additional features significantly improve model performance for various downstream tasks in different diseases. To ensure robustness, we interpret model predictions using an adaptation of expected gradients, which has not been previously applied to transformers with EHR data and provides more granular interpretations than previous approaches such as feature and token importances. Furthermore, by clustering the model representations of oncology patients, we show that the model has an implicit understanding of the disease and is able to classify patients with the same cancer type into different risk groups. Given the additional features and interpretability, ExBEHRT can help make informed decisions about disease trajectories, diagnoses, and risk factors of various diseases.
Chapter
In this study, we introduce ExBEHRT, an extended version of BEHRT (BERT applied to electronic health record data) and applied various algorithms to interpret its results. While BEHRT only considers diagnoses and patient age, we extend the feature space to several multi-modal records, namely demographics, clinical characteristics, vital signs, smoking status, diagnoses, procedures, medications and lab tests by applying a novel method to unify the frequencies and temporal dimensions of the different features. We show that additional features significantly improve model performance for various down-stream tasks in different diseases. To ensure robustness, we interpret the model predictions using an adaption of expected gradients, which has not been applied to transformers with EHR data so far and provides more granular interpretations than previous approaches such as feature and token importances. Furthermore, by clustering the models’ representations of oncology patients, we show that the model has implicit understanding of the disease and is able to classify patients with same cancer type into different risk groups. Given the additional features and interpretability, ExBEHRT can help making informed decisions about disease progressions, diagnoses and risk factors of various diseases.KeywordsBERTRWEPatient SubtypingInterpretability