Jared Dunnmon's research works | Stanford University, CA (SU) and other places

Predicting 30-Day All-Cause Hospital Readmission Using Multimodal Spatiotemporal Graph Neural Networks

Article

Jan 2023

Reduction in 30-day readmission rate is an important quality factor for hospitals as it can reduce the overall cost of care and improve patient post-discharge outcomes. While deep-learning-based studies have shown promising empirical results, several limitations exist in prior models for hospital readmission prediction, such as: (a) only patients w...

Spatiotemporal Modeling of Multivariate Signals With Graph Neural Networks and Structured State Space Models

Preprint

Full-text available

Nov 2022

Multivariate signals are prevalent in various domains, such as healthcare, transportation systems, and space sciences. Modeling spatiotemporal dependencies in multivariate signals is challenging due to (1) long-range temporal dependencies and (2) complex spatial correlations between sensors. To address these challenges, we propose representing mult...

Fig. 1. ROI and FOV examples at 20× magnification. 1A. Within this FOV,...

Fig. 2. ROC curves and subclass performance graphs. 2A. ROI-level ROC...

Fig. 4. 10× magnification FOVs with proper and improper false...

Fig. 5. 20× & 40× magnification FOVs with proper false negatives. 5A. A...

Fig. 6. 20× & 40× magnification FOVs with ITC proper false negatives....

Independent assessment of a deep learning system for lymph node metastasis detection on the Augmented Reality Microscope

Article

Full-text available

Sep 2022

Several machine learning algorithms have demonstrated high predictive capability in the identification of cancer within digitized pathology slides. The Augmented Reality Microscope (ARM) has allowed these algorithms to be seamlessly integrated within the pathology workflow by overlaying their inferences onto its microscopic field of view in real ti...

Figure 1. ROI and FOV examples at 20x magnification 1A. Within this...

Figure 2. ROC curves and subclass performance graphs 2A. ROI-level ROC...

Figure 3. 10x magnification FOVs of common out-of-domain tumor tissues...

Figure 5. 20x & 40x magnification FOVs with proper false negatives 5A....

Figure 6. 20x & 40x magnification FOVs with ITC proper false negatives...

Independent assessment of a deep learning system for lymph node metastasis detection on the Augmented Reality Microscope

Preprint

Full-text available

May 2022

Several machine learning algorithms have demonstrated high predictive capability in the identification of cancer within digitized pathology slides. The Augmented Reality Microscope (ARM) has allowed these algorithms to be seamlessly integrated within the current pathology workflow by overlaying their inferences onto its microscopic field of view in...

Patient characteristics in the primary dataset and MIMIC-IV dataset....

Multimodal spatiotemporal graph neural networks for improved prediction of 30-day all-cause hospital readmission

Preprint

Full-text available

Apr 2022

Measures to predict 30-day readmission are considered an important quality factor for hospitals as accurate predictions can reduce the overall cost of care by identifying high risk patients before they are discharged. While recent deep learning-based studies have shown promising empirical results on readmission prediction, several limitations exist...

Figure 1: Proposed Approach. (Left) We design an evaluation framework...

Figure 2: Evaluation Framework. We propose a framework for generating...

Figure 4: Error-aware mixture model enables accurate slice discovery....

Figure 9: Descriptions of discovered slices align with the names of the...

Domino: Discovering Systematic Errors with Cross-Modal Embeddings

Preprint

Full-text available

Mar 2022

Machine learning models that achieve high overall accuracy often make systematic errors on important subsets (or slices) of data. Identifying underperforming slices is particularly challenging when working with high-dimensional inputs (e.g. images, audio), where important slices are often unlabeled. In order to address this issue, recent studies ha...

ViLMedic: a framework for research at the intersection of vision and language in medical AI

Conference Paper

Jan 2022

Separating Hope from Hype

Article

Nov 2021

Jared Dunnmon

Although recent scientific studies suggest that artificial intelligence (AI) could provide value in many radiology applications, much of the hard engineering work required to consistently realize this value in practice remains to be done. In this article, we summarize the various ways in which AI can benefit radiology practice, identify key challen...

Observational Supervision for Medical Image Classification Using Gaze Data

Chapter

Sep 2021

Deep learning models have demonstrated favorable performance on many medical image classification tasks. However, they rely on expensive hand-labeled datasets that are time-consuming to create. In this work, we explore a new supervision source to training deep learning models by using gaze data that is passively and cheaply collected during a clini...

OncoNet: Weakly Supervised Siamese Network to automate cancer treatment response assessment between longitudinal FDG PET/CT examinations

Preprint

Aug 2021

FDG PET/CT imaging is a resource intensive examination critical for managing malignant disease and is particularly important for longitudinal assessment during therapy. Approaches to automate longtudinal analysis present many challenges including lack of available longitudinal datasets, managing complex large multimodal imaging examinations, and ne...

Impact of Upstream Medical Image Processing on the Downstream Performance of a Head CT Triage Neural Network

Article

Apr 2021

Purpose: To develop a convolutional neural network (CNN) to triage head CT (HCT) studies and investigate the effect of upstream medical image processing on the CNN's performance. Materials and methods: A total of 9776 HCT studies were retrospectively collected from 2001 through 2014, and a CNN was trained to triage them as normal or abnormal. CN...

Overview of our method. (a) The input data were chest X-ray images and...

(a)–(c) Effects of removing high value data points to pneumonia...

Example heatmaps for (a) low value images that were mislabeled as...

Data valuation for medical imaging using Shapley value and application to a large-scale chest X-ray dataset

Article

Full-text available

Apr 2021

The reliability of machine learning models can be compromised when trained on low quality data. Many large-scale medical imaging datasets contain low quality labels extracted from sources such as medical reports. Moreover, images within a dataset may have heterogeneous quality due to artifacts and biases arising from equipment or measurement errors...

Automated Seizure Detection and Seizure Type Classification From Electroencephalography With a Graph Neural Network and Self-Supervised Pre-Training

Preprint

Full-text available

Apr 2021

Automated seizure detection and classification from electroencephalography (EEG) can greatly improve the diagnosis and treatment of seizures. While prior studies mainly used convolutional neural networks (CNNs) that assume image-like structure in EEG signals or spectrograms, this modeling choice does not reflect the natural geometry of or connectiv...

Weak supervision, multi-task learning, and spatial attention are...

Combining weakly-supervised multi-task learning with spatial attention...

Weak supervision reduces labeling costs for automated abnormality...

Multi-task learning improves automated location estimation performance,...

Spatial attention mechanisms improve automated detection performance...

Multi-task weak supervision enables anatomically-resolved abnormality detection in whole-body FDG-PET/CT

Article

Full-text available

Mar 2021

Computational decision support systems could provide clinical value in whole-body FDG-PET/CT workflows. However, limited availability of labeled data combined with the large size of PET/CT imaging exams make it challenging to apply existing supervised machine learning systems. Leveraging recent advancements in natural language processing, we descri...

Fig. 2 (Sensitivity vs. specificity plot): Sensitivity and specificity...

Fig. 3 (Class Activation Maps): Class activation map (CAM)...

PENet—a scalable deep-learning model for automated diagnosis of pulmonary embolism using volumetric CT imaging

Article

Full-text available

Dec 2020

Pulmonary embolism (PE) is a life-threatening clinical problem and computed tomography pulmonary angiography (CTPA) is the gold standard for diagnosis. Prompt diagnosis and immediate treatment are critical to avoid high morbidity and mortality rates, yet PE remains among the diagnoses most frequently missed or delayed. In this study, we developed a...

Comparison of Segmentation-Free and Segmentation-Dependent Computer-Aided Diagnosis of Breast Masses on a Public Mammography Dataset

Article

Dec 2020

Purpose To compare machine learning methods for classifying mass lesions on mammography images that use predefined image features computed over lesion segmentations to those that leverage segmentation-free representation learning on a standard, public evaluation dataset. Methods We apply several classification algorithms to the public Curated Brea...

Fig. 2. Comparison risk levels Pα,L(N ) as a function of the number of...

Fig. 3. Comparison of the precision-recall curves of the two models....

Fig. 4. Performance, measured as F1-score, as a function of the average...

Fig. 5. Comparison of the attention maps of the risk tolerant and...

Let's Hope it Works! Inaccurate Supervision of Neural Networks with Incorrect Labels: Application to Epilepsy

Preprint

Full-text available

Nov 2020

This work describes multiple weak supervision strategies for video processing with neural networks in the context of epilepsy. To study seizure onset, researchers have designed automated methods to detect seizures from electroencephalography (EEG), a modality used for recording electrical brain activity. However, the EEG signal alone is sometimes n...

No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems

Preprint

Nov 2020

In real-world classification tasks, each class often comprises multiple finer-grained "subclasses." As the subclass labels are frequently unavailable, models trained using only the coarser-grained class labels often exhibit highly variable performance across different subclasses. This phenomenon, known as hidden stratification, has important conseq...

Figure 1. Overview of our method. (a) The input data were chest X-ray...

Figure 2. (a) -(c) Effects of removing high value data points to...

Figure 3. Example heatmaps for (a) low value images that were...

Data Valuation for Medical Imaging Using Shapley Value: Application on A Large-scale Chest X-ray Dataset

Preprint

Full-text available

Oct 2020

The reliability of machine learning models can be compromised when trained on low quality data. Many large-scale medical imaging datasets contain low quality labels extracted from sources such as medical reports. Moreover, images within a dataset may have heterogeneous quality due to artifacts and biases arising from equipment or measurement errors...

Author Correction: PENet—a scalable deep-learning model for automated diagnosis of pulmonary embolism using volumetric CT imaging

Article

Full-text available

Jul 2020

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

Ivy: Instrumental Variable Synthesis for Causal Inference

Preprint

Full-text available

Apr 2020

A popular way to estimate the causal effect of a variable x on y from observational data is to use an instrumental variable (IV): a third variable z that affects y only through x. The more strongly z is associated with x, the more reliable the estimate is, but such strong IVs are difficult to find. Instead, practitioners combine more commonly avail...

Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging

Conference Paper

Apr 2020

Machine learning models for medical image analysis often suffer from poor performance on important subsets of a population that are not identified during training or testing. For example, overall performance of a cancer detection model may be high, but the model may still consistently miss a rare but aggressive cancer subtype. We refer to this prob...

Figure 1. A Cross-Modal Data Programming Pipeline for Rapidly Training...

Figure 2. Example Target Modality Data for Four Clinical Applications

Figure 3. ROC Curves for Models Trained Using Full Hand-Labeled...

Figure 4. Mean Neural Network ROC-AUC Score versus Dataset Size Using...

Figure 5. Analysis of Labeling Function Types and...

Cross-Modal Data Programming Enables Rapid Medical Machine Learning

Article

Full-text available

Apr 2020

A major bottleneck in developing clinically impactful machine learning models is a lack of labeled training data for model supervision. Thus, medical researchers increasingly turn to weaker, noisier sources of supervision, such as leveraging extractions from unstructured text reports to supervise image classification. A key challenge in weak superv...

Fig. 2 Data description for Lucile Packard Children's Hospital (LPCH)....

Fig. 3 ROC curves and sample scaling results. We plot a median ROC...

Fig. 4 Generalization performance across pediatric and adult...

Fig. 6 Representative cases for true-positive, false-positive,...

Weak supervision as an efficient approach for automated seizure detection in electroencephalography

Article

Full-text available

Apr 2020

Automated seizure detection from electroencephalography (EEG) would improve the quality of patient care while reducing medical costs, but achieving reliably high performance across patients has proven difficult. Convolutional Neural Networks (CNNs) show promise in addressing this problem, but they are limited by a lack of large labeled training dat...

Assessing Robustness to Noise: Low-Cost Head CT Triage

Preprint

Mar 2020

Automated medical image classification with convolutional neural networks (CNNs) has great potential to impact healthcare, particularly in resource-constrained healthcare systems where fewer trained radiologists are available. However, little is known about how well a trained CNN can perform on images with the increased noise levels, different acqu...

Categorization of patients in training, validation and test sets
Class...

Model architecture
Transfer learning model architecture for obtaining a...

Grad-CAM visualizations for abnormal lower extremities
a, An abnormal...

Automated abnormality detection in lower extremity radiographs using deep learning

Article

Full-text available

Dec 2019

Musculoskeletal disorders are a major healthcare challenge around the world. We investigate the utility of convolutional neural networks (CNNs) in performing generalized abnormality detection on lower extremity radiographs. We also explore the effect of pretraining, dataset size and model architecture on model performance to provide recommendations...

Doubly Weak Supervision of Deep Learning Models for Head CT

Chapter

Oct 2019

Recent deep learning models for intracranial hemorrhage (ICH) detection on computed tomography of the head have relied upon large datasets hand-labeled at either the full-scan level or at the individual slice-level. Though these models have demonstrated favorable empirical performance, the hand-labeled datasets upon which they rely are time-consumi...

Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging

Preprint

Sep 2019

Machine learning models for medical image analysis often suffer from poor performance on important subsets of a population that are not identified during training or testing. For example, overall performance of a cancer detection model may be high, but the model still consistently misses a rare but aggressive cancer subtype. We refer to this proble...

Comparison of Modeling Assumptions among Ivy, UAS, and WAS

Mendelian Randomization with Instrumental Variable Synthesis (IVY)

Preprint

Full-text available

Sep 2019

Mendelian Randomization (MR) is an important causal inference method primarily used in biomedical research. This work applies contemporary techniques in machine learning to improve the robustness and power of traditional MR tools. By denoising and combining candidate genetic variants through techniques from unsupervised probabilistic graphical mode...

Training Complex Models with Multi-Task Weak Supervision

Article

Jul 2019

As machine learning models continue to increase in complexity, collecting large hand-labeled training sets has become one of the biggest roadblocks in practice. Instead, weaker forms of supervision that provide noisier but cheaper labels are often used. However, these weak supervision sources have diverse and unknown accuracies, may output correlat...

Frame-level labeling function performance metrics

Weak supervision scale up performance metrics. Metrics include a...

Unadjusted survival from MACE in 9230 participants stratified by model...

Patient clustering visualization. t-SNE visualization of the last...

Example MRI sequence data for BAV and TAV subjects. a Uncropped MRI...

Weakly supervised classification of aortic valve malformations using unlabeled cardiac MRI sequences

Article

Full-text available

Jul 2019

Biomedical repositories such as the UK Biobank provide increasing access to prospectively collected cardiac imaging, however these data are unlabeled, which creates barriers to their use in supervised machine learning. We develop a weakly supervised deep learning model for classification of aortic valve malformations using up to 4,000 unlabeled car...

Cross-Modal Data Programming Enables Rapid Medical Machine Learning

Preprint

Full-text available

Mar 2019

Labeling training datasets has become a key barrier to building medical machine learning models. One strategy is to generate training labels programmatically, for example by applying natural language processing pipelines to text reports associated with imaging studies. We propose cross-modal data programming, which generalizes this intuitive strate...

Figure 1: Distribution of agriculture-related tweets by crop and by state

Figure 2: Hyperparameter search pipeline. The data featurizer returns...

Figure 5: Confusion matrices for Agriculture (AG) test set and Kansas...

Figure 6: Model training, validation, and test performance and...

Predicting US State-Level Agricultural Sentiment as a Measure of Food Security with Tweets from Farming Communities

Preprint

Full-text available

Feb 2019

The ability to obtain accurate food security metrics in developing areas where relevant data can be sparse is critically important for policy makers tasked with implementing food aid programs. As a result, a great deal of work has been dedicated to predicting important food security metrics such as annual crop yields using a variety of methods incl...

Figure 2: Computational graph for VGG-style network.

Table 2 : CNN FSM Prediction Results. Model Output Layer Feature Number...

Figure 4: Confusion matrix for the prediction of land cover classes

Figure 5: A schematic for the path from an image to generating an FSM

Predicting Food Security Outcomes Using Convolutional Neural Networks (CNNs) for Satellite Tasking

Preprint

Full-text available

Feb 2019

Obtaining reliable data describing local Food Security Metrics (FSM) at a granularity that is informative to policy-makers requires expensive and logistically difficult surveys, particularly in the developing world. We train a CNN on publicly available satellite data describing land cover classification and use both transfer learning and direct tra...

PENet - A Scalable Deep-Learning Model for Automated Diagnosis of Pulmonary Embolism Using Volumetric CT Imaging

Article

Full-text available

Jan 2019

Assessment of Convolutional Neural Networks for Automated Classification of Chest Radiographs

Article

Nov 2018

Purpose To assess the ability of convolutional neural networks (CNNs) to enable high-performance automated binary classification of chest radiographs. Materials and Methods In a retrospective study, 216 431 frontal chest radiographs obtained between 1998 and 2012 were procured, along with associated text reports and a prospective label from the att...

Training Complex Models with Multi-Task Weak Supervision

Preprint

Oct 2018

As machine learning models continue to increase in complexity, collecting large hand-labeled training sets has become one of the biggest roadblocks in practice. Instead, weaker forms of supervision that provide noisier but cheaper labels are often used. However, these weak supervision sources have diverse and unknown accuracies, may output correlat...

Snorkel MeTaL: Weak Supervision for Multi-Task Learning

Conference Paper

Jun 2018

Many real-world machine learning problems are challenging to tackle for two reasons: (i) they involve multiple sub-tasks at different levels of granularity; and (ii) they require large volumes of labeled training data. We propose Snorkel MeTaL, an end-to-end system for multi-task learning that leverages weak supervision provided at multiple levels...

Weakly supervised classification of rare aortic valve malformations using unlabeled cardiac MRI sequences

Preprint

Full-text available

Jun 2018

Recent releases of population-scale biomedical repositories such as the UK Biobank have enabled unprecedented access to prospectively collected medical imaging data. Applying machine learning methods to analyze these data holds great promise in facilitating new insights into the genetic and epidemiological associations between anatomical structures...

Learning to Compose Domain-Specific Transformations for Data Augmentation

Article

Sep 2017

Data augmentation is a ubiquitous technique for increasing the size of labeled training sets by leveraging task-specific data transformations that preserve class labels. While it is often easy for domain experts to specify individual transformations, constructing and tuning the more sophisticated compositions typically needed to achieve state-of-th...

Optimizing and Visualizing Deep Learning for Benign/Malignant Classification in Breast Tumors

Article

May 2017

Breast cancer has the highest incidence and second highest mortality rate for women in the US. Our study aims to utilize deep learning for benign/malignant classification of mammogram tumors using a subset of cases from the Digital Database of Screening Mammography (DDSM). Though it was a small dataset from the view of Deep Learning (about 1000 pat...

Predicting State-Level Agricultural Sentiment with Tweets from Farming Communities

Article

Full-text available

Apr 2017

The ability to obtain accurate food security metrics in developing areas where relevant data can be sparse is critically important for policy makers tasked with implementing food aid programs. As a result, a great deal of work has been dedicated to predicting important food security metrics such as annual crop yields using a variety of methods incl...

Predicting Food Security Outcomes Using CNNs for Satellite Tasking

Article

Full-text available

Jan 2017

Obtaining reliable data describing local Food Security Metrics (FSM) at a granularity that is informative to policy-makers requires expensive and logistically difficult surveys, particularly in the developing world. We train a CNN on publicly available satellite data describing land cover classification and use both transfer learning and direct tra...

Jared Dunnmon's research while affiliated with Stanford University and other places

What is this page?

Publications (43)

Citations