Jared Dunnmon's research while affiliated with Stanford University and other places

Publications (43)

Article
Reduction in 30-day readmission rate is an important quality factor for hospitals as it can reduce the overall cost of care and improve patient post-discharge outcomes. While deep-learning-based studies have shown promising empirical results, several limitations exist in prior models for hospital readmission prediction, such as: (a) only patients w...
Preprint
Full-text available
Multivariate signals are prevalent in various domains, such as healthcare, transportation systems, and space sciences. Modeling spatiotemporal dependencies in multivariate signals is challenging due to (1) long-range temporal dependencies and (2) complex spatial correlations between sensors. To address these challenges, we propose representing mult...
Article
Full-text available
Several machine learning algorithms have demonstrated high predictive capability in the identification of cancer within digitized pathology slides. The Augmented Reality Microscope (ARM) has allowed these algorithms to be seamlessly integrated within the pathology workflow by overlaying their inferences onto its microscopic field of view in real ti...
Preprint
Full-text available
Several machine learning algorithms have demonstrated high predictive capability in the identification of cancer within digitized pathology slides. The Augmented Reality Microscope (ARM) has allowed these algorithms to be seamlessly integrated within the current pathology workflow by overlaying their inferences onto its microscopic field of view in...
Preprint
Full-text available
Measures to predict 30-day readmission are considered an important quality factor for hospitals as accurate predictions can reduce the overall cost of care by identifying high risk patients before they are discharged. While recent deep learning-based studies have shown promising empirical results on readmission prediction, several limitations exist...
Preprint
Full-text available
Machine learning models that achieve high overall accuracy often make systematic errors on important subsets (or slices) of data. Identifying underperforming slices is particularly challenging when working with high-dimensional inputs (e.g. images, audio), where important slices are often unlabeled. In order to address this issue, recent studies ha...
Article
Although recent scientific studies suggest that artificial intelligence (AI) could provide value in many radiology applications, much of the hard engineering work required to consistently realize this value in practice remains to be done. In this article, we summarize the various ways in which AI can benefit radiology practice, identify key challen...
Chapter
Deep learning models have demonstrated favorable performance on many medical image classification tasks. However, they rely on expensive hand-labeled datasets that are time-consuming to create. In this work, we explore a new supervision source to training deep learning models by using gaze data that is passively and cheaply collected during a clini...
Preprint
FDG PET/CT imaging is a resource intensive examination critical for managing malignant disease and is particularly important for longitudinal assessment during therapy. Approaches to automate longtudinal analysis present many challenges including lack of available longitudinal datasets, managing complex large multimodal imaging examinations, and ne...
Article
Purpose: To develop a convolutional neural network (CNN) to triage head CT (HCT) studies and investigate the effect of upstream medical image processing on the CNN's performance. Materials and methods: A total of 9776 HCT studies were retrospectively collected from 2001 through 2014, and a CNN was trained to triage them as normal or abnormal. CN...
Article
Full-text available
The reliability of machine learning models can be compromised when trained on low quality data. Many large-scale medical imaging datasets contain low quality labels extracted from sources such as medical reports. Moreover, images within a dataset may have heterogeneous quality due to artifacts and biases arising from equipment or measurement errors...
Preprint
Full-text available
Automated seizure detection and classification from electroencephalography (EEG) can greatly improve the diagnosis and treatment of seizures. While prior studies mainly used convolutional neural networks (CNNs) that assume image-like structure in EEG signals or spectrograms, this modeling choice does not reflect the natural geometry of or connectiv...
Article
Full-text available
Computational decision support systems could provide clinical value in whole-body FDG-PET/CT workflows. However, limited availability of labeled data combined with the large size of PET/CT imaging exams make it challenging to apply existing supervised machine learning systems. Leveraging recent advancements in natural language processing, we descri...
Article
Full-text available
Pulmonary embolism (PE) is a life-threatening clinical problem and computed tomography pulmonary angiography (CTPA) is the gold standard for diagnosis. Prompt diagnosis and immediate treatment are critical to avoid high morbidity and mortality rates, yet PE remains among the diagnoses most frequently missed or delayed. In this study, we developed a...
Article
Purpose To compare machine learning methods for classifying mass lesions on mammography images that use predefined image features computed over lesion segmentations to those that leverage segmentation-free representation learning on a standard, public evaluation dataset. Methods We apply several classification algorithms to the public Curated Brea...
Preprint
Full-text available
This work describes multiple weak supervision strategies for video processing with neural networks in the context of epilepsy. To study seizure onset, researchers have designed automated methods to detect seizures from electroencephalography (EEG), a modality used for recording electrical brain activity. However, the EEG signal alone is sometimes n...
Preprint
In real-world classification tasks, each class often comprises multiple finer-grained "subclasses." As the subclass labels are frequently unavailable, models trained using only the coarser-grained class labels often exhibit highly variable performance across different subclasses. This phenomenon, known as hidden stratification, has important conseq...
Preprint
Full-text available
The reliability of machine learning models can be compromised when trained on low quality data. Many large-scale medical imaging datasets contain low quality labels extracted from sources such as medical reports. Moreover, images within a dataset may have heterogeneous quality due to artifacts and biases arising from equipment or measurement errors...
Preprint
Full-text available
A popular way to estimate the causal effect of a variable x on y from observational data is to use an instrumental variable (IV): a third variable z that affects y only through x. The more strongly z is associated with x, the more reliable the estimate is, but such strong IVs are difficult to find. Instead, practitioners combine more commonly avail...
Conference Paper
Machine learning models for medical image analysis often suffer from poor performance on important subsets of a population that are not identified during training or testing. For example, overall performance of a cancer detection model may be high, but the model may still consistently miss a rare but aggressive cancer subtype. We refer to this prob...
Article
Full-text available
A major bottleneck in developing clinically impactful machine learning models is a lack of labeled training data for model supervision. Thus, medical researchers increasingly turn to weaker, noisier sources of supervision, such as leveraging extractions from unstructured text reports to supervise image classification. A key challenge in weak superv...
Article
Full-text available
Automated seizure detection from electroencephalography (EEG) would improve the quality of patient care while reducing medical costs, but achieving reliably high performance across patients has proven difficult. Convolutional Neural Networks (CNNs) show promise in addressing this problem, but they are limited by a lack of large labeled training dat...
Preprint
Automated medical image classification with convolutional neural networks (CNNs) has great potential to impact healthcare, particularly in resource-constrained healthcare systems where fewer trained radiologists are available. However, little is known about how well a trained CNN can perform on images with the increased noise levels, different acqu...
Article
Full-text available
Musculoskeletal disorders are a major healthcare challenge around the world. We investigate the utility of convolutional neural networks (CNNs) in performing generalized abnormality detection on lower extremity radiographs. We also explore the effect of pretraining, dataset size and model architecture on model performance to provide recommendations...
Chapter
Recent deep learning models for intracranial hemorrhage (ICH) detection on computed tomography of the head have relied upon large datasets hand-labeled at either the full-scan level or at the individual slice-level. Though these models have demonstrated favorable empirical performance, the hand-labeled datasets upon which they rely are time-consumi...
Preprint
Machine learning models for medical image analysis often suffer from poor performance on important subsets of a population that are not identified during training or testing. For example, overall performance of a cancer detection model may be high, but the model still consistently misses a rare but aggressive cancer subtype. We refer to this proble...
Preprint
Full-text available
Mendelian Randomization (MR) is an important causal inference method primarily used in biomedical research. This work applies contemporary techniques in machine learning to improve the robustness and power of traditional MR tools. By denoising and combining candidate genetic variants through techniques from unsupervised probabilistic graphical mode...
Article
As machine learning models continue to increase in complexity, collecting large hand-labeled training sets has become one of the biggest roadblocks in practice. Instead, weaker forms of supervision that provide noisier but cheaper labels are often used. However, these weak supervision sources have diverse and unknown accuracies, may output correlat...
Article
Full-text available
Biomedical repositories such as the UK Biobank provide increasing access to prospectively collected cardiac imaging, however these data are unlabeled, which creates barriers to their use in supervised machine learning. We develop a weakly supervised deep learning model for classification of aortic valve malformations using up to 4,000 unlabeled car...
Preprint
Full-text available
Labeling training datasets has become a key barrier to building medical machine learning models. One strategy is to generate training labels programmatically, for example by applying natural language processing pipelines to text reports associated with imaging studies. We propose cross-modal data programming, which generalizes this intuitive strate...
Preprint
Full-text available
The ability to obtain accurate food security metrics in developing areas where relevant data can be sparse is critically important for policy makers tasked with implementing food aid programs. As a result, a great deal of work has been dedicated to predicting important food security metrics such as annual crop yields using a variety of methods incl...
Preprint
Full-text available
Obtaining reliable data describing local Food Security Metrics (FSM) at a granularity that is informative to policy-makers requires expensive and logistically difficult surveys, particularly in the developing world. We train a CNN on publicly available satellite data describing land cover classification and use both transfer learning and direct tra...
Article
Purpose To assess the ability of convolutional neural networks (CNNs) to enable high-performance automated binary classification of chest radiographs. Materials and Methods In a retrospective study, 216 431 frontal chest radiographs obtained between 1998 and 2012 were procured, along with associated text reports and a prospective label from the att...
Preprint
As machine learning models continue to increase in complexity, collecting large hand-labeled training sets has become one of the biggest roadblocks in practice. Instead, weaker forms of supervision that provide noisier but cheaper labels are often used. However, these weak supervision sources have diverse and unknown accuracies, may output correlat...
Conference Paper
Many real-world machine learning problems are challenging to tackle for two reasons: (i) they involve multiple sub-tasks at different levels of granularity; and (ii) they require large volumes of labeled training data. We propose Snorkel MeTaL, an end-to-end system for multi-task learning that leverages weak supervision provided at multiple levels...
Preprint
Full-text available
Recent releases of population-scale biomedical repositories such as the UK Biobank have enabled unprecedented access to prospectively collected medical imaging data. Applying machine learning methods to analyze these data holds great promise in facilitating new insights into the genetic and epidemiological associations between anatomical structures...
Article
Data augmentation is a ubiquitous technique for increasing the size of labeled training sets by leveraging task-specific data transformations that preserve class labels. While it is often easy for domain experts to specify individual transformations, constructing and tuning the more sophisticated compositions typically needed to achieve state-of-th...
Article
Breast cancer has the highest incidence and second highest mortality rate for women in the US. Our study aims to utilize deep learning for benign/malignant classification of mammogram tumors using a subset of cases from the Digital Database of Screening Mammography (DDSM). Though it was a small dataset from the view of Deep Learning (about 1000 pat...
Article
Full-text available
The ability to obtain accurate food security metrics in developing areas where relevant data can be sparse is critically important for policy makers tasked with implementing food aid programs. As a result, a great deal of work has been dedicated to predicting important food security metrics such as annual crop yields using a variety of methods incl...
Article
Full-text available
Obtaining reliable data describing local Food Security Metrics (FSM) at a granularity that is informative to policy-makers requires expensive and logistically difficult surveys, particularly in the developing world. We train a CNN on publicly available satellite data describing land cover classification and use both transfer learning and direct tra...

Citations

... Multimodal learning for healthcare. In the quest for a thorough comprehension of patient patterns to enhance the precision of clinical event prediction, researchers have delved into the realm of multimodal learning utilizing healthcare data [4,19,23,31,34,38,39,41]. HAIM [28] leverages different pre-trained feature extraction models to process multimodal inputs and obtains the overall representation of the patient. ...
... If a similar study as the one presented here is conducted five years from now using the same tool and methodology, the tool will identify new frequent author-defined keywords, and the experts will adapt the taxonomy accordingly. For instance, if the recent trends in multi-modal machine learning [62,63] turn out to be important five years from now, this will be reflected in the author-defined keywords found by the tool and, as a consequence, be reflected in future versions of the expert-defined taxonomy. In this way, the tool and methodology will identify emerging topics. ...
... However, it is essential to recognize that AI solutions are tools, not magic. They come with limitations and pitfalls, including overfitting, model drift, and automation bias [3,4]. End-users must be well-acquainted with these aspects. ...
... Prior knowledge about the disease region can also be derived from radiologists' gaze, which indicates their visual cognitive behavior during diagnosis [26,7]. The eye gaze data obtained by eye trackers indicate the specific regions of interest for radiologists, which are also related to the disease and contain task-relevant knowledge [19,1]. Moreover, they can be collected passively and cheaply during image reading, without incurring extra costs. ...
... Data quality issues (e.g., outliers and missing values) have a significant impact on the validity of ML pipelines because they can easily corrupt ML models and ultimately result in inaccurate analytics (Schelter et al. 2021). Many previous studies in the literature showed that a lack of proper upstream tasks to handle data quality issues will cause a devastating effect on downstream ML applications by reducing their efficiency, accuracy, and robustness (Jäger et al. 2021;Hooper et al. 2021;Gupta et al. 2021). Therefore, data validation/cleansing, as one of the main upstream tasks, needs to be performed to tackle data quality issues for ensuring the validity, applicability, and generalizability of an ML pipeline (Nguyen et al. 2021). ...
... The Shapley value [12,13] measures the weighted average utility change when adding a point to all possible training subsets, making it a primary tool for assessing the valuation of individual samples [20]. Shapley-value based methods have found extensive application in various domains, including variable selection [21,22], feature importance [23,24,25], model valuation [26], health care [27,28], federated learning [29], collaborative learning [30], data debugging [31], and distribution analysis [32,33]. Building upon this concept, Beta Shapley [34] and Banzhaf value [35] are developed by relaxing the efficiency axiom of the Shapley value. ...
... This improvement may be due to the accuracy of the information extraction schema. A previous study on positron-emission tomography/CT applied information extraction algorithms to create labeled datasets from radiology reports to train an anomaly-detection model, 26 and achieved a maximum F1 score of 0·888 in label creation. Our schema yielded higher F1 scores across all organs, suggesting that the automatic extraction of information from reports contributed to improved abnormality detection. ...
... The performance comparison of the proposed framework for both CBIS-DDSM and MIAS databases with other existing frameworks (CNN-based texture feature-based) is shown in Table IV. For the CBIS-DDSM database, the proposed framework has given better performance than the CNN-based approaches [4], [40], [14], [42], [43], [46]. In [40], authors used a bag of visual words with CNN for both segment-free and segment-dependent-based classification and the segment-freebased method has shown better performance. ...
... These improvements support us getting better extracted features with our transfer learning method. In addition, we also test our transfer learning method on 3D-ResNet18 and PENet (54). Table 2 shows that our transfer learning method can improve ACC by 1.8 and 0.7%. ...
... Nevertheless, a study by Saab et al. 5 demonstrates that the expansive volume of data accessible through workflow notes can compensate for these inaccuracies, facilitating the development of highly proficient EEG ML models. This underscores the statistical principle that, at times, leveraging a larger dataset with inherent noise can be more advantageous in modeling than utilizing a smaller, meticulously hand-labeled dataset, due to the diversity and the variety it offers 23 . In the ensuing experiment, we further validate the assertion that scaling training data with workflow notes greatly benefits the performance of ML models for detecting seizure onset. ...