Example of our dataset showing an original scan of a newspaper page (left), the transformed representation (middle) and the manually created segmentation mask (right).

Source publication

Beyond ImageNet: Deep Learning in Industrial Practice

Chapter

Full-text available

Jun 2019

Deep learning (DL) methods have gained considerable attention since 2014. In this chapter we briefly review the state of the art in DL and then give several examples of applications from diverse areas of application. We will focus on convolutional neural networks (CNNs), which have since the seminal work of Krizhevsky et al. (ImageNet classificatio...

Context 1

... is done by replacing illustrations with gray areas and by blackening lines of texts (after OCR). In the end, our data set contains 507 pictures with two channels each (plus ground truth segmentation mask, see Figure 6): the original scan, and the abovementioned transformation. We use approximately 85% of the dataset for training and hold out the remaining 15% for testing. ...

View in full-text

Machine learning for robust structural uncertainty quantification in fractured reservoirs

Article

Jun 2024
GEOTHERMICS

Including uncertainty is essential for accurate decision-making in underground applications. We propose a novel approach to consider structural uncertainty in two enhanced geothermal systems (EGSs) using machine learning (ML) models. The results of numerical simulations show that a small change in the structural model can cause a significant variation in the tracer breakthrough curves (BTCs). To develop a more robust method for including structural uncertainty, we train three different ML models: decision tree regression (DTR), random forest regression (RFR), and gradient boosting regression (GBR). DTR and RFR predict the entire BTC at once, but they are susceptible to overfitting and underfitting. In contrast, GBR predicts each time step of the BTC as a separate target variable, considering the possible correlation between consecutive time steps. This approach is implemented using a chain of regression models. The chain model achieves an acceptable increase in RMSE from train to test data, confirming its ability to capture both the general trend and small-scale heterogeneities of the BTCs. Additionally, using the ML model instead of the numerical solver reduces the computational time by six orders of magnitude. This time efficiency allows us to calculate BTCs for 2′000 different reservoir models, enabling a more comprehensive structural uncertainty quantification for EGS cases. The chain model is particularly promising, as it is robust to overfitting and underfitting and can generate BTCs for a large number of structural models efficiently.

A Comprehensive Survey of Deep Transfer Learning for Anomaly Detection in Industrial Time Series: Methods, Applications, and Directions

Article

Full-text available

Jan 2024

Automating the monitoring of industrial processes has the potential to enhance efficiency and optimize quality by promptly detecting abnormal events and thus facilitating timely interventions. Deep learning, with its capacity to discern non-trivial patterns within large datasets, plays a pivotal role in this process. Standard deep learning methods are suitable to solve a specific task given a specific type of data. During training, deep learning demands large volumes of labeled data. However, due to the dynamic nature of the industrial processes and environment, it is impractical to acquire large-scale labeled data for standard deep learning training for every slightly different case anew. Deep transfer learning offers a solution to this problem. By leveraging knowledge from related tasks and accounting for variations in data distributions, the transfer learning framework solves new tasks with little or even no additional labeled data. The approach bypasses the need to retrain a model from scratch for every new setup and dramatically reduces the labeled data requirement. This survey first provides an in-depth review of deep transfer learning, examining the problem settings of transfer learning and classifying the prevailing deep transfer learning methods. Moreover, we delve into applications of deep transfer learning in the context of a broad spectrum of time series anomaly detection tasks prevalent in primary industrial domains, e.g., manufacturing process monitoring, predictive maintenance, energy management, and infrastructure facility monitoring. We discuss the challenges and limitations of deep transfer learning in industrial contexts and conclude the survey with practical directions and actionable suggestions to address the need to leverage diverse time series data for anomaly detection in an increasingly dynamic production environment.

Deep ensemble inverse model for image-based estimation of solar cell parameters

Article

Full-text available

Sep 2023

Physical models can help improve solar cell efficiency during the design phase and for quality control after the fabrication process. We present a data-driven approach to inverse modeling that can predict the underlying parameters of a finite element method solar cell model based on an electroluminescence (EL) image of a solar cell with known cell geometry and laser scribed defects. For training the inverse model, 75 000 synthetic EL images were generated with randomized parameters of the physical cell model. We combine 17 deep convolutional neural networks based on a modified VGG19 architecture into a deep ensemble to add uncertainty estimates. Using the silicon solar cell model, we show that such a novel approach to data-driven statistical inverse modeling can help apply recent developments in deep learning to new engineering applications that require real-time parameterizations of physical models augmented by confidence intervals. The trained network was tested on four different physical solar cell samples, and the estimated parameters were used to create the corresponding model representations. Resimulations of the measurements yielded relative deviations of the calculated and the measured junction voltage values of 0.2% on average with a maximum of 10%, demonstrating the validity of the approach.

Video object detection for privacy-preserving patient monitoring in intensive care

Preprint

Full-text available

Jun 2023

Patient monitoring in intensive care units, although assisted by biosensors, needs continuous supervision of staff. To reduce the burden on staff members, IT infrastructures are built to record monitoring data and develop clinical decision support systems. These systems, however, are vulnerable to artifacts (e.g. muscle movement due to ongoing treatment), which are often indistinguishable from real and potentially dangerous signals. Video recordings could facilitate the reliable classification of biosignals using object detection (OD) methods to find sources of unwanted artifacts. Due to privacy restrictions, only blurred videos can be stored, which severely impairs the possibility to detect clinically relevant events such as interventions or changes in patient status with standard OD methods. Hence, new kinds of approaches are necessary that exploit every kind of available information due to the reduced information content of blurred footage and that are at the same time easily implementable within the IT infrastructure of a normal hospital. In this paper, we propose a new method for exploiting information in the temporal succession of video frames. To be efficiently implementable using off-the-shelf object detectors that comply with given hardware constraints, we repurpose the image color channels to account for temporal consistency, leading to an improved detection rate of the object classes. Our method outperforms a standard YOLOv5 baseline model by +1.7% mAP@.5 while also training over ten times faster on our proprietary dataset. We conclude that this approach has shown effectiveness in the preliminary experiments and holds potential for more general video OD in the future.

Mitigation of motion‐induced artifacts in cone beam computed tomography using deep convolutional neural networks

Article

Full-text available

Apr 2023
MED PHYS

Background Cone beam computed tomography (CBCT) is often employed on radiation therapy treatment devices (linear accelerators) used in image‐guided radiation therapy (IGRT). For each treatment session, it is necessary to obtain the image of the day in order to accurately position the patient and to enable adaptive treatment capabilities including auto‐segmentation and dose calculation. Reconstructed CBCT images often suffer from artifacts, in particular those induced by patient motion. Deep‐learning based approaches promise ways to mitigate such artifacts. Purpose We propose a novel deep‐learning based approach with the goal to reduce motion induced artifacts in CBCT images and improve image quality. It is based on supervised learning and includes neural network architectures employed as pre‐ and/or post‐processing steps during CBCT reconstruction. Methods Our approach is based on deep convolutional neural networks which complement the standard CBCT reconstruction, which is performed either with the analytical Feldkamp‐Davis‐Kress (FDK) method, or with an iterative algebraic reconstruction technique (SART‐TV). The neural networks, which are based on refined U‐net architectures, are trained end‐to‐end in a supervised learning setup. Labeled training data are obtained by means of a motion simulation, which uses the two extreme phases of 4D CT scans, their deformation vector fields, as well as time‐dependent amplitude signals as input. The trained networks are validated against ground truth using quantitative metrics, as well as by using real patient CBCT scans for a qualitative evaluation by clinical experts. Results The presented novel approach is able to generalize to unseen data and yields significant reductions in motion induced artifacts as well as improvements in image quality compared with existing state‐of‐the‐art CBCT reconstruction algorithms (up to +6.3 dB and +0.19 improvements in peak signal‐to‐noise ratio, PSNR, and structural similarity index measure, SSIM, respectively), as evidenced by validation with an unseen test dataset, and confirmed by a clinical evaluation on real patient scans (up to 74% preference for motion artifact reduction over standard reconstruction). Conclusions For the first time, it is demonstrated, also by means of clinical evaluation, that inserting deep neural networks as pre‐ and post‐processing plugins in the existing 3D CBCT reconstruction and trained end‐to‐end yield significant improvements in image quality and reduction of motion artifacts.

KI als Chance für die angewandten Wissenschaften im Wettbewerb der Hochschulen

Conference Paper

Full-text available

Jan 2023

Thilo Stadelmann

Mit der Künstlichen Intelligenz (KI) ist ein Megatrend in der Mitte der Gesellschaft angekommen, der sich als Glücksfalls für die Hochschulen für angewandte Wissenschaften erweist: KI ist nicht nur in aller Munde, es ist im Kern eine angewandte Wissenschaft. KI bietet daher nicht nur Wachstumschancen in allen vier Leistungsbereichen, sondern auch sehr gute Argumente für die exzellente wissenschaftliche Arbeit unserer Hochschulen. Diese sollten über die Disziplin KI hinaus für den Ruf unserer Hochschulen nutzbar gemacht werden.

Unsupervised Domain Adaptation for Vertebrae Detection and Identification in 3D CT Volumes Using a Domain Sanity Loss

Article

Full-text available

Aug 2022

A variety of medical computer vision applications analyze 2D slices of computed tomography (CT) scans, whereas axial slices from the body trunk region are usually identified based on their relative position to the spine. A limitation of such systems is that either the correct slices must be extracted manually or labels of the vertebrae are required for each CT scan to develop an automated extraction system. In this paper, we propose an unsupervised domain adaptation (UDA) approach for vertebrae detection and identification based on a novel Domain Sanity Loss (DSL) function. With UDA the model’s knowledge learned on a publicly available (source) data set can be transferred to the target domain without using target labels, where the target domain is defined by the specific setup (CT modality, study protocols, applied pre- and processing) at the point of use (e.g., a specific clinic with its specific CT study protocols). With our approach, a model is trained on the source and target data set in parallel. The model optimizes a supervised loss for labeled samples from the source domain and the DSL loss function based on domain-specific “sanity checks” for samples from the unlabeled target domain. Without using labels from the target domain, we are able to identify vertebra centroids with an accuracy of 72.8%. By adding only ten target labels during training the accuracy increases to 89.2%, which is on par with the current state-of-the-art for full supervised learning, while using about 20 times less labels. Thus, our model can be used to extract 2D slices from 3D CT scans on arbitrary data sets fully automatically without requiring an extensive labeling effort, contributing to the clinical adoption of medical imaging by hospitals.

A Survey of Un-, Weakly-, and Semi-Supervised Learning Methods for Noisy, Missing and Partial Labels in Industrial Vision Applications

Conference Paper

Full-text available

Jun 2021

Two to Trust: AutoML for Safe Modelling and Interpretable Deep Learning for Robustness

Chapter

Full-text available

Apr 2021

With great power comes great responsibility. The success of machine learning, especially deep learning, in research and practice has attracted a great deal of interest, which in turn necessitates increased trust. Sources of mistrust include matters of model genesis (“Is this really the appropriate model?”) and interpretability (“Why did the model come to this conclusion?”, “Is the model safe from being easily fooled by adversaries?”). In this paper, two partners for the trustworthiness tango are presented: recent advances and ideas, as well as practical applications in industry in (a) Automated machine learning (AutoML), a powerful tool to optimize deep neural network architectures and fine-tune hyperparameters, which promises to build models in a safer and more comprehensive way; (b) Interpretability of neural network outputs, which addresses the vital question regarding the reasoning behind model predictions and provides insights to improve robustness against adversarial attacks.

Design Patterns for Resource-Constrained Automated Deep-Learning Methods

Article

Full-text available

Nov 2020

Thilo Stadelmann

We present an extensive evaluation of a wide variety of promising design patterns for automated deep-learning (AutoDL) methods, organized according to the problem categories of the 2019 AutoDL challenges, which set the task of optimizing both model accuracy and search efficiency under tight time and computing constraints. We propose structured empirical evaluations as the most promising avenue to obtain design principles for deep-learning systems due to the absence of strong theoretical support. From these evaluations, we distill relevant patterns which give rise to neural network design recommendations. In particular, we establish (a) that very wide fully connected layers learn meaningful features faster; we illustrate (b) how the lack of pretraining in audio processing can be compensated by architecture search; we show (c) that in text processing deep-learning-based methods only pull ahead of traditional methods for short text lengths with less than a thousand characters under tight resource limitations; and lastly we present (d) evidence that in very data- and computing-constrained settings, hyperparameter tuning of more traditional machine-learning methods outperforms deep-learning systems.

Example of our dataset showing an original scan of a newspaper page (left), the transformed representation (middle) and the manually created segmentation mask (right).

Context in source publication

Citations