Figure 6 - uploaded by Thilo Stadelmann
Content may be subject to copyright.
Example of our dataset showing an original scan of a newspaper page (left), the transformed representation (middle) and the manually created segmentation mask (right).

Example of our dataset showing an original scan of a newspaper page (left), the transformed representation (middle) and the manually created segmentation mask (right).

Source publication
Chapter
Full-text available
Deep learning (DL) methods have gained considerable attention since 2014. In this chapter we briefly review the state of the art in DL and then give several examples of applications from diverse areas of application. We will focus on convolutional neural networks (CNNs), which have since the seminal work of Krizhevsky et al. (ImageNet classificatio...

Context in source publication

Context 1
... is done by replacing illustrations with gray areas and by blackening lines of texts (after OCR). In the end, our data set contains 507 pictures with two channels each (plus ground truth segmentation mask, see Figure 6): the original scan, and the abovementioned transformation. We use approximately 85% of the dataset for training and hold out the remaining 15% for testing. ...

Citations

... Besides the huge energy consumption of these computational infrastructures (Benoit et al., 2018), their availability is also limited. Furthermore, parameter tuning, sensitivity analysis (Borgonovo and Plischke, 2016), and uncertainty quantification (Abbaszadeh Shahri et al., 2022;Soize, 2017) demand up to millions of simulations. ...
... Machine learning (ML) methods have gained significant traction across various fields (Brunton and Kutz, 2022;Stadelmann et al., 2019), including geothermal applications (Okoroafor et al., 2022). In this context, data-driven and physics-informed ML (physics-informed neural network, PINN) techniques are of great interest (Carleo et al., 2019;Raissi et al., 2019). ...
Article
Including uncertainty is essential for accurate decision-making in underground applications. We propose a novel approach to consider structural uncertainty in two enhanced geothermal systems (EGSs) using machine learning (ML) models. The results of numerical simulations show that a small change in the structural model can cause a significant variation in the tracer breakthrough curves (BTCs). To develop a more robust method for including structural uncertainty, we train three different ML models: decision tree regression (DTR), random forest regression (RFR), and gradient boosting regression (GBR). DTR and RFR predict the entire BTC at once, but they are susceptible to overfitting and underfitting. In contrast, GBR predicts each time step of the BTC as a separate target variable, considering the possible correlation between consecutive time steps. This approach is implemented using a chain of regression models. The chain model achieves an acceptable increase in RMSE from train to test data, confirming its ability to capture both the general trend and small-scale heterogeneities of the BTCs. Additionally, using the ML model instead of the numerical solver reduces the computational time by six orders of magnitude. This time efficiency allows us to calculate BTCs for 2′000 different reservoir models, enabling a more comprehensive structural uncertainty quantification for EGS cases. The chain model is particularly promising, as it is robust to overfitting and underfitting and can generate BTCs for a large number of structural models efficiently.
... There are two basic ways to detect anomalies: for supervised anomaly detection, labels (normal/abnormal) are needed per time series to build a binary classifier [86]. For unsupervised anomaly detection, an anomaly score or confidence value that is conditioned purely on normal data can be used to differentiate abnormal from normal instances [87] - [88]. ...
Article
Full-text available
Automating the monitoring of industrial processes has the potential to enhance efficiency and optimize quality by promptly detecting abnormal events and thus facilitating timely interventions. Deep learning, with its capacity to discern non-trivial patterns within large datasets, plays a pivotal role in this process. Standard deep learning methods are suitable to solve a specific task given a specific type of data. During training, deep learning demands large volumes of labeled data. However, due to the dynamic nature of the industrial processes and environment, it is impractical to acquire large-scale labeled data for standard deep learning training for every slightly different case anew. Deep transfer learning offers a solution to this problem. By leveraging knowledge from related tasks and accounting for variations in data distributions, the transfer learning framework solves new tasks with little or even no additional labeled data. The approach bypasses the need to retrain a model from scratch for every new setup and dramatically reduces the labeled data requirement. This survey first provides an in-depth review of deep transfer learning, examining the problem settings of transfer learning and classifying the prevailing deep transfer learning methods. Moreover, we delve into applications of deep transfer learning in the context of a broad spectrum of time series anomaly detection tasks prevalent in primary industrial domains, e.g., manufacturing process monitoring, predictive maintenance, energy management, and infrastructure facility monitoring. We discuss the challenges and limitations of deep transfer learning in industrial contexts and conclude the survey with practical directions and actionable suggestions to address the need to leverage diverse time series data for anomaly detection in an increasingly dynamic production environment.
... The proposed method demonstrates a novel implementation of recent developments in machine learning that could extend existing engineering applications of deep learning for industrial practice 30,31 . Bridging the gap between data-driven and physical models raises new challenges that are less frequently discussed in the deep learning literature: First, CNN network architectures are designed and implemented for classification tasks in the majority of cases. ...
Article
Full-text available
Physical models can help improve solar cell efficiency during the design phase and for quality control after the fabrication process. We present a data-driven approach to inverse modeling that can predict the underlying parameters of a finite element method solar cell model based on an electroluminescence (EL) image of a solar cell with known cell geometry and laser scribed defects. For training the inverse model, 75 000 synthetic EL images were generated with randomized parameters of the physical cell model. We combine 17 deep convolutional neural networks based on a modified VGG19 architecture into a deep ensemble to add uncertainty estimates. Using the silicon solar cell model, we show that such a novel approach to data-driven statistical inverse modeling can help apply recent developments in deep learning to new engineering applications that require real-time parameterizations of physical models augmented by confidence intervals. The trained network was tested on four different physical solar cell samples, and the estimated parameters were used to create the corresponding model representations. Resimulations of the measurements yielded relative deviations of the calculated and the measured junction voltage values of 0.2% on average with a maximum of 10%, demonstrating the validity of the approach.
... While it loses the color information, the additional information about pixel changes and bounding boxes from the earlier frame more than compensates for this loss, evidenced by the performance increase and shortened training time. What seems like a hack is typical for deep learning in practice: In absence of large training sets and conditions as found in public benchmarks [17], the available information has to be exploited optimally while considering computational boundary conditions. ...
Preprint
Full-text available
Patient monitoring in intensive care units, although assisted by biosensors, needs continuous supervision of staff. To reduce the burden on staff members, IT infrastructures are built to record monitoring data and develop clinical decision support systems. These systems, however, are vulnerable to artifacts (e.g. muscle movement due to ongoing treatment), which are often indistinguishable from real and potentially dangerous signals. Video recordings could facilitate the reliable classification of biosignals using object detection (OD) methods to find sources of unwanted artifacts. Due to privacy restrictions, only blurred videos can be stored, which severely impairs the possibility to detect clinically relevant events such as interventions or changes in patient status with standard OD methods. Hence, new kinds of approaches are necessary that exploit every kind of available information due to the reduced information content of blurred footage and that are at the same time easily implementable within the IT infrastructure of a normal hospital. In this paper, we propose a new method for exploiting information in the temporal succession of video frames. To be efficiently implementable using off-the-shelf object detectors that comply with given hardware constraints, we repurpose the image color channels to account for temporal consistency, leading to an improved detection rate of the object classes. Our method outperforms a standard YOLOv5 baseline model by +1.7% mAP@.5 while also training over ten times faster on our proprietary dataset. We conclude that this approach has shown effectiveness in the preliminary experiments and holds potential for more general video OD in the future.
... Having gained traction in numerous fields including CT imaging, [28][29][30] deep-learning approaches have been used for metal artifact reduction (MAR). 31,32 A dualdomain network (DuDoNet) 33 was introduced to jointly compensate for metal-induced artifacts in both projection and volume domains. ...
Article
Full-text available
Background Cone beam computed tomography (CBCT) is often employed on radiation therapy treatment devices (linear accelerators) used in image‐guided radiation therapy (IGRT). For each treatment session, it is necessary to obtain the image of the day in order to accurately position the patient and to enable adaptive treatment capabilities including auto‐segmentation and dose calculation. Reconstructed CBCT images often suffer from artifacts, in particular those induced by patient motion. Deep‐learning based approaches promise ways to mitigate such artifacts. Purpose We propose a novel deep‐learning based approach with the goal to reduce motion induced artifacts in CBCT images and improve image quality. It is based on supervised learning and includes neural network architectures employed as pre‐ and/or post‐processing steps during CBCT reconstruction. Methods Our approach is based on deep convolutional neural networks which complement the standard CBCT reconstruction, which is performed either with the analytical Feldkamp‐Davis‐Kress (FDK) method, or with an iterative algebraic reconstruction technique (SART‐TV). The neural networks, which are based on refined U‐net architectures, are trained end‐to‐end in a supervised learning setup. Labeled training data are obtained by means of a motion simulation, which uses the two extreme phases of 4D CT scans, their deformation vector fields, as well as time‐dependent amplitude signals as input. The trained networks are validated against ground truth using quantitative metrics, as well as by using real patient CBCT scans for a qualitative evaluation by clinical experts. Results The presented novel approach is able to generalize to unseen data and yields significant reductions in motion induced artifacts as well as improvements in image quality compared with existing state‐of‐the‐art CBCT reconstruction algorithms (up to +6.3 dB and +0.19 improvements in peak signal‐to‐noise ratio, PSNR, and structural similarity index measure, SSIM, respectively), as evidenced by validation with an unseen test dataset, and confirmed by a clinical evaluation on real patient scans (up to 74% preference for motion artifact reduction over standard reconstruction). Conclusions For the first time, it is demonstrated, also by means of clinical evaluation, that inserting deep neural networks as pre‐ and post‐processing plugins in the existing 3D CBCT reconstruction and trained end‐to‐end yield significant improvements in image quality and reduction of motion artifacts.
... konkurrenzlose) Arbeit im Feld KI durchzuführen, die keine internationale Konkurrenz per se scheuen muss. Diese Präsentation wird dies mit Beispielen aus unserer Lehre [2], Forschung [Beispielhaft: [3][4][5][6][7][8][9] und Organisation [Gründung des CAI 2021; Akquisition der ersten Stiftungsprofessur 2022] belegen. ...
Conference Paper
Full-text available
Mit der Künstlichen Intelligenz (KI) ist ein Megatrend in der Mitte der Gesellschaft angekommen, der sich als Glücksfalls für die Hochschulen für angewandte Wissenschaften erweist: KI ist nicht nur in aller Munde, es ist im Kern eine angewandte Wissenschaft. KI bietet daher nicht nur Wachstumschancen in allen vier Leistungsbereichen, sondern auch sehr gute Argumente für die exzellente wissenschaftliche Arbeit unserer Hochschulen. Diese sollten über die Disziplin KI hinaus für den Ruf unserer Hochschulen nutzbar gemacht werden.
... We extend the Detection module with post-processing and the Identification module with a new Domain Sanity Loss (DSL) based on "sanity checks". We build upon their work for the following reasons: (i) The average distance between the predicted and the actual vertebrae centroids is small and considered state-of-the-art; (ii) the models are pure CNN architectures which can be easily extended within the framework of deep learning [23]; (iii) no assumptions are made about neither the shape of the spine nor the visible vertebrae. This way, the model is adapted to the target data, which is considerably easier to train in our experience than the alternative of adapting the data to the model [24]. ...
Article
Full-text available
A variety of medical computer vision applications analyze 2D slices of computed tomography (CT) scans, whereas axial slices from the body trunk region are usually identified based on their relative position to the spine. A limitation of such systems is that either the correct slices must be extracted manually or labels of the vertebrae are required for each CT scan to develop an automated extraction system. In this paper, we propose an unsupervised domain adaptation (UDA) approach for vertebrae detection and identification based on a novel Domain Sanity Loss (DSL) function. With UDA the model’s knowledge learned on a publicly available (source) data set can be transferred to the target domain without using target labels, where the target domain is defined by the specific setup (CT modality, study protocols, applied pre- and processing) at the point of use (e.g., a specific clinic with its specific CT study protocols). With our approach, a model is trained on the source and target data set in parallel. The model optimizes a supervised loss for labeled samples from the source domain and the DSL loss function based on domain-specific “sanity checks” for samples from the unlabeled target domain. Without using labels from the target domain, we are able to identify vertebra centroids with an accuracy of 72.8%. By adding only ten target labels during training the accuracy increases to 89.2%, which is on par with the current state-of-the-art for full supervised learning, while using about 20 times less labels. Thus, our model can be used to extract 2D slices from 3D CT scans on arbitrary data sets fully automatically without requiring an extensive labeling effort, contributing to the clinical adoption of medical imaging by hospitals.
... Utilizing machine learning in an industrial application poses additional challenges compared to research lab environments [1], [2], e.g., in the form of data quality and data quantity issues [3]. "Garbage in, Garbage out" is an often stressed dictum in machine learning -even more so in industrial applications, where data samples and labels collection is difficult and costly [4]. ...
... The recent success of machine learning (ML) and deep learning (DL) has triggered enormous interest in practical applications of these algorithms in many organizations [23,24]. The emergence of automated ML (AutoML), which includes automated DL (AutoDL), further expands the horizons of such machine learning applications for non-experts and broadens the feasibility of exploring larger search spaces during development. ...
Chapter
Full-text available
With great power comes great responsibility. The success of machine learning, especially deep learning, in research and practice has attracted a great deal of interest, which in turn necessitates increased trust. Sources of mistrust include matters of model genesis (“Is this really the appropriate model?”) and interpretability (“Why did the model come to this conclusion?”, “Is the model safe from being easily fooled by adversaries?”). In this paper, two partners for the trustworthiness tango are presented: recent advances and ideas, as well as practical applications in industry in (a) Automated machine learning (AutoML), a powerful tool to optimize deep neural network architectures and fine-tune hyperparameters, which promises to build models in a safer and more comprehensive way; (b) Interpretability of neural network outputs, which addresses the vital question regarding the reasoning behind model predictions and provides insights to improve robustness against adversarial attacks.
... Deep learning [1] has demonstrated outstanding performance for many tasks such as computer vision, audio analysis, natural language processing, or game playing [2][3][4][5], and across a wide variety of domains such as the medical, industrial, sports, and retail sectors [6][7][8][9]. However, the design, training, and deployment of high-performance deep-learning models requires human expert knowledge. ...
Article
Full-text available
We present an extensive evaluation of a wide variety of promising design patterns for automated deep-learning (AutoDL) methods, organized according to the problem categories of the 2019 AutoDL challenges, which set the task of optimizing both model accuracy and search efficiency under tight time and computing constraints. We propose structured empirical evaluations as the most promising avenue to obtain design principles for deep-learning systems due to the absence of strong theoretical support. From these evaluations, we distill relevant patterns which give rise to neural network design recommendations. In particular, we establish (a) that very wide fully connected layers learn meaningful features faster; we illustrate (b) how the lack of pretraining in audio processing can be compensated by architecture search; we show (c) that in text processing deep-learning-based methods only pull ahead of traditional methods for short text lengths with less than a thousand characters under tight resource limitations; and lastly we present (d) evidence that in very data- and computing-constrained settings, hyperparameter tuning of more traditional machine-learning methods outperforms deep-learning systems.