Figure 1 - uploaded by Alexandr A. Kalinin
Content may be subject to copyright.
Sample images from the training set for Angiodysplasia detection and localization challenge (MICCAI 2017 Endoscopic Vision Challenge). The upper row corresponds to normal images. In the middle row the images contain angiodysplasia area represented as red spots. The down row contains masks for angiodysplasia from the middle row. 

Sample images from the training set for Angiodysplasia detection and localization challenge (MICCAI 2017 Endoscopic Vision Challenge). The upper row corresponds to normal images. In the middle row the images contain angiodysplasia area represented as red spots. The down row contains masks for angiodysplasia from the middle row. 

Source publication
Article
Full-text available
Accurate detection and localization for angiodysplasia lesions is an important problem in early stage diagnostics of gastrointestinal bleeding and anemia. Gold-standard for angiodysplasia detection and localization is performed using wireless capsule endoscopy. This pill-like device is able to produce thousand of high enough resolution images durin...

Contexts in source publication

Context 1
... pixels in the masks correspond to lesion localization. Several examples from the training set are given in Fig.1, where the first row corresponds to images without pathology, the second one to images with several AD lesions in every image, and the last row contains masks that correspond to the pathology images from the second row. ...
Context 2
... dataset consists of 1200 color images obtained with WCE, Fig.2. The images are in 24-bit PNG format, with 576 × 576 pixel resolution. The dataset is split into two equal parts, 600 images for training and 600 for evaluation. Each subset is composed of 300 images with apparent AD and 300 without any pathology. The training subset is annotated by human expert and contains 300 binary masks in JPEG format of the same 576×576 pixel resolution. White pixels in the masks correspond to lesion localization. Several examples from the training set are given in Fig.1, where the first row corresponds to images without pathology, the second one to images with several AD lesions in every image, and the last row contains masks that correspond to the pathology images from the second row. In the dataset each image contains up to 6 lesions and their distribution is shown in Fig.3 (left). As shown, the most images contain only 1 lesion. In addition, Fig.3 (right) shows distribution of AD lesion areas that reach the maximum of approximately 12,000 pixels with the median value of 1,648 pixels. figure) in the data ...

Similar publications

Article
Full-text available
Background Gastrointestinal (GI) tract bleeding is a major cause of mortality among patients with GI malignancies. We aimed to assess the technical and clinical efficacy of trans-arterial embolization (TAE) as a symptomatic treatment of tumor-related GI bleeding. This study was conducted for patients with GI bleeding secondary to histopathologicall...
Article
Full-text available
Aortoenteric fistula (AEF) is a rare but potentially fatal condition causing massive gastrointestinal bleeding. It is defined as fistulous communication between the gastrointestinal tract and the aorta which is sub classified into primary and secondary. Primary AEF refers to communication between a native aorta and the gastrointestinal tract. Secon...
Article
Full-text available
We present an alternative treatment to resolve lower gastrointestinal bleeding by the application of FloSeal, a haemostatic matrix. Fundamentally, the treatment consists of inserting the tube containing the Sengstaken-Blakemore probe impregnated with FloSeal into the rectum-sigma. This procedure is simple, easy to reproduce and can be very useful t...
Article
Full-text available
Non-variceal gastrointestinal bleeding (GIB) is a significant cause of mortality and morbidity worldwide which is encountered in the ambulatory and hospital settings. Hemorrhage form the gastrointestinal (GI) tract is categorized as upper GIB, small bowel bleeding (also formerly referred to as obscure GIB) or lower GIB. Although the etiologies of G...
Article
Full-text available
Blue Rubber Bleb Nevus Syndrome, is a rare condition characterized by skin lesions caused by vascular malformations most frequently associated with lesions of the gastrointestinal tract, although rare, it can present with lesions in the central nervous system, thyroid, liver, spleen and lungs; common symptoms are: digestive tract bleeding and iron...

Citations

... The study presents a solution for detecting and localizing angiodysplasia using deep neural networks and binary segmentation. The approach provides state-of-the-art performance and ranks among the top results for all subcategories in the task [3]. ...
Article
Our aim is to detect anemia through a comparative analysis of three convolutional neural network (CNN) models, namely EfficientNet B3, DenseNet121, and CNN AllNet. A collection of 3,000 microscopic palm pictures, including 1,500 anaemic and 1,500 non-anemic samples, was used to train and test the algorithms. The dataset was preprocessed to balance the classes, augment the images, and normalize the pixel values. The models were trained using transfer learning on the ImageNet dataset and fine-tuned on the anemia dataset. The performance of the models was evaluated based on accuracy, precision, recall, and F1-score. The results showed that CNN ALLNET achieved the highest accuracy of 96.8%, followed by DenseNet121 with 94.4%, and EfficientNet B3 with 91.2%. The precision, recall, and F1-score also followed a similar trend. The study concludes that CNN ALLNET is the optimal model for anemia detection due to its high accuracy and overall better performance when compared with the different models. The findings of this research could provide a basis for further studies on anemia detection using CNN models, ultimately improving the accuracy and efficiency of anemia diagnosis and treatment.
... Recently, with the rapid development of DL, researchers proposed several models based on CNNs to distinguish bleeding and non-bleeding WCE images through V-GAN [49], semantic segmentation method known as SegNet [62], U-Net, TernausNet, and AlbuNet34 [64] as well as transfer learning with pre-trained AlexNet [80]. Surprisingly, the optimal color space for these models has been shown to be Hue, saturation, and value (HSV) instead of RGB. ...
Article
Full-text available
The clinical application of a real-time artificial intelligence (AI) image processing system to diagnose upper gastrointestinal (GI) malignancies remains an experimental research and engineering problem. Understanding these commonly used technical techniques is required to appreciate the scientific quality and novelty of AI studies. Clinicians frequently lack this technical background, and AI experts may be unaware of such clinical relevance and implications in daily practice. As a result, there is a growing need for a multidisciplinary, international assessment of how to conduct high-quality AI research in upper GI malignancy detection. This research will help endoscopists build approaches or models to increase diagnosis accuracy for upper GI malignancies despite variances in experience, education, personnel, and resources, as it offers real-time and retrospective chances to improve upper GI malignancy diagnosis and screening. This comprehensive review sheds light on potential enhancements to computer-aided diagnostic (CAD) systems for GI endoscopy. The survey includes 65 studies on automatic upper GI malignancy diagnosis and evaluation, which are compared by endoscopic modalities, image counts, models, validation methods, and results. The main goal of this research is to assess and compare each AI method’s current stage and potential improvement to boost performance, maturity, and the possibility to open new research areas for the application of a real-time AI image recognition system that diagnoses upper GI malignancies. The findings of this study suggest that Support Vector Machines (SVM) are frequently utilized in gastrointestinal (GI) image processing within the context of machine learning (ML). Moreover, the analysis reveals that CNN-based supervised learning object detection models are widely employed in GI image analysis within the deep learning (DL) context. The results of this study also suggest that RGB is the most commonly used image modality for GI analysis, with color playing a vital role in detecting bleeding locations. Researchers rely on public datasets from 2018-2019 to develop AI systems, but combining them is challenging due to their unique classes. To overcome the problem of insufficient data to train a new DL model, a standardized database is needed to hold different datasets for the development of AI-based GI endoscopy systems.
... These U-Net variants showed excellent performance in biomedical image segmentations with similar challenges as chest x-ray diagnosis. Lastly, we implemented AlbuNet [17], which deploys ResNet as an encoder. The architecture of our customized AlbuNet is demonstrated in Fig. 2. All networks were pre-trained with ImageNet [18] and fined-tuned on an image repository of 485 images with lung boundary annotations and 461 images with heart boundary annotations. ...
Article
Full-text available
Background Artificial intelligence, particularly the deep learning (DL) model, can provide reliable results for automated cardiothoracic ratio (CTR) measurement on chest X-ray (CXR) images. In everyday clinical use, however, this technology is usually implemented in a non-automated (AI-assisted) capacity because it still requires approval from radiologists. We investigated the performance and efficiency of our recently proposed models for the AI-assisted method intended for clinical practice. Methods We validated four proposed DL models (AlbuNet, SegNet, VGG-11, and VGG-16) to find the best model for clinical implementation using a dataset of 7517 CXR images from manual operations. These models were investigated in single-model and combined-model modes to find the model with the highest percentage of results where the user could accept the results without further interaction (excellent grade), and with measurement variation within ± 1.8% of the human-operating range. The best model from the validation study was then tested on an evaluation dataset of 9386 CXR images using the AI-assisted method with two radiologists to measure the yield of excellent grade results, observer variation, and operating time. A Bland–Altman plot with coefficient of variation (CV) was employed to evaluate agreement between measurements. Results The VGG-16 gave the highest excellent grade result (68.9%) of any single-model mode with a CV comparable to manual operation (2.12% vs 2.13%). No DL model produced a failure-grade result. The combined-model mode of AlbuNet + VGG-11 model yielded excellent grades in 82.7% of images and a CV of 1.36%. Using the evaluation dataset, the AlbuNet + VGG-11 model produced excellent grade results in 77.8% of images, a CV of 1.55%, and reduced CTR measurement time by almost ten-fold (1.07 ± 2.62 s vs 10.6 ± 1.5 s) compared with manual operation. Conclusion Due to its excellent accuracy and speed, the AlbuNet + VGG-11 model could be clinically implemented to assist radiologists with CTR measurement.
... Several studies have shown that the U-Net architecture [40] is able to deliver state-of-the-art results in water segmentation tasks using either Multispectral (e.g., [19], [41]) or SAR data (e.g., [22], [23], [30], [34]). After initial tests, a slightly modified version of the U-Net architecture is chosen as the base model, which has been shown to provide better segmentation results than U-Net while using fewer parameters: AlbuNet-34 (AN-34) [42]. The main differences between AN-34 and the standard U-Net architecture are as follows. ...
... Fig. 3 shows the schematics of the AN-34 architecture. For a detailed description, refer to [42]. ...
... Schematic architecture of AlbuNet-34[42] which builds upon the common U-Net architecture. The encoder is replaced by a ResNet-34 and the information flow from the encoder to the decoder is accomplished by summation, where the original U-Net concatenates the feature maps, which reduces the number of trainable parameters. ...
Article
Full-text available
In this study, the effectiveness of several convolutional neural network architectures (AlbuNet-34/FCN/DeepLabV3+/U-Net/U-Net++) for water and flood mapping using Sentinel-1 amplitude data is compared to an operational rule-based processor (S-1FS). This comparison is made using a globally distributed dataset of Sentinel-1 scenes and the corresponding ground truth water masks derived from Sentinel-2 data to evaluate the performance of the classifiers on a global scale in various environmental conditions. The impact of using single versus dual-polarized input data on the segmentation capabilities of AlbuNet-34 is evaluated. The weighted cross entropy loss is combined with the Lovász loss and various data augmentation methods are investigated. Furthermore, the concept of atrous spatial pyramid pooling used in DeepLabV3+ and the multiscale feature fusion inherent in U-Net++ are assessed. Finally, the generalization capacity of AlbuNet-34 is tested in a realistic flood mapping scenario by using additional data from two flood events and the Sen1Floods11 dataset. The model trained using dual polarized data outperforms the S-1FS significantly and increases the intersection over union (IoU) score by 5%. Using a weighted combination of the cross entropy and the Lovász loss increases the IoU score by another 2%. Geometric data augmentation degrades the performance while radiometric data augmentation leads to better testing results. FCN/DeepLabV3+/U-Net/U-Net++ perform not significantly different to AlbuNet-34. Models trained on data showing no distinct inundation perform very well in mapping the water extent during two flood events, reaching IoU scores of 0.96 and 0.94, respectively, and perform comparatively well on the Sen1Floods11 dataset.
... Therefore, an encoder-decoder architecture with skip connection like UNet [37] was selected as the baseline of feature learning framework, which has established its efficiency in many segmentation tasks with a limited amount of data, e.g. medical [39] and satellite imagery task [40], and the evaluation criteria for the baseline selection are further elucidated in the supplementary materials. Moreover, in order to achieve rapid processing speed, we implement a lightweight UNet: MobileNet [41,42] as the encoder with a reduced form of UNet for the task of dendrite segmentation, with only slight performance degradation while greatly improving the inference speed of network. ...
Article
The dendrite morphology significantly affects the formation of micro-segregation, intermetallic precipitation, and rheology of the mushy zone during solidification. Among the parameters that describe the morphology of dendrites, the specific interface area is critical since it characterizes the overall morphology of dendrites in a universal and general sense. In this work, a novel radiography-based method has been proposed to achieve in-situ determination of the evolving specific interface area during thin-sample solidification. Employing our proposed deep-learning-based dendrite segmentation model and image processing method, a generally authentic three-dimensional solidification microstructure can be obtained, which underlies the measurement of specific interface area from radiographs with negligible relative error. This method can be employed to study the evolution of 3D microstructure under large cooling rates requiring high temporal resolution. Based on this method, the morphology evolution of the overall solidification microstructure during directional solidification of Al-15 wt% Cu alloy has been studied. The results indicate that the evolution of interfacial area density SV conforms to the relation suggested by Rath with asymmetric law. Besides, the asymmetric evolution of SV concerning solid fraction can be attributed to the concurrent growth and coarsening of the solid phase with uneven change rates under temperature gradient.
... Handcraft (color and texture) and DL based features were compared in [83] for angioectasia lesion detection in a database of 600 frames having been reported a sensitivity and specificity of 62% and 78%, respectively. A DL based architecture for pixel-wise segmentation purposes was proposed in [84] by using AlbuNet and TernausNet networks where a Dice coefficient of 85% in a database of 600 images was reported. Although it is stated that it can be used for a detection purpose, the paper does not present these results. ...
Thesis
Full-text available
The wireless capsule endoscopy is a medical device with the main advantage of being able to visualize the whole gastrointestinal tract. This non-invasive exam is specially used for the diagnosis of small bowel pathologies, since the conventional endoscopy is not able to visualize this organ. To analyze these exams the medical staff need specialized training and it was recently proven that the massive quantity of images that are generated lead to medical errors and consequently the sub diagnosis of certain pathologies. In this thesis the main objective was to develop systems for automatic detection of different lesions present in the small bowel. These developments included the use of segmentation algorithms based on probabilistic methods (namely the Expectation-Maximization), with the presentation of an acceleration method and a new approach for improving the borders of the segmentation based on Markov Random Fields. Beyond that, several supervised classification strategies were studied, with the use of single-based classifiers and ensemble-based classifiers for detection of single lesions and convolutional neural networks, and instance segmentation for multipathology detection and segmentation. With the support of Hospital of Braga, a clinical studied was performed with the developed method for angioectasia detection. This work had the main purpose of comparing the efficiency and performance of the method with the performance of different physicians when analyzing wireless capsule endoscopy exams. The developed methods were tested in different applications and it was found that the performance was improved when compared to the most recent bibliography. It is important to state that all this work allowed to conclude that these systems need to have a greater implantation in the clinical practice. While there is a lot of advances in computer vision methods for lesion detection, there are still lacking better clinical studies and better and bigger public databases to improve the testing of the methodologies.
... Jonas et al. proposed a transfer learning approach [28] which utilizes the ResNet34 encoder. The author extended upon AlbuNet [29] proposed by A. Shvets et al. The authors dropped the T1 modality from a the BraTS2020 dataset to match the 3-channel input of ResNe34. ...
Chapter
Full-text available
The incidence of gliomas has been on the rise and are the most common malignant brain tumours diagnosed upon medical appointments. A common approach to identify and diagnose brain tumours is to use Magnetic Resonance Imaging (MRI) to pinpoint tumour regions. However, manual segmentation of brain tumours is highly time-consuming and challenging due to the multimodal structure of MRI scans coupled with the task of delineating boundaries of different brain tissues. As such, there is a need for automated and accurate segmentation techniques in the medical domain to reduce both time and task complexity. Various Deep Learning techniques such as Convolutional Neural Networks (CNN) and Fully Connected Networks (FCN) have been introduced to address this challenge with promising segmentation results on various datasets. FCNs such as U-Net in recent literature achieve state-of-the-art performance on segmentation tasks and have been adapted to tackle various domains. In this paper, we propose an improved extension upon an existing transfer learning method on the Brain Tumour Segmentation (BraTS) 2020 dataset and achieved marginally better results compared to the original approach.
... Presently, all DL techniques in CTR calculation are based on the U-Net model, the most successful convolutional network for biomedical image segmentation [13]. Since its conception, U-Net has inspired many successors and previous studies have reported that modification of encoding architecture can further improve accuracy [14][15][16][17]. While DL techniques in CTR calculation have been technically validated, only two reports [9,11] with small sample size (n = 100) were conducted in the clinical setting. ...
Article
Full-text available
Background Artificial Intelligence (AI) is a promising tool for cardiothoracic ratio (CTR) measurement that has been technically validated but not clinically evaluated on a large dataset. We observed and validated AI and manual methods for CTR measurement using a large dataset and investigated the clinical utility of the AI method. Methods Five thousand normal chest x-rays and 2,517 images with cardiomegaly and CTR values, were analyzed using manual, AI-assisted, and AI-only methods. AI-only methods obtained CTR values from a VGG-16 U-Net model. An in-house software was used to aid the manual and AI-assisted measurements and to record operating time. Intra and inter-observer experiments were performed on manual and AI-assisted methods and the averages were used in a method variation study. AI outcomes were graded in the AI-assisted method as excellent (accepted by both users independently), good (required adjustment), and poor (failed outcome). Bland–Altman plot with coefficient of variation (CV), and coefficient of determination (R-squared) were used to evaluate agreement and correlation between measurements. Finally, the performance of a cardiomegaly classification test was evaluated using a CTR cutoff at the standard (0.5), optimum, and maximum sensitivity. Results Manual CTR measurements on cardiomegaly data were comparable to previous radiologist reports (CV of 2.13% vs 2.04%). The observer and method variations from the AI-only method were about three times higher than from the manual method (CV of 5.78% vs 2.13%). AI assistance resulted in 40% excellent, 56% good, and 4% poor grading. AI assistance significantly improved agreement on inter-observer measurement compared to manual methods (CV; bias: 1.72%; − 0.61% vs 2.13%; − 1.62%) and was faster to perform (2.2 ± 2.4 secs vs 10.6 ± 1.5 secs). The R-squared and classification-test were not reliable indicators to verify that the AI-only method could replace manual operation. Conclusions AI alone is not yet suitable to replace manual operations due to its high variation, but it is useful to assist the radiologist because it can reduce observer variation and operation time. Agreement of measurement should be used to compare AI and manual methods, rather than R-square or classification performance tests.
... However, there a compelling need for the use of deep convolutional neural networks on mobile devices and in embedded systems. This is particularly important for video processing in, for example, autonomous cars and medical [5], [6], which demand capabilities of high-accuracy and real-time object recognition. ...
Preprint
Full-text available
Modern mobile neural networks with a reduced number of weights and parameters do a good job with image classification tasks, but even they may be too complex to be implemented in an FPGA for video processing tasks. The article proposes neural network architecture for the practical task of recognizing images from a camera, which has several advantages in terms of speed. This is achieved by reducing the number of weights, moving from a floating-point to a fixed-point arithmetic, and due to a number of hardware-level optimizations associated with storing weights in blocks, a shift register, and an adjustable number of convolutional blocks that work in parallel. The article also proposed methods for adapting the existing data set for solving a different task. As the experiments showed, the proposed neural network copes well with real-time video processing even on the cheap FPGAs.
... Vemuri et al. [3] give an exhaustive overview of applications of computer vision and machine learning in gastrointestinal (GI) endoscopy. Shevets et al. [4], Sornapudi et al. [5] and Hajabdollahi et al. [6] contributed to segmentation of images acquired by wireless capsule endoscopy (WCE). The employed models range from standard multilayered perceptron (MLP) to more sophisticated architectures like TernausNet and region based CNNs. ...
Article
Full-text available
Minimally invasive surgery is increasingly utilized for mitral valve repair and replacement. The intervention is performed with an endoscopic field of view on the arrested heart. Extracting the necessary information from the live endoscopic video stream is challenging due to the moving camera position, the high variability of defects, and occlusion of structures by instruments. During such minimally invasive interventions there is no time to segment regions of interest manually. We propose a real-time-capable deep-learning-based approach to detect and segment the relevant anatomical structures and instruments. For the universal deployment of the proposed solution, we evaluate them on pixel accuracy as well as distance measurements of the detected contours. The U-Net, Google’s DeepLab v3, and the Obelisk-Net models are cross-validated, with DeepLab showing superior results in pixel accuracy and distance measurements.