Figure - available from: Applied Intelligence
This content is subject to copyright. Terms and conditions apply.
Schematic diagram of the proposed algorithm

Schematic diagram of the proposed algorithm

Source publication
Article
Full-text available
Brain structure segmentation in Magnetic Resonance Images (MRI) is essential to the assessment and treatment of medical disorders, especially neuropsychiatric diseases. The key to semantic segmentation is to understand the low-level visual semantics and the high-level spatial semantics of the image. Due to the complex anatomical structures, the cur...

Citations

... To enhance semantic features, U-net++ [7] and U-net3+ [8] incorporated multiscale jump connections and reduced the semantic gap between the encoder and decoder. To enhance target features, Attention UW-Net [9], SVF-Net [10] and HT-Net [11] focused on target features and suppressed irrelevant features. Dhamija et al. [12] proposed the USeg Transformer by combining transformer-based encoders and convolution-based encoders to segment medical images with high precision. ...
Article
Full-text available
Medical image segmentation can provide a reliable basis for clinical analysis and diagnosis. However, this task is challenging due to the low contrast, boundary ambiguity between organs or lesions and surrounding tissues, and noise interference of images. To address this challenge, which is unique to medical images, and further improve the segmentation accuracy and precision, a medical image segmentation model (TransDiff) is proposed from the perspective of improving model robustness and enriching semantic information. TransDiff comprises three parts: a variational autoencoder (VAE), a diffusion transformer model and a Swin Transformer. The VAE constructs a latent space to provide an environment for fully extracting and fusing features. The diffusion model predicts and removes noise by inferring semantics through the propagation of information between nodes. The Swin Transformer enriches discriminative features as a conditional part. TransDiff inherits the robustness to noise and missing data of the diffusion model and the feature enrichment of the Swin Transformer, thus exhibiting a higher understanding of semantic information. It performs well on medical datasets with three different image modalities, outperforms existing medical image segmentation methods in terms of segmentation precision and accuracy, and has good generalizability. The codes and trained models will be publicly available at https://github.com/xiaoxiao1997/TransDiff.
... For example, Jo et al. [6] propagate the features from the encoder to decoder using skip connections with attention gate, which can automatically learns target structures of varying shapes and size. Recent works [7][8][9][10][11] have followed encoder-decoder architecture and made improvements in multi-scale feature integrating strategy. However, limited by intrinsic localized receptive fields, CNNs-based networks tend to neglect global spatial context and long-range pixel relationships [12][13][14][15]. ...
Article
Full-text available
3D volumetric medical image segmentation is a crucial task in computer-aided diagnosis applications, but it remains challenging due to low contrast and boundary ambiguity between organs and surrounding tissues. Considering that accurate boundary voxels are of importance for organ segmentation, which relies on rich detailed features information. The most recent convolutional neural networks and transformer networks have attempted to enhance 2D boundaries during feature extraction. Few approaches focus on boundary voxels preservation for 3D scenarios. To address these issues, we propose the Hybrid Transformer-CNN with Boundary-awareness(HTCB-Net) network, which follows an encoder-decoder segmentation paradigm with learnable boundary modules. The 3D swin-transformer encoder is embedded with auxiliary object-related boundary map by designing a learnable boundary extracting module(BEM), which assists model in obtaining rich and discriminative feature representations. Our boundary map obtained from BEM supervises explicitly feature extraction process. Subsequently, boundary preserving module(BPM) adopts a novel fusion strategy, which integrates extracted boundary map and the corresponding encoder features. Through this module, the boundary position awareness is combined for feature enhancement with spatial complement and channel attention. We evaluate the performance of the proposed method with quantitative experiments on three public available datasets in both CT and MRI modalities: OAI-ZIB, Spleen, Pancreas. The comparative experimental results demonstrate that our HTCB-Net preserves more precise 3D boundaries and obtains significant improvements, particularly in terms of Average Symmetric Surface Distance(ASSD).