FIGURE 3 - uploaded by Fuquan Zhang
Content may be subject to copyright.
Original image and initialization result.

Original image and initialization result.

Source publication
Article
Full-text available
Although Generative Adversarial Networks (GANs) have shown remarkable successes in various computer vision tasks, they still face challenges in image season style transfer task. In this paper, we propose a multi-season Generative Adversarial Networks (MSGANs) aimed to transfer input images into other season styles. To improve the quality of the sim...

Contexts in source publication

Context 1
... R, G and B represent the three color channels of the RGB images respectively. Eq. 1 is the classical formula of RGB image to gray image, which is in line with the proportion of human body's color perception ability. An example of the initialization result is shown in Fig. ...
Context 2
... R, G and B represent the three color channels of the RGB images respectively. Eq. 1 is the classical formula of RGB image to gray image, which is in line with the proportion of human body's color perception ability. An example of the initialization result is shown in Fig. ...

Similar publications

Article
Full-text available
In order to study the numerical simulation method of metro transfer station, a three-dimensional finite element model with and without considering fluid-solid coupling effect is established based on a metro transfer station of Jinan. The difference between the displacement of diaphragm wall, as well as the ground surface settlement, under the two c...

Citations

... Hong et al. [7] introduced a self-attention mechanism in deep learning to blend Chinese landscape painting with classical private garden virtual scenes, while Li et al. [8] incorporated the Canny operator into the CNN model to sharpen the edges of generated transferred images. The latter category includes methods like image style transfer based on CycleGAN proposed by Zhu et al. [9], a GAN approach for seasonal style transfer by Zhang et al. [10], a twin GAN method for Chinese landscape painting style transfer by Way et al. [11], integration of Fourier transform and MiDaS depth estimation model into GANs by Han et al. [12] for achieving artistic image style transfer, and the design of an adversarial network model with steganography embedding hidden information function by Li et al. [13] to realize image transfer effects. Overall, while each of these research endeavours possesses its own strengths and limitations, collectively, they have achieved commendable image style transfer outcomes. ...
Article
Full-text available
To reduce the occurrence of information loss and distortion in image style transfer, a method is proposed for researching and designing image style transfer technology based on multi‐scale convolutional neural networks (CNNs) feature fusion. Initially, the VGG19 model is designed for coarse and fine‐scale networks to achieve multi‐scale CNN feature extraction of target image information. Subsequently, while setting the corresponding feature loss function, an additional least‐squares penalty parameter is introduced to balance the optimal total loss function. Finally, leveraging the characteristics of stochastic gradient descent iteration, image features are fused and reconstructed to obtain better style transfer images. Experimental evaluations utilize peak signal‐to‐noise ratio (PSNR), structural similarity index (SSIM), information entropy (IE), and mean squared error (MSE) as metrics for assessing the transferred images, comparing them with three typical image style transfer methods. Results demonstrate that the proposed method achieves optimal performance across all metrics, realizing superior image style transfer effects.
... Outside the medical sphere, generative models find utility in NLP settings, particularly in text-toimage models like DALL·E 2 and Midjourney (Liao et al., 2022;Ramesh et al., 2022;Oppenlaender, 2022). Additionally, they are employed in style transfer and other aesthetic computer vision techniques Cao et al., 2018;Liu et al., 2018;Palsson et al., 2018;Zhang and Wang, 2020). Within the biomedical realm, generative models have proven efficacious in creating virtual stains for unstained histopathological tissues which would typically undergo hemotoxylin/eosin staining . ...
Article
Full-text available
Machine learning (ML) applications in medical artificial intelligence (AI) systems have shifted from traditional and statistical methods to increasing application of deep learning models. This survey navigates the current landscape of multimodal ML, focusing on its profound impact on medical image analysis and clinical decision support systems. Emphasizing challenges and innovations in addressing multimodal representation, fusion, translation, alignment, and co-learning, the paper explores the transformative potential of multimodal models for clinical predictions. It also highlights the need for principled assessments and practical implementation of such models, bringing attention to the dynamics between decision support systems and healthcare providers and personnel. Despite advancements, challenges such as data biases and the scarcity of “big data” in many biomedical domains persist. We conclude with a discussion on principled innovation and collaborative efforts to further the mission of seamless integration of multimodal ML models into biomedical practice.
... The algorithm can also generate candidate textures through different iterations to provide users with choices. However, this algorithm is not suitable for any texture, and only includes many repeated patterns in nature, such as pebbles, leaves, shrubs, flowers and branches [5]. Related researchers have developed a texture transfer algorithm based on image gradient. ...
... Express. In practical application, it is often used = = = 1Substituting (3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17) into (3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16) gives the following formula (20): ...
... Express. In practical application, it is often used = = = 1Substituting (3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17) into (3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16) gives the following formula (20): ...
Chapter
Full-text available
In order to solve some inherent defects of generative confrontation network model in artistic image processing, such as unstable training, gradient disappearance and pattern collapse, the research on artistic image style transfer based on artificial intelligence generative confrontation network is put forward (overall method). On the basis of generative countermeasure network, the network model structure is optimized, and the spectral normalization processing is introduced and the residual structure is improved. Normalization constrains the parameter learning range of the network model, accelerates the learning speed of the model, and has a certain regularization effect; In the generator network, a new residual structure is used to optimize the signal propagation and reduce the training error. Through the treatment of identity mapping, ResNet network alleviates the problems of gradient disappearance and network degradation caused by deep network layers. The experimental results show that 10 subjects, aged between 22 and 30, including 3 females and 7 males, were invited to participate in the test. They had never been exposed to 30 pictures in the two groups, and the corresponding model of each group was unknown. From the results, the stylized pictures using the model proposed in this paper are more popular among young people, with an evaluation rate of 58% and better visual experience. Conclusion: Experiments show that the improved algorithm is better than the original algorithm in image style conversion.
... Transfer learning (TL) is one of the emerging techniques that has been widely exploited in text processing [6,26], speech recognition [23] and image processing [20,25] to help using the learned semantic representation. Furthermore, the G-MLTL uses MMD distance in the "feature-label" subspace to minimize the distribution discrepancy between the source and target domains. ...
Article
Full-text available
Multi-label transfer learning aiming to learn robust classifiers for the target domain by leveraging knowledge from a source domain has been received considerable attention recently. The core part of such research is similarity measurement. Nevertheless, the existing similarity measurement functions of probability distributions are still too simple to fully describe the similarity of probability distributions.. In order to address this problem, we propose a Multi-label Transfer Learning Via Latent Graph Alignment (G-MLTL). G-MLTL uses subspace learning to make the feature distribution of the target domain consistent with the source domain. At the same time, G-MLTL decomposes the label matrix to ensure data points sharing the same labels to have identical latent semantic representation in the new reconstruction space. The proposed G-MLTL also focuses on directly utilizing Latent Graph Alignment to guide the knowledge transfer process. Extensive experiments demonstrate that G-MLTL significantly outperforms the existing multi-label transfer learning methods. Especially when the number of labels is more than four, the mean Average Precision of G-MLTL is higher than the baseline algorithm by 2.1-10.5%.
... translation, and MR motion correction [1]. Other potential generative models for unpaired image translation include Multimodal UNIT (MUNIT) [10], Disentangled Representation for Image-to-Image Translation++ (DRIT++) [17], Multi-Season GAN (MSGAN) [30], and StarGAN v2 [4]. ...
Preprint
Medical image translation has the potential to reduce the imaging workload, by removing the need to capture some sequences, and to reduce the annotation burden for developing machine learning methods. GANs have been used successfully to translate images from one domain to another, such as MR to CT. At present, paired data (registered MR and CT images) or extra supervision (e.g. segmentation masks) is needed to learn good translation models. Registering multiple modalities or annotating structures within each of them is a tedious and laborious task. Thus, there is a need to develop improved translation methods for unpaired data. Here, we introduce modified pix2pix models for tasks CT$\rightarrow$MR and MR$\rightarrow$CT, trained with unpaired CT and MR data, and MRCAT pairs generated from the MR scans. The proposed modifications utilize the paired MR and MRCAT images to ensure good alignment between input and translated images, and unpaired CT images ensure the MR$\rightarrow$CT model produces realistic-looking CT and CT$\rightarrow$MR model works well with real CT as input. The proposed pix2pix variants outperform baseline pix2pix, pix2pixHD and CycleGAN in terms of FID and KID, and generate more realistic looking CT and MR translations.
... Document [33] proposed a dedicated data compression scheme. Document [34][35][36] proposed wavelet transform and feature comparison to perform data conversion and compress the data volume. Document [37,38] proposed effective data management and flow strategies to improve data processing and analysis speed. ...
Article
Full-text available
This paper proposes a parallel computing analysis model HPM and analyzes the parallel architecture of CPU–GPU based on this model. On this basis, we study the parallel optimization of the ray-tracing algorithm on the CPU–GPU parallel architecture and give full play to the parallelism between nodes, the parallelism of the multi-core CPU inside the node, and the parallelism of the GPU, which improve the calculation speed of the ray-tracing algorithm. This paper uses the space division technology to divide the ground data, constructs the KD-tree organization structure, and improves the construction method of KD-tree to reduce the time complexity of the algorithm. The ground data is evenly distributed to each computing node, and the computing nodes use a combination of CPU–GPU for parallel optimization. This method dramatically improves the drawing speed while ensuring the image quality and provides an effective means for quickly generating photorealistic images.
... These types of methods usually do not pay enough attention to the location problem of the detection and often produce missed detection and inaccurate locations. In addition, various types of neural works have been applied in various applications, e.g., graph neural network for creative works [12], LVQ neural network for traffic prediction [13], generative adversarial networks (GANs) for style transfer [14], and 3D GANs for the simulation of creative stage scene [15]. The most important is that GGO detection requires a huge amount of computation in 3D CT and generally requires a more efficient detection method to meet actual needs in medical IoT. ...
Article
Full-text available
We present a 3D deep neural network known as URDNet for detecting ground-glass opacity (GGO) nodules in 3D CT images. Prior work on GGO detection repurposes classifiers on a large number of windows to perform detection or fine-tuning by box regression based on a previous window classification step. Instead, we consider GGO detection as a multitarget regression problem to focus on the location of GGO. Furthermore, to capture multiscale information, we introduce a backbone network which is a contracting-expanding structure similar to 2D U-net, but we inject the source CT inputs into each layer in the contracting pathway to prevent source information loss at different scales. At last, we propose a two-stage training method for URDNet. In the first stage, the backbone of the network for feature extraction is trained, and in the second, the overall URDNet is fine-tuned based on the previous pretrained weights. By using this training method in conjunction with data augmentation and hard negative mining techniques, our URDNet can be effectively trained even on a small amount of annotated CT images. We evaluate the proposed method on the LIDC-IDRI dataset. It achieves the sensitivity of 90.8% with only 1 false positive per scan. Experimental results show that our detection method achieves the superior detection performance over the state-of-the-art methods. Due to its simplicity and effective, URDNet can be easier to apply to medical IoT systems for improving the efficiency of overall health systems.
Article
Unsupervised heterogeneous face translation requires obtaining heterogeneous images with the same identities at training time, limiting the use in unconstrained real‐world scenarios. Taking a step further towards unconstrained heterogeneous face translation, the authors explore unsupervised zero‐shot heterogeneous face translation for the first time, which is expected to synthesize images that resemble the style of target images and whose identities in the source domain have been preserved but never seen in the target domain during training. Essentially, asymmetry between heterogeneous faces under the zero‐shot setting further exacerbates the distortion and blurring of the translated images. The authors therefore propose a novel frequency‐structure‐guided regularization, which can jointly encourage to capture detailed textures and maintain identity consistency. Through extensive experimental validation and comparisons to several baseline methods on benchmark datasets, the authors verify the effectiveness of the proposed framework.
Article
Full-text available
In computer vision and artistic expression, the synthesis of visually compelling images and the transfer of artistic styles onto videos have gained significant attention. This research addresses the challenges in achieving realistic image synthesis and style transfer in the dynamic context of videos. Existing methods often struggle to maintain temporal coherence and fail to capture intricate details, prompting the need for innovative approaches. The conventional methods for image synthesis and style transfer in videos encounter difficulties in preserving the natural flow of motion and consistency across frames. This research aims to bridge this gap by leveraging the power of Generative Adversarial Networks (GANs) to enhance the quality and temporal coherence of synthesized images in video sequences. While GANs have demonstrated success in image generation, their application to video synthesis and style transfer remains an underexplored domain. The research seeks to address this gap by proposing a novel methodology that optimizes GANs for video-challenges, aiming for realistic, high-quality, and temporally consistent results. Our approach involves the development of a specialized GAN architecture tailored for video synthesis, incorporating temporal-aware modules to ensure smooth transitions between frames. Additionally, a style transfer mechanism is integrated, enabling the transfer of artistic styles onto videos seamlessly. The model is trained on diverse datasets to enhance its generalization capabilities. Experimental results showcase the efficacy of the proposed methodology in generating lifelike images and seamlessly transferring styles across video frames. Comparative analyses demonstrate the superiority of our approach over existing methods, highlighting its ability to address the temporal challenges inherent in video synthesis and style transfer.