Figure - available from: Neural Computing and Applications
This content is subject to copyright. Terms and conditions apply.
Schematic diagram of face data augmentation. The real and augmented samples were generated by TL-GAN (transparent latent-space generative adversarial network) [40]

Schematic diagram of face data augmentation. The real and augmented samples were generated by TL-GAN (transparent latent-space generative adversarial network) [40]

Source publication
Article
Full-text available
The quality and size of training set have a great impact on the results of deep learning-based face-related tasks. However, collecting and labeling adequate samples with high-quality and balanced distributions still remains a laborious and expensive work, and various data augmentation techniques have thus been widely used to enrich the training dat...

Similar publications

Preprint
Full-text available
In this contribution, we show how to incorporate prior knowledge to a deep neural network architecture in a principled manner. We enforce feature space invariances using a novel layer based on invariant integration. This allows us to construct a complete feature space invariant to finite transformation groups. We apply our proposed layer to explici...
Article
Full-text available
A core challenge for both physics and artificial intelligence (AI) is symbolic regression: finding a symbolic expression that matches data from an unknown function. Although this problem is likely to be NP-hard in principle, functions of practical interest often exhibit symmetries, separability, compositionality, and other simplifying properties. I...
Article
Full-text available
Bridges play an important role in transportation, but because of overload and natural factors, bridges will inevitably be damaged, which will affect traffic and even lead to major accidents. Therefore, timely and accurate identification of bridge damage is extremely necessary. Because of the great danger of manual detection, in order to identify th...
Article
Full-text available
A new non-linear variant of a quantitative extension of the uniform boundedness principle is used to show sharpness of error bounds for univariate approximation by sums of sigmoid and ReLU functions. Single hidden layer feedforward neural networks with one input node perform such operations. Errors of best approximation can be expressed using modul...

Citations

... Over the past decade, additional general image augmentations have been presented [5,7,17,18], some of them benefiting facial analyses just as well [19,20,21]. In addition, other works presented face-specific augmentations to take advantage of experts' insights during training [19,22,20,23]. ...
... (3) If controllable generation method is adopted, faces with specific features and attributes can be obtained. (4) Face data augmentation has some special advantages, such as generating faces without self-occlusion and balanced dataset with more intra-class variations [23]. ...
Preprint
Full-text available
The proliferation of deep learning solutions and the scarcity of large annotated datasets pose significant challenges in real-world applications. Various strategies have been explored to overcome this challenge, with data augmentation (DA) approaches emerging as prominent solutions. DA approaches involve generating additional examples by transforming existing labeled data, thereby enriching the dataset and helping deep learning models achieve improved generalization without succumbing to overfitting. In real applications, where solutions based on deep learning are widely used, there is facial expression recognition (FER), which plays an essential role in human communication, improving a range of knowledge areas (e.g., medicine, security, and marketing). In this paper, we propose a simple and comprehensive face data augmentation approach based on mixed face component regularization that outperforms the classical DA approaches from the literature, including the MixAugment which is a specific approach for the target task in two well-known FER datasets existing in the literature.
... These models have millions of trainable parameters, allowing them to capture intricate relationships and features in the input data. To generalize effectively in unconstrained real-world scenarios, these models require extensive exposure to diverse and varied examples during the training process [12]. The acquisition of a substantial number of authentic samples, their subsequent collection and labelling, is commonly acknowledged as a labor-intensive, costly, and error-prone process Moreover, existing datasets frequently experience a disparity in data distribution [13]. ...
Article
Full-text available
The Information Retrieval system aims to discover relevant documents and display them as query responses. However, the ever-changing nature of user queries poses a substantial research problem in defining the necessary data to respond accurately. The Major intention for this study is for enhance the retrieval of relevant information in response to user queries. The aim to develop an advanced IR system that adapts to changing user requirements. By introducing WMO_DBN, we seek to improve the efficiency and accuracy of information retrieval, catering to both general and specific user searches. The proposed methodology comprises three important steps: pre-processing, feature choice, and categorization. Initially, unstructured data subject to pre-processing to transform it into a structured format. Subsequently, relevant features are selected to optimize the retrieval process. The final step involves the utilization of WMO_DBN, a novel deep learning model designed for information retrieval based on the query data. Additionally, similarity calculation is employed to improve the effectiveness for the network training model. The investigational evaluation for the suggested model was conducted, and its performance is measured regarding the metrics of recall, precision, accuracy, and F1 score, the present discourse concerns their significance within the academic realm. The results prove the superiority of WMO_DBN in retrieving relevant information compared to traditional approaches. This research introduces novel method for addressing the challenges in information retrieval with the integration of WMO_DBN. By applying pre-processing, feature selection, and a deep belief neural network, the proposed system achieves more accurate and efficient retrieval of relevant information. The study contributes to the advancement of information retrieval systems and emphasizes the importance of adapting to users' evolving search queries. The success of WMO_DBN in retrieving relevant information highlights its potential for enhancing information retrieval processes in various applications.
... Using sophisticated forecasting techniques, such as those outlined by the study in [47], enhances the quality of input data for DEA models, thus increasing the accuracy and reliability of the analysis. In this context, Data Augmentation methods come into play by enriching the dataset with additional information, simulating a more robust environment to test our DEA model, and improving the reliability of the results [48]. This methodological approach ensures a strong foundation for our efficiency assessments, offering actionable insights with greater confidence in their validity. ...
Article
Full-text available
This research evaluates the production efficiency of broiler batches using the production efficiency factor and unit cost of production. Employing Data Envelopment Analysis (DEA), it considers variables like poultry housing, age at slaughter, feed consumed, mortality, and unit cost, with the total available weight as the output. Among 31 decision-making units (DMUs), only DMU 4 and DMU 23 approached maximum efficiency. Efficient DMUs serve as benchmarks for disseminating best practices to less efficient ones, enhancing overall efficiency and financial sustainability in poultry farming. This study highlights the significance of unit cost in evaluating production efficiency and proposes actionable insights for improving practices in the sector.
... • FER: 765 images from CK + 48 with surprise, sad, happy, fearful and angry expressions; calm emotion was added from 162 manually cropped images from live-stream (camera and YouTube (San Mateo, CA, USA)); in addition, the dataset was augmented using data augmentation generative adversarial network (DAGAN [41]) with 186 images; • SER: 1440 files from RAVDESS with surprise, sad, happy, fearful, angry, calm and disgust expressions; disgust emotion files were eliminated; • TER: 47,291 short texts with surprise, sad, happy, fearful, angry and calm emotions. ...
Article
Full-text available
The paper aims to develop an information system for human emotion recognition in streaming data obtained from a PC or smartphone camera, using different methods of modality merging (image, sound and text). The objects of research are the facial expressions, the emotional color of the tone of a conversation and the text transmitted by a person. The paper proposes different neural network structures for emotion recognition based on unimodal flows and models for the margin of the multimodal data. The analysis determined that the best classification accuracy is obtained for systems with data fusion after processing each channel separately and obtaining individual characteristics. The final analysis of the model based on data from a camera and microphone or recording or broadcast of the screen, which were received in the “live” mode, gave a clear understanding that the quality of the obtained results is highly dependent on the quality of the data preparation and labeling. This is directly related to the fact that the data on which the neural network is trained is highly qualified. The neural network with combined data on the penultimate layer allows a psycho-emotional state recognition accuracy of 0.90 to be obtained. The spatial distribution of emotion analysis was also analyzed for each data modality. The model with late fusion of multimodal data demonstrated the best recognition accuracy.
... Simultaneously, the conditional variational autoencoder (CVAE), as one powerful deep learning method, employs an encoder-decoder framework with conditional variables, where the encoder maps the input image into latent space, compressing the high-dimensional image data into a lower-dimensional representation, while the decoder reconstructs the image by utilizing the latent space representation. CVAE has been demonstrated to be effective in classification tasks in the field of structural health monitoring [18,19], biomedicine [20,21], and face recognition [22,23]. ...
Article
Full-text available
In this paper, the vibration-based image representation and data fusion demonstrates distinctive benefit in feature extraction, yielding superior performance for damage identification in railway engineering. Specifically, based on vehicle-track coupled dynamics, the rail vibration datasets under diverse fastener damage conditions are generated. By converting 1-D vibration signals into 2-D grayscale images with recurrence plots (RPs) and the aid of conditional variational autoencoder (CVAE), the acceleration RPs and displacement RPs are fused for enhancing feature extraction. It is demonstrated that detecting the variation in texture patterns and color distribution of the vibration-based images facilitates effective damage identification, mitigating the sensitivity of damage recognition to the deterioration of track irregularity. The results show that the displacement RPs characterised by quasi-static features are more suitable for fastener damage identification. Further, by employing the data fusion that combines both the random dynamic features of the acceleration RPs and quasi-static features of the displacement RPs, the tolerance of measurement range for accurate fastener damage identification can be extended. The robustness of the proposed method is validated after testing different sampling frequencies and additional noise.
... After the images were cropped and resized to fit network requirements, the dataset was extended by rotating the images. This was an effective way to increase the diversity and number of datasets, thus improving the performance and accuracy of the model [40]. By data augmentation, we were able to obtain different angles, orientations, and perspectives, making the model more robust and better able to handle images in different scenes. ...
Article
Full-text available
Winter jujube (Ziziphus jujuba Mill. cv. Dongzao) has been cultivated in China for a long time and has a richly abundant history, whose maturity grade determined different postharvest qualities. Traditional methods for identifying the fundamental quality of winter jujube are known to be time-consuming and labor-intensive, resulting in significant difficulties for winter jujube resource management. The applications of deep learning in this regard will help manufacturers and orchard workers quickly identify fundamental quality information. In our study, the best fundamental quality of winter jujube from the correlation between maturity and fundamental quality was determined by testing three simple physicochemical indexes: total soluble solids (TSS), total acid (TA) and puncture force of fruit at five maturity stages which classified by the color and appearance. The results showed that the fully red fruits (the 4th grade) had the optimal eating quality parameter. Additionally, five different maturity grades of winter jujube were photographed as datasets and used the ResNet-50 model and the iResNet-50 model for training. And the iResNet-50 model was improved to overlap double residuals in the first Main Stage, with an accuracy of 98.35%, a precision of 98.40%, a recall of 98.35%, and a F1 score of 98.36%, which provided an important basis for automatic fundamental quality detection of winter jujube. This study provided ideas for fundamental quality classification of winter jujube during harvesting, fundamental quality screening of winter jujube in assembly line production, and real-time monitoring of winter jujube during transportation and storage.
... Reviews GAN architectures for imbalanced learning in computer vision tasks. [13] review Generative Adversarial Network architectures for medical imaging [29]. Reviews face data augmentation techniques [30,31] and [32]. ...
Article
Full-text available
The generation of synthetic data can be used for anonymization, regularization, oversampling, semi-supervised learning, self-supervised learning, and several other tasks. Such broad potential motivated the development of new algorithms, specialized in data generation for specific data formats and Machine Learning (ML) tasks. However, one of the most common data formats used in industrial applications, tabular data, is generally overlooked; Literature analyses are scarce, state-of-the-art methods are spread across domains or ML tasks and there is little to no distinction among the main types of mechanism underlying synthetic data generation algorithms. In this paper, we analyze tabular and latent space synthetic data generation algorithms. Specifically, we propose a unified taxonomy as an extension and generalization of previous taxonomies, review 70 generation algorithms across six ML problems, distinguish the main generation mechanisms identified into six categories, describe each type of generation mechanism, discuss metrics to evaluate the quality of synthetic data and provide recommendations for future research. We expect this study to assist researchers and practitioners identify relevant gaps in the literature and design better and more informed practices with synthetic data.
... Among the various existing RV forecasting models, ANN models stand out due to their higher forecast precision and robustness in various contexts (Bucci, 2020;D'Ecclesia & Clementi, 2021;Li, 2022;Souto, 2023;Souto & Moradi, 2023). Nonetheless, ANN models require a substantial amount of data for its superiority to become clear (Wang et al, 2020;Wen et al., 2021). As a result, ANN models tend to not yield remarkably accurate and robust forecasts for novel stocks that have been publicly traded for less than seven years since there is not enough data for proper parameter optimization (Wen et al., 2021). ...
... As a result, ANN models tend to not yield remarkably accurate and robust forecasts for novel stocks that have been publicly traded for less than seven years since there is not enough data for proper parameter optimization (Wen et al., 2021). Though data augmentation can address data scarcity in various cases (Guennec et al., 2016;Hernández & König, 2019;Wang et al, 2020), there is still no ideal data augmentation procedure for time-series data (Wen et al., 2021). ...
... Data augmentation is one of the effective methods to generate augmented views in SSCL [46], [201]. The widely used methods for time series data include jitter, scaling, rotation, permutation, and warping [33], [114]- [116], [202]. ...
Preprint
Full-text available
Self-supervised learning (SSL) has recently achieved impressive performance on various time series tasks. The most prominent advantage of SSL is that it reduces the dependence on labeled data. Based on the pre-training and fine-tuning strategy, even a small amount of labeled data can achieve high performance. Compared with many published self-supervised surveys on computer vision and natural language processing, a comprehensive survey for time series SSL is still missing. To fill this gap, we review current state-of-the-art SSL methods for time series data in this article. To this end, we first comprehensively review existing surveys related to SSL and time series, and then provide a new taxonomy of existing time series SSL methods. We summarize these methods into three categories: generative-based, contrastive-based, and adversarial-based. All methods can be further divided into ten subcategories. To facilitate the experiments and validation of time series SSL methods, we also summarize datasets commonly used in time series forecasting, classification, anomaly detection, and clustering tasks. Finally, we present the future directions of SSL for time series analysis.
... Examples include CenterCrop, RandomCrop, RandomResizedCrop, and CropAndPad. Region cropping/padding methods are widely used in image classification and target detection tasks to cut or remove targets to be classified and detected, both of which can effectively improve the generalization ability of the model [28]. ...
Article
Full-text available
The deep learning-based image segmentation approach has evolved into the mainstream of target detection and shape characterization in microscopic image analysis. However, the accuracy and generalizability of deep learning approaches are still hindered by the insufficient data problem that results from the high expense of human and material resources for microscopic image acquisition and annotation. Generally, image augmentation can increase the amount of data in a short time by means of mathematical simulation, and has become a necessary module for deep learning-based material microscopic image analysis. In this work, we first review the commonly used image augmentation methods and divide more than 60 basic image augmentation methods into eleven categories based on different implementation strategies. Secondly, we conduct experiments to verify the effectiveness of various basic image augmentation methods for the image segmentation task of two classical material microscopic images using evaluation metrics with different applicabilities. The U-Net model was selected as a representative benchmark model for image segmentation tasks, as it is the classic and most widely used model in this field. We utilize this model to verify the improvement of segmentation performance by various augmentation methods. Then, we discuss the advantages and applicability of various image augmentation methods in the material microscopic image segmentation task. The evaluation experiments and conclusions in this work can serve as a guide for the creation of intelligent modeling frameworks in the materials industry.