Fig 4 - uploaded by Ruslan Bainazarov
Content may be subject to copyright.
Absolute value of covariance matrices between dimensions of latent space. Dimensions are sorted in the same order as in Figure 2.

Absolute value of covariance matrices between dimensions of latent space. Dimensions are sorted in the same order as in Figure 2.

Source publication
Chapter
Full-text available
Variational Autoencoders play important role in text generation tasks, when semantically consistent latent space is needed. However, training VAE for text is not a trivial task due to mode collapse issue. In this paper, autoencoder with binary latent space trained using straight-through estimator is shown to have advantages over VAE on text modelin...

Contexts in source publication

Context 1
... could be seen as a sort of latent collapse. Figure 4b shows that no such correlations are observed. The same covariance matrix for VAE in Figure 4a demonstrates latent collapse. ...
Context 2
... 4b shows that no such correlations are observed. The same covariance matrix for VAE in Figure 4a demonstrates latent collapse. Results similar to the binary case were obtained for Gumbel-softmax VAE. ...

Similar publications

Conference Paper
Full-text available
In this paper we present an overview of our participation in TRECVID 2019 [1]. We participated in the task Ad-hoc Video Search (AVS) and the subtasks Description Generation and Matching and Ranking of Video to Text (VTT) task. First, for the AVS Task, we develop a system architecture that we call "Word2AudioVisualVec++" (W2AVV++) based on Word2Visu...

Citations

... We applied our multi-objective optimization software MOQA 16 to optimize the following three objectives: (1) the average of the prediction scores, (2) the standard deviation of them, (3) solubility predicted by NetSolP 17 . Notice that MOQA is based on quantum annealing 18 and deep learning 19 , and was previously applied to antimicrobial peptide design 16 . See Supplementary Information about the algorithm of MOQA. ...
Article
Full-text available
In designing functional biological sequences with machine learning, the activity predictor tends to be inaccurate due to shortage of data. Top ranked sequences are thus unlikely to contain effective ones. This paper proposes to take prediction stability into account to provide domain experts with a reasonable list of sequences to choose from. In our approach, multiple prediction models are trained by subsampling the training set and the multi-objective optimization problem, where one objective is the average activity and the other is the standard deviation, is solved. The Pareto front represents a list of sequences with the whole spectrum of activity and stability. Using this method, we designed VHH (Variable domain of Heavy chain of Heavy chain) antibodies based on the dataset obtained from deep mutational screening. To solve multi-objective optimization, we employed our sequence design software MOQA that uses quantum annealing. By applying several selection criteria to 19,778 designed sequences, five sequences were selected for wet-lab validation. One sequence, 16 mutations away from the closest training sequence, was successfully expressed and found to possess desired binding specificity. Our whole spectrum approach provides a balanced way of dealing with the prediction uncertainty, and can possibly be applied to extensive search of functional sequences.
... Finding noisy data in various datasets has always been of importance in traditional machine learning [3] and deep learning [21]. Using Autoencoders is the most popular approach among all and it has also been useful in natural language processing [8]. ...
Article
Full-text available
The advancement of social media contributes to the growing amount of content they share frequently. This framework provides a sophisticated place for people to report various real-life events. Detecting these events with the help of natural language processing has received researchers’ attention, and various algorithms have been developed for this goal. In this paper, we propose a Semantic Modular Model (SMM) consisting of 5 different modules, namely Distributional Denoising Autoencoder, Incremental Clustering, Semantic Denoising, Defragmentation, and Ranking and Processing. The proposed model aims to (1) cluster various documents and ignore the documents that might not contribute to the identification of events, (2) identify more important and descriptive keywords. Compared to the state-of-the-art methods, the results show that the proposed model has a higher performance in identifying events with lower ranks and extracting keywords for more important events in three English Twitter datasets: FACup, SuperTuesday, and USElection. The proposed method outperformed the best-reported results in the mean keyword-precision metric by 7.9%.
... We used the bVAE implementation by Baynazarov and Piontkovskaya. 21 To predict antimicrobial activity, a deep-learning model with a gated recurrent unit is trained. See Tucs et al. 5 for details of the predictor. ...
Article
Full-text available
Increasing the variety of antimicrobial peptides is crucial in meeting the global challenge of multi-drug-resistant bacterial pathogens. While several deep-learning-based peptide design pipelines are reported, they may not be optimal in data efficiency. High efficiency requires a well-compressed latent space, where optimization is likely to fail due to numerous local minima. We present a multi-objective peptide design pipeline based on a discrete latent space and D-Wave quantum annealer with the aim of solving the local minima problem. To achieve multi-objective optimization, multiple peptide properties are encoded into a score using non-dominated sorting. Our pipeline is applied to design therapeutic peptides that are antimicrobial and non-hemolytic at the same time. From 200 000 peptides designed by our pipeline, four peptides proceeded to wet-lab validation. Three of them showed high anti-microbial activity, and two are non-hemolytic. Our results demonstrate how quantum-based optimizers can be taken advantage of in real-world medical studies.
... Variational Autoencoders [27] (VAEs) play an important role in text generation tasks, when semantically consistent latent space is needed; however, VAEs training generally suffers from mode collapse issues. The authors of [28] developed an autoencoder with binary latent space using a straight-through estimator: experiments showed that this approach maintains the main features of VAE, e.g., semantic consistency and good latent space coverage, while not suffering from the mode collapse, other than being much easier to train. One of the most successful uses of autoencoders is for noise removal. ...
Article
Full-text available
One of the main issues for the navigation of underwater robots consists in accurate vehicle positioning, which heavily depends on the orientation estimation phase. The systems employed to this end are affected by different noise typologies, mainly related to the sensors and to the irregular noise of the underwater environment. Filtering algorithms can reduce their effect if opportunely configured, but this process usually requires fine techniques and time. This paper presents DANAE++, an improved denoising autoencoder based on DANAE (deep Denoising AutoeNcoder for Attitude Estimation), which is able to recover Kalman Filter (KF) IMU/AHRS orientation estimations from any kind of noise, independently of its nature. This deep learning-based architecture already proved to be robust and reliable, but in its enhanced implementation significant improvements are obtained in terms of both results and performance. In fact, DANAE++is able to denoise the three angles describing the attitude at the same time, and that is verified also using the estimations provided by an extended KF. Further tests could make this method suitable for real-time applications in navigation tasks.
Article
The Gumbel-max trick is a method to draw a sample from a categorical distribution, given by its unnormalized (log-)probabilities. Over the past years, the machine learning community has proposed several extensions of this trick to facilitate, e.g., drawing multiple samples, sampling from structured domains, or gradient estimation for error backpropagation in neural network optimization. The goal of this survey article is to present background about the Gumbel-max trick, and to provide a structured overview of its extensions to ease algorithm selection. Moreover, it presents a comprehensive outline of (machine learning) literature in which Gumbel-based algorithms have been leveraged, reviews commonly-made design choices, and sketches a future perspective.