Home
National Research University Higher School of Economics
Dmitry Vetrov

Dmitry Vetrov
National Research University Higher School of Economics | HSE · Faculty of Computer Sciences

PhD

About

131

Publications

19,214

Reads

3,565

Citations

Skills and Expertise

Machine Learning

Support Vector Machine

January 2007 - July 2015

Lomonosov Moscow State University

Faculty of Computational Mathematics and Cybernetics
Moscow, Russia

Position

Professor (Associate)

Publications

Figure 2. Comparison of the DMD loss surfaces without (left) and with...

Figure 3. Visualization of RDMD mappings on Gaussian → 8Gaussians with...

Figure 6. Left: visualization of the generator initialization at...

Regularized Distribution Matching Distillation for One-step Unpaired Image-to-Image Translation

Preprint

Full-text available

Jun 2024

Diffusion distillation methods aim to compress the diffusion models into efficient one-step generators while trying to preserve quality. Among them, Distribution Matching Distillation (DMD) offers a suitable framework for training general-form one-step generators, applicable beyond unconditional generation. In this work, we introduce its modificati...

The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing

Preprint

Jun 2024

The task of manipulating real image attributes through StyleGAN inversion has been extensively researched. This process involves searching latent variables from a well-trained StyleGAN generator that can synthesize a real image, modifying these latent variables, and then synthesizing an image with the desired edits. A balance must be struck between...

StyleDomain: Efficient and Lightweight Parameterizations of StyleGAN for One-shot and Few-shot Domain Adaptation

Conference Paper

Oct 2023

UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model

Conference Paper

Aug 2023

HIFI++: A Unified Framework for Bandwidth Extension and Speech Enhancement

Conference Paper

Jun 2023

Figure 1: Failure case of source separation with Undiff model.

Results of unconditional speech generation (VCTK).

Results of bandwidth extension (BWE) on VCTK.

Results of declipping (input SNR = 3 db) on VCTK.

Results of neural vocoding (LJ speech dataset).

UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model

Preprint

Full-text available

Jun 2023

This paper introduces UnDiff, a diffusion probabilistic model capable of solving various speech inverse tasks. Being once trained for speech waveform generation in an unconditional manner, it can be adapted to different tasks including degradation inversion, neural vocoding, and source separation. In this paper, we, first, tackle the challenging pr...

To Stay or Not to Stay in the Pre-train Basin: Insights on Ensembling in Transfer Learning

Preprint

Mar 2023

Transfer learning and ensembling are two popular techniques for improving the performance and robustness of neural networks. Due to the high cost of pre-training, ensembles of models fine-tuned from a single pre-trained checkpoint are often used in practice. Such models end up in the same basin of the loss landscape and thus have limited diversity....

Star-Shaped Denoising Diffusion Probabilistic Models

Preprint

Feb 2023

Methods based on Denoising Diffusion Probabilistic Models (DDPM) became a ubiquitous tool in generative modeling. However, they are mostly limited to Gaussian and discrete diffusion processes. We propose Star-Shaped Denoising Diffusion Probabilistic Models (SS-DDPM), a model with a non-Markovian diffusion-like noising process. In the case of Gaussi...

StyleDomain: Analysis of StyleSpace for Domain Adaptation of StyleGAN

Preprint

Dec 2022

Domain adaptation of GANs is a problem of fine-tuning the state-of-the-art GAN models (e.g. StyleGAN) pretrained on a large dataset to a specific domain with few samples (e.g. painting faces, sketches, etc.). While there are a great number of methods that tackle this problem in different ways there are still many important questions that remain una...

Finemap-MiXeR: A variational Bayesian approach for genetic finemapping

Preprint

Dec 2022

Discoveries from genome-wide association studies often contain large clusters of highly correlated genetic variants, which makes them hard to interpret. In such cases, finemapping the underlying causal variants become important. Here we present a new method, the Finemap-MiXeR, based on a variational Bayesian approach for finemapping genomic data, i...

HyperDomainNet: Universal Domain Adaptation for Generative Adversarial Networks

Preprint

Oct 2022

Domain adaptation framework of GANs has achieved great progress in recent years as a main successful approach of training contemporary GANs in the case of very limited training data. In this work, we significantly improve this framework by proposing an extremely compact parameter space for fine-tuning the generator. We introduce a novel domain-modu...

Variational Autoencoders for Precoding Matrices with High Spectral Efficiency

Chapter

Sep 2022

Neural networks are used for channel decoding, channel detection, channel evaluation, and resource management in multi-input and multi-output (MIMO) wireless communication systems. In this paper, we consider the problem of finding precoding matrices with high spectral efficiency (SE) using variational autoencoder (VAE). We propose a computationally...

Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three Regimes

Preprint

Sep 2022

A fundamental property of deep learning normalization techniques, such as batch normalization, is making the pre-normalization parameters scale invariant. The intrinsic domain of such parameters is the unit sphere, and therefore their gradient optimization dynamics can be represented via spherical optimization with varying effective learning rate (...

Figure 2: Fast Fourier Convolution neural module for speech...

FFC-SE: Fast Fourier Convolution for Speech Enhancement

Preprint

Full-text available

Apr 2022

Fast Fourier convolution (FFC) is the recently proposed neural operator showing promising performance in several computer vision problems. The FFC operator allows employing large receptive field operations within early layers of the neural network. It was shown to be especially helpful for inpainting of periodic structures which are common in audio...

HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement

Preprint

Mar 2022

Generative adversarial networks have recently demonstrated outstanding performance in neural vocoding outperforming best autoregressive and flow-based models. In this paper, we show that this success can be extended to other tasks of conditional audio generation. In particular, building upon HiFi vocoders, we propose a novel HiFi++ general framewor...

Figure 5. Comparison of the SE prediction algorithms for different...

Figure 7. Comparison of the SE prediction algorithms for user-wise SE...

Machine Learning Methods for Spectral Efficiency Prediction in Massive MIMO Systems

Preprint

Full-text available

Dec 2021

Channel decoding, channel detection, channel assessment, and resource management for wireless multiple-input multiple-output (MIMO) systems are all examples of problems where machine learning (ML) can be successfully applied. In this paper, we study several ML approaches to solve the problem of estimating the spectral efficiency (SE) value for a ce...

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Preprint

Oct 2021

Structured latent variables allow incorporating meaningful prior knowledge into deep learning models. However, learning with such variables remains challenging because of their discrete nature. Nowadays, the standard learning approach is to define a latent variable as a perturbed algorithm output and to use a differentiable surrogate for training....

Figure 2: The illustration of the connection of predictionsˆZtionsˆ...

Figure 7: Optimization curves for AdaTQC, TQC with two different...

Figure 8: Comparison of different methods on 'HalfCheetah-v3'...

Figure 9: The average η over the optimization trajectory of AdaTQC with...

Automating Control of Overestimation Bias for Continuous Reinforcement Learning

Preprint

Full-text available

Oct 2021

Bias correction techniques are used by most of the high-performing methods for off-policy reinforcement learning. However, these techniques rely on a pre-defined bias correction policy that is either not flexible enough or requires environment-specific tuning of hyperparameters. In this work, we present a simple data-driven approach for guiding bia...

Performance (FID and qFID, lower is better) of post-training...

Quantization of Generative Adversarial Networks for Efficient Inference: a Methodological Study

Preprint

Full-text available

Aug 2021

Generative adversarial networks (GANs) have an enormous potential impact on digital content creation, e.g., photo-realistic digital avatars, semantic content editing, and quality enhancement of speech and images. However, the performance of modern GANs comes together with massive amounts of computations performed during the inference and high energ...

On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay

Preprint

Jun 2021

Despite the conventional wisdom that using batch normalization with weight decay may improve neural network training, some recent works show their joint usage may cause instabilities at the late stages of training. Other works, in contrast, show convergence to the equilibrium, i.e., the stabilization of training metrics. In this paper, we study thi...

Figure 1. To produce a mean embedding MeTTA averages activations of a...

Figure 2. Jittering of supervised ResNet50 prediction depending on a...

Mean Embeddings with Test-Time Data Augmentation for Ensembling of Representations

Preprint

Full-text available

Jun 2021

Averaging predictions over a set of models -- an ensemble -- is widely used to improve predictive performance and uncertainty estimation of deep learning models. At the same time, many machine learning systems, such as search, matching, and recommendation systems, heavily rely on embeddings. Unfortunately, due to misalignment of features of indepen...

Towards Practical Credit Assignment for Deep Reinforcement Learning

Preprint

Jun 2021

Credit assignment is a fundamental problem in reinforcement learning, the problem of measuring an action's influence on future rewards. Improvements in credit assignment methods have the potential to boost the performance of RL algorithms on many tasks, but thus far have not seen widespread adoption. Recently, a family of methods called Hindsight C...

On Power Laws in Deep Ensembles

Preprint

Jul 2020

Ensembles of deep neural networks are known to achieve state-of-the-art performance in uncertainty estimation and lead to accuracy improvement. In this work, we focus on a classification problem and investigate the behavior of both non-calibrated and calibrated negative log-likelihood (CNLL) of a deep ensemble as a function of the ensemble size and...

Involutive MCMC: a Unifying Framework

Preprint

Jun 2020

Markov Chain Monte Carlo (MCMC) is a computational approach to fundamental problems such as inference, integration, optimization, and simulation. The field has developed a broad spectrum of algorithms, varying in the way they are motivated, the way they are applied and how efficiently they sample. Despite all the differences, many of them share the...

MARS: Masked Automatic Ranks Selection in Tensor Decompositions

Preprint

Jun 2020

Tensor decomposition methods have recently proven to be efficient for compressing and accelerating neural networks. However, the problem of optimal decomposition structure determination is still not well studied while being quite important. Specifically, decomposition ranks present the crucial parameter controlling the compression-accuracy trade-of...

Test accuracy for different methods on CIFAR-10. Architecture is the...

Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks

Preprint

Full-text available

Jun 2020

Training neural networks with binary weights and activations is a challenging problem due to the lack of gradients and difficulty of optimization over discrete weights. Many successful experimental results have been recently achieved using the empirical straight-through estimation approach. This approach has generated a variety of ad-hoc rules for...

Deep Ensembles on a Fixed Memory Budget: One Wide Network or Several Thinner Ones?

Preprint

May 2020

One of the generally accepted views of modern deep learning is that increasing the number of parameters usually leads to better quality. The two easiest ways to increase the number of parameters is to increase the size of the network, e.g. width, or to train a deep ensemble; both approaches improve the performance in practice. In this work, we cons...

Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics

Preprint

May 2020

The overestimation bias is one of the major impediments to accurate off-policy learning. This paper investigates a novel way to alleviate the overestimation bias in a continuous control setting. Our method---Truncated Quantile Critics, TQC,---blends three ideas: distributional representation of a critic, truncation of critics prediction, and ensemb...

Structured Sparsification of Gated Recurrent Neural Networks

Article

Apr 2020

One of the most popular approaches for neural network compression is sparsification — learning sparse weight matrices. In structured sparsification, weights are set to zero by groups corresponding to structure units, e. g. neurons. We further develop the structured sparsification approach for the gated recurrent neural networks, e. g. Long Short-Te...

Low-Variance Black-Box Gradient Estimates for the Plackett-Luce Distribution

Article

Apr 2020

Learning models with discrete latent variables using stochastic gradient descent remains a challenge due to the high variance of gradient estimates. Modern variance reduction techniques mostly consider categorical distributions and have limited applicability when the number of possible outcomes becomes large. In this work, we consider models with l...

Deterministic Decoding for Discrete Data in Variational Autoencoders

Preprint

Mar 2020

Variational autoencoders are prominent generative models for modeling discrete data. However, with flexible decoders, they tend to ignore the latent codes. In this paper, we study a VAE model with a deterministic decoder (DD-VAE) for sequential data that selects the highest-scoring tokens instead of sampling. Deterministic decoding solely relies on...

Stochasticity in Neural ODEs: An Empirical Study

Preprint

Feb 2020

Stochastic regularization of neural networks (e.g. dropout) is a wide-spread technique in deep learning that allows for better generalization. Despite its success, continuous-time models, such as neural ordinary differential equation (ODE), usually rely on a completely deterministic feed-forward operation. This work provides an empirical study of s...

Greedy Policy Search: A Simple Baseline for Learnable Test-Time Augmentation

Preprint

Feb 2020

Test-time data augmentation---averaging the predictions of a machine learning model across multiple augmented samples of data---is a widely used technique that improves the predictive performance. While many advanced learnable data augmentation techniques have emerged in recent years, they are focused on the training phase. Such techniques are not...

Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning

Preprint

Feb 2020

Uncertainty estimation and ensembling methods go hand-in-hand. Uncertainty estimation is one of the main benchmarks for assessment of ensembling performance. At the same time, deep learning ensembles have provided state-of-the-art results in uncertainty estimation. In this work, we focus on in-domain uncertainty for image classification. We explore...

User-controllable Multi-texture Synthesis with Generative Adversarial Networks

Conference Paper

Jan 2020

MLRG Deep Curvature

Preprint

Dec 2019

We present MLRG Deep Curvature suite, a PyTorch-based, open-source package for analysis and visualisation of neural network curvature and loss landscape. Despite of providing rich information into properties of neural network and useful for a various designed tasks, curvature information is still not made sufficient use for various reasons, and our...

A Simple Method to Evaluate Support Size and Non-uniformity of a Decoder-Based Generative Model

Chapter

Dec 2019

Theoretical analysis in [1] suggested that adversarially trained generative models are naturally inclined to learn distribution with low support. In particular, this effect is caused by the limited capacity of the discriminator network. To verify this claim, [2] proposed a statistical test based on the birthday paradox that partially confirmed the...

Low-variance Black-box Gradient Estimates for the Plackett-Luce Distribution

Preprint

Nov 2019

Structured Sparsification of Gated Recurrent Neural Networks

Preprint

Nov 2019

Recently, a lot of techniques were developed to sparsify the weights of neural networks and to remove networks' structure units, e.g. neurons. We adjust the existing sparsification approaches to the gated recurrent architectures. Specifically, in addition to the sparsification of weights and neurons, we propose sparsifying the preactivations of gat...

A Prior of a Googol Gaussians: a Tensor Ring Induced Prior for Generative Models

Preprint

Full-text available

Oct 2019

Generative models produce realistic objects in many domains, including text, image, video, and audio synthesis. Most popular models---Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs)---usually employ a standard Gaussian distribution as a prior. Previous works show that the richer family of prior distributions may help to a...

Subspace Inference for Bayesian Deep Learning

Preprint

Full-text available

Jul 2019

Bayesian inference was once a gold standard for learning with neural networks, providing accurate full predictive distributions and well calibrated uncertainty. However, scaling Bayesian inference techniques to deep neural networks is challenging due to the high dimensionality of the parameter space. In this paper, we construct low-dimensional subs...

Uncertainty Estimation via Stochastic Batch Normalization

Chapter

Jun 2019

In this work, we investigate Batch Normalization technique and propose its probabilistic interpretation. We propose a probabilistic model and show that Batch Normalization maximizes the lower bound of its marginal log-likelihood. Then, according to the new probabilistic model, we design an algorithm which acts consistently during train and test. Ho...

The Implicit Metropolis-Hastings Algorithm

Preprint

Full-text available

Jun 2019

Recent works propose using the discriminator of a GAN to filter out unrealistic samples of the generator. We generalize these ideas by introducing the implicit Metropolis-Hastings algorithm. For any implicit probabilistic model and a target distribution represented by a set of samples, implicit Metropolis-Hastings operates by learning a discriminat...

Importance Weighted Hierarchical Variational Inference

Preprint

May 2019

Variational Inference is a powerful tool in the Bayesian modeling toolkit, however, its effectiveness is determined by the expressivity of the utilized variational distributions in terms of their ability to match the true posterior distribution. In turn, the expressivity of the variational family is largely limited by the requirement of having a tr...

Figure 1. The proposed semi-conditional architecture consists of two...

Figure 3. Reconstructions of images using different number of deepest...

Figure 4. Reconstructions from the SCNF-GLOW model with different...

Semi-Conditional Normalizing Flows for Semi-Supervised Learning

Preprint

Full-text available

May 2019

This paper proposes a semi-conditional normalizing flow model for semi-supervised learning. The model uses both labelled and unlabeled data to learn an explicit model of joint distribution over objects and labels. Semi-conditional architecture of the model allows us to efficiently compute a value and gradients of the marginal likelihood for unlabel...

User-Controllable Multi-Texture Synthesis with Generative Adversarial Networks

Preprint

Full-text available

Apr 2019

We propose a novel multi-texture synthesis model based on generative adversarial networks (GANs) with a user-controllable mechanism. The user control ability allows to explicitly specify the texture which should be generated by the model. This property follows from using an encoder part which learns a latent representation for each texture from the...

A Simple Baseline for Bayesian Uncertainty in Deep Learning

Preprint

Full-text available

Feb 2019

We propose SWA-Gaussian (SWAG), a simple, scalable, and general purpose approach for uncertainty representation and calibration in deep learning. Stochastic Weight Averaging (SWA), which computes the first moment of stochastic gradient descent (SGD) iterates with a modified learning rate schedule, has recently been shown to improve generalization i...

Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks

Conference Paper

Jan 2019

Bayesian Sparsification of Gated Recurrent Neural Networks

Preprint

Dec 2018

Bayesian methods have been successfully applied to sparsify weights of neural networks and to remove structure units from the networks, e. g. neurons. We apply and further develop this approach for gated recurrent architectures. Specifically, in addition to sparsification of individual weights and neurons, we propose to sparsify preactivations of g...

ReSet: Learning Recurrent Dynamic Routing in ResNet-like Neural Networks

Preprint

Full-text available

Nov 2018

Neural Network is a powerful Machine Learning tool that shows outstanding performance in Computer Vision, Natural Language Processing, and Artificial Intelligence. In particular, recently proposed ResNet architecture and its modifications produce state-of-the-art results in image classification problems. ResNet and most of the previously proposed a...

Variational Dropout via Empirical Bayes

Preprint

Nov 2018

We study the Automatic Relevance Determination procedure applied to deep neural networks. We show that ARD applied to Bayesian DNNs with Gaussian approximate posterior distributions leads to a variational bound similar to that of variational dropout, and in the case of a fixed dropout rate, objectives are exactly the same. Experimental results show...

Bayesian Compression for Natural Language Processing

Preprint

Oct 2018

In natural language processing, a lot of the tasks are successfully solved with recurrent neural networks, but such models have a huge number of parameters. The majority of these parameters are often concentrated in the embedding layer, which size grows proportionally to the vocabulary length. We propose a Bayesian sparsification technique for RNNs...

Metropolis-Hastings view on variational inference and adversarial training

Preprint

Oct 2018

In this paper we propose to view the acceptance rate of the Metropolis-Hastings algorithm as a universal objective for learning to sample from target distribution -- given either as a set of samples or in the form of unnormalized density. This point of view unifies the goals of such approaches as Markov Chain Monte Carlo (MCMC), Generative Adversar...

The Deep Weight Prior. Modeling a prior distribution for CNNs using generative models

Preprint

Full-text available

Oct 2018

Bayesian inference is known to provide a general framework for incorporating prior knowledge or specific properties into machine learning models via carefully choosing a prior distribution. In this work, we propose a new type of prior distributions for convolutional neural networks, deep weight prior, that in contrast to previously published techni...

Pairwise Augmented GANs with Adversarial Reconstruction Loss

Preprint

Oct 2018

We propose a novel autoencoding model called Pairwise Augmented GANs. We train a generator and an encoder jointly and in an adversarial manner. The generator network learns to sample realistic objects. In turn, the encoder network at the same time is trained to map the true data distribution to the prior in latent space. To ensure good reconstructi...

Doubly Semi-Implicit Variational Inference

Preprint

Oct 2018

We extend the existing framework of semi-implicit variational inference (SIVI) and introduce doubly semi-implicit variational inference (DSIVI), a way to perform variational inference and learning when both the approximate posterior and the prior distribution are semi-implicit. In other words, DSIVI performs inference in models where the prior and...

Conditional Generators of Words Definitions

Preprint

Jun 2018

We explore recently introduced definition modeling technique that provided the tool for evaluation of different distributed vector representations of words through modeling dictionary definitions of words. In this work, we study the problem of word ambiguities in definition modeling and propose a possible solution by employing latent variable model...

Universal Conditional Machine

Preprint

Jun 2018

We propose a single neural probabilistic model based on variational autoencoder that can be conditioned on an arbitrary subset of observed features and then sample the remaining features in "one shot". The features may be both real-valued and categorical. Training of the model is performed by stochastic variational Bayes. The experimental evaluatio...

Averaging Weights Leads to Wider Optima and Better Generalization

Article

Full-text available

Mar 2018

Deep neural networks are typically trained by optimizing a loss function with an SGD variant, in conjunction with a decaying learning rate, until convergence. We show that simple averaging of multiple points along the trajectory of SGD, with a cyclical or constant learning rate, leads to better generalization than conventional training. We also sho...

Bayesian Incremental Learning for Deep Neural Networks

Article

Feb 2018

In industrial machine learning pipelines, data often arrive in parts. Particularly in the case of deep neural networks, it may be too expensive to train the model from scratch each time, so one would rather use a previously learned model and the new data to improve performance. However, deep neural networks are prone to getting stuck in a suboptima...

Uncertainty Estimation via Stochastic Batch Normalization

Article

Feb 2018

In this work, we investigate Batch Normalization technique and propose its probabilistic interpretation. We propose a probabilistic model and show that Batch Normalization maximazes the lower bound of its marginalized log-likelihood. Then, according to the new probabilistic model, we design an algorithm which acts consistently during train and test...

Conditional Generators of Words Definitions

Conference Paper

Jan 2018

Bayesian Compression for Natural Language Processing

Conference Paper

Jan 2018

Figure 1: Relaxed adaptive computation block.

Probabilistic Adaptive Computation Time

Article

Full-text available

Dec 2017

We present a probabilistic model with discrete latent variables that control the computation time in deep learning models such as ResNets and LSTMs. A prior on the latent variables expresses the preference for faster computation. The amount of computation for an input is determined via amortized maximum a posteriori (MAP) inference. MAP inference i...

Bayesian Sparsification of Recurrent Neural Networks

Article

Jul 2017

Recurrent neural networks show state-of-the-art results in many text analysis tasks but often require a lot of memory to store their weights. Recently proposed Sparse Variational Dropout eliminates the majority of the weights in a feed-forward neural network without significant loss of quality. We apply this technique to sparsify recurrent neural n...

Spatially Adaptive Computation Time for Residual Networks

Conference Paper

Jul 2017

Structured Bayesian Pruning via Log-Normal Multiplicative Noise

Article

May 2017

Dropout-based regularization methods can be regarded as injecting random noise with pre-defined magnitude to different parts of the neural network during training. It was recently shown that Bayesian dropout procedure not only improves generalization but also leads to extremely sparse neural architectures by automatically setting the individual noi...

Putting MRFs on a Tensor Train. Supplemetary Material

Data

Feb 2017

Variational Dropout Sparsifies Deep Neural Networks

Article

Full-text available

Jan 2017

We explore recently proposed variational dropout technique which provided an elegant Bayesian interpretation to dropout. We extend variational dropout to the case when dropout rate is unknown and show that it can be found by optimizing evidence variational lower bound. We show that it is possible to assign and find individual dropout rates to each...

Fast Adaptation in Generative Models with Generative Matching Networks

Article

Full-text available

Dec 2016

Despite recent advances, the remaining bottlenecks in deep generative models are necessity of extensive training and difficulties with generalization from small number of training examples. Both problems may be addressed by conditional generative models that are trained to adapt the generative distribution to additional input data. So far this idea...

Spatially Adaptive Computation Time for Residual Networks

Article

Full-text available

Dec 2016

This paper proposes a deep learning architecture based on Residual Network that dynamically adjusts the number of executed layers for the regions of the image. This architecture is end-to-end trainable, deterministic and problem-agnostic. It is therefore applicable without any modifications to a wide range of computer vision problems such as image...

Robust Variational Inference

Article

Full-text available

Nov 2016

Variational inference is a powerful tool for approximate inference. However, it mainly focuses on the evidence lower bound as variational objective and the development of other measures for variational inference is a promising area of research. This paper proposes a robust modification of evidence and a lower bound for the evidence, which is applic...

Ultimate tensorization: compressing convolutional and FC layers alike

Article

Full-text available

Nov 2016

Convolutional neural networks excel in image recognition tasks, but this comes at the cost of high computational and memory complexity. To tackle this problem, [1] developed a tensor factorization framework to compress fully-connected layers. In this paper, we focus on compressing convolutional layers. We show that while the direct application of t...

A new approach for sparse Bayesian channel estimation in SCMA uplink systems

Conference Paper

Oct 2016

The rapid growth of traffic and number of simultaneously available devices leads to the new challenges in constructing fifth generation wireless networks (5G). To handle with them various schemes of non-orthogonal multiple access (NOMA) were proposed. One of these schemes is Sparse Code Multiple Access (SCMA), which is shown to achieve better link...

Deep Part-Based Generative Shape Model with Latent Variables

Conference Paper

Jan 2016

Inferring M-Best Diverse Labelings in a Single One

Conference Paper

Full-text available

Dec 2015

We consider the task of finding M-best diverse solutions in a graphical model. In a previous work by Batra et al. an algorithmic approach for finding such solutions was proposed , and its usefulness was shown in numerous applications. Contrary to previous work we propose a novel formulation of the problem in form of a single energy minimization pro...

M-Best-Diverse Labelings for Submodular Energies and Beyond

Conference Paper

Full-text available

Dec 2015

We consider the problem of finding M best diverse solutions of energy minimization problems for graphical models. Contrary to the sequential method of Batra et al., which greedily finds one solution after another, we infer all M solutions jointly. It was shown recently that such jointly inferred labelings not only have smaller total energy but also...

Tensorizing Neural Networks

Article

Full-text available

Sep 2015

Deep neural networks currently demonstrate state-of-the-art performance in several domains. At the same time, models of this class are very demanding in terms of computational resources. In particular, a large amount of memory is required by commonly used fully-connected layers, making it hard to use the models on low-end devices and stopping the f...

PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions

Article

Full-text available

Apr 2015

This paper proposes a novel approach to reduce the computational cost of evaluation of convolutional neural networks, a factor that has hindered their deployment in low-power devices such as mobile phones. Our method is inspired by the loop perforation technique from source code optimization and accelerates the evaluation of bottleneck convolutiona...

Breaking Sticks and Ambiguities with Adaptive Skip-gram

Article

Full-text available

Feb 2015

Recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words. However, Skip-gram as well as most prior work on learning word representations does not take into account word ambiguity and maintain only single representation per word. Although a number...

Submodular Relaxation for Inference in Markov Random Fields

Article

Full-text available

Jan 2015

In this paper we address the problem of finding the most probable state of a discrete Markov random field (MRF), also known as the MRF energy minimization problem. The task is known to be NP-hard in general and its practical importance motivates numerous approximate algorithms. We propose a submodular relaxation approach (SMR) based on a Lagrangian...

Relevance tagging machine

Article

Full-text available

Jan 2015

Multi-utility Learning: Structured-Output Learning with Multiple Annotation-Specific Loss Functions

Conference Paper

Full-text available

Jun 2014

Structured-output learning is a challenging problem; particularly so because of the difficulty in obtaining large datasets of fully labelled instances for training. In this paper we try to overcome this difficulty by presenting a multi-utility learning framework for structured prediction that can learn from training instances with different forms o...

Putting MRFs on a tensor train

Article

Full-text available

Jan 2014

In the paper we present a new framework for dealing with probabilistic graphical models. Our approach relies on the recently proposed Tensor Train format (TT-format) of a tensor that while being compact allows for efficient application of linear algebra operations. We present a way to convert the energy of a Markov random field to the TT-format and...

Variational inference for sequential distance dependent Chinese restaurant process

Article

Jan 2014

Recently proposed distance dependent Chinese Restaurant Process (ddCRP) generalizes exten-sively used Chinese Restaurant Process (CRP) by accounting for dependencies between data points. Its posterior is intractable and so far only MCMC methods were used for inference. Because of very different nature of ddCRP no prior developments in variational m...

Learning a Model for Shape-Constrained Image Segmentation from Weakly Labeled Data

Conference Paper

Aug 2013

In the paper we address a challenging problem of incorporating preferences on possible shapes of an object in a binary image segmentation framework. We extend the well-known conditional random fields model by adding new variables that are responsible for the shape of an object. We describe the shape via a flexible graph augmented with vertex positi...

Spatial Inference Machines

Conference Paper

Jun 2013

This paper addresses the problem of semantic segmentation of 3D point clouds. We extend the inference machines framework of Ross et al. by adding spatial factors that model mid-range and long-range dependencies inherent in the data. The new model is able to account for semantic spatial context. During training, our method automatically isolates and...

An approach to segmentation of mouse brain images via intermodal registration

Article

Apr 2013

One way to perform a segmentation of the images of mouse brain sections is to register them to some reference images with known segmentation. We designed a new registration algorithm for this task. It is based on hierarchical mutual information maximizationand draws on several recent methods. Besides combining them in a novel way, we identify their...

Statistic Parametric Mapping of Changes in Gene Activity in Animal Brain during Acoustic Stimulation

Article

Full-text available

Mar 2013

We analyzed the expression of transcription factor c-Fos induced by neural activity in the mouse brain after acoustic stimulation. The brain sections of the animals subjected to acoustic stimulation and controls were immunohistochemically stained for c-Fos protein. Statistical parametric mapping (SPM) was used to identify group differences in th...

Automatic determination of cell division rate using microscope images

Article

Full-text available

Mar 2013

Estimating the dynamics of cell culture development using fluorescent microscopy images is of great interest in modern cellular and molecular biology. In large-scale studies of cell populations involving a great number of images taken within a certain interval, manual analysis has proved to be time consuming and inaccurate; therefore, various compu...

Statistic Parametric Mapping of Changes in Gene Activity in Animal Brain during Acoustic Stimulation

Article

Full-text available

Feb 2013

Machine learning: State of the art and perspectives

Article

Jan 2013

Dmitry Vetrov

In the paper we briefly present main active areas in modern machine learning and highlight several new paradigms which became extremely popular since the end of 90s. These paradigms make it possible to include prior domain-and task-specific knowledge in the data model. Among them are Bayesian inference, reinforcement learning, big data processing,...

Submodular Relaxation for MRFs with High-Order Potentials

Conference Paper

Full-text available

Oct 2012

In the paper we propose a novel dual decomposition scheme for approximate MAP-inference in Markov Random Fields with sparse high-order potentials, i.e. potentials encouraging relatively a small number of variable configurations. We construct a Lagrangian dual of the problem in such a way that it can be efficiently evaluated by minimizing a submodul...

Automated Atlas-Based Segmentation of NISSL-Stained Mouse Brain Sections Using Supervised Learning

Article

Full-text available

Sep 2011

The problem of segmentation of mouse brain images into anatomical structures is an important stage of practically every analytical procedure for these images. The present study suggests a new approach to automated segmentation of anatomical structures in the images of NISSL-stained histological sections of mouse brain. The segmentation algorithm is...

Image Segmentation with a Shape Prior Based on Simplified Skeleton

Conference Paper

Full-text available

Jul 2011

In the paper we propose a new deformable shape model that is based on simplified skeleton graph. Such shape model allows to account for different shape variations and to introduce global constraints like known orientation or scale of the object. We combine the model with low-level image segmentation techniques based on Markov random fields and deri...

Submodular Decomposition Framework for Inference in Associative MarkovNetworks with Global Constraints

Conference Paper

Full-text available

Mar 2011

In the paper we address the problem of finding the most probable state of discrete Markov random field (MRF) with associative pairwise terms. Although of practical importance, this problem is known to be NP-hard in general. We propose a new type of MRF decomposition, submodular decomposition (SMD). Unlike existing decomposition approaches SMD decom...

Short-term solar flare forecast

Article

Full-text available

Jan 2011

In this paper a new automated hybrid method for short-term flare forecasting is introduced and suggested for future use. At the initial stage we created a flare base, and an image base for 1996–2009 years interval. Further, we derived simple and efficient parametric precedent model, which turned our prediction problem into two-class classifi-cation...

An Interactive Method of Anatomical Segmentation and Gene Expression Estimation for an Experimental Mouse Brain Slice

Conference Paper

Full-text available

Sep 2010

We consider the problem of statistical analysis of gene expression in a mouse brain during cognitive processes. In particular we focus on the problems of anatomical segmentation of a histological brain slice and estimation of slice’s gene expression level. The first problem is solved by interactive registration of an experimental brain slice into...

Variational segmentation algorithms with label frequency constraints

Article

Full-text available

Sep 2010

We consider image and signal segmentation problems within the Markov random field (MRF) approach and try to take into account label frequency constraints. Incorporating these constraints into MRF leads to an NP-hard optimization problem. For solving this problem we present a two-step approximation scheme that allows one to use hard, interval and so...

Adaptation of Mouse Brain Gene Expression Data for further Statistical Parametrical Mapping Analysis

Article

Full-text available

Aug 2010

The paper describes a method for fully automatic 3D-reconstruction of mouse brain voxel model from a sequence of coronal 2D slices for statistical analysis of gene expression. Two images of each brain slice with different stains are used. The first stain highlights the his-tology of brain, which is used for slice matching. The second stain highligh...