Contours of the density functions of the entropy-regularized optimal coupling of N (0, 1) and N (5, 2) in three different parameters λ = 0.1, 1, 10. All of the optimal couplings are two-variate normal distributions.

Contours of the density functions of the entropy-regularized optimal coupling of N (0, 1) and N (5, 2) in three different parameters λ = 0.1, 1, 10. All of the optimal couplings are two-variate normal distributions.

Source publication
Article
Full-text available
The distance and divergence of the probability measures play a central role in statistics, machine learning, and many other related fields. The Wasserstein distance has received much attention in recent years because of its distinctions from other distances or divergences. Although computing the Wasserstein distance is costly, entropy-regularized o...

Context in source publication

Context 1
... which is equal to the original optimal coupling of nonregularized optimal transport and as λ → ∞, Σ λ converges to 0. This is a special case of Corollary 1.The larger λ becomes, the less correlated the optimal coupling is. We visualize this behavior by computing the optimal couplings of two one-dimensional normal distributions in Figure 2. The left panel shows the original version. ...

Similar publications

Article
Full-text available
We propose an effective framework for computing the prepotential of the topological B-model on a class of local Calabi–Yau geometries related to the circle compactification of five-dimensional N=1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsf...
Article
Full-text available
We study Benamou’s domain decomposition algorithm for optimal transport in the entropy regularized setting. The key observation is that the regularized variant converges to the globally optimal solution under very mild assumptions. We prove linear convergence of the algorithm with respect to the Kullback–Leibler divergence and illustrate the (poten...

Citations

... However, it is crucial to ensure the convergence of our latent representations of similar pairs across their entire characteristics. Notably, as Tong and Kobayashi [83] demonstrated, differences in the diagonal covariances of multivariate normal distributions can significantly influence the optimal transport cost and Wasserstein distance, even when the means are aligned. This highlights the importance of considering both mean and covariance differences for accurate distribution comparison. ...
Preprint
Full-text available
Addressing challenges in domain invariance within single-cell genomics necessitates innovative strategies to manage the heterogeneity of multi-source datasets while maintaining the integrity of biological signals. We introduce TarDis, a novel deep generative model designed to disentangle intricate covariate structures across diverse biological datasets, distinguishing technical artifacts from true biological variations. By employing tailored covariate-specific loss components and a self-supervised approach, TarDis effectively generates multiple latent space representations that capture each continuous and categorical target covariate separately, along with unexplained variation. Our extensive evaluations demonstrate that TarDis outperforms existing methods in data integration, covariate disentanglement, and robust out-of-distribution predictions. The model's capacity to produce interpretable and structured latent spaces, including ordered latent representations for continuous covariates, enhances its utility in hypothesis-driven research. Consequently, TarDis offers a promising analytical platform for advancing scientific discovery, providing insights into cellular dynamics, and enabling targeted therapeutic interventions.
... As a further extension, Janati et al . proposed an entropy-regularized OMT method between two Gaussian measures, by solving the fixed-point equation underpinning the Sinkhorn algorithm for both the balanced and unbalanced cases [5,15,45]. Kernel methods are extensively employed in machine learning, providing a powerful capability to efficiently handle data in a non-linear space by implicitly mapping data into a high-dimensional space, a method known as the kernel trick. Ghojogh et al . ...
Conference Paper
Full-text available
The Wasserstein distance from optimal mass transport (OMT) is a powerful mathematical tool with numerous applications that provides a natural measure of the distance between two probability distributions. Several methods to incorporate OMT into widely used probabilistic models, such as Gaussian or Gaussian mixture, have been developed to enhance the capability of modeling complex multimodal densities of real datasets. However, very few studies have explored the OMT problems in a reproducing kernel Hilbert space (RKHS), wherein the kernel trick is utilized to avoid the need to explicitly map input data into a high-dimensional feature space. In the current study, we propose a Wasserstein-type metric to compute the distance between two Gaussian mixtures in a RKHS via the kernel trick, i.e., kernel Gaussian mixture models.
... In optimal transport theory, there have been many studies in recent years on the case of discrete Tsallis regularization, comparing its properties with those of regularization by the KL divergence [3,22]. However, to our knowledge, few results focus on the properties of Tsallis regularization in the continuous setting [36]. ...
Preprint
Full-text available
In this paper, we consider Tsallis entropic regularized optimal transport and discuss the convergence rate as the regularization parameter $\varepsilon$ goes to $0$. In particular, we establish the convergence rate of the Tsallis entropic regularized optimal transport using the quantization and shadow arguments developed by Eckstein--Nutz. We compare this to the convergence rate of the entropic regularized optimal transport with Kullback--Leibler (KL) divergence and show that KL is the fastest convergence rate in terms of Tsallis relative entropy.
... The proposed algorithm based on the work by Sinkhorn (1964Sinkhorn ( , 1967 solves the corresponding optimization problem in about O(N 2 ) elementary operations (Altschuler et al., 2017;Dvurechensky et al., 2018). Since then, entropy regularized OT (EROT) has become a frequently used computational scheme for the approximation of OT Amari et al., 2019;Clason et al., 2021;Tong & Kobayashi, 2021). ...
Preprint
For probability measures supported on countable spaces we derive limit distributions for empirical entropic optimal transport quantities. In particular, we prove that the corresponding plan converges weakly to a centered Gaussian process. Furthermore, its optimal value is shown to be asymptotically normal. The results are valid for a large class of ground cost functions and generalize recently obtained limit laws for empirical entropic optimal transport quantities on finite spaces. Our proofs are based on a sensitivity analysis with respect to a weighted $\ell^1$-norm relying on the dual formulation of entropic optimal transport as well as necessary and sufficient optimality conditions for the entropic transport plan. This can be used to derive weak convergence of the empirical entropic optimal transport plan and value that results in weighted Borisov-Dudley-Durst conditions on the underlying probability measures. The weights are linked to an exponential penalty term for dual entropic optimal transport and the underlying ground cost function under consideration. Finally, statistical applications, such as bootstrap, are discussed.
Article
We study optimal transport for stationary stochastic processes taking values in finite spaces. In order to reflect the stationarity of the underlying processes, we restrict attention to stationary couplings, also known as joinings. The resulting optimal joining problem captures differences in the long-run average behavior of the processes of interest. We introduce estimators of both optimal joinings and the optimal joining cost, and establish consistency of the estimators under mild conditions. Furthermore, under stronger mixing assumptions we establish finite-sample error rates for the estimated optimal joining cost that extend the best known results in the iid case. We also extend the consistency and rate analysis to an entropy-penalized version of the optimal joining problem. Finally, we validate our convergence results empirically as well as demonstrate the computational advantage of the entropic problem in a simulation experiment.
Article
The body of most multivariate financial data sets can be well modeled by log-normal distributions. Yet not many multivariate log-normal distributions are available in the literature. In this paper, we propose many new multivariate log-normal distributions. An application to an insurance data set is given.