ArticlePDF Available

Convergence of the majorization method for multidimensional scaling

February 1988
Journal of Classification 5(2):163-180

February 1988
5(2):163-180

DOI:10.1007/BF01897162

Source
RePEc

Authors:

Jan de Leeuw

University of California, Los Angeles

In this paper we study the convergence properties of an important class of multidimensional scaling algorithms. We unify and extend earlier qualitative results on convergence, which tell us when the algorithms are convergent. In order to prove global convergence results we use the majorization method. We also derive, for the first time, some quantitative convergence theorems, which give information about the speed of convergence. It turns out that in almost all cases convergence is linear, with a convergence rate close to unity. This has the practical consequence that convergence will usually be very slow, and this makes techniques to speed up convergence very important. It is pointed out that step-size techniques will generally not succeed in producing marked improvements in this respect.

Content uploaded by Jan de Leeuw

Content may be subject to copyright.

A preview of the PDF is not available

Enhancing multidimensional scaling through a distributed algorithm

Article

Full-text available

Jun 2024
J SUPERCOMPUT

Classic multidimensional scaling (MDS) and scaling by majorizing a complex function (SMACOF) are well-known centralized algorithms that are used to solve MDS problem. In this paper, we present a distributed algorithm for solving MDS problem. Estimations of coordinates are performed concurrently under the assumption that each item knows only its own position and its distances from its neighbors and their approximated present locations. The update process is done by calculating the average of the current coordinate of each object and its projections on the solution spaces allocated to it by its neighbors. We apply the method to the problem of sensor localization and obtain numerical results that demonstrate the efficacy of our suggested strategy.

Supervised maximum variance unfolding

Article

Full-text available

Jun 2024
MACH LEARN

Maximum Variance Unfolding (MVU) is among the first methods in nonlinear dimensionality reduction for data visualization and classification. It aims to preserve local data structure and in the meantime push the variance among data as big as possible. However, MVU in general remains a computationally challenging problem and this may explain why it is less popular than other leading methods such as Isomap and t-SNE. In this paper, based on a key observation that the structure-preserving term in MVU is actually the squared stress in Multi-Dimensional Scaling (MDS), we replace the term with the stress function from MDS, resulting in a model that is usable. The property of the usability guarantees the “crowding phenomenon” will not happen in the dimension reduced results. The new model also allows us to combine label information and hence we call it the supervised MVU (SMVU). We then develop a fast algorithm that is based on Euclidean distance matrix optimization. By making use of the majorization-mininmization technique, the algorithm at each iteration solves a number of one-dimensional optimization problems, each having a closed-form solution. This strategy significantly speeds up the computation. We demonstrate the advantage of SMVU on some standard data sets against a few leading algorithms including Isomap and t-SNE.

Self-supervised Multidimensional Scaling with $F$-ratio: Improving Microbiome Visualization

Preprint

Full-text available

Aug 2023

Multidimensional scaling (MDS) is an unsupervised learning technique that preserves pairwise distances between observations and is commonly used for analyzing multivariate biological datasets. Recent advances in MDS have achieved successful classification results, but the configurations heavily depend on the choice of hyperparameters, limiting its broader application. Here, we present a self-supervised MDS approach informed by the dispersions of observations that share a common binary label ($F$-ratio). Our visualization accurately configures the $F$-ratio while consistently preserving the global structure with a low data distortion compared to existing dimensionality reduction tools. Using an algal microbiome dataset, we show that this new method better illustrates the community's response to the host, suggesting its potential impact on microbiology and ecology data analysis.

Universal Majorization-Minimization Algorithms

Preprint

Jul 2023

Matthew Streeter

Majorization-minimization (MM) is a family of optimization methods that iteratively reduce a loss by minimizing a locally-tight upper bound, called a majorizer. Traditionally, majorizers were derived by hand, and MM was only applicable to a small number of well-studied problems. We present optimizers that instead derive majorizers automatically, using a recent generalization of Taylor mode automatic differentiation. These universal MM optimizers can be applied to arbitrary problems and converge from any starting point, with no hyperparameter tuning.

Free Lunch for Privacy Preserving Distributed Graph Learning

Preprint

May 2023

Learning on graphs is becoming prevalent in a wide range of applications including social networks, robotics, communication, medicine, etc. These datasets belonging to entities often contain critical private information. The utilization of data for graph learning applications is hampered by the growing privacy concerns from users on data sharing. Existing privacy-preserving methods pre-process the data to extract user-side features, and only these features are used for subsequent learning. Unfortunately, these methods are vulnerable to adversarial attacks to infer private attributes. We present a novel privacy-respecting framework for distributed graph learning and graph-based machine learning. In order to perform graph learning and other downstream tasks on the server side, this framework aims to learn features as well as distances without requiring actual features while preserving the original structural properties of the raw data. The proposed framework is quite generic and highly adaptable. We demonstrate the utility of the Euclidean space, but it can be applied with any existing method of distance approximation and graph learning for the relevant spaces. Through extensive experimentation on both synthetic and real datasets, we demonstrate the efficacy of the framework in terms of comparing the results obtained without data sharing to those obtained with data sharing as a benchmark. This is, to our knowledge, the first privacy-preserving distributed graph learning framework.

Optimal embedding of finite metric spaces into strictly convex spaces

Article

Jun 2024

Let $D=\{a_1,\dots ,a_n\}$ be a finite set endowed with a metric d and X be an arbitrary strictly convex space. In this paper, we propose an algorithm for solving the following optimization problem We will discuss the convergence of the algorithm, and in the case where X is an inner product space, we will prove that the proposed algorithm is convergent.

Neural graph distance embedding for molecular geometry generation

Article

Apr 2024
J COMPUT CHEM

Johannes T Margraf

This article introduces neural graph distance embedding (nGDE), a method for generating 3D molecular geometries. Leveraging a graph neural network trained on the OE62 dataset of molecular geometries, nGDE predicts interatomic distances based on molecular graphs. These distances are then used in multidimensional scaling to produce 3D geometries, subsequently refined with standard bioorganic forcefields. The machine learning‐based graph distance introduced herein is found to be an improvement over the conventional shortest path distances used in graph drawing. Comparative analysis with a state‐of‐the‐art distance geometry method demonstrates nGDE's competitive performance, particularly showcasing robustness in handling polycyclic molecules—a challenge for existing methods.

Multinomial Restricted Unfolding

Article

Full-text available

Apr 2024

For supervised classification we propose to use restricted multidimensional unfolding in a multinomial logistic framework. Where previous research proposed similar models based on squared distances, we propose to use usual (i.e., not squared) Euclidean distances. This change in functional form results in several interpretational advantages of the resulting biplot, a graphical representation of the classification model. First, the conditional probability of any class peaks at the location of the class in the Euclidean space. Second, the interpretation of the biplot is in terms of distances towards the class points, whereas in the squared distance model the interpretation is in terms of the distance towards the decision boundary. Third, the distance between two class points represents an upper bound for the estimated log-odds of choosing one of these classes over the other. For our multinomial restricted unfolding, we develop and test a Majorization Minimization algorithm that monotonically decreases the negative log-likelihood. With two empirical applications we point out the advantages of the distance model and show how to apply multinomial restricted unfolding in practice, including model selection.

Clustering and Geodesic Scaling of Dissimilarities on the Spherical Surface

Article

Full-text available

Jan 2024

Spherical embedding is an important tool in several fields of data analysis, including environmental data, spatial statistics, text mining, gene expression analysis, medical research and, in general, areas in which the geodesic distance is a relevant factor. Many data acquisition technologies are related to massive data acquisition, and these high-dimensional vectors are often normalised and transformed into spherical data. In this representation of data on spherical surfaces, multidimensional scaling plays an important role. Traditionally, the methods of clustering and representation have been combined, since the precision of the representation tends to decrease when a large number of objects are involved, which makes interpretation difficult. In this paper, we present a model that partitions objects into classes while simultaneously representing the cluster centres on a spherical surface based on geodesic distances. The model combines a partition algorithm based on the approximation of dissimilarities to geodesic distances with a representation procedure for geodesic distances. In this process, the dissimilarities are transformed in order to optimise the radius of the sphere. The efficiency of the procedure described is analysed by means of an extensive Monte Carlo experiment, and its usefulness is illustrated for real data sets. Supplementary material to this paper is provided online.

The appeals of quadratic majorization–minimization

Article

Full-text available

Jan 2024
J GLOBAL OPTIM

Majorization–minimization (MM) is a versatile optimization technique that operates on surrogate functions satisfying tangency and domination conditions. Our focus is on differentiable optimization using inexact MM with quadratic surrogates, which amounts to approximately solving a sequence of symmetric positive definite systems. We begin by investigating the convergence properties of this process, from subconvergence to R-linear convergence, with emphasis on tame objectives. Then we provide a numerically stable implementation based on truncated conjugate gradient. Applications to multidimensional scaling and regularized inversion are discussed and illustrated through numerical experiments on graph layout and X-ray tomography. In the end, quadratic MM not only offers solid guarantees of convergence and stability, but is robust to the choice of its control parameters.

Multidimensional scaling with restrictions on the configuration

Article

Full-text available

Jan 1980

Application of Convex Analysis to Multidimensional Scaling

Article

Full-text available

Jan 1977

Jan de Leeuw

In this paper we discuss the convergence of an algorithm for metric and nonmetric multidimensional scaling that is very similar to the C-matrix algorithm of Guttman. The paper improves some earlier results in two respects. In the first place the analysis is extended to cover general Minkovski metrics, in the second place a more elementary proof of convergence based on results of Robert is presented.

Upper bounds of Kruskal's Stress

Article

Full-text available

Feb 1984
PSYCHOMETRIKA

In this paper the relationships between the two formulas for stress proposed by Kruskal in 1964 are studied. It is shown that stress formula one has a system of nontrivial upper bounds. It seems likely that minimization of this loss function will be liable to produce solutions for which this upper bound is small. These are regularly shaped configurations. Even though stress formula two yields less equivocal results, it seems to be expected that minimization of this loss function will tend to produce configurations in which the points are clumped. These results give no clue as to which of the two loss functions is to be preferred.

Differentiability of Kruskal's stress at a local minimum

Article

Full-text available

Feb 1984
PSYCHOMETRIKA

Jan de Leeuw

It is shown that Kruskal's multidimensional scaling loss function is differentiable at a local minimum. Or, to put it differently, that in multidimensional scaling solutions using Kruskal's stress distinct points cannot coincide.

Multidimensional Scaling

Book