Figure 11 - uploaded by Yasuaki Hiraoka
Content may be subject to copyright.
(a) shows the reconstructed persistence diagram from the learned vector w and (b) shows the positive and negative areas of (a) with a certain threshold. Recall that, from the 0/1 assignment, the generators in the blue (resp. red) area contributes to classifying into PPP (resp. GPP). From the learned persistence diagram, we observe that the red area is located on the region with large birth values. This is consistent to the fact that GPP has a repulsive interaction, and hence it prevents the point cloud from constructing rings with small birth values. (c) and (d) show the death positions of the generators in the blue and red areas of (b) with the same colors, where (c) (resp. (d)) corresponds to PPP (resp. GPP). Similarly to the discussion in Figure 5, these death positions express characteristic geometric features used for learnings more explicitly. We remark that PPP and GPP can also be distinguished by using other descriptors such as average nearest neighbor distances. An advantage of our method is that we do not need any prior knowledge, providing us with more universal method compared to problem-specific descriptors. In fact, the analysis using average nearest neighbor distance can be realized by the 0th persistence diagram. 

(a) shows the reconstructed persistence diagram from the learned vector w and (b) shows the positive and negative areas of (a) with a certain threshold. Recall that, from the 0/1 assignment, the generators in the blue (resp. red) area contributes to classifying into PPP (resp. GPP). From the learned persistence diagram, we observe that the red area is located on the region with large birth values. This is consistent to the fact that GPP has a repulsive interaction, and hence it prevents the point cloud from constructing rings with small birth values. (c) and (d) show the death positions of the generators in the blue and red areas of (b) with the same colors, where (c) (resp. (d)) corresponds to PPP (resp. GPP). Similarly to the discussion in Figure 5, these death positions express characteristic geometric features used for learnings more explicitly. We remark that PPP and GPP can also be distinguished by using other descriptors such as average nearest neighbor distances. An advantage of our method is that we do not need any prior knowledge, providing us with more universal method compared to problem-specific descriptors. In fact, the analysis using average nearest neighbor distance can be realized by the 0th persistence diagram. 

Source publication
Article
Full-text available
Persistence diagrams have been widely recognized as a compact descriptor for characterizing multiscale topological features in data. When many datasets are available, statistical features embedded in those persistence diagrams can be extracted by applying machine learnings. In particular, the ability for explicitly analyzing the inverse in the orig...

Citations

... The SC represents the relative position of the two phases, while the PH can quantify the connectivity of the phases, both of which are increasingly gaining attention in recent materials science. The persistence diagrams (PD) was computed by HomCloud [25]. Principal component analysis (PCA) was applied to SC and PH for dimensionality reduction. ...
Article
Empirical formulas were derived for the interface shape, mechanical properties, and electrical characteristics of accumulative roll bonded (ARB) Cu/Nb laminated materials, based on relevant literature data. These formulas were incorporated into a forward analysis model using finite element analysis, enabling the calculation of yield stress and conductivity from the spatial distribution of Cu/Nb two phases. By randomly varying the layer thickness and interface shape in the two-phase spatial distribution and conducting repeated forward analyses, a database linking microstructural descriptors with yield stress and conductivity was created. These microstructural descriptors include volume fraction, geometric features, topological features, spatial correlation functions, and persistent homology. The significance of each microstructural descriptor on yield stress and conductivity was quantified using machine learning techniques. The results revealed that the Cu volume fraction, layer thickness, and 0th Betti number are crucial for yield stress, while for conductivity, the Cu volume fraction has the strongest influence, followed by layer thickness and layer continuity. Based on these outcomes, the Pareto front for ARB Cu/Nb laminates in the strength-conductivity space was presented.
... This is realized by tracking the evolution of connected pieces and holes by calculating the homology groups of an increasing sequence of complexes (called a filtration). The specific process is as follows [17]: First, a Manhattan distance is assigned to each pixel based on the boundary between black and white pixels ( Fig. 2(a)). There are two types of assignments: white-based, which considers the white direction as positive, and black-based, which is the opposite. ...
... Therefore, it is expected that the PD will be a useful descriptor for analyzing energetics in domain structure. In this analysis, feature extraction was performed using Homcloud a Python API capable of executing PH analysis [17]. To extract diverse information due to the significant changes in domain shapes depending on frequency, we extracted both 0 th and 1 st homology information. ...
... During the conversion, we created PDs based on white and black for both 0 th and 1 st homology groups, resulting in four PDs for each magnetic domain image. Next, PDs were transformed into feature vectors whose elements are the value of the generators blurred by Gaussian kernel [17], and combined them to create an 8,320-dimensional vector. By stacking these vectors for all images, we created a feature matrix of size 6,000 × 8,320. ...
Article
Full-text available
The kinetics of magnetic domain structure in soft magnetic materials is crucial for the understanding of their functional properties, such as coercivity and loss. We have developed a high-speed and real-time magnetic domain measurement system based on the magnetic-optical Kerr effect (MOKE) microscope. High-speed evolution of domain structures of YIG (yttrium iron garnet) thin film under AC magnetic field and its frequency-dependent hysteresis curves were measured by the system. Subsequently, we combined persistent homology (PH) with principal component analysis (PCA), a dimensionality reduction method, to extract topological information of domain structure and analyzed complex magnetization process. We successfully extracted physically meaningful features of the frequency-dependent magnetic domain structures. As a result, by using the machine-learning outputted features, the coercivity contributing factors of magnetization reversal process were visualized onto domain structures. We also found that the occurrence of the coercivity factors increases along excitation magnetic field frequency, indicating the increase of loss. These findings provide new insights into the relationship between coercivity and magnetic domain structure dynamics.
... covariate), that is, as a vector (Adams 2017), (219 Berry et al. 2020). 6 (Obayashi et al. 2018) have applied this prescription to carry out TDA with persistence diagrams in machine learning (see also: Adams et al. (2017)). They specifically used the following transformation to construct persistence image functions (429 Obayashi et al. 2018): ...
... 6 (Obayashi et al. 2018) have applied this prescription to carry out TDA with persistence diagrams in machine learning (see also: Adams et al. (2017)). They specifically used the following transformation to construct persistence image functions (429 Obayashi et al. 2018): ...
Article
Full-text available
This paper presents a new approach to survival analysis using topological data analysis (TDA) within Bayesian statistics combined with machine learning algorithms suitable to time-to-event data. The paper brings into the analysis aspects of topological invariance through what is known as persistence homology. TDA demonstrates the existence and statistical significance of a kind of unmeasured heterogeneity originating from the topology of the data as a whole. Combined with machine learning tools persistence homology provides us with new tools to construct a rich set of ways to analyze data and build predictive models that are optimized using inherent topological invariants such as one-dimensional loops as regularization. Specifically, this paper incorporates persistent homology effects in different ways in the analysis of survival data through the technique of functional principal component analysis (FPCA): first, by using topological invariants converted into FPCA factors that shape Bayesian statistical analysis of time-to-event data; second, by using FPCA measures of topological invariants in regularizing the process of optimizing the data and the posterior distributions of the Bayesian estimation; three, by using FPCA factors of measures of topological invariants in machine learning algorithms and deep neural networks suitable for analyzing survival data as a way of going beyond usual parametric and semi-parametric models of survival analysis. The approach is illustrated through a running example of multi-frailty survival analysis of democracies in the period of 1950–2010.
... We perform a similar analysis to discriminate between two types of point processes: a Poisson point process (PPP) and a Ginibre point process (GPP). This setup has been introduced in Obayashi et al. (2018). The specificity of Ginibre processes lies in repulsive interactions between points. ...
Preprint
Full-text available
In this article, we study Euler characteristic techniques in topological data analysis. Pointwise computing the Euler characteristic of a family of simplicial complexes built from data gives rise to the so-called Euler characteristic profile. We show that this simple descriptor achieve state-of-the-art performance in supervised tasks at a very low computational cost. Inspired by signal analysis, we compute hybrid transforms of Euler characteristic profiles. These integral transforms mix Euler characteristic techniques with Lebesgue integration to provide highly efficient compressors of topological signals. As a consequence, they show remarkable performances in unsupervised settings. On the qualitative side, we provide numerous heuristics on the topological and geometric information captured by Euler profiles and their hybrid transforms. Finally, we prove stability results for these descriptors as well as asymptotic guarantees in random settings.
... Some methods such as landscapes (Bubenik et al., 2015), persistence images (Adams et al., 2017), or ATOL immediately get rid of the measure representation and transform the data into a vector. It then becomes possible to plug these vector representations into a standard classifier, we refer to Obayashi et al. (2018) for classification using linear classifiers. Some papers use kernel methods, such as Carriere et al. (2017) or Le and Yamada (2018), while some other works make use of neural networks, such as , and more recently Reinauer et al. (2021). ...
... The boosting algorithm aggregates these classifiers and improves the classification performance by up to 10 % as opposed to considering a single rectangle. A second experiment conducted is based on the experimental set-up from Obayashi et al. (2018). We sample Poisson (PPP) and Ginibre (GPP) point processes on the disk, with 30 points on average and compute their one-dimensional persistence diagrams. ...
... We have reached similar classification accuracy (around 94% in both cases). Obayashi et al. (2018) apply a logistic regression to a persistence image transform of the persistence diagrams. When using a L 1 penalty, this induces sparsity and highlights a zone of the persistence image useful for discrimination. ...
Preprint
We consider a binary supervised learning classification problem where instead of having data in a finite-dimensional Euclidean space, we observe measures on a compact space $\mathcal{X}$. Formally, we observe data $D_N = (\mu_1, Y_1), \ldots, (\mu_N, Y_N)$ where $\mu_i$ is a measure on $\mathcal{X}$ and $Y_i$ is a label in $\{0, 1\}$. Given a set $\mathcal{F}$ of base-classifiers on $\mathcal{X}$, we build corresponding classifiers in the space of measures. We provide upper and lower bounds on the Rademacher complexity of this new class of classifiers that can be expressed simply in terms of corresponding quantities for the class $\mathcal{F}$. If the measures $\mu_i$ are uniform over a finite set, this classification task boils down to a multi-instance learning problem. However, our approach allows more flexibility and diversity in the input data we can deal with. While such a framework has many possible applications, this work strongly emphasizes on classifying data via topological descriptors called persistence diagrams. These objects are discrete measures on $\mathbb{R}^2$, where the coordinates of each point correspond to the range of scales at which a topological feature exists. We will present several classifiers on measures and show how they can heuristically and theoretically enable a good classification performance in various settings in the case of persistence diagrams.
... The presented analysis framework is similar to our previous work [8], but the introduction of NMF with concatenated PIs improves the interpretability of the result. ...
Article
This paper proposes a data analysis method using persistent homology and nonnegative matrix factorization. A concatenated persistence image technique is used to extract coexisting structures from the persistence diagrams of different dimensions hidden behind the data. To demonstrate the potential of our method, we apply the method to 3D voxel data of iron ore sinters obtained by X-ray computed tomography. The analysis successfully captures the coexistence structures in these iron ore sinters.
... In order to present the process of the creation of the persistence diagram, it is necessary to introduce a number of concepts related to the TDA methods. Homology is the topological invariant (a parameter that is immutable during the transformation of a given topological space) that represents a set of n-dimensional holes (Obayashi et al. [2018]). It characterizes the topological properties of a given scale of the analyzed object, allowing for comparison of two different topological spaces (Pereira and de Mello [2015]). ...
... where T r is the set of all points distant from any element of the P by r, B r (x i ) = {y ∈ R N : ||y − x i || ≤ r} is a ball centered at x i with radius r and N denotes the dimensionality of space Obayashi et al. [2018]. As the radius increases segments (1 -simplex) are created which delimit a certain two-dimensional space (Fig.4 (c)). ...
... The results can be obtained both in the form of a persistence diagram and a text file containing the coordinates of the birth-death points. The algorithm that the HomCloud software is based on, for analyzing binary images applies a function that uses Manhattan distance to assign appropriate values to pixels (or voxels) depending on the position relative to the phase boundary (Obayashi et al. [2018]). In the example of PD 0 formation presented in Fig. 6 (based on Obayashi et al. [2018]), the gray phase has positive values and the white phase -negative values. ...
Preprint
Full-text available
Uncovering microstructure evolution mechanisms that accompany the long-term operation of solid oxide fuel cells is a fundamental challenge in designing a more durable energy system for the future. To date, the study of fuel cell stack degradation has focused mainly on electrochemical performance and, more rarely, on averaged microstructural parameters. Here we show an alternative approach in which an evolution of three-dimensional microstructural features is studied using electron tomography coupled with topological data analysis. The latter produces persistent images of microstructure before and after long-term operation of electrodes. Those images unveil a new insight into the degradation process of three involved phases: nickel, pores, and yttrium-stabilized zirconium.
... en. html) developed by Obayashi et al. for PH analysis, machine learning, and visualization 25,45 . Before PH, we took the absolute value of M z and normalized its intensity from 0 to 1. ...
... To perform RR and PCA, the PDs were converted to a persistence image 25 . We set the dispersion in the PD to σ = 0.03; the mesh range to [0, 1]; and the mesh size to 100. ...
Article
Full-text available
The magnetization reversal in nanomagnets is causally analyzed using an extended Landau free-energy model. This model draws an energy landscape in the information space using physics-based features. Thus, the origin of the magnetic effect in macroscopic pinning phenomena can be identified. The microscopic magnetic domain beyond the hierarchy can be explained using energy gradient analysis and its decomposition. Structural features from the magnetic domains are extracted using persistent homology. Extended energy is visualized using ridge regression, principal component analysis, and Hadamard products. We found that the demagnetization energy concentration near a defect causes the demagnetization effect, which quantitatively dominates the pinning phenomenon. The exchange energy inhibits pinning, promotes saturation, and shows slight interactions with the defect. Furthermore, the energy distributions are visualized in real space. Left-position defects reduce the energy barrier and are useful for the topological inverse design of recording devices.
... This is because machine learning can extract the features inherent in the data. It can also enable logical causal analyses based on the accumulated results of correlation analyses [20][21][22][23]. ...
... TDA is a novel analysis method for constructing a super-hierarchical and quantitative linkage between microscopic fine structures and macroscopic physical properties. TDA, which uses a combination of a novel topological concept called persistent homology (PH) and machine learning, has attracted increasing attention in recent years [20]. PH is a powerful tool that can quantitatively describe the features (size, shape, fluctuation, and connectivity of holes and islands) of fine structures. ...
... PH is a powerful tool that can quantitatively describe the features (size, shape, fluctuation, and connectivity of holes and islands) of fine structures. Machine learning enables the construction of correlations between features and various physical properties [20][21][22][23]. ...
Article
Full-text available
The microstructures of magnetic domains are crucial in determining the functions of spintronic devices. However, the magnetization reversal mechanism is still not fully understood because of the difficulty in quantifying the drastic and complex changes in the magnetic domain structure. Here, we used topological data analysis and developed a super-hierarchical and explanatory analysis method for magnetic reversal processes. We quantified the complexity of a magnetic domain structure using persistent homology and visualized the magnetization reversal process in a two-dimensional space using principal component analysis. The first principal component (PC1) was a descriptor explaining the magnetization, and the second principal component (PC2) was a crucial descriptor characterizing the stability of the magnetic domain structure. Interestingly, PC2 detected slight changes in the structure, which indicates a hidden feature dominating the metastable/stable reversal processes. We successfully determined the cause of the branching of the macroscopic reversal process on the original microscopic magnetic domain structure. This super-hierarchical and explanatory analysis would improve the reliability of spintronics devices and understanding of stochastic/deterministic magnetization reversal phenomena.
... In this paper we apply the technique of simplification [69] developed within persistent homology theory. Another widely used related method is the analysis of persistent diagrams for revealing their correlations with different geometrical, topological and physical properties of porous media [70][71][72][73][74]. Edelsbrunner et al. [65] were the first to explore the connections between Morse theory and persistent homology, and the persistence homology algorithms found their way into practically all the latest works on image analysis in the framework of discrete Morse theory. ...
... Discrete Morse theory was also recently successfully applied for the image analysis in astronomy [76,77], topography [78] and medicine biology [79]. The results of image analysis using discrete Morse theory and persistent homology are becoming a trend in the machine learning or deep learning algorithms where they are converted into training features [72,73,78,79]. ...
Article
Full-text available
Pore-scale modeling based on the 3D structural information of porous materials has enormous potential in assessing physical properties beyond the capabilities of laboratory methods. Such capabilities are pricey in terms of computational expenses, and this limits the applicability of the direct simulations to a small volume and requires high-performance computational resources, especially for multiphase flow simulations. The only pore-scale technique capable of dealing with large representative volumes of porous samples is pore-network (PNM) based modeling. The problem of the PNM approach is that 3D pore geometry first needs to be simplified into a graph of pores and throats that conserve topological and geometrical properties of the original 3D image. While significant progress has been achieved in terms of geometry representation, no methodology provides full conservation of the topological features of the pore structure. In this paper we present a pore-network extraction algorithm for binary 3D images based on discrete Morse theory and persistent homology that by design targets topology preservation. In addition to methodological developments, we also clarify the relationship between topological characteristics of constructed Morse chain complex and pore-network elements. We show that the Euler numbers calculated for PNMs based on our methodology coincide with those obtained using the direct topological analysis. The characteristics of the extracted pore network are calculated for several 3D porous binary images and compared with the results of maximum inscribed balls-based and watershed-based approaches as well as a hybrid approach to support our methodology.