FIG 10 - uploaded by Lexing Ying
Content may be subject to copyright.
Three data sets in 3D: Left: densities distributed on the unit sphere, Middle: densities distributed uniform in the unit cube, Right: densities distributed at the eight corners of the unit cube.  

Three data sets in 3D: Left: densities distributed on the unit sphere, Middle: densities distributed uniform in the unit cube, Right: densities distributed at the eight corners of the unit cube.  

Source publication
Article
Full-text available
We present a new fast multipole method for particle simulations. The main feature of our algorithm is that it does not require the implementation of multipole expansions of the underlying kernel, and it is based only on kernel evaluations. Instead of using analytic expansions to represent the potential generated by sources inside a box of the hiera...

Context in source publication

Context 1
... all density distributions the densities are chosen randomly from [0, 1). The three data sets for the 3D case are shown in Figure 10. ...

Similar publications

Article
Full-text available
Three-dimensional Boundary Element Method accelerated using a heterogeneous Fast Multipole Method is employed to study two-phase flow in a microchannel of different cross-sections. The flow of a mixture of two Newtonian liquids with equal viscosities and densities is described by Stokes equations. A comparison of the simulation results of the dropl...
Conference Paper
Full-text available
We present a preconditioner for an intrusive stochastic Galerkin method (SGM) based scattering solver that also leverages the multilevel fast multipole method (MLFMM). The proposed preconditioner is essential in developing a general and intrusive SGM method. Simulation results are obtained for a canonical scattering structure with perfect electrica...
Article
Full-text available
The fast multipole method (FMM) is commonly used to speed-up the time to solution of a wide diversity of N-body type problems. To use the FMM, the elements that constitute the geometry of the problem are clustered into groups of a given size (D) that may deeply vary the time to solution of the FMM. The optimal value of D, in the sense of minimizing...
Preprint
Full-text available
We present an algorithm to parallelize the inverse fast multipole method (IFMM), which is an approximate direct solver for dense linear systems. The parallel scheme is based on a greedy coloring algorithm, where two nodes in the hierarchy with the same color are separated by at least $\sigma$ nodes. We proved that when $\sigma \ge 6$, the workload...

Citations

... In this category, there are certain versions of Adaptive Cross Approximation (ACA) [2,3], black-box fast multipole method [11], and Nyström approximation [6]. Certain analytic techniques such as multipole expansions [14], Taylor expansions, equivalent densities [35], and proxy point method [34]. For the global case, and for kernels that define positive semi-definite matrices there are various techniques such as Nyström's method [33], random Fourier features [26,1], and randomly pivoted Cholesky [8]. ...
Preprint
Full-text available
Computing low-rank approximations of kernel matrices is an important problem with many applications in scientific computing and data science. We propose methods to efficiently approximate and store low-rank approximations to kernel matrices that depend on certain hyperparameters. The main idea behind our method is to use multivariate Chebyshev function approximation along with the tensor train decomposition of the coefficient tensor. The computations are in two stages: an offline stage, which dominates the computational cost and is parameter-independent, and an online stage, which is inexpensive and instantiated for specific hyperparameters. A variation of this method addresses the case that the kernel matrix is symmetric and positive semi-definite. The resulting algorithms have linear complexity in terms of the sizes of the kernel matrices. We investigate the efficiency and accuracy of our method on parametric kernel matrices induced by various kernels, such as the Mat\'ern kernel, through various numerical experiments. Our methods have speedups up to $200\times$ in the online time compared to other methods with similar complexity and comparable accuracy.
... Originally, the Fast Multipole Method (FMM) [38,27,26] and the panel clustering method [31,39] used explicit expansions of g which are kernel-specific. This drawback is removed in kernel-independent variants [15,41,22,34] but the utilized general expansions can become less efficient. This is demonstrated by comparing spherical harmonics with polynomials in three dimensions. ...
Preprint
Boundary integral equation formulations of elliptic partial differential equations lead to dense system matrices when discretized, yet they are data-sparse. Using the $\mathcal{H}$-matrix format, this sparsity is exploited to achieve $\mathcal{O}(N\log N)$ complexity for storage and multiplication by a vector. This is achieved purely algebraically, based on low-rank approximations of subblocks, and hence the format is also applicable to a wider range of problems. The $\mathcal{H}^2$-matrix format improves the complexity to $\mathcal{O}(N)$ by introducing a recursive structure onto subblocks on multiple levels. However, in practice this comes with a large proportionality constant, making the $\mathcal{H}^2$-matrix format advantageous mostly for large problems. In this paper we investigate the usefulness of a matrix format that lies in between these two: Uniform $\mathcal{H}$-matrices. An algebraic compression algorithm is introduced to transform a regular $\mathcal{H}$-matrix into a uniform $\mathcal{H}$-matrix, which maintains the asymptotic complexity.
... The dashed lines highlight a band of radius r = √ 2/10 around the diagonal. (b) L 2 -norm of the error between the Green's function G and its truncation G r along a bandwidth of radius r along the diagonal of the domain.Rokhlin, 1997;Ying et al., 2004). It allows for the evaluation of the integral operation in Eq. (3) in linear complexity. ...
... Recently, more variants of the FMA are developed [27], [28], [29], [30], [31], [32], [33], [34]. Specifically, for the static cases with smooth kernels, several kernel-independent FMAs (KI-FMA) [27], [28], [29] are developed. ...
... Recently, more variants of the FMA are developed [27], [28], [29], [30], [31], [32], [33], [34]. Specifically, for the static cases with smooth kernels, several kernel-independent FMAs (KI-FMA) [27], [28], [29] are developed. Some of these FMAs leverage the spatial sampling expansions [27], [28], while some are based on the spectral Fourier expansions [29]. ...
... Specifically, for the static cases with smooth kernels, several kernel-independent FMAs (KI-FMA) [27], [28], [29] are developed. Some of these FMAs leverage the spatial sampling expansions [27], [28], while some are based on the spectral Fourier expansions [29]. Furthermore, for the dynamic cases with the oscillatory kernels, several directional FMAs (D-FMA) [30], [31], [32], [33], [34] are developed. ...
Article
A directional multi-level complex-space fast multipole algorithm (DMLCSFMA) is proposed for solving electrically large problems of various dimensions. This algorithm implements a high-frequency generalization of the well-known mid-frequency multi-level fast multipole algorithm (MLFMA). It is established by exploring the fundamental connection between the conventional MLFMA and the recently developed directional fast multipole algorithms (D-FMA), as well as the plane wave expansion induced from the complex source beam (Gaussian beam). Different from the conventional MLFMA which exhibits the complexity of O ( N <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ) for certain situations such as the quasi-one dimensional elongated object, the proposed high-frequency generalized version is capable of achieving a stable complexity of O ( N log N ), irrespective of the dimensional features of the objects. Besides, the proposed algorithm also manifests itself as a spectral counterpart of the traditional D-FMAs. However, unlike the traditional D-FMAs which leverage the equivalent source based sampling expansions, the proposed algorithm is established using the plane wave based exponential expansions. Thus, the feasibility of building a D-FMA with analytically diagonalized translators is also demonstrated in this work. Several numerical examples are provided to illustrate the complexity and accuracy of the proposed algorithm.
... The finite element method holds a distinct advantage over the finite difference method due to its capacity to represent smooth head tissue surfaces through adaptable polyhedral-type elements. The boundary element method formulates the problem as equivalent integral equations assuming constant conductivities between the tissue surfaces (29)(30)(31)(32)(33). ...
... Since this class of problems is ubiquitous, the corresponding literature is vast. We will not attempt a comprehensive review here, but simply point out that there are, broadly speaking, two classes of fast algorithms for such problems: (a) tree-based methods such as the fast multipole method (FMM) and its variants [12,16,27,31,36,37,60,72,73] or multilevel summation [9,10,62,41,42,47], and (b) uniform grid-based methods that rely on the fast Fourier transform (FFT) such as Ewald summation (see, for example, [18,46,61,65]). The tree-based methods have the advantage of permitting adaptive discretization in the case of either continuous or discrete sources, and can achieve linear scaling. ...
... DMK is a dual-space method, like Ewald summation, making use of the Fourier transform to diagonalize these interactions with a significant cost savings. Diagonal translation plays a role in some existing kernel-independent FMMs, such as [72] and [73], where Fourier-based convolution is used to account for well-separated interactions. ...
Preprint
We introduce a new class of multilevel, adaptive, dual-space methods for computing fast convolutional transforms. These methods can be applied to a broad class of kernels, from the Green's functions for classical partial differential equations (PDEs) to power functions and radial basis functions such as those used in statistics and machine learning. The DMK (dual-space multilevel kernel-splitting) framework uses a hierarchy of grids, computing a smoothed interaction at the coarsest level, followed by a sequence of corrections at finer and finer scales until the problem is entirely local, at which point direct summation is applied. The main novelty of DMK is that the interaction at each scale is diagonalized by a short Fourier transform, permitting the use of separation of variables, but without requiring the FFT for its asymptotic performance. The DMK framework substantially simplifies the algorithmic structure of the fast multipole method (FMM) and unifies the FMM, Ewald summation, and multilevel summation, achieving speeds comparable to the FFT in work per gridpoint, even in a fully adaptive context. For continuous source distributions, the evaluation of local interactions is further accelerated by approximating the kernel at the finest level as a sum of Gaussians with a highly localized remainder. The Gaussian convolutions are calculated using tensor product transforms, and the remainder term is calculated using asymptotic methods. We illustrate the performance of DMK for both continuous and discrete sources with extensive numerical examples in two and three dimensions.
... Therefore, K(x, y) can be approximated by a low-order expansion, and low rank approximations for off-diagonal matrix blocks corresponding to far-field interactions can be constructed. Representative fast algorithms following this idea include the fast multipole method (FMM) and its variants [10][11][12][13], treecode algorithm [14], H 2 -matrix [15], adaptive cross approximation [16], and nested cross approximation [17,18], etc. One of the main differences between these algorithms is that they use different functions in the low-order expansion of the kernel, or different algebraic schemes in deriving the low rank approximation for matrix blocks. ...
Preprint
A nearly optimal explicitly-sparse representation for oscillatory kernels is presented in this work by developing a curvelet based method. Multilevel curvelet-like functions are constructed as the transform of the original nodal basis. Then the system matrix in a new non-standard form is derived with respect to the curvelet basis, which would be nearly optimally sparse due to the directional low rank property of the oscillatory kernel. Its sparsity is further enhanced via a-posteriori compression. Finally its nearly optimial log-linear computational complexity with controllable accuracy is demonstrated with numerical results. This explicitly-sparse representation is expected to lay ground to future work related to fast direct solvers and effective preconditioners for high frequency problems. It may also be viewed as the generalization of wavelet based methods to high frequency cases, and used as a new wideband fast algorithm for wave problems.
... In principle, we can use the kernel-independent fast multipole method [54] to accelerate the evaluations of ρ T (q i ) for all q i ∈ Q int . Here we consider (6.2) as a Nbody problem by treating Q, Q int and W as source points, target points and source densities, respectively. ...
Preprint
We present a finite element scheme for fractional diffusion problems with varying diffusivity and fractional order. We consider a symmetric integral form of these nonlocal equations defined on general geometries and in arbitrary bounded domains. A number of challenges are encountered when discretizing these equations. The first comes from the heterogeneous kernel singularity in the fractional integral operator. The second comes from the dense discrete operator with its quadratic growth in memory footprint and arithmetic operations. An additional challenge comes from the need to handle volume conditions-the generalization of classical local boundary conditions to the nonlocal setting. Satisfying these conditions requires that the effect of the whole domain, including both the interior and exterior regions, can be computed on every interior point in the discretization. Performed directly, this would result in quadratic complexity. To address these challenges, we propose a strategy that decomposes the stiffness matrix into three components. The first is a sparse matrix that handles the singular near-field separately and is computed by adapting singular quadrature techniques available for the homogeneous case to the case of spatially variable order. The second component handles the remaining smooth part of the near-field as well as the far-field and is approximated by a hierarchical $\mathcal{H}^{2}$ matrix that maintains linear complexity in storage and operations. The third component handles the effect of the global mesh at every node and is written as a weighted mass matrix whose density is computed by a fast-multipole type method. The resulting algorithm has therefore overall linear space and time complexity. Analysis of the consistency of the stiffness matrix is provided and numerical experiments are conducted to illustrate the convergence and performance of the proposed algorithm.
... The use of a hierarchical grid allows coarser levels of approximation with larger interaction distances. The difficulty in formulating the multipole expansion of the desired kernel motivated development of the kernel-independent FMM (KIFMM) [11] or similar methods such as the multi-level multi-interaction cluster (MLMIC) [12,13]. An advantage of the FMM is that it naturally handles unbounded BCs, although periodic or homogeneous Neumann or Dirichlet BCs may also be treated by using the method of images [14,15]. ...
Preprint
Full-text available
A solver for the Poisson equation for 1D, 2D and 3D regular grids is presented. The solver applies the convolution theorem in order to efficiently solve the Poisson equation in spectral space over a rectangular computational domain. Conversion to and from the spectral space is achieved through the use of discrete Fourier transforms, allowing for the application of highly optimised O(NlogN) algorithms. The data structure is configured to be modular such that the underlying interface for operations to, from and within the spectral space may be interchanged. For computationally demanding tasks, the library is optimised by making use of parallel processing architectures. A range of boundary conditions can be applied to the domain including periodic, Dirichlet, Neumann and fully unbounded. In the case of Neumann and Dirichlet boundary conditions, arbitrary inhomogeneous boundary conditions may be specified. The desired solution may be found either on regular (cell-boundary) or staggered (cell-centre) grid configurations. For problems with periodic, Dirichlet or Neumann boundary conditions either a pseudo-spectral or a second-order finite difference operator may be applied. For unbounded boundary conditions a range of Green's functions are available. In addition to this, a range of differential operators may be applied in the spectral space in order to treat different forms of the Poisson equation or to extract highly accurate gradients of the input fields. The underlying framework of the solver is first detailed, followed by a range of validations for each of the available boundary condition types. Finally, the performance of the library is investigated. The code is free and publicly available under a GNU v3.0 license.
... Actually, a similar scheme can be derived for any kernel K(x, y) (not only the Coulomb one) that satisfies particular assumption, namely the asymptotically smooth behavior [9]. There exist different approaches to obtain a fast separable formula in a kernel-independent way [46,18]. Among them, we are especially interested into the interpolationbased FMM (IBFMM) because of its flexibility, its well documented error bounds as well as the particular form of the approximated expansion of K it provides. ...
... Hence, only w has to be computed, meaning that only a single FFT needs to be performed during evaluation of Eq. (46). As another important consequence of Cor. 2, once this FFT has been performed, one can only keeps the real part of the output (the imaginary one being equal to 0). ...
Preprint
Full-text available
To evaluate electrostatics interactions, Molecular dynamics (MD) simulations rely on Particle Mesh Ewald (PME), an O(Nlog(N)) algorithm that uses Fast Fourier Transforms (FFTs) or, alternatively, on O(N) Fast Multipole Methods (FMM) approaches. However, the FFTs low scalability remains a strong bottleneck for large-scale PME simulations on supercomputers. On the opposite, - FFT-free - FMM techniques are able to deal efficiently with such systems but they fail to reach PME performances for small- to medium-size systems, limiting their real-life applicability. We propose ANKH, a strategy grounded on interpolated Ewald summations and designed to remain efficient/scalable for any size of systems. The method is generalized for distributed point multipoles and so for induced dipoles which makes it suitable for high performance simulations using new generation polarizable force fields towards exascale computing.