ArticlePDF Available

SSF-Align: Point Cloud Registration Based on Statistical Shape Features with Manifold Metric

Wiley
Journal of Sensors
Authors:

Abstract and Figures

As an important topic in 3D vision, the point cloud registration has been widely used in various applications, including location, reconstruction, and shape recognition. In this paper, we propose a new registration method for this topic, which utilizes statistical shape features (SSFs) and manifold metrics to estimate the transformation matrix. The SSFs are extracted to establish a compact representation for the original point cloud. Then, the representation is mapped into a manifold to reduce the influences of different scales and translations. Finally, the manifold metric is used to minimize the distance based on the compact representation and the pose can be estimated. The advantages of our method include robustness to nonuniform densities, insensitivity to missing parts, and better performance to handle large difference of poses. Experimental results show that our method achieves significant improvements compared to the state of the art methods.
This content is subject to copyright. Terms and conditions apply.
Research Article
SSF-Align: Point Cloud Registration Based on Statistical Shape
Features with Manifold Metric
Pu Ren ,
1
Chongbin Xu,
2
Xiaomin Sun,
2
Yuan Li,
3
and Haiying Tao
1
1
Beijing Institute of Graphic Communication, Beijing, China
2
Beijing Institute of Space Mechanics &Electricity, Beijing, China
3
Xian Museum, Xian, China
Correspondence should be addressed to Pu Ren; renpu@bigc.edu.cn
Received 19 July 2023; Revised 30 October 2023; Accepted 6 November 2023; Published 28 November 2023
Academic Editor: Yunchao Tang
Copyright ©2023 Pu Ren et al. This is an open access article distributed under the Creative Commons Attribution License, which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
As an important topic in 3D vision, the point cloud registration has been widely used in various applications, including location,
reconstruction, and shape recognition. In this paper, we propose a new registration method for this topic, which utilizes statistical
shape features (SSFs) and manifold metrics to estimate the transformation matrix. The SSFs are extracted to establish a compact
representation for the original point cloud. Then, the representation is mapped into a manifold to reduce the inuences of different
scales and translations. Finally, the manifold metric is used to minimize the distance based on the compact representation and the
pose can be estimated. The advantages of our method include robustness to nonuniform densities, insensitivity to missing parts,
and better performance to handle large difference of poses. Experimental results show that our method achieves signicant
improvements compared to the state of the art methods.
1. Introduction
Following the development of 3D scanning technology, 3D
point clouds and related analysis have been widely used in
practical scenarios. Comparing to the 2D images, 3D point
clouds take complete geometric information with texture
that supports more accurate quantitative analysis. As one
important topic in this eld, point cloud registration meth-
ods have been studied for decade which can be regarded as a
fundamental issue in a series of 3D vision tasks. Related
achievements have been successfully used in numerous com-
mercial applications, including digital entertainment, intelli-
gent healthcare, and remote sensing data processing.
The mainstream technological routes can be roughly
divided into two categories: the global correspondence and
the local matching. The global correspondence-based route is
to search a transformation that satises minimize state under
certain metric in a global view. It is robust to local random
variations produced by the noise and nonuniform densities.
However, the performance is limited by the selected metric.
The situation of local optimization cannot be completely
avoided. In addition, the missing parts in point clouds have
signicant impacts that cannot be ignored. The local matching
route prefers to establish local region-based correspondence
directly. It reects the relationships of signicant geometric
features and does not require the complex searching operation
in the global view. Therefore, it is not sensitive to the missing
parts. Obviously, the drawback of the route is the dependence
on the quality of local feature analysis. The noisy points and
random distributions take unstable inuences.
Recently, a new solution (KSS-ICP) [1] is proposed
which can be regarded as a global correspondence scheme.
The novelty of the solution is that the manifold metric is used
to improve the accurate of the metric. It signicantly reduces
the inuences of scaling, translation, and rotation by align-
ment while improving the robustness to defective parts.
However, it still has some limitations: the lower computa-
tional efciency, the lack of local features analysis, and the
restricted metric. The KSS-ICP requires many times of com-
parisons to achieve nal alignment. Even the GPU-based
parallel acceleration is used, the efciency is still lower
than others (>1 min for a point cloud with more than 50k
points). Due to the method does not consider the local fea-
tures, it is still sensitive to the missing parts to a certain
Hindawi
Journal of Sensors
Volume 2023, Article ID 6691332, 10 pages
https://doi.org/10.1155/2023/6691332
degree, which can not be solved by the Hausdorff distance-
based metric.
In this paper, we propose a new registration scheme that
combines statistical shape feature (SSF) analysis and mani-
fold metric to align point clouds. Inspired by the KSS-ICP,
we establish a compact representation for original point
cloud, which reduces the inuences of scaling and transla-
tion. At the same time, we extract the SSFs from the point
cloud to achieve further simplication for the compact repre-
sentation. The salient geometric features are kept. Based on
the representation, we designed a new manifold metric to
align point clouds. It is used to improve the performance
of the local feature analysis. Finally, we implement the regis-
tration scheme that achieves a balance between global corre-
spondence and local matching. The pipeline is shown in
Figure 1. The contribution can be concluded as follows:
(1) We present a compact representation for the raw
point cloud, which reduces the inuences of scaling
and translation while considering SSFs. It takes less
space and carries more salient geometric features,
which is more suitable for the manifold metric.
(2) We propose a new manifold metric for the align-
ment. Comparing to the traditional metrics, the pro-
posed manifold metric has the property of shape
feature invariant. It improves the performance of
global correspondence by considering local features.
(3) We report a comprehensive analysis for the proposed
method in two classical datasets. It provides quanti-
tative proofs to show the performance of our method.
The rest of the paper is organized as follows. In Section 2,
we introduce related works for the point cloud registration.
In Section 3, we introduce some fundamental details of the
shape space and SSF. In Section 4, we explain the construc-
tion of compact representation for point clouds. In Section 5,
we show the details of manifold metric for alignment. The
experiments in Section 6 show the performance of our
method.
2. Related Works
Global correspondence scheme estimates the transformation
matrix based on the global-view metric. The representative
methods are ICP [2] and its variants [1, 35]. The ICP-based
registration is to align point clouds according to the potential
point-based distance. During the iterative correction, corre-
spondences between points are estimated and the transfor-
mation matrix is achieved. However, the registration process
is sensitive to the initial poses of point clouds. It makes the
registration drop into the local optima with high probability.
The variants are proposed to reduce the probability. Fast-ICP
[5] utilized the Welsh function and Lie algebra form to rep-
resent the registration error and the transformation matrix,
which greatly improved the accuracy and convergence speed
of ICP algorithm.
As a well-known solution, GO-ICP [6] established a
global searching strategy to search the potential solution
with a BnB scheme [7]. It can be regarded as a controllable
exhaustive searching process. KSS-ICP [1] is a new solution
that combines with ICP scheme and manifold metric on
Kendall shape space. Inuences produced by noisy points
and different scales are reduced signicantly. Following the
development of deep learning frameworks, some researchers
employ the related technologies to implement the registra-
tion. PointNetLK [8] creatively combined the classic
LucasKanade (LK) algorithm for the 2D images registration
task and the well-known PointNet for 3D point clouds,
which opened up a new path for the application of deep
learning into 3D point cloud registration. Similarly, many
solutions improve the efciency of the registration with
effective feature coding [9, 10].
Local matching scheme is to search the point-based cor-
respondence directly. The advantage is that the scheme can
Registration
Template
Source
SSF extraction
Shape space
mapping
Rotation searching
FIGURE 1: Pipeline of our method.
2 Journal of Sensors
solve the low-overlap registration task with a local view. The
incomplete point clouds can be processed. Point-based cor-
respondences are estimated by local geometric features,
including the normal distributions transform, the 3D context
descriptor, and the Point Feature Histograms [1113]. As a
representative method, the fast point feature histograms
(FPFH) [12] provide more robustness local shape feature
that is beneted from the normal vector-based statistical
analysis. Due to the dependency on local features, such
methods are sensitive to noisy points and regions with non-
uniform densities. Recent works attempt to use the deep
neural networks to improve the robustness for feature anal-
ysis [14, 15]. Beneted from the prior knowledge learning,
such frameworks can extract semantic feature to guide point-
based alignment. However, such feature coding process suf-
fers for point clouds with complex similarity transforma-
tions, especially for the scale difference and nonuniform
densities.
Recent researches become more focused on specic issues.
For instance, a method based on geometric attention network
was proposed to solve the partial point cloud registration [16].
Qin et al. [17] proposed a solution which is both keypoint-free
and RANSAC-free. Yang et al. [18] proposed a mutual voting
method for ranking 3D correspondences, which can be
directly utilized in the 3D point cloud registration.
Our registration method inherits advantages of the global
correspondence scheme and combines the local feature anal-
ysis at the same time. Inspired by the KSS-ICP, we introduce
the manifold metric to align point clouds with an efcient
way. In the following parts, we discuss the implementation
details.
3. Fundamental
3.1. Shape Space. For quantitative shape analysis between
point cloud models, inuences of different similarity trans-
formations (shape-preserving transformations) should be
removed. Following the basic requirement, Kendall [19] pro-
vided the prior work to measure 3D models on Kendall shape
space. The Kendall shape space is a manifold space con-
structed by discrete shape form (point sequence). It is also
a quotient space that removes inuences of similarity trans-
formations. For point cloud-based measurement, the Kendall
shape space provides a manifold metric that is not affected by
representation on Euclidean space with similarity transfor-
mations, including translations, scaling, and rotations. Such
property can be used to align point clouds. If the metric on
Kendall shape space is considered as optimization object,
then the shape space can be regarded as a solution space to
search optimal similarity transformations as the registration
result. We provide a mathematical representation for Ken-
dall shape space:
M¼R3n=0
fg
;
Ks¼M=G;G¼T;S;O
fg
;ð1Þ
where Mis a manifold space that is constructed by discrete
shape form with n3D points, Grepresents similarity
transformation group with translation T, scaling S, and rota-
tion O.Ksrepresents the Kendall shape space constructed by
M=G, which is still a manifold and, at the same time, a
quotient group space. For point clouds, once we dene and
implement the mapping from Euclidean space to Kendall
shape space, the inuence of similarity transformation can
be removed. The mapping is represented as follows:
Ks¼T:S:OaðÞ;ð2Þ
where ais a discrete point sequence with point number
nandT:S:Orepresent related normalization to remove inu-
ences of translation, scaling, and rotation. After the normal-
ization, the similarity transformation cannot change the
representation of a(for instance, TðaÞ¼Tðt:aÞ;tis to
remove the position of a). The distance on Kendall shape
space is a manifold metric that reects the shape similarity
between point clouds. The registration result can be achieved
from the reverse operation of T:S:O. In Section 3, we will
discuss an implementation of the normalization based on the
Kendall theory.
3.2. Statistical Shape Feature. The manifold metric on Ken-
dall shape space represents the global matching between
point clouds. It can be used to describe the alignment as
the global view. Naturally, the drawback of the metric is
that some local geometric features are lost. It takes a serious
impact when the convex hull of the point cloud model has a
signicant symmetry structure, such as indoor scenes, trans-
portation, and furniture. At this time, the local geometric
information of the model will become important features
that cannot be ignored. Therefore, the local geometric infor-
mation should be considered in the registration.
To formulate the information, some shape operators are
proposed to provide a quantitative representation as the local
shape feature. Such operators describe local geometric details
based on normal vectors or curvature values statistically.
Representative operators include normal distributions trans-
form and FPFH [12]. However, such methods depend on the
quality of normal estimation and the performance are
reduced for noisy point cloud inevitably. Therefore, a rea-
sonable SSF should not depend on normal vectors and robust
to random disturbances of points. Mathematically, it can be
expressed as follows:
SSF pi;ni
ðÞ¼SSF piþα;niþβðÞþε;ð3Þ
where piis a point with normal vector ni;SSFðpiÞrepresents
the required SSF, αand βare random disturbances, and εis
an acceptable error. The SSF represents the mapping dis-
tance based on the tting plane in a local region. The inter-
ference caused by random noise on the mapping distance of
a point is signicantly reduced when constrained by the
tting plane. In the following parts, we provide the imple-
mentation details according to the requirements of SSF.
Journal of Sensors 3
4. Compact Representation with SSF
Generally, the scanned point cloud takes redundant points
and random outliers that reduce the computation efciency.
To improve the performance for alignment, such points
should be processed rst. We employ the simplication pro-
cessing [20] and outlier deletion [21] as a preprocessing step.
After preprocessing, the density of point cloud is optimized
and point number is uniformed. By default, the simplica-
tion number is specied to 50k.
To implement manifold metric, the simplied point
cloud needs to be normalized in order to reduce the inuence
of scaling and translation, which can be regarded as func-
tions Tand Smentioned before. The normalization can be
formulated as follows:
KPðÞ¼p1p;;pnpðÞ=P
jj
;
p¼1
n
n
j¼1
pi;P
jj
¼
n
j¼1
pip
kk
!
1=2
;ð4Þ
where Pis the simplied point cloud with specied number
nandpiis a point of P. In KSS-ICP, such representation is
directly used in alignment. Local features are ignored
completely which affects the accuracy of the further align-
ment to a certain extent. To solve the problem, we introduce
a SSF to combine the local feature analysis into alignment.
Some feature points are labeled according to the SSF that
provide a reference for the global correspondence.
To establish a functional SSF, a mature solution is to
establish a local shape description based on normal vectors
or curvatures, such as FPFH [12]. It represents shape fea-
tures by statistically analyzing the normal vector-based
angles of a local region. Naturally, it is sensitive to noisy
points. To solve the problem, we provide a tangent space-
based statistical measure to represent SSF. It can be formu-
lated as follows:
SSF pi
ðÞ¼
pj2Np
i
ðÞ
dis pj;T
ÀÁ
;
TPCA Np
i
ðÞðÞ;
ð5Þ
where NðpiÞis the neighbor set of pi;Tis the tangent plane that
is dened by the largest eigenvector of NðpiÞ;dis represents the
mapping distance between the point and the plane. It is clear that
if a point is located in a region with sharp curvature changes, the
value of SSFðpiÞis larger. On the contrary, if the point lies on a
plane, even if there are noisy points in its neighbor region, the
related SSF value is still small. According to the SSF value, we
label the top 10%of points (500) as feature points. An instance is
shown in Figure 2. Finally, we achieve the compact representa-
tion of original point cloud with SSF-based labels.
5. SSF-Based Manifold Metric
Based on the compact representation with SSF, we propose
the SSF-based manifold metric for the registration task. It is
used to evaluate the quality of alignment between point
clouds. As a classical measurement, Hausdorff distance has
been widely used to measure geometric consistency between
two shapes [20, 22]. However, it just provided an upper limit
estimation and ignore internal feature correspondence. For
point clouds with symmetrical structure, the accuracy of the
measurement is reduced signicantly.
According to the KSS-ICP, the Hausdorff distance is used
to simulate the manifold metric on Kendall shape space. We
change the metric with SSF to enhance the local feature
analysis. Then, a new manifold metric is proposed for shape
similarity measurement on Kendall shape space, represented
as follows:
ðaÞ ðbÞ ðcÞ
FIGURE 2: Visualization of SSF extraction. (a) Original point cloud, (b) preprocessing result, and (c) SSF-based labels (red points).
4 Journal of Sensors
EKP
1
ðÞ;KP
2
ðÞðÞ¼αdismKP
1
ðÞ;KP
2
ðÞðÞ
þβdismKP
1
ðÞ
SSF;KP
2
ðÞ
SSF
ðÞ;
ð6Þ
dismKP
1
ðÞ;KP
2
ðÞðÞ¼
1
n
n
i¼0ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
pi;qi
kk
2
q;
pi2KP
1
ðÞ;qi2KP
2
ðÞ;
ð7Þ
where Erepresents the new manifold metric between two
compact representations of point clouds P1and P2;dismis
the mean distance between the representations, qi is the
point with minimum distance to pi ;and αand βare weights
to control the inuences between global measurement and
local feature alignment. It is clear that the new metric con-
siders global correspondence and local feature analysis at the
same time. The registration task is transferred into an opti-
mization problem.
To optimize the manifold metric, we design a discrete
solution-searching scheme based on a parallel structure. The
optimized target is the rotation that is represented as follows:
Or¼argmin EKP
1
ðÞ
;OKP
1
ðÞðÞ
fg;ð8Þ
where rotation Oris the target result that corresponds to the
global minimum of E. To implement the optimization, we
generate a candidate rotation set fOgwith certain steps for
each coordinate axis:
O
fg
¼Rx;Ry;Rz
ÈÉ
;ð9Þ
where Rx;Ry;Rzare rotation angles according to the axis.
Each one corresponds to a discrete range of angles with
certain steps, like Rx2f0°;30°;;300°;330°g. The certain
step is 12 and the scale of fOgis 12
3
. To improve the ef-
ciency, we use the GPU-based parallel structure to accelerate
the searching for Or. The performance approaches real-time
and the achieved rotation is the discrete global result. Figure 3
shows the GPU-based rotation searching scheme.
6. Experiments
The performance of our method is evaluated in this section.
The experimental machine is equipped with Intel i7 3.2 GHz,
16 GB RAM, RTX2080, and with windows 10 as its running
system and Visual Studio 2019 as the development platform.
The test dataset is collected from ModelNet40, RGB-D
scenes, and the multisource sampling data. The experiments
include the following parts: rst, we introduce some details of
the test dataset and related metrics we used; second, we
compare different methods on the test dataset with similarity
transformations and noisy point clouds; third, we provide a
comprehensive analysis based on the experimental data; and
nally, we test our method in the practical application. In
addition, we utilize the 3D point clouds in the practical
application to verify the effectiveness of our method in
real data.
6.1. Dataset and Metrics. In this part, the test dataset is
collected from ModelNet40 [23] and RGB-D scenes [24].
We randomly select 500 models from ModelNet40 and all
models from RGB-D scenes to implement experiments. To
evaluate the performance for registration with similarity
transformations, we add operations of random rotations
([30°,90°]), scaling ([1.2, 2.4]), and translations ([1, 10])
to generate a source dataset. The original test dataset is
used to be the template one. In addition, to evaluate the noisy
robustness, we add random noisy points into the source
dataset to generate two noisy datasets with different Gauss-
ian distributions just like in Lv et al.s [1] study.
The quantitative metrics are represented by average value
of the mean squared error (MSE) and the root mean squared
error (MAE) based on Euclidean distances and normal
vector-based angles. Based on the test dataset and related
metrics, the experimental results can be reported.
GPU
Core Core Core
Core Core Core
Core Core Core
FIGURE 3: GPU-based rotation searching scheme.
Journal of Sensors 5
6.2. Comparisons
6.2.1. Similarity Transformations. Some representative meth-
ods are selected as the comparisons, including ICP [2], FPFH
[12], Go-ICP [6], PointNetLK [8], Fast-ICP [5], and KSS-
ICP [1]. The ICP and FPFH are implemented by PCL library.
Other methods provide related codes (Github) that can be
achieved in their paper. In Figures 4 and 5, we show some
registration results by different methods. It proves that our
method achieves better results, especially for complex 3D
scene with symmetric structures. The related metrics are
reported in Tables 1 and 2 as a further explanation.
6.2.2. Noisy Robustness. As mentioned before, the noisy data-
sets are used to evaluate the noisy robustness for different
methods. In Figure 6, we show instances of noisy point
clouds with different Gaussian distributions. Related metrics
are reported in Table 3. It is clear that our method can also
achieve better results on these noisy data. Compared to the
FPFH, our SSF is extracted from the tangent space without
Original ICP FPFH Go-ICP
Fast-ICP KSS-ICP Ours
PointNetLK
FIGURE 4: Comparisons of different methods on models of ModelNet40. Blue: template point cloud and yellow: source point cloud and
registration results by different methods.
Original ICP FPFH Go-ICP
Fast-ICP KSS-ICP Ours
PointNetLK
FIGURE 5: Comparisons of different methods on models of RGB-D scenes.
6 Journal of Sensors
TABLE 1: Metrics of different registration methods on models of ModelNet40.
MSE MSE (n) MAE MAE (n)
ICP [2] 9.4E-3 5.3E-1 1.3E-1 7.3E-1
FPFH [12] 2.3E-3 2.8E-1 3.1E-2 3.6E-1
Go-ICP [6] 1.5E-3 2.6E-1 2.8E-2 3.2E-1
PointNetLK [8] 2.2E-2 7.5E-1 8.9E-2 7.8E-1
Fast-ICP [5] 1.1E-2 5.4E-1 5.8E-2 5.3E-1
KSS-ICP [1] 3.6E-4 1.1E-1 8.7E-3 1.2E-1
Ours 3.5E-4 1.1E-1 8.2E-3 1.1E-1
MSE(n) and MAE(n) represent the normal vector angle-based MSE and MAE. Bold values signify the best results.
TABLE 2: Metrics of different registration methods on models of RGB-D scenes.
MSE MSE (n) MAE MAE (n)
ICP [2] 1.4E-2 5.6E-1 7.2E-2 5.6E-1
FPFH [12] 1.6E-2 5.2E-1 7.4E-2 5.4E-1
Go-ICP [6] 8.3E-3 4.5E-1 5.5E-2 4.9E-1
PointNetLK [8] 2.1E-1 7.7E-1 2.8E-1 7.2E-1
Fast-ICP [5] 5.8E-2 5.1E-1 1.2E-1 5.3E-1
KSS-ICP [1] 4.4E-4 1.8E-2 3.9E-1 4.5E-1
Ours 3.2E-4 1.3E-1 1.7E-2 1.2E-1
Bold values signify the best results.
Original σ = 0.33 σ = 0.66
FIGURE 6: Instances of noisy point clouds with different Gaussian distributions.
TABLE 3: Metrics of different registration methods on noisy models of RGB-D scenes.
σ¼0:33 σ¼0:66
Method MSE MSE (n) MAE MAE (n)
ICP [2] 3.4E-2 2.2E-1 2.5E-2 8.6E-2
FPFH [12] 6.7E-3 4.3E-2 4.4E-3 4.6E-2
Go-ICP [6] 5.2E-1 5.6E-1 4.7E-1 4.3E-1
PointNetLK [8] 5.6E-2 2.5E-1 5.3E-2 2.8E-1
Fast-ICP [5] 3.9E-2 2.1E-1 3.5E-2 9.3E-2
KSS-ICP [1] 5.9E-3 2.6E-2 3.6E-3 5.5E-2
Ours 1.8E-3 1.1E-2 2.6E-3 3.7E-2
σis the parameter of Gaussian distribution for noisy point generation. Bold values signify the best results.
Journal of Sensors 7
point-based normal vector analysis, which improves the
noisy robustness. Due to the SSFs, the symmetrical structures
are well-handled while maintaining noise robustness. Our
method efciently solves the problem of traditional manifold
metrics provided by KSS-ICP.
6.3. Analysis. The proposed method combines SSF and man-
ifold metrics to implement registration task. It inherits
advantages of global shape analysis scheme while enhancing
the local shape feature correspondence. The traditional ICP
scheme depends on the initial poses that increase the local
optimum with high probability. The FPFH proposes a local
shape feature to implement point-based correspondence.
The feature is established by normal vectors that are sensitive
to the noisy points. The SSF of our method is extracted in
tangent space. It measures deviation degrees between a point
and its neighborhood-based tting plane. Even if there are
noisy points that take some random movements, the devia-
tion degrees are controlled in a small range which improves
the noisy robustness. The KSS-ICP presents a global shape
analysis framework based on Kendall shape space. Our
method fully considers the advantages of its framework while
addressing its issues through adjustments to the metric based
on SSF. Therefore, we achieve better results in the standard
test dataset, especially in symmetrical structures of 3D
points. The time cost is reported in Table 4. Beneted
from the SSF, our method can simplify point cloud with
less points than KSS-ICP. It reduces the computation for
alignment.
6.4. Application. We also test our method in the practical 3D
digitization application of Chinese ancient architecture. The
target building is the Small Wild Goose Pagoda, which is a
world-famous cultural relic located in Xian city in China.
This pagoda was built in the Tang Dynasty (618-907 AD)
and is one of the most well-known landmarks of the city. As
a typical representative of the dense eaves pagoda of the Tang
Dynasty, this pagoda has rich historical and cultural value.
Thus, the 3D digitization data are very important for related
academic researches.
In the past, using 3D laser scanning for such a massive
architecture was not only costly but also inefcient. The most
challenging aspect is that the height of the building is more
than 40 m with large amount of brick carving details on the
facades. This leads to a result that we cannot acquire the
complete 3D data with a single technique. Fortunately,
with the maturity of unmanned aerial vehicle (UAV) hard-
ware equipment and the continuous progress of aerial
TABLE 4: Time cost report for different registration methods.
ModelNet40 (s) RGB-D scenes (s)
ICP [2] 14.5 156.3
FPFH [12] 79.3 267.3
Go-ICP [6] 48.6 >300
PointNetLK [8] 5.3 78.3
Fast-ICP [5] 76.3 163.5
KSS-ICP [1] 41.2 50.3
Ours 9.8 19.6
Bold values signify the best results.
Original KSS-ICP Ours
FIGURE 7: Registration results from real point clouds (Small Wild Goose Pagoda). The blue point cloud is the scanned model from real object,
and the orange ones are reconstructed points from images.
8 Journal of Sensors
oblique photography measurement technology, 3D digitiza-
tion of tower-style ancient architecture in relatively low cost
became possible. However, it also presents higher challenges
for registration algorithms.
In this practical application, the aerial tilt photography
and the ground laser scanning are simultaneously used. On
the ground, we utilized a terrestrial laser scanner FARO
Focus Premium to collect high-precision 3D point clouds
with an accuracy of 0.2 cm. In the aerial photography stage,
two UAVs are used to cruise different paths to collect aerial
images around the pagoda. We totally collected 7,400
gures and reconstructed 3D point cloud in the software
ContextCapture. In the data processing part, we need to
conduct the 3D point cloud registration algorithm not
only between different views but also on different data
from various sources.
The complex data sources brought challenges on the
registration algorithm, which is that there are much more
outliers in the real-world data; and different sources generate
point clouds in various densities and features. It is worth
mentioning that our proposed method is very suitable
when dealing with these practical problems. Figure 7 shows
the comparison results of our method and the KSS-ICP algo-
rithm. The input two-point clouds are generated by different
techniques. The laser-scanned point cloud is colored in blue,
while the reconstructed point cloud is in orange. Both the
two pieces of data are postprocessed with downsampling to
reduce the data volume.
In the middle of Figure 7, we can see the KSS-ICP algo-
rithm cannot realize accurate alignment for input data. The
reason is the input two data are generated in different scales,
and there is partial defect in scanned data comparing with
the reconstructed one. Because of considering SSFs, our
method reduces the inuences of scaling and translation,
and then realizes ideal registration results. Figure 8 further
illustrates our advantages in applications.
In Figure 8, the blue point cloud is the entire data that we
generate from reconstructed and scanned points. In order to
improve the density of the 2D images, we captured pictures
by a digital camera from the ground. The reconstruction
program was also executed on these images, and the recon-
structed 3D points are colored in orange in Figure 8. The
features of the input data are much more different than the
last inputs in the point cloud density, scales, and integrity.
Applied experimental results show that our method achieves
better results than KSS-ICP, which proves the effectiveness of
our method on the real data.
7. Conclusions
In this paper, we propose an SSF-based method for the 3D
point cloud registration. It combines global correspondence
and local shape feature analysis during the registration. The
SSF is extracted from the tangent space that improves the
noisy robustness. The SSF-based manifold metric provides
more accurate measurement for alignment. With a parallel
structure, the computational efciency of our method is
improved signicantly. Experiments show that our method
achieves better balance between accuracy, robustness, and
efciency.
Data Availability
The data used to support the ndings of this study are
included within the article.
Conicts of Interest
The authors declare that they have no conicts of interest.
AuthorsContributions
Pu Ren and Chongbin Xu did the methodology; Pu Ren and
Xiaomin Sun worked on the software; Yuan Li took part in
Original KSS-ICP Ours
FIGURE 8: Registration results from real point clouds (Small Wild Goose Pagoda). The blue one is the reconstructed entire point cloud, and the
orange one is the supplement reconstructed from the ground pictures.
Journal of Sensors 9
the 3D scanning of Small Wild Goose Pagoda; and Haiying
Tao worked on the project administration of the study.
Acknowledgments
This research is supported by the Beijing Natural Science Foun-
dation (4214064), BIGC Project (Eb202308 and Ea202316),
General Projects of the Social Science Program of Beijing Munic-
ipal Education Commission (SM202210015003), and BIGC
Educational Project (22150121033/002). The authors thank for
the data and code providers on Github.
References
[1] C. Lv, W. Lin, and B. Zhao, KSS-ICP: point cloud registration
based on Kendall shape space,IEEE Transactions on Image
Processing, vol. 32, pp. 16811693, 2023.
[2] P. J. Besl and N. D. McKay, A method for registration of 3-D
shapes,IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 14, no. 2, pp. 239256, 1992.
[3] A. W. Fitzgibbon, Robust registration of 2D and 3D point
sets,Image and Vision Computing, vol. 21, no. 13-14,
pp. 11451153, 2003.
[4] S. Ying, J. Peng, S. Du, and H. Qiao, A scale stretch method
based on ICP for 3D data registration,IEEE Transactions on
Automation Science and Engineering, vol. 6, no. 3, pp. 559
565, 2009.
[5] J. Zhang, Y. Yao, and B. Deng, Fast and robust iterative
closest point,IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 44, no. 7, pp. 34503466, 2022.
[6] J. Yang, H. Li, D. Campbell, and Y. Jia, Go-ICP: a globally
optimal solution to 3D ICP point-set registration,IEEE
Transactions on Pattern Analysis and Machine Intelligence,
vol. 38, no. 11, pp. 22412254, 2016.
[7] C. Olsson, F. Kahl, and M. Oskarsson, Branch-and-bound
methods for Euclidean registration problems,IEEE Transac-
tions on Pattern Analysis and Machine Intelligence, vol. 31,
no. 5, pp. 783794, 2009.
[8] Y. Aoki, H. Goforth, R. A. Srivatsan, and S. Lucey,
PointNetLk: robust &efcient point cloud registration using
PointNet,in IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), pp. 71567165, IEEE, Long
Beach, CA, USA, 2019.
[9] A. Kurobe, Y. Sekikawa, K. Ishikawa, and H. Saito, CorsNet:
3D point cloud registration by deep neural network,IEEE
Robotics and Automation Letters, vol. 5, no. 3, pp. 39603966,
2020.
[10] W. Liu, C. Wang, S. Chen et al., Y-Net: learning domain
robust feature representation for ground camera image and
large-scale image-based point cloud registration,Information
Sciences, vol. 581, pp. 655677, 2021.
[11] A. Ioannidou, E. Chatzilari, S. Nikolopoulos, and
I. Kompatsiaris, Deep learning advances in computer vision
with 3D data: a survey,ACM Computing Surveys, vol. 50,
no. 2, pp. 138, Article ID 20, 2017.
[12] R. B. Rusu, N. Blodow, and M. Beetz, Fast point feature
histograms (FPFH) for 3D registration,in 2009 IEEE
International Conference on Robotics and Automation,
pp. 32123217, IEEE, Kobe, Japan, 2009.
[13] P. Li, J. Wang, Y. Zhao, Y. Wang, and Y. Yao, Improved
algorithm for point cloud registration based on fast point
feature histograms,Journal of Applied Remote Sensing,
vol. 10, no. 4, Article ID 045024, 2016.
[14] Y. Wang and J. M. Solomon, PRNet: self-supervised learning
for partial-to-partial registration,in Advances in Neural
Information Processing Systems, vol. 32, 2019.
[15] J. Yang, Z. Cao, and Q. Zhang, A fast and robust local
descriptor for 3D point cloud registration,Information
Sciences, vol. 346-347, pp. 163179, 2016.
[16] Y. Chen, Y. Wang, J. Li, Y. Zhang, and X. Gao, A partial-to-
partial point cloud registration method based on geometric
attention network,Journal of Sensors, vol. 2023, Article ID
3427758, 12 pages, 2023.
[17] Z. Qin, H. Yu, C. Wang et al., GeoTransformer: fast and
robust point cloud registration with geometric transformer,
IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 45, no. 8, pp. 98069821, 2023.
[18] J. Yang, X. Zhang, S. Fan, C. Ren, and Y. Zhang, Mutual
voting for ranking 3D correspondences,IEEE Transactions
on Pattern Analysis and Machine Intelligence, pp. 118, 2023.
[19] D. G. Kendall, Shape manifolds, procrustean metrics, and
complex projective spaces,Bulletin of the London Mathemat-
ical Society, vol. 16, no. 2, pp. 81121, 1984.
[20] C. Lv, W. Lin, and B. Zhao, Approximate intrinsic voxel
structure for point cloud simplication,IEEE Transactions on
Image Processing, vol. 30, pp. 72417255, 2021.
[21] C. Lv, W. Lin, and B. Zhao, Intrinsic and isotropic resampling
for 3D point clouds,IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol. 45, no. 3, pp. 32743291, 2023.
[22] C. Lv, W. Lin, and B. Zhao, Voxel structure-based mesh
reconstruction from a 3D point cloud,IEEE Transactions on
Multimedia, vol. 24, pp. 18151829, 2022.
[23] S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and
M. Shah, Transformers in vision: a survey,ACM Computing
Surveys, vol. 54, no. 10s, pp. 141, Article ID 200, 2022.
[24] H. Yu, F. Li, M. Saleh, B. Busam, and S. Ilic, CoFiNet: reliable
coarse-to-ne correspondences for robust pointcloud registra-
tion,in Advances in Neural Information Processing Systems,
vol. 34, pp. 2387223884, 2021.
10 Journal of Sensors
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Partial point cloud registration is an important step in generating a full 3D model. Many deep learning-based methods show good performance for the registration of complete point clouds but cannot deal with the registration of partial point clouds effectively. Recent methods that seek correspondences over downsampled superpoints show great potential in partial point cloud registration. Therefore, this paper proposes a partial-to-partial point cloud registration network based on geometric attention (GAP-Net), which mainly includes a backbone network optimized by a spatial attention module and an overlapping attention module guided by geometric information. The former aggregates the feature information of superpoints, and the latter focuses on superpoint matching in overlapping regions. The experimental results show that the method achieves better registration performance on ModelNet and ModelLoNet with lower overlap. The rotation error is reduced by 14.49% and 17.12%, respectively, which is robust to the overlap rate.
Article
Full-text available
Consistent correspondences between point clouds are vital to 3D vision tasks such as registration and recognition. In this paper, we present a mutual voting method for ranking 3D correspondences. The key insight is to achieve reliable scoring results for correspondences by refining both voters and candidates in a mutual voting scheme. First, a graph is constructed for the initial correspondence set with the pairwise compatibility constraint. Second, nodal clustering coefficients are introduced to preliminarily remove a portion of outliers and speed up the following voting process. Third, we model nodes and edges in the graph as candidates and voters, respectively. Mutual voting is then performed in the graph to score correspondences. Finally, the correspondences are ranked based on the voting scores and top-ranked ones are identified as inliers. Feature matching, 3D point cloud registration, and 3D object recognition experiments on various datasets with different nuisances and modalities verify that MV is robust to heavy outliers under different challenging settings, and can significantly boost 3D point cloud registration and 3D object recognition performance. Code will be available at: https://github.com/NWPU-YJQ-3DV/2022_Mutual_Voting .
Article
Full-text available
Point cloud registration is a popular topic that has been widely used in 3D model reconstruction, location, and retrieval. In this paper, we propose a new registration method, KSS-ICP, to address the rigid registration task in Kendall shape space (KSS) with Iterative Closest Point (ICP). The KSS is a quotient space that removes influences of translations, scales, and rotations for shape feature-based analysis. Such influences can be concluded as the similarity transformations that do not change the shape feature. The point cloud representation in KSS is invariant to similarity transformations. We utilize such property to design the KSS-ICP for point cloud registration. To tackle the difficulty to achieve the KSS representation in general, the proposed KSS-ICP formulates a practical solution that does not require complex feature analysis, data training, and optimization. With a simple implementation, KSS-ICP achieves more accurate registration from point clouds. It is robust to similarity transformation, non-uniform density, noise, and defective parts. Experiments show that KSS-ICP has better performance than the state-of-the-art.
Article
Full-text available
With rapid development of 3D scanning technology, 3D point cloud based research and applications are becoming more popular. However, major difficulties are still exist which affect the performance of point cloud utilization. Such difficulties include lack of local adjacency information, non-uniform point density, and control of point numbers. In this paper, we propose a two-step intrinsic and isotropic (I&I) resampling framework to address the challenge of these three major difficulties. The efficient intrinsic control provides geodesic measurement for a point cloud to improve local region detection and avoids redundant geodesic calculation. Then the geometrically-optimized resampling uses a geometric update process to optimize a point cloud into an isotropic or adaptively-isotropic one. The point cloud density can be adjusted to global uniform (isotropic) or local uniform with geometric feature keeping (being adaptively isotropic). The point cloud number can be controlled based on application requirement or user-specification. Experiments show that our point cloud resampling framework achieves outstanding performance in different applications: point cloud simplification, mesh reconstruction and shape registration. We provide the implementation codes of our resampling method at https://github.com/vvvwo/II-resampling.
Article
Full-text available
A point cloud as an information-intensive 3D representation usually requires a large amount of transmission, storage and computing resources, which seriously hinder its usage in many emerging fields. In this paper, we propose a novel point cloud simplification method, Approximate Intrinsic Voxel Structure (AIVS), to meet the diverse demands in real-world application scenarios. The method includes point cloud pre-processing (denoising and down-sampling), AIVS-based realization for isotropic simplification and flexible simplification with intrinsic control of point distance. To demonstrate the effectiveness of the proposed AIVS-based method, we conducted extensive experiments by comparing it with several relevant point cloud simplification methods on three public datasets, including Stanford, SHREC, and RGB-D scene models. The experimental results indicate that AIVS has great advantages over peers in terms of moving least squares (MLS) surface approximation quality, curvature-sensitive sampling, sharp-feature keeping and processing speed. The source code of the proposed method is publicly available.
Article
Full-text available
Mesh reconstruction from a 3D point cloud is an important topic in the fields of computer graphic, computer vision, and multimedia analysis. In this paper, we propose a voxel structure-based mesh reconstruction framework. It provides the intrinsic metric to improve the accuracy of local region detection. Based on the detected local regions, an initial reconstructed mesh can be obtained. With the mesh optimization in our framework, the initial reconstructed mesh is optimized into an isotropic one with the important geometric features such as external and internal edges. The experimental results indicate that our framework shows great advantages over peer ones in terms of mesh quality, geometric feature keeping, and processing speed. The source code of the proposed method is publicly available at: https://github.com/vvvwo/Parallel-Structure-for-Meshing.
Article
We study the problem of extracting accurate correspondences for point cloud registration. Recent keypoint-free methods have shown great potential through bypassing the detection of repeatable keypoints which is difficult to do especially in low-overlap scenarios. They seek correspondences over downsampled superpoints, which are then propagated to dense points. Superpoints are matched based on whether their neighboring patches overlap. Such sparse and loose matching requires contextual features capturing the geometric structure of the point clouds. We propose Geometric Transformer, or GeoTransformer for short, to learn geometric feature for robust superpoint matching. It encodes pair-wise distances and triplet-wise angles, making it invariant to rigid transformation and robust in low-overlap cases. The simplistic design attains surprisingly high matching accuracy such that no RANSAC is required in the estimation of alignment transformation, leading to 100 times acceleration. Extensive experiments on rich benchmarks encompassing indoor, outdoor, synthetic, multiway and non-rigid demonstrate the efficacy of GeoTransformer. Notably, our method improves the inlier ratio by $18{\sim }31$ percentage points and the registration recall by over 7 points on the challenging 3DLoMatch benchmark. Our code and models are available at https://github.com/qinzheng93/GeoTransformer.
Article
2019 Neural information processing systems foundation. All rights reserved. We present a simple, flexible, and general framework titled Partial Registration Network (PRNet), for partial-to-partial point cloud registration. Inspired by recently-proposed learning-based methods for registration, we use deep networks to tackle non-convexity of the alignment and partial correspondence problems. While previous learning-based methods assume the entire shape is visible, PRNet is suitable for partial-to-partial registration, outperforming PointNetLK, DCP, and non-learning methods on synthetic data. PRNet is self-supervised, jointly learning an appropriate geometric representation, a keypoint detector that finds points in common between partial views, and keypoint-to-keypoint correspondences. We show PRNet predicts keypoints and correspondences consistently across views and objects. Furthermore, the learned representation is transferable to classification.
Article
Astounding results from Transformer models on natural language tasks have intrigued the vision community to study their application to computer vision problems. Among their salient benefits, Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence as compared to recurrent networks e.g. , Long short-term memory (LSTM). Different from convolutional networks, Transformers require minimal inductive biases for their design and are naturally suited as set-functions. Furthermore, the straightforward design of Transformers allows processing multiple modalities ( e.g. , images, videos, text and speech) using similar processing blocks and demonstrates excellent scalability to very large capacity networks and huge datasets. These strengths have led to exciting progress on a number of vision tasks using Transformer networks. This survey aims to provide a comprehensive overview of the Transformer models in the computer vision discipline. We start with an introduction to fundamental concepts behind the success of Transformers i.e., self-attention, large-scale pre-training, and bidirectional feature encoding. We then cover extensive applications of transformers in vision including popular recognition tasks ( e.g. , image classification, object detection, action recognition, and segmentation), generative modeling, multi-modal tasks ( e.g. , visual-question answering, visual reasoning, and visual grounding), video processing ( e.g. , activity recognition, video forecasting), low-level vision ( e.g. , image super-resolution, image enhancement, and colorization) and 3D analysis ( e.g. , point cloud classification and segmentation). We compare the respective advantages and limitations of popular techniques both in terms of architectural design and their experimental value. Finally, we provide an analysis on open research directions and possible future works. We hope this effort will ignite further interest in the community to solve current challenges towards the application of transformer models in computer vision.
Article
Registering the 2D images (2D space) with the 3D model of the environment (3D space) provides a promising solution to outdoor Augmented Reality (AR) virtual-real registration. In this work, we use the position and orientation of the ground camera image to synthesize a corresponding rendered image from the outdoor large-scale 3D image-based point cloud. To achieve the virtual-real registration, we indirectly establish the spatial relationship between 2D and 3D space by matching the above two kinds (2D/3D sapce) of cross-domain images. However, matching cross-domain images goes beyond the capability of handcrafted descriptors and existing deep neural networks. To address this issue, we propose an end-to-end network, Y-Net, to learn Domain Robust Feature Representations (DRFRs) for the cross-domain images. Besides, we introduce a cross-domain-constrained loss function that balances the loss in image content and cross-domain consistency of the feature representations. Experimental results show that the DRFRs simultaneously preserve the representation of image content and suppress the influence of independent domains. Furthermore, Y-Net outperforms the existing algorithms on extracting feature representations and achieves state-of-the-art performance in cross-domain image retrieval. Finally, we validate the Y-Net-based registration approach on campus to demonstrate its possible applicability.