10-fold cross-validation technique

10-fold cross-validation technique

Source publication
Article
Full-text available
The ability to decide if a solution to a pose-graph problem is globally optimal is of high significance for safety-critical applications. Converging to a local-minimum may result in severe estimation errors along the estimated trajectory. In this paper, we propose a graph neural network based on a novel implementation of a graph convolutional-like...

Similar publications

Preprint
Full-text available
Multi-robot SLAM systems in GPS-denied environments require loop closures to maintain a drift-free centralized map. With an increasing number of robots and size of the environment, checking and computing the transformation for all the loop closure candidates becomes computationally infeasible. In this work, we describe a loop closure module that is...
Preprint
Full-text available
This paper presents a novel non-Gaussian inference algorithm, Normalizing Flow iSAM (NF-iSAM), for solving SLAM problems with non-Gaussian factors and/or nonlinear measurement models. NF-iSAM exploits the expressive power of neural networks to model normalizing flows that can accurately approximate the joint posterior of highly nonlinear and non-Ga...
Preprint
Full-text available
In this paper, we present BAMF-SLAM, a novel multi-fisheye visual-inertial SLAM system that utilizes Bundle Adjustment (BA) and recurrent field transforms (RFT) to achieve accurate and robust state estimation in challenging scenarios. First, our system directly operates on raw fisheye images, enabling us to fully exploit the wide Field-of-View (FoV...
Article
Full-text available
Pose graph optimization is a non-convex optimiza- tion problem encountered in many areas of robotics perception. Its convergence to an accurate solution is conditioned by two factors: the non-linearity of the cost function in use and the initial configuration of the pose variables. In this paper, we present HiPE, a novel hierarchical algorithm for...
Preprint
Full-text available
Camera pose estimation or camera relocalization is the centerpiece in numerous computer vision tasks such as visual odometry, structure from motion (SfM) and SLAM. In this paper we propose a neural network approach with a graph transformer backbone, namely TransCamP, to address the camera relocalization problem. In contrast with prior work where th...

Citations

... The method has also been used to fuse consecutive pose estimation results (Agarwal et al., 2019;Yang et al., 2018). Pose graph optimisation has been improved in multiple ways (Azzam et al., 2021;Chen et al., 2021) and these improved variants are more robust and suitable to perform mobile mapping tasks (Bai et al., 2021;Orsulić et al., 2021). ...
Article
Full-text available
LiDAR odometry enables localising vehicles and robots in the environments where global navigation satellite systems (GNSS) are not available. An inherent limitation of LiDAR odometry is the accumulation of local motion estimation errors. Current approaches heavily rely on loop closure to optimise the estimated sensor poses and to eliminate the drift of the estimated trajectory. Consequently, these systems cannot perform real-time localization and are therefore not practical for a navigation task. This paper presents MoLO, a novel model-based LiDAR odometry approach to achieve real-time and drift-free localization using a 3D model of the environment containing planar surfaces, namely the structural elements of buildings. The proposed approach uses a 3D model of the environment to initial-ise the LiDAR pose and includes a scan-to-scan registration to estimate the pose for consecutive LiDAR scans. Re-registering LiDAR scans to the 3D model at a certain frequency provides the global sensor pose and eliminates the drift of the trajectory. Pose graphs are built frequently to acquire a smooth and accurate trajectory. A geometry-based method and a learning-based method to register LiDAR scans with the 3D model are tested and compared. Experimental results show that MoLO can eliminate drift and achieve real-time localization while providing an accuracy equivalent to loop closure optimization.
... The proposed modular encoder-agent architecture is comprised of two main components. The first component is a neural network initially proposed in [9] for graph optimality classification. The second is a recurrent-based soft actorcritic (SAC) [10] agent whose policy predicts the optimal orientation retractions [7]. ...
... The sparse bounded degree sum-of-squares (Sparse-BSOS) optimization method [22], formulate SLAM problems as polynomial optimization programs and demonstrate the ability to achieve global minimum solutions without initialization. A deep learning approach for pose-graph global optimality classification was also recently proposed in [9]. ...
... We adopt the architecture proposed in [9], formally presented for the task of global optimality prediction. The poses or in other words, nodes of each graph input, store a cost feature and the absolute orientation R i of the node itself. ...
Preprint
The objective of pose SLAM or pose-graph optimization (PGO) is to estimate the trajectory of a robot given odometric and loop closing constraints. State-of-the-art iterative approaches typically involve the linearization of a non-convex objective function and then repeatedly solve a set of normal equations. Furthermore, these methods may converge to a local minima yielding sub-optimal results. In this work, we present to the best of our knowledge the first Deep Reinforcement Learning (DRL) based environment and proposed agent for 2D pose-graph optimization. We demonstrate that the pose-graph optimization problem can be modeled as a partially observable Markov Decision Process and evaluate performance on real-world and synthetic datasets. The proposed agent outperforms state-of-the-art solver g2o on challenging instances where traditional nonlinear least-squares techniques may fail or converge to unsatisfactory solutions. Experimental results indicate that iterative-based solvers bootstrapped with the proposed approach allow for significantly higher quality estimations. We believe that reinforcement learning-based PGO is a promising avenue to further accelerate research towards globally optimal algorithms. Thus, our work paves the way to new optimization strategies in the 2D pose SLAM domain.
... [10] improved SLAM sparse algorithm based on non-factor descending, using sparse network to approximate the original dense factor to accelerate the operation. [11] proposed a graph neural network based on a novel implementation of a graph convolutional-like layer to convey messages that facilitate the optimality verification of a 2D pose-graph, which can learn to classify candidate solutions of 2D pose-graphs as optimal or sub-optimal. Although this method sacrifices a slight accuracy, it is rewarded with a significant speed increase. ...
Article
Full-text available
Simultaneous Localization and Mapping (SLAM) is the core technology of intelligent substation inspection robot. Because of lightweight computation, Rao-Blackwellized Particle Filter (RBPF) is widely used in two-dimensional SLAM. However, it suffers from poor positioning accuracy, low robustness and rapid cumulative errors despite recent improvement. This paper presents a lidar SLAM system based on RBPF and graph optimization that can adapt to unstructured operating environment of substation. Firstly, the diversity of particles is increased by rebuilding the resample algorithm to improve the robustness of the system, and high-quality poses are estimated in submaps. Secondly, the multi-submap system is established to construct odometry constraints (one pose corresponds to two submaps). Furthermore, loop detector is an important part of optimization algorithm, and the branch-bound method is used to reduce computation burden and accelerate the loop detection. Finally, global poses of robot are optimized by the whole odometry and loop constraints in real time. Experiment results show that the proposed method is more accurate than other methods, and can maintain and produce high-precision positioning and mapping in complex substation operation and maintenance environment. It provides a new idea for intelligent substation inspection and positioning method.
... number of connections and dimension) to address the problem in question. For instance, the work proposed in [36] and [37] designed graphs to represent point-clouds and ground vehicle poses, respectively. The features of the nodes and edges in each graph encode information necessary to perform the problem in hand, like the point 3D coordinates and the 2D pose of the robot. ...
... In [36], a stack of EdgeConv layers is proposed to capture and exploit fine-grained geometric properties of point clouds which are then employed to carry out classification and segmentation for point cloud data. Another graph convolutional layer is proposed in [37], called PoseConv, to carry out global optimality verification of 2D pose graph SLAM. ...
Preprint
Full-text available
Neuromorphic vision is a bio-inspired technology that has triggered a paradigm shift in the computer-vision community and is serving as a key-enabler for a multitude of applications. This technology has offered significant advantages including reduced power consumption, reduced processing needs, and communication speed-ups. However, neuromorphic cameras suffer from significant amounts of measurement noise. This noise deteriorates the performance of neuromorphic event-based perception and navigation algorithms. In this paper, we propose a novel noise filtration algorithm to eliminate events which do not represent real log-intensity variations in the observed scene. We employ a Graph Neural Network (GNN)-driven transformer algorithm, called GNN-Transformer, to classify every active event pixel in the raw stream into real-log intensity variation or noise. Within the GNN, a message-passing framework, called EventConv, is carried out to reflect the spatiotemporal correlation among the events, while preserving their asynchronous nature. We also introduce the Known-object Ground-Truth Labeling (KoGTL) approach for generating approximate ground truth labels of event streams under various illumination conditions. KoGTL is used to generate labeled datasets, from experiments recorded in challenging lighting conditions. These datasets are used to train and extensively test our proposed algorithm. When tested on unseen datasets, the proposed algorithm outperforms existing methods by 12% in terms of filtration accuracy. Additional tests are also conducted on publicly available datasets to demonstrate the generalization capabilities of the proposed algorithm in the presence of illumination variations and different motion dynamics. Compared to existing solutions, qualitative results verified the superior capability of the proposed algorithm to eliminate noise while preserving meaningful scene events.
Article
In this letter, we present to the best of our knowledge, the first deep reinforcement learning (DRL) based 2D pose-graph optimization (PGO). We demonstrate that the pose-graph optimization problem can be modeled as a partially observable Markov Decision Process. The proposed agent outperforms state-of-the-art solver $\mathrm {g}^{2} \mathrm {o}$ on challenging instances where traditional nonlinear least-squares techniques may fail or converge to unsatisfactory solutions. Experimental results indicate that iterative-based solvers bootstrapped with the proposed approach allow for significantly higher quality estimations.
Article
The development of autonomous vehicles has prompted an interest in exploring various techniques in navigation. One such technique is simultaneous localization and mapping (SLAM), which enables a vehicle to comprehend its surroundings, build a map of the environment in real-time, and locate itself within that map. Although traditional techniques have been used to perform SLAM for a long time, recent advancements have seen the incorporation of neural network techniques into various stages of the SLAM pipeline. This review paper provides a focused analysis of the recent developments in neural network techniques for SLAM-based localization of autonomous ground vehicles. In contrast to the previous review studies that covered general navigation and SLAM techniques, this work specifically addresses the unique challenges and opportunities presented by the integration of neural networks in this context. Existing review studies have highlighted the limitations of conventional visual SLAM, and this paper aims to explore the potential of deep learning methods. The paper discusses the functions required for localization, as well as several neural network-based techniques proposed by researchers to carry out such functions, are discussed. Firstly, it presents a general background of the issue, the relevant review studies that have already been done, and the adopted methodology in this review. Then, it provides a thorough review of the findings regarding localization and odometry. Finally, it presents our analysis of the findings, open research questions in the field, and a conclusion. A semi-systematic approach is used to carry out the review.
Article
Full-text available
Neuromorphic vision is a bio-inspired technology that has triggered a paradigm shift in the computer vision community and is serving as a key enabler for a wide range of applications. This technology has offered significant advantages, including reduced power consumption, reduced processing needs, and communication speedups. However, neuromorphic cameras suffer from significant amounts of measurement noise. This noise deteriorates the performance of neuromorphic event-based perception and navigation algorithms. In this article, we propose a novel noise filtration algorithm to eliminate events that do not represent real log-intensity variations in the observed scene. We employ a graph neural network (GNN)-driven transformer algorithm, called GNN-Transformer, to classify every active event pixel in the raw stream into real log-intensity variation or noise. Within the GNN, a message-passing framework, referred to as EventConv, is carried out to reflect the spatiotemporal correlation among the events while preserving their asynchronous nature. We also introduce the known-object ground-truth labeling (KoGTL) approach for generating approximate ground-truth labels of event streams under various illumination conditions. KoGTL is used to generate labeled datasets, from experiments recorded in challenging lighting conditions, including moon light. These datasets are used to train and extensively test our proposed algorithm. When tested on unseen datasets, the proposed algorithm outperforms state-of-the-art methods by at least 8.8% in terms of filtration accuracy. Additional tests are also conducted on publicly available datasets (ETH Zürich Color-DAVIS346 datasets) to demonstrate the generalization capabilities of the proposed algorithm in the presence of illumination variations and different motion dynamics. Compared to state-of-the-art solutions, qualitative results verified the superior capability of the proposed algorithm to eliminate noise while preserving meaningful events in the scene.
Article
Graph convolutional networks (GCNs) have been introduced to effectively process non-euclidean graph data. However, GCNs incur large amounts of irregularity in computation and memory access, which prevents efficient use of traditional neural network accelerators. Moreover, existing dedicated GCN accelerators demand high memory volumes and are difficult to implement onto resource limited edge devices. In this work, we propose LW-GCN , a lightweight FPGA-based accelerator with a software-hardware co-designed process to tackle irregularity in computation and memory access in GCN inference. LW-GCN decomposes the main GCN operations into Sparse Matrix-Matrix Multiplication (SpMM) and Matrix-Matrix Multiplication (MM). We propose a novel compression format to balance workload across PEs and prevent data hazards. Moreover, we apply data quantization and workload tiling, and map both SpMM and MM of GCN inference onto a uniform architecture on resource limited hardware. Evaluation on GCN and GraphSAGE are performed on Xilinx Kintex-7 FPGA with three popular datasets. Compared to existing CPU, GPU, and state-of-the-art FPGA-based accelerator, LW-GCN reduces latency by up to 60x, 12x and 1.7x and increases power efficiency by up to 912x., 511x and 3.87x, respectively. Furthermore, compared with NVIDIA’s latest edge GPU Jetson Xavier NX, LW-GCN achieves speedup and energy savings of 32x and 84x, respectively.