Block diagram of the visual odometry software system

Source publication

A real-time method for depth enhanced visual odometry

Article

Full-text available

Jan 2017

Visual odometry can be augmented by depth information such as provided by RGB-D cameras, or from lidars associated with cameras. However, such depth information can be limited by the sensors, leaving large areas in the visual images where depth is unavailable. Here, we propose a method to utilize the depth, even if sparsely available, in recovery o...

(Color online) (a) Side view of the schematic structure, (b) top view...

(Color online) (a) NFP when VCSEL side is pumped, (b) L/I...

(Color online) (a) Measured FFP when the amplifier is pumped, (b) L/I...

(Color online) (a) L/I characteristic of the amplification when a seed...

Surface grating VCSEL-integrated amplifier/beam scanner with high power and single mode operation

Article

Full-text available

May 2021

We propose and demonstrate a novel surface-grating vertical cavity surface emitting laser (VCSEL)-integrated amplifier/beam scanner. When the surface of the VCSEL section is periodically etched, a single slow-light mode which travels laterally into the amplifier section is selected due to the wavelength selectivity of the grating. The coupled slow...

VA-LOAM: Visual Assist LiDAR Odometry and Mapping for Accurate Autonomous Navigation

Article

Full-text available

Jun 2024
SENSORS-BASEL

In this study, we enhanced odometry performance by integrating vision sensors with LiDAR sensors, which exhibit contrasting characteristics. Vision sensors provide extensive environmental information but are limited in precise distance measurement, whereas LiDAR offers high accuracy in distance metrics but lacks detailed environmental data. By utilizing data from vision sensors, this research compensates for the inadequate descriptors of LiDAR sensors, thereby improving LiDAR feature matching performance. Traditional fusion methods, which rely on extracting depth from image features, depend heavily on vision sensors and are vulnerable under challenging conditions such as rain, darkness, or light reflection. Utilizing vision sensors as primary sensors under such conditions can lead to significant mapping errors and, in the worst cases, system divergence. Conversely, our approach uses LiDAR as the primary sensor, mitigating the shortcomings of previous methods and enabling vision sensors to support LiDAR-based mapping. This maintains LiDAR Odometry performance even in environments where vision sensors are compromised, thus enhancing performance with the support of vision sensors. We adopted five prominent algorithms from the latest LiDAR SLAM open-source projects and conducted experiments on the KITTI odometry dataset. This research proposes a novel approach by integrating a vision support module into the top three LiDAR SLAM methods, thereby improving performance. By making the source code of VA-LOAM publicly available, this work enhances the accessibility of the technology, fostering reproducibility and transparency within the research community.

Camera, LiDAR, and IMU Based Multi-Sensor Fusion SLAM: A Survey

Article

Apr 2024

In recent years, Simultaneous Localization And Mapping (SLAM) technology has prevailed in a wide range of applications, such as autonomous driving, intelligent robots, Augmented Reality (AR), and Virtual Reality (VR). Multi-sensor fusion using the most popular three types of sensors (e.g., visual sensor, LiDAR sensor, and IMU) is becoming ubiquitous in SLAM, in part because of the complementary sensing capabilities and the inevitable shortages (e.g., low precision and long-term drift) of the stand-alone sensor in challenging environments. In this article, we survey thoroughly the research efforts taken in this field and strive to provide a concise but complete review of the related work. Firstly, a brief introduction of the state estimator formation in SLAM is presented. Secondly, the state-of-the-art algorithms of different multi-sensor fusion algorithms are given. Then we analyze the deficiencies associated with the reviewed approaches and formulate some future research considerations. This paper can be considered as a brief guide to newcomers and a comprehensive reference for experienced researchers and engineers to explore new interesting orientations.

Tightly-Coupled LiDAR-Visual SLAM Based on Geometric Features for Mobile Agents

Preprint

Full-text available

Jul 2023

The mobile robot relies on SLAM (Simultaneous Localization and Mapping) to provide autonomous navigation and task execution in complex and unknown environments. However, it is hard to develop a dedicated algorithm for mobile robots due to dynamic and challenging situations, such as poor lighting conditions and motion blur. To tackle this issue, we propose a tightly-coupled LiDAR-visual SLAM based on geometric features, which includes two sub-systems (LiDAR and monocular visual SLAM) and a fusion framework. The fusion framework associates the depth and semantics of the multi-modal geometric features to complement the visual line landmarks and to add direction optimization in Bundle Adjustment (BA). This further constrains visual odometry. On the other hand, the entire line segment detected by the visual subsystem overcomes the limitation of the LiDAR subsystem, which can only perform the local calculation for geometric features. It adjusts the direction of linear feature points and filters out outliers, leading to a higher accurate odometry system. Finally, we employ a module to detect the subsystem's operation, providing the LiDAR subsystem's output as a complementary trajectory to our system while visual subsystem tracking fails. The evaluation results on the public dataset M2DGR, gathered from ground robots across various indoor and outdoor scenarios, show that our system achieves more accurate and robust pose estimation compared to current state-of-the-art multi-modal methods.

Tightly-coupled LiDAR-visual SLAM Based on Geometric Features

Thesis

Full-text available

Jun 2023

Providing stable navigation for the visually impaired in various unknown scenarios is a significant challenge. Simultaneous Localization And Mapping (SLAM) technology is essential for navigation with precise and reliable real-time map and position. It helps visually impaired people reach their destinations smoothly. Several SLAM algorithms are currently designed for such navigation. However, tracking in unfamiliar environments without prior information poses a significant challenge for achieving reliable results. Single-sensor SLAM algorithms face challenges in improving robustness due to sensor limitations. Multi-modal SLAM has a higher potential for improving robustness. Multi-modal SLAM systems supplement the information in different dimensions by combining the different sensors. For example, cameras capture rich texture information such as color and brightness but provide only 2D information. Depth, an essential parameter for map reconstruction, is missing. On the other hand, the LiDAR sensor captures depth information in full 360 degrees, but the resulting point clouds are massive, unordered, and lack semantic information. We also found that algorithms based on point features are susceptible to noise. Geometric features, such as linear and planar features, are more robust and are receiving more attention in this field. In this master thesis, we propose a tightly-coupled LiDAR-visual SLAM based on geometric features. It consists of two parallel subsystems - a LiDAR and a visual subsystem. These subsystems independently generate rich geometric features. While the system is running, we construct a fusion framework that acquires geometric features from the front end of both subsystems. As the dimensions of the geometric features generated by the two subsystems are inconsistent, we establish a spherical coordinate system as a fusion reference system. It ensures the spatial and temporal consistency of the geometric features within a unified system. Our LiDAR-visual system, also known as a multi-modal SLAM, employs geometric information provided by the visual subsystem, such as 2D detected lines, and linear and planar features with depth and direction supplied by the LiDAR subsystem. Through neighborhood search, projection, and computation of the feature's direction, our multi-modal framework generates high-quality and diverse spatial lines. These lines return as new optimization terms to both subsystems. The LiDAR subsystem uses 2D line features from the multi-modal frame to optimize the direction of its linear feature points. In the monocular visual subsystem, the reconstructed lines with depth and direction contribute as new optimization terms to the visual odometry estimation and back-end optimization, thus improving the accuracy of the multi-modal subsystem. Finally, we employ a module to detect whether the subsystem is working. We choose the visual subsystem output as the final result of our algorithm since visual SLAM has higher accuracy, even though it may have lower robustness. If the detection module detects a failure in the visual odometry tracking, we employ the LiDAR subsystem's trajectory and map as the final result. With our selected dataset M2DGR, we completed experiments in various scenes, including narrow and spacious indoor and outdoor environments with varying lighting conditions. These scenes match the diverse, complex, and unfamiliar navigational requirements of the visually impaired. We executed the proposed SLAM algorithm and analyzed the results qualitatively and quantitatively. We found that the feature fusion in our multi-modal framework was effective in various scenarios, and our algorithm achieved higher accuracy than its predecessor subsystems. Furthermore, our algorithm produced complete trajectories and maps in every scene. This proves the robustness of our algorithm. In conclusion, our system explores the fusion of geometric features in visual-LiDAR multi-modal SLAM and has made significant progress in this area. It provides navigation systems with more accurate position and environment information in unknown scenarios. Our system also adapts to different scenes and provides stable and reliable performance.

SLAM and 3D Semantic Reconstruction Based on the Fusion of Lidar and Monocular Vision

Article

Full-text available

Jan 2023
SENSORS-BASEL

Monocular camera and Lidar are the two most commonly used sensors in unmanned vehicles. Combining the advantages of the two is the current research focus of SLAM and semantic analysis. In this paper, we propose an improved SLAM and semantic reconstruction method based on the fusion of Lidar and monocular vision. We fuse the semantic image with the low-resolution 3D Lidar point clouds and generate dense semantic depth maps. Through visual odometry, ORB feature points with depth information are selected to improve positioning accuracy. Our method uses parallel threads to aggregate 3D semantic point clouds while positioning the unmanned vehicle. Experiments are conducted on the public CityScapes and KITTI Visual Odometry datasets, and the results show that compared with the ORB-SLAM2 and DynaSLAM, our positioning error is approximately reduced by 87%; compared with the DEMO and DVL-SLAM, our positioning accuracy improves in most sequences. Our 3D reconstruction quality is better than DynSLAM and contains semantic information. The proposed method has engineering application value in the unmanned vehicles field.

Time Synchronization and Space Registration of Roadside LiDAR and Camera

Article

Full-text available

Jan 2023

The sensing system consisting of Light Detection and Ranging (LiDAR) and a camera provides complementary information about the surrounding environment. To take full advantage of multi-source data provided by different sensors, an accurate fusion of multi-source sensor information is needed. Time synchronization and space registration are the key technologies that affect the fusion accuracy of multi-source sensors. Due to the difference in data acquisition frequency and deviation in startup time between LiDAR and the camera, asynchronous data acquisition between LiDAR and camera is easy to occur, which has a significant influence on subsequent data fusion. Therefore, a time synchronization method of multi-source sensors based on frequency self-matching is developed in this paper. Without changing the sensor frequency, the sensor data are processed to obtain the same number of data frames and set the same ID number, so that the LiDAR and camera data correspond one by one. Finally, data frames are merged into new data packets to realize time synchronization between LiDAR and camera. Based on time synchronization, to achieve spatial synchronization, a nonlinear optimization algorithm of joint calibration parameters is used, which can effectively reduce the reprojection error in the process of sensor spatial registration. The accuracy of the proposed time synchronization method is 99.86% and the space registration accuracy is 99.79%, which is better than the calibration method of the Matlab calibration toolbox.

LE-VINS: A Robust Solid-State-LiDAREnhanced Visual-Inertial Navigation System for Low-Speed Robots

Article

Jan 2023
IEEE T INSTRUM MEAS

Accurate and long-distance depth estimation for visual landmarks is challenging in visual-inertial navigation systems (VINS). In visual-degenerated scenes with illumination changes, moving objects, or weak texture, depth estimation may be more difficult, resulting in poor robustness and accuracy. For low-speed robot navigation, we present a solid-state-LiDAR-enhanced VINS (LE-VINS) to improve the system robustness and accuracy in challenging environments. The point clouds from the solid-state LiDAR are projected to the visual keyframe with the inertial navigation system (INS) pose for depth association while compensating for the motion distortion. A robust depth-association method with an effective plane-checking algorithm is proposed to estimate the landmark depth. With the estimated depth, we present a LiDAR depth factor to construct accurate depth measurements for visual landmarks in factor graph optimization (FGO). The visual feature, LiDAR depth, and IMU measurements are tightly fused within the FGO framework to achieve maximum-a-posterior estimation. Field tests were conducted on a low-speed robot in large-scale challenging environments. The results demonstrate that the proposed LE-VINS yields significantly improved robustness and accuracy compared to the original VINS. Besides, LE-VINS exhibits superior accuracy than the state-of-the-art LiDAR-visual-inertial navigation system. LE-VINS also outperforms the existing LiDAR-enhanced method, benefiting from the robust depth-association algorithm and the effective LiDAR depth factor.

Disturbance Observer and Depth Enhanced Visual-Inertial Navigation System For Multi-rotor MAVs: An Observability Analysis

Conference Paper

Full-text available

Dec 2022

JVLDLoc: a Joint Optimization of Visual-LiDAR Constraints and Direction Priors for Localization in Driving Scenario

Preprint

Aug 2022

The ability for a moving agent to localize itself in environment is the basic demand for emerging applications, such as autonomous driving, etc. Many existing methods based on multiple sensors still suffer from drift. We propose a scheme that fuses map prior and vanishing points from images, which can establish an energy term that is only constrained on rotation, called the direction projection error. Then we embed these direction priors into a visual-LiDAR SLAM system that integrates camera and LiDAR measurements in a tightly-coupled way at backend. Specifically, our method generates visual reprojection error and point to Implicit Moving Least Square(IMLS) surface of scan constraints, and solves them jointly along with direction projection error at global optimization. Experiments on KITTI, KITTI-360 and Oxford Radar Robotcar show that we achieve lower localization error or Absolute Pose Error (APE) than prior map, which validates our method is effective.

Camera, LiDAR and Multi-modal SLAM Systems for Autonomous Ground Vehicles: a Survey

Article

Full-text available

Apr 2022
J INTELL ROBOT SYST

Simultaneous Localization and Mapping (SLAM) have been widely studied over the last years for autonomous vehicles. SLAM achieves its purpose by constructing a map of the unknown environment while keeping track of the location. A major challenge, which is paramount during the design of SLAM systems, lies in the efficient use of onboard sensors to perceive the environment. The most widely applied algorithms are camera-based SLAM and LiDAR-based SLAM. Recent research focuses on the fusion of camera-based and LiDAR-based frameworks that show promising results. In this paper, we present a study of commonly used sensors and the fundamental theories behind SLAM algorithms. The study then presents the hardware architectures used to process these algorithms and the performance obtained when possible. Secondly, we highlight state-of-the-art methodologies in each modality and in the multi-modal framework. A brief comparison followed by future challenges is then underlined. Additionally, we provide insights to possible fusion approaches that can increase the robustness and accuracy of modern SLAM algorithms; hence allowing the hardware-software co-design of embedded systems taking into account the algorithmic complexity and the embedded architectures and real-time constraints.

Block diagram of the visual odometry software system

Similar publications

Citations