Overview of the SegNet architecture with LiDAR rasters as input.

Dynamic Spatial–Spectral Feature Optimization-Based Point Cloud Classification

Article

Full-text available

Feb 2024

With the development and popularization of LiDAR technology, point clouds are becoming widely used in multiple fields. Point cloud classification plays an important role in segmentation, geometric analysis, and vegetation description. However, existing point cloud classification algorithms have problems such as high computational complexity, a lack of feature optimization, and low classification accuracy. This paper proposes an efficient point cloud classification algorithm based on dynamic spatial–spectral feature optimization. It can eliminate redundant features, optimize features, reduce computational costs, and improve classification accuracy. It achieves feature optimization through three key steps. First, the proposed method extracts spatial, geometric, spectral, and other features from point cloud data. Then, the Gini index and Fisher score are used to calculate the importance and relevance of features, and redundant features are filtered. Finally, feature importance factors are used to dynamically enhance the discriminative power of highly distinguishable features to strengthen their contribution to point cloud classification. Four real-scene datasets from STPLS3D are utilized for experimentation. Compared to the other five algorithms, the proposed algorithm achieves at least a 37.97% improvement in mean intersection over union (mIoU). Meanwhile, the results indicate that the proposed algorithm can achieve high-precision point cloud classification with low computational complexity.

Learnable Earth Parser: Discovering 3D Prototypes in Aerial Scans

Preprint

Full-text available

Apr 2023

We propose an unsupervised method for parsing large 3D scans of real-world scenes into interpretable parts. Our goal is to provide a practical tool for analyzing 3D scenes with unique characteristics in the context of aerial surveying and mapping, without relying on application-specific user annotations. Our approach is based on a probabilistic reconstruction model that decomposes an input 3D point cloud into a small set of learned prototypical shapes. Our model provides an interpretable reconstruction of complex scenes and leads to relevant instance and semantic segmentations. To demonstrate the usefulness of our results, we introduce a novel dataset of seven diverse aerial LiDAR scans. We show that our method outperforms state-of-the-art unsupervised methods in terms of decomposition accuracy while remaining visually interpretable. Our method offers significant advantage over existing approaches, as it does not require any manual annotations, making it a practical and efficient tool for 3D scene analysis. Our code and dataset are available at https://imagine.enpc.fr/~loiseaur/learnable-earth-parser

Canopy classification using LiDAR: a generalizable machine learning approach

Article

Full-text available

Dec 2022

The integration of machine learning algorithms with LiDAR-derived datasets has long been used in the development of canopy classification models with the main objective of differentiating between tree canopies and other types of land cover. However, these integrated canopy classification models often require investigators to provide training reference data, use vendor classification codes that vary in quality and availability, and/or rely on site-specific information to achieve the required accuracy. In this study, a generalizable canopy classification model based solely on LiDAR-derived datasets is proposed and evaluated. Ten watersheds that are located in different regions in the continental USA were selected to represent a wide range of physiography, topography, climate, tree diversity, and LiDAR data characteristics, ensuring model applicability to different environments. Three canopy classification model development strategies were considered: general, specific, and single. The final decision tree-based general canopy classification model contains five datasets with the roughness of the filtered DHM yielding the highest normalized feature importance of 0.9. The developed general canopy detection model accuracy was comparable to of the specific/single models and it generated an average testing kappa statistic of 0.90 and 0.96 when applied to training/testing and testing datasets, respectively. This study demonstrated the existence of a consistent canopy signal in LiDAR datasets across the contiguous US that can be used to create a general canopy classification models that are functional regardless of the study area or LiDAR quality.

LiDAR-Based Hatch Localization

Article

Full-text available

Oct 2022

This paper considers the problem of determining the time-varying location of a nearly full hatch during cyclic transloading operations. Hatch location determination is a necessary step for automation of transloading, so that the crane can safely operate on the cargo in the hatch without colliding with the hatch edges. A novel approach is presented and evaluated by using data from a light detection and ranging (LiDAR) mounted on a pan-tilt unit (PT). Within each cycle, the hatch area is scanned, the data is processed, and the hatch corner locations are extracted. Computations complete less than 5 ms after the LiDAR scan completes, which is well within the time constraints imposed by the crane transloading cycle. Although the approach is designed to solve the challenging problem of a full hatch scenario, it also works when the hatch is not full, because in that case the hatch edges can be more easily distinguished from the cargo data. Therefore, the approach can be applied during the whole duration of either loading or unloading. Experimental results for hundreds of cycles are present to demonstrate the ability to track the hatch location as it moves and to assess the accuracy (standard deviation less than 0.30 m) and reliability (worst case error less than 0.35 m).

Learning Digital Terrain Models from Point Clouds: ALS2DTM Dataset and Rasterization-based GAN

Preprint

Full-text available

Jun 2022

Despite the popularity of deep neural networks in various domains, the extraction of digital terrain models (DTMs) from airborne laser scanning (ALS) point clouds is still challenging. This might be due to the lack of dedicated large-scale annotated dataset and the data-structure discrepancy between point clouds and DTMs. To promote data-driven DTM extraction, this paper collects from open sources a large-scale dataset of ALS point clouds and corresponding DTMs with various urban, forested, and mountainous scenes. A baseline method is proposed as the first attempt to train a Deep neural network to extract digital Terrain models directly from ALS point clouds via Rasterization techniques, coined DeepTerRa. Extensive studies with well-established methods are performed to benchmark the dataset and analyze the challenges in learning to extract DTM from point clouds. The experimental results show the interest of the agnostic data-driven approach, with sub-metric error level compared to methods designed for DTM extraction. The data and source code is provided at https://lhoangan.github.io/deepterra/ for reproducibility and further similar research.

Learning Digital Terrain Models from Point Clouds: ALS2DTM Dataset and Rasterization-based GAN

Article

Full-text available

Jan 2022
IEEE J-STARS

Despite the popularity of deep neural networks in various domains, the extraction of digital terrain models (DTMs) from airborne laser scanning (ALS) point clouds is still challenging. This might be due to the lack of the dedicated large-scale annotated dataset and the data-structure discrepancy between point clouds and DTMs. To promote data-driven DTM extraction, this article collects from open sources a large-scale dataset of ALS point clouds and corresponding DTMs with various urban, forested, and mountainous scenes. A baseline method is proposed as the first attempt to train a deep neural network to extract DTMs directly from ALS point clouds via rasterization techniques, coined DeepTerRa. Extensive studies with well-established methods are performed to benchmark the dataset and analyze the challenges in learning to extract DTM from point clouds. The experimental results show the interest of the agnostic data-driven approach, with submetric error level compared to methods designed for DTM extraction. The data and source code are available online at https://lhoangan.github.io/deepterra/ for reproducibility and further similar research.

Fully convolutional open set segmentation

Article

Full-text available

Jul 2021
MACH LEARN

In traditional semantic segmentation, knowing about all existing classes is essential to yield effective results with the majority of existing approaches. However, these methods trained in a Closed Set of classes fail when new classes are found in the test phase, not being able to recognize that an unseen class has been fed. This means that they are not suitable for Open Set scenarios, which are very common in real-world computer vision and remote sensing applications. In this paper, we discuss the limitations of Closed Set segmentation and propose two fully convolutional approaches to effectively address Open Set semantic segmentation: OpenFCN and OpenPCS. OpenFCN is based on the well-known OpenMax algorithm, configuring a new application of this approach in segmentation settings. OpenPCS is a fully novel approach based on feature-space from DNN activations that serve as features for computing PCA and multi-variate gaussian likelihood in a lower dimensional space. In addition to OpenPCS and aiming to reduce the RAM memory requirements of the methodology, we also propose a slight variation of the method (OpenIPCS) that uses an iteractive version of PCA able to be trained in small batches. Experiments were conducted on the well-known ISPRS Vaihingen/Potsdam and the 2018 IEEE GRSS Data Fusion Challenge datasets. OpenFCN showed little-to-no improvement when compared to the simpler and much more time efficient SoftMax thresholding, while being some orders of magnitude slower. OpenPCS achieved promising results in almost all experiments by overcoming both OpenFCN and SoftMax thresholding. OpenPCS is also a reasonable compromise between the runtime performances of the extremely fast SoftMax thresholding and the extremely slow OpenFCN, being able to run close to real-time. Experiments also indicate that OpenPCS is effective, robust and suitable for Open Set segmentation, being able to improve the recognition of unknown class pixels without reducing the accuracy on the known class pixels. We also tested the scenario of hiding multiple known classes to simulate multimodal unknowns, resulting in an even larger gap between OpenPCS/OpenIPCS and both SoftMax thresholding and OpenFCN, implying that gaussian modeling is more robust to settings with greater openness. Graphic Abstract

A Framework for Drivable Area Detection Via Point Cloud Double Projection on Rough Roads

Article

Full-text available

Jun 2021
J INTELL ROBOT SYST

Drivable area detection is one of the essential functions of autonomous vehicles. However, due to the complexity and diversity of unknown environments, it remains a challenge specifically on rough roads. In this paper, we propose a systematical framework for drivable area detection, including ground segmentation and road labelling. For each scan, the point cloud is projected onto two different planes simultaneously, generating an elevation map and a range map. Different from the existing methods based on mathematical models, we accomplish the ground segmentation using image processing methods. Subsequently, road points will be filtered out from ground points and used to generate the road area with the assistance of a range map. Meanwhile, a two-step search method is utilized to create the reachable area from an elevation map. For the robustness of drivable area detection, Bayesian decision theory is introduced in the final step to fuse the road area and the reachable area. Since we explicitly avoid complex three-dimensional computation, our method, from both empirical and theoretical perspectives, has a high real-time capability, and experimental results also show it has promising detection performance in various traffic situations.

RELATION NETWORK FOR FULL-WAVEFORMS LIDAR CLASSIFICATION

Article

Full-text available

Aug 2020

LiDAR data are widely used in various domains related to geosciences (flow, erosion, rock deformations, etc.), computer graphics (3D reconstruction) or earth observation (detection of trees, roads, buildings, etc.). Because of the unstructured nature of remaining 3D points and because of the cost of acquisition, the LiDAR data processing is still challenging (few learning data, difficult spatial neighboring relationships, etc.). In practice, one can directly analyze the 3D points using feature extraction and then classify the points via machine learning techniques (Brodu, Lague, 2012, Niemeyer et al., 2014, Mallet et al., 2011). In addition, recent neural network developments have allowed precise point cloud segmentation, especially using the seminal pointnet network and its extensions (Qi et al., 2017a, Riegler et al., 2017). Other authors rather prefer to rasterize / voxelize the point cloud and use more conventional computers vision strategies to analyze structures (Lodha et al., 2006). In a recent work, we demonstrated that Digital Elevation Models (DEM) is reductive of the vertical component complexity describing objects in urban environments (Guiotte et al., 2020). These results highlighted the necessity to preserve the 3D structure of the point cloud as long as possible in the processing. In this paper, we therefore rely on ortho-waveforms to compute a land cover map. Ortho-waveforms are directly computed from the waveforms in a regular 3D grid. This method provides volumes somehow “similar” to hyperspectral data where each pixel is here associated with one ortho-waveform. Then, we exploit efficient neural networks adapted to the classification of hyperspectral data when few samples are available. Our results, obtained on the 2018 Data Fusion Contest dataset (DFC), demonstrate the efficiency of the approach.

DETECTION OF TERRAIN STRUCTURES IN AIRBORNE LASER SCANNING DATA USING DEEP LEARNING

Article

Full-text available

Aug 2020

Automated recognition of terrain structures is a major research problem in many application areas. These structures can be investigated in raster products such as Digital Elevation Models (DEMs) generated from Airborne Laser Scanning (ALS) data. Following the success of deep learning and computer vision techniques on color images, researchers have focused on the application of such techniques in their respective fields. One example is detection of structures in DEM data. DEM data can be used to train deep learning models, but recently, Du et al. (2019) proposed a multi-modal deep learning approach (hereafter referred to as MM) proving that combination of geomorphological information help improve the performance of deep learning models. They reported that combining DEM, slope, and RGB-shaded relief gives the best result among other combinations consisting of curvature, flow accumulation, topographic wetness index, and grey-shaded relief. In this work, we approve and build on top of this approach. First, we use MM and show that combinations of other information such as sky view factors, (simple) local relief models, openness, and local dominance improve model performance even further. Secondly, based on the recently proposed HR-Net (Sun et al., 2019), we build a tinier, Multi-Modal High Resolution network called MM-HR, that outperforms MM. MM-HR learns with fewer parameters (4 millions), and gives an accuracy of 84:2 percent on ZISM50m data compared to 79:2 percent accuracy by MM which learns with more parameters (11 millions). On the dataset of archaeological mining structures from Harz, the top accuracy by MM-HR is 91:7 percent compared to 90:2 by MM.

Overview of the SegNet architecture with LiDAR rasters as input.

Citations