Fig 1 - uploaded by Sébastien Lefèvre
Content may be subject to copyright.
Overview of the SegNet architecture with LiDAR rasters as input.

Overview of the SegNet architecture with LiDAR rasters as input.

Source publication
Article
Full-text available
LiDAR point clouds are receiving a growing interest in remote sensing as they provide rich information to be used independently or together with optical data sources, such as aerial imagery. However, their nonstructured and sparse nature make them difficult to handle, conversely to raw imagery for which many efficient tools are available. To overco...

Citations

... Due to these advantages, more researchers have begun exploring the area of 3D space. Three-dimensional point cloud semantic segmentation and recognition algorithms hold significance in computer vision as a long-standing research topic [8]. However, the disorder and irregularity of point clouds in 3D space pose challenges for the automatic and accurate classification of point clouds [9,10]. ...
Article
Full-text available
With the development and popularization of LiDAR technology, point clouds are becoming widely used in multiple fields. Point cloud classification plays an important role in segmentation, geometric analysis, and vegetation description. However, existing point cloud classification algorithms have problems such as high computational complexity, a lack of feature optimization, and low classification accuracy. This paper proposes an efficient point cloud classification algorithm based on dynamic spatial–spectral feature optimization. It can eliminate redundant features, optimize features, reduce computational costs, and improve classification accuracy. It achieves feature optimization through three key steps. First, the proposed method extracts spatial, geometric, spectral, and other features from point cloud data. Then, the Gini index and Fisher score are used to calculate the importance and relevance of features, and redundant features are filtered. Finally, feature importance factors are used to dynamically enhance the discriminative power of highly distinguishable features to strengthen their contribution to point cloud classification. Four real-scene datasets from STPLS3D are utilized for experimentation. Compared to the other five algorithms, the proposed algorithm achieves at least a 37.97% improvement in mean intersection over union (mIoU). Meanwhile, the results indicate that the proposed algorithm can achieve high-precision point cloud classification with low computational complexity.
... LiDAR data are often treated as digital elevation models, i.e. images with pixel elevations [24,41,21]. Thus, our work is related to image-based primitive prediction [57,23,29,52] and unsupervised multi-object image segmentation [40,64,48,75]. ...
Preprint
Full-text available
We propose an unsupervised method for parsing large 3D scans of real-world scenes into interpretable parts. Our goal is to provide a practical tool for analyzing 3D scenes with unique characteristics in the context of aerial surveying and mapping, without relying on application-specific user annotations. Our approach is based on a probabilistic reconstruction model that decomposes an input 3D point cloud into a small set of learned prototypical shapes. Our model provides an interpretable reconstruction of complex scenes and leads to relevant instance and semantic segmentations. To demonstrate the usefulness of our results, we introduce a novel dataset of seven diverse aerial LiDAR scans. We show that our method outperforms state-of-the-art unsupervised methods in terms of decomposition accuracy while remaining visually interpretable. Our method offers significant advantage over existing approaches, as it does not require any manual annotations, making it a practical and efficient tool for 3D scene analysis. Our code and dataset are available at https://imagine.enpc.fr/~loiseaur/learnable-earth-parser
... Each point is described by a set of attributes, such as the intensity of the backscattered signal, information about the number of returned pulses, and the scan angle. Rasterization of the attribute data encoded in a point cloud is used to overcome the irregular sampling characteristic of LiDAR point clouds and to produce continuous raster grid datasets depicting various aspects of the Earth's surface, depending on the parameters used to produce the raster (Guiotte et al. 2020). ...
Article
Full-text available
The integration of machine learning algorithms with LiDAR-derived datasets has long been used in the development of canopy classification models with the main objective of differentiating between tree canopies and other types of land cover. However, these integrated canopy classification models often require investigators to provide training reference data, use vendor classification codes that vary in quality and availability, and/or rely on site-specific information to achieve the required accuracy. In this study, a generalizable canopy classification model based solely on LiDAR-derived datasets is proposed and evaluated. Ten watersheds that are located in different regions in the continental USA were selected to represent a wide range of physiography, topography, climate, tree diversity, and LiDAR data characteristics, ensuring model applicability to different environments. Three canopy classification model development strategies were considered: general, specific, and single. The final decision tree-based general canopy classification model contains five datasets with the roughness of the filtered DHM yielding the highest normalized feature importance of 0.9. The developed general canopy detection model accuracy was comparable to of the specific/single models and it generated an average testing kappa statistic of 0.90 and 0.96 when applied to training/testing and testing datasets, respectively. This study demonstrated the existence of a consistent canopy signal in LiDAR datasets across the contiguous US that can be used to create a general canopy classification models that are functional regardless of the study area or LiDAR quality.
... Rasterization [8,10] is a process commonly used to encode a point cloud into an image [34,35]. Appropriate rasterization reduces the size of the data while retaining the interesting information. ...
Article
Full-text available
This paper considers the problem of determining the time-varying location of a nearly full hatch during cyclic transloading operations. Hatch location determination is a necessary step for automation of transloading, so that the crane can safely operate on the cargo in the hatch without colliding with the hatch edges. A novel approach is presented and evaluated by using data from a light detection and ranging (LiDAR) mounted on a pan-tilt unit (PT). Within each cycle, the hatch area is scanned, the data is processed, and the hatch corner locations are extracted. Computations complete less than 5 ms after the LiDAR scan completes, which is well within the time constraints imposed by the crane transloading cycle. Although the approach is designed to solve the challenging problem of a full hatch scenario, it also works when the hatch is not full, because in that case the hatch edges can be more easily distinguished from the cargo data. Therefore, the approach can be applied during the whole duration of either loading or unloading. Experimental results for hundreds of cycles are present to demonstrate the ability to track the hatch location as it moves and to assess the accuracy (standard deviation less than 0.30 m) and reliability (worst case error less than 0.35 m).
... 3D ALS point clouds inherently lack topological information that 2D image-like DTMs embrace. In this paper, we propose using rasterization techniques to bridge the representation gap, which has proven to largely retain point-cloud information [20]. As such, point-cloudbased DTM extraction can be formulated as an image-to-image translation problem [21] and thus off-the-shelf computer vision methods can be applied. ...
Preprint
Full-text available
Despite the popularity of deep neural networks in various domains, the extraction of digital terrain models (DTMs) from airborne laser scanning (ALS) point clouds is still challenging. This might be due to the lack of dedicated large-scale annotated dataset and the data-structure discrepancy between point clouds and DTMs. To promote data-driven DTM extraction, this paper collects from open sources a large-scale dataset of ALS point clouds and corresponding DTMs with various urban, forested, and mountainous scenes. A baseline method is proposed as the first attempt to train a Deep neural network to extract digital Terrain models directly from ALS point clouds via Rasterization techniques, coined DeepTerRa. Extensive studies with well-established methods are performed to benchmark the dataset and analyze the challenges in learning to extract DTM from point clouds. The experimental results show the interest of the agnostic data-driven approach, with sub-metric error level compared to methods designed for DTM extraction. The data and source code is provided at https://lhoangan.github.io/deepterra/ for reproducibility and further similar research.
... 3-D ALS point clouds inherently lack topological information that 2-D image-like DTMs embrace. In this article, we propose using rasterization techniques to bridge the representation gap, which has proven to largely retain point cloud information [20]. As such, point cloud-based DTM extraction can be formulated as an image-to-image translation problem [21], and thus off-the-shelf computer vision methods can be applied. ...
Article
Full-text available
Despite the popularity of deep neural networks in various domains, the extraction of digital terrain models (DTMs) from airborne laser scanning (ALS) point clouds is still challenging. This might be due to the lack of the dedicated large-scale annotated dataset and the data-structure discrepancy between point clouds and DTMs. To promote data-driven DTM extraction, this article collects from open sources a large-scale dataset of ALS point clouds and corresponding DTMs with various urban, forested, and mountainous scenes. A baseline method is proposed as the first attempt to train a deep neural network to extract DTMs directly from ALS point clouds via rasterization techniques, coined DeepTerRa. Extensive studies with well-established methods are performed to benchmark the dataset and analyze the challenges in learning to extract DTM from point clouds. The experimental results show the interest of the agnostic data-driven approach, with submetric error level compared to methods designed for DTM extraction. The data and source code are available online at https://lhoangan.github.io/deepterra/ for reproducibility and further similar research.
... More recently, several semantic segmentation methods have been proposed specifically to deal with different aspects of remote sensing images such as spatial constraints (Nogueira et al. 2016;Maggiori et al. 2017;Marmanis et al. 2018;Wang et al. 2017;Audebert et al. 2016;Nogueira et al. 2019) or non-RGB data (Kemker et al. 2018;Guiotte et al. 2020). Nogueira et al. (2016) use patchwise semantic segmentation in RS imaging for both urban and agricultural scenarios. ...
... In Kemker et al. (2018), the authors adapt state-of-the-art semantic segmentation approaches to work with multi-spectral images. Guiotte et al. (2020) proposed an aprooach for semantic segmentation from LiDAR point clouds. ...
Article
Full-text available
In traditional semantic segmentation, knowing about all existing classes is essential to yield effective results with the majority of existing approaches. However, these methods trained in a Closed Set of classes fail when new classes are found in the test phase, not being able to recognize that an unseen class has been fed. This means that they are not suitable for Open Set scenarios, which are very common in real-world computer vision and remote sensing applications. In this paper, we discuss the limitations of Closed Set segmentation and propose two fully convolutional approaches to effectively address Open Set semantic segmentation: OpenFCN and OpenPCS. OpenFCN is based on the well-known OpenMax algorithm, configuring a new application of this approach in segmentation settings. OpenPCS is a fully novel approach based on feature-space from DNN activations that serve as features for computing PCA and multi-variate gaussian likelihood in a lower dimensional space. In addition to OpenPCS and aiming to reduce the RAM memory requirements of the methodology, we also propose a slight variation of the method (OpenIPCS) that uses an iteractive version of PCA able to be trained in small batches. Experiments were conducted on the well-known ISPRS Vaihingen/Potsdam and the 2018 IEEE GRSS Data Fusion Challenge datasets. OpenFCN showed little-to-no improvement when compared to the simpler and much more time efficient SoftMax thresholding, while being some orders of magnitude slower. OpenPCS achieved promising results in almost all experiments by overcoming both OpenFCN and SoftMax thresholding. OpenPCS is also a reasonable compromise between the runtime performances of the extremely fast SoftMax thresholding and the extremely slow OpenFCN, being able to run close to real-time. Experiments also indicate that OpenPCS is effective, robust and suitable for Open Set segmentation, being able to improve the recognition of unknown class pixels without reducing the accuracy on the known class pixels. We also tested the scenario of hiding multiple known classes to simulate multimodal unknowns, resulting in an even larger gap between OpenPCS/OpenIPCS and both SoftMax thresholding and OpenFCN, implying that gaussian modeling is more robust to settings with greater openness. Graphic Abstract
... where Z th is a vertical height threshold and is set to 15 cm in our work, then this point could be classified as a ground point. Digital terrain modelling methods based on morphological operations have been developed in some previous studies [28,29], but they are mostly used to process the airborne LiDAR point cloud, which is significantly different from the point cloud generated by the vehicle equipment. In our work, online ground estimation is performed with the assistance of an elevation map. ...
Article
Full-text available
Drivable area detection is one of the essential functions of autonomous vehicles. However, due to the complexity and diversity of unknown environments, it remains a challenge specifically on rough roads. In this paper, we propose a systematical framework for drivable area detection, including ground segmentation and road labelling. For each scan, the point cloud is projected onto two different planes simultaneously, generating an elevation map and a range map. Different from the existing methods based on mathematical models, we accomplish the ground segmentation using image processing methods. Subsequently, road points will be filtered out from ground points and used to generate the road area with the assistance of a range map. Meanwhile, a two-step search method is utilized to create the reachable area from an elevation map. For the robustness of drivable area detection, Bayesian decision theory is introduced in the final step to fuse the road area and the reachable area. Since we explicitly avoid complex three-dimensional computation, our method, from both empirical and theoretical perspectives, has a high real-time capability, and experimental results also show it has promising detection performance in various traffic situations.
... Other authors rather prefer to rasterize / voxelize the point cloud and use more conventional computers vision strategies to analyze structures (Lodha et al., 2006). In a recent work, we demonstrated that Digital Elevation Models (DEM) is reductive of the vertical component complexity describing objects in urban environments (Guiotte et al., 2020). These results highlighted the necessity to preserve the 3D structure of the point cloud as long as possible in the processing. ...
... Because of their ability to capture complex structures, many domains related to geosciences and earth observation are making increasing use of LiDAR data. Such systems provide indeed accurate 3D point clouds of the scanned scene which has a large number of applications ranging from urban scene analysis (Chehata et al., 2009, Guiotte et al., 2020, Shan, Aparajithan, 2005, geology and erosion (Brodu, Lague, 2012), archaeology (Witharana et al., 2018) or even ecology (Eitel et al., 2016). ...
... While first works have been focused on the characterization of single points (often through height and intensity) without including information related to their neighbours (Lodha et al., 2006), * Corresponding author more advanced approaches have included spatial relationships using a set of spheres or cylinders (of variable radius) around each point to extract consistent geometric features (Mallet et al., 2011, Weinmann et al., 2015, Niemeyer et al., 2014. Among others, we have demonstrated in (Guiotte et al., 2019b, Guiotte et al., 2020 that the various rasterization strategies may have an important impact on the final result. ...
Article
Full-text available
LiDAR data are widely used in various domains related to geosciences (flow, erosion, rock deformations, etc.), computer graphics (3D reconstruction) or earth observation (detection of trees, roads, buildings, etc.). Because of the unstructured nature of remaining 3D points and because of the cost of acquisition, the LiDAR data processing is still challenging (few learning data, difficult spatial neighboring relationships, etc.). In practice, one can directly analyze the 3D points using feature extraction and then classify the points via machine learning techniques (Brodu, Lague, 2012, Niemeyer et al., 2014, Mallet et al., 2011). In addition, recent neural network developments have allowed precise point cloud segmentation, especially using the seminal pointnet network and its extensions (Qi et al., 2017a, Riegler et al., 2017). Other authors rather prefer to rasterize / voxelize the point cloud and use more conventional computers vision strategies to analyze structures (Lodha et al., 2006). In a recent work, we demonstrated that Digital Elevation Models (DEM) is reductive of the vertical component complexity describing objects in urban environments (Guiotte et al., 2020). These results highlighted the necessity to preserve the 3D structure of the point cloud as long as possible in the processing. In this paper, we therefore rely on ortho-waveforms to compute a land cover map. Ortho-waveforms are directly computed from the waveforms in a regular 3D grid. This method provides volumes somehow “similar” to hyperspectral data where each pixel is here associated with one ortho-waveform. Then, we exploit efficient neural networks adapted to the classification of hyperspectral data when few samples are available. Our results, obtained on the 2018 Data Fusion Contest dataset (DFC), demonstrate the efficiency of the approach.
... Many DL models are adapted well to structured data such as images or videos. Therefore, it is advantageous to create regular raster grids such as DEMs from the ALS point clouds which could be fed to DL models for training (Guiotte et al., 2020). Values represented by DEM cells however show either absolute distance from the terrain to the acquisition device or relative elevations based on a reference surface, and in cases where the shape of objects and * Corresponding author structures are relevant regardless of how high or low of a terrain they are located at, only elevations relative to neighboring cells matter. ...
Article
Full-text available
Automated recognition of terrain structures is a major research problem in many application areas. These structures can be investigated in raster products such as Digital Elevation Models (DEMs) generated from Airborne Laser Scanning (ALS) data. Following the success of deep learning and computer vision techniques on color images, researchers have focused on the application of such techniques in their respective fields. One example is detection of structures in DEM data. DEM data can be used to train deep learning models, but recently, Du et al. (2019) proposed a multi-modal deep learning approach (hereafter referred to as MM) proving that combination of geomorphological information help improve the performance of deep learning models. They reported that combining DEM, slope, and RGB-shaded relief gives the best result among other combinations consisting of curvature, flow accumulation, topographic wetness index, and grey-shaded relief. In this work, we approve and build on top of this approach. First, we use MM and show that combinations of other information such as sky view factors, (simple) local relief models, openness, and local dominance improve model performance even further. Secondly, based on the recently proposed HR-Net (Sun et al., 2019), we build a tinier, Multi-Modal High Resolution network called MM-HR, that outperforms MM. MM-HR learns with fewer parameters (4 millions), and gives an accuracy of 84:2 percent on ZISM50m data compared to 79:2 percent accuracy by MM which learns with more parameters (11 millions). On the dataset of archaeological mining structures from Harz, the top accuracy by MM-HR is 91:7 percent compared to 90:2 by MM.