Article

Semantic segmentation of point clouds of building interiors with deep learning: Augmenting training datasets with synthetic BIM-based point clouds

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This paper investigates the viability of using synthetic point clouds generated from building information models (BIMs) to train deep neural networks to perform semantic segmentation of point clouds of building interiors. In order to achieve these goals, this paper first presents a procedure for converting digital 3D BIMs into synthetic point clouds using three commercially available software systems. Then the generated synthetic point clouds are used to train a deep neural network. Semantic segmentation performance is compared for several models trained on: real point clouds, synthetic point clouds, and combinations of real and synthetic point clouds. A key finding is the 7.1% IOU boost in performance achieved when a small real point cloud dataset is augmented by synthetic point clouds for training, as compared to training the classifier on the real data alone. The experimental results confirmed the viability of using synthetic point clouds generated from building information models in combination with small datasets of real point clouds. This opens up the possibility of developing a segmentation model for building interiors that can be applied to as-built modeling of buildings that contain unseen indoor structures.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In the context of office environments, some studies (J. W. Ma et al., 2020;Zhai et al., 2022) used the S3DIS dataset to investigate the potential of synthetic point cloud data for training a neural network for point cloud semantic segmentation. ...
... For the experiment presented by J. W. Ma et al. (2020), a subset of the S3DIS dataset ("Area 1") was remodeled manually in an engineering application. The pure geometry of the objects in the model was then exported to sample evenly spaced points on a 3D grid within the objects' volumes to generate synthetic training data and finally annotate the point cloud on an instance level with accordingly reduced manual effort. ...
... As depicted in Figure 7a, the distribution of points over the various classes is quite imbalanced, which is challenging for applying machine learning algorithms but very common for point cloud scenes, as can be seen in similar experiments (J. W. Ma et al., 2020;Soilán et al., 2021). The class analysis is omitted for the factory hall dataset to keep the study concise. ...
Article
Digitizing existing structures is essential for applying digital methods in architecture, engineering, and construction. However, the adoption of data‐driven techniques for transforming point cloud data into useful digital models faces challenges, particularly in the industrial domain, where ground truth datasets for training are scarce. This paper investigates a solution leveraging synthetic data to train data‐driven models effectively. In the investigated industrial domain, the complex geometry of building elements often leads to occlusions, limiting the effectiveness of conventional sampling‐based synthetic data generation methods. Our approach proposes the automatic generation of realistic and semantically enriched ground truth data using surface‐based sampling methods and laser scan simulation on industry‐standard 3D models. In the presented experiments, we use a neural network for point cloud semantic segmentation to demonstrate that compared to sampling‐based alternatives, simulation‐based synthetic data significantly improve mean class intersection over union performance on real point cloud data, achieving up to 7% absolute increase.
... This rich geometric and semantic information makes it possible to render a synthetic point cloud from a BIM model, which may be used for data enhancement during 3D understanding model training. Generating synthetic point clouds through BIMs is characterized by the ability to generate unlimited amounts of data from different views, which meets the requirements for improving the performance of deep learning models [7]. Fig. 1 shows the sample synthetic data generated from a BIM model by our approach. ...
... The general process of generating a synthetic point cloud from as-built BIM models consists of three steps: (1) geometric and semantic parsing of the BIM model, (2) fully labeled point cloud generation via ray tracing, and (3) modeling the noise based on the reflective properties of different objects. Currently, although large-scale efforts have created real annotated datasets of indoor scenes, they have typically focused on the generation of redgreen-blue (RGB) and depth images [8][9][10][11] and there remains no fully automated way to generate perfect synthetic point clouds from BIM models [7]. ...
... Similarly, Wang et al. [32] devised a method to automatically extract 3D LiDAR point clouds with point-level ground truth labels from the CARLA simulator for 3D data analysis. Ma et al. [7] exploited the complete geometry and rich semantic information in BIM models to develop a framework to simulate a 3D point cloud from an as-built BIM model. They used this framework to confirm that synthetic point clouds generated from BIM models could be used in combination with small datasets of real point clouds for effective 3D point cloud segmentation. ...
Article
The limited amount of high-quality training data available in indoor understanding with deep learning is a major problem. A possible solution to this problem is to use synthetic data to improve network training. In this study, a fully automatic method to generate synthetic noisy point clouds from as-built building information modeling (BIM) models is presented and it assesses the potential of these synthetic point clouds to improve deep neural network training. Based on a skeleton-guided strategy, all hypothetical scanning sites are located along the central axis of the buildings, which are obtained through equidistant sampling. Then, the synthetic labeled point cloud is generated station-by-station, and data augmentation is achieved using a random combination of data from different stations. The proposed approach involves generating over 44 sets of synthetic noisy point clouds based on BIM models. The performance of state-of-the-art (SOTA) deep learning methods in understanding indoor scenes enhanced by the synthetic point clouds is thoroughly assessed, and the effectiveness of various combinations of real and synthetic datasets is investigated. The experimental results demonstrate that leveraging synthetic point clouds generated from BIM models leads to a remarkable 5%-10% improvement in 3D semantic segmentation accuracy. The research signifies the value of synthetic point clouds as an effective tool for improving deep neural network training. All simulation datasets are publicly available, including original BIM models, full synthetic point clouds, and point clouds after IHPR processing, accessible via the BIMSyn Dataset link. In future research, an exploration of how synthetic point clouds will be further improved by considering specific characteristics of objects such as color, material reflectance, and illumination.
... The challenge is mainly induced by the complexity of the BIM generation process, which involves distinguishing architectural elements, colorization, creating visual representations, and assembling these components into a semantically-enriched BIM. Despite the proposition of several automated BIM modeling frameworks, most primarily focus on reconstructing the 3D geometry of high-rise buildings [5,18,22,41] and extracting the semantic information of indoor building components [15,22,42]. Yet, methods for outdoor facade-level or infrastructure projects remain largely unexplored due to the lack of data sources, robust machine learning methods, or a streamlined approach [30]. ...
... The challenge is mainly induced by the complexity of the BIM generation process, which involves distinguishing architectural elements, colorization, creating visual representations, and assembling these components into a semantically-enriched BIM. Despite the proposition of several automated BIM modeling frameworks, most primarily focus on reconstructing the 3D geometry of high-rise buildings [5,18,22,41] and extracting the semantic information of indoor building components [15,22,42]. Yet, methods for outdoor facade-level or infrastructure projects remain largely unexplored due to the lack of data sources, robust machine learning methods, or a streamlined approach [30]. ...
Preprint
Full-text available
The adoption of Building Information Modeling (BIM) is beneficial in construction projects. However, it faces challenges due to the lack of a unified and scalable framework for converting 3D model details into BIM. This paper introduces SRBIM, a unified semantic reconstruction architecture for BIM generation. Our approach's effectiveness is demonstrated through extensive qualitative and quantitative evaluations, establishing a new paradigm for automated BIM modeling.
... The challenge is mainly induced by the complexity of the BIM generation process, which involves distinguishing architectural elements, colorization, creating visual representations, and assembling these components into a semantically-enriched BIM. Despite the proposition of several automated BIM modeling frameworks, most primarily focus on reconstructing the 3D geometry of high-rise buildings [5,18,22,41] and extracting the semantic information of indoor building components [15,22,42]. Yet, methods for outdoor facade-level or infrastructure projects remain largely unexplored due to the lack of data sources, robust machine learning methods, or a streamlined approach [30]. ...
... The challenge is mainly induced by the complexity of the BIM generation process, which involves distinguishing architectural elements, colorization, creating visual representations, and assembling these components into a semantically-enriched BIM. Despite the proposition of several automated BIM modeling frameworks, most primarily focus on reconstructing the 3D geometry of high-rise buildings [5,18,22,41] and extracting the semantic information of indoor building components [15,22,42]. Yet, methods for outdoor facade-level or infrastructure projects remain largely unexplored due to the lack of data sources, robust machine learning methods, or a streamlined approach [30]. ...
Conference Paper
Full-text available
The adoption of Building Information Model-ing (BIM) is beneficial in construction projects. However, it faces challenges due to the lack of a unified and scalable framework for converting 3D model details into BIM. This paper introduces SR-BIM, a unified semantic reconstruction architecture for BIM generation. Our approach's effectiveness is demonstrated through extensive qualitative and quantitative evaluations, establishing a new paradigm for automated BIM modeling.
... In construction informatics research dealing with 2D and 3D data, such as in Scan-to-BIM, semantic segmentation is increasingly regarded as an essential step after data collection, to provide further information useful to subsequent tasks such as object detection [8]. This technique has been applied to different types of data, including: (1) 2D image of indoor scene [9] and aerial images of different architecture [10]; (2) 3D point cloud of building interiors [8], plumbing and structural components [11], autonomous vehicles and robot navigation [12]. ...
... In construction informatics research dealing with 2D and 3D data, such as in Scan-to-BIM, semantic segmentation is increasingly regarded as an essential step after data collection, to provide further information useful to subsequent tasks such as object detection [8]. This technique has been applied to different types of data, including: (1) 2D image of indoor scene [9] and aerial images of different architecture [10]; (2) 3D point cloud of building interiors [8], plumbing and structural components [11], autonomous vehicles and robot navigation [12]. This paper reports on the development and comparison of well-established deep-learning -based semantic segmentation models for segmenting orthophotos of individual roof panels into 'background', 'slate', 'lead', and 'other' classes. ...
... In this study, the existing indoor scenes of buildings involving a large number of non-forward designed objects (such as tables, chairs, etc., in conference rooms) were clearly not suitable for BIM reconstructions produced from the remote data. The reconstruction of on-site data mainly combined on-site point cloud data [45,46] and image data [47]. Image-based methods mainly use image processing techniques to extract features in order to recognize target building objects and reconstruct 3D models from building images [48]. ...
... These original point clouds cannot be directly used to generate 3D models [50]. Therefore, a series of point cloud processing technologies are being continuously developed, such as point cloud preprocessing, point cloud segmentation, and point cloud object recognition [45,51,52]. This study used point cloud data as the raw data to reconstruct the indoor scenes of existing buildings. ...
Article
Full-text available
Building information models (BIMs) offer advantages, such as visualization and collaboration, making them widely used in the management of existing buildings. Currently, most BIMs for existing indoor spaces are manually created, consuming a significant amount of manpower and time, severely impacting the efficiency of building operations and maintenance management. To address this issue, this study proposes an automated reconstruction method for an indoor scene BIM based on a feature-enhanced point transformer and an octree. This method enhances the semantic segmentation performance of point clouds by using feature position encoding to strengthen the point transformer network. Subsequently, the data are partitioned into multiple segments using an octree, collecting the geometric and spatial information of individual objects in the indoor scene. Finally, the BIM is automatically reconstructed using Dynamo in Revit. The research results indicate that the proposed feature-enhanced point transformer algorithm achieves a high segmentation accuracy of 71.3% mIoU on the S3DIS dataset. The BIM automatically generated from the field point cloud data, when compared to the original data, has an average error of ±1.276 mm, demonstrating a good reconstruction quality. This method achieves the high-precision, automated reconstruction of the indoor BIM for existing buildings, avoiding extensive manual operations and promoting the application of BIMs for the maintenance processes of existing buildings.
... Detecting changes between the real environment and the 3D model is the process of detecting differences between LiDAR scans and the 3D model. Learning-based change detection approaches have been studied and developed recently (Czerniawski et al., 2021, Voelsen et al., 2021, Ma et al., 2020. Such learning based methods rely on sufficient annotated real LiDAR data or generated synthetic LiDAR data, which require a prolonged labelling procedure and training stage. ...
... Learning-based change detection methods are inspired by LiDAR segmentation networks (Czerniawski et al., 2021, Voelsen et al., 2021, Ma et al., 2020 and satellite imagebased change detection methods (Xu et al., 2021, Meshkini et al., 2022, Bai et al., 2022. LiDAR segmentation techniques can be modified to have two inputs for two LiDAR point clouds taken from different epoches respectively to perform change detection between these two LiDAR point clouds. ...
Article
Full-text available
Indoor change detection is important for building monitoring, building management and model-based localization and navigation systems because the real building environment may not always be the same as the design model. This paper presents a novel indoor building change detection method based on entropy. A sequence of real LiDAR scans is acquired with a static LiDAR scanner and the pose of the LiDAR scanner for each scan is then estimated. Synthetic LiDAR scans are generated with the pose of the LiDAR scanner using the 3D model. The real LiDAR scans and synthetic LiDAR scans are sliced horizontally with a certain angular interval and the entropy of all slices of LiDAR scans is then calculated. Differenced entropy between two corresponding slices of real LiDAR scans and synthetic LiDAR scans is calculated for the classification of the changes. Each slice of real LiDAR scans will be classified into one of the four categories of changes: unchanged, moving objects, structural change and non-structural change. Experimental results show that unchanged slices and slices containing moving objects can be accurately detected, achieving 100% accuracy while non-structural and structural changes are detected with an accuracy of 98.5% and 86.3% respectively.
... Since performing these tasks manually is tedious and error-prone, much research effort is put into automation. As for point clouds, especially deep learning has recently gained attention [30]. Most algorithms in use, including neural networks, are supervised, meaning that labeled data is needed to train them. ...
... Several sampling attempts have been performed. [30] presents a process of transforming Revit models to sampled point clouds using AutoCAD, Sketchup, and FME Workbench; labeling is done manually, and the produced point cloud is volumetric (not only the exterior faces of geometries have been sampled). In [6], Blender is used to create CSV files with coordinates and labels of sampled points of historical buildings' models in OBJ format. ...
Article
Full-text available
The increasing availability of point clouds has led to intensive research into automating point cloud processing using machine learning. While supervised systems require large and diverse labeled datasets, the cost and time of manual data creation can be overcome with synthetic data. This paper introduces DynamoPCSim, a versatile scanning simulator based on visual programming, implementing ray tracing, and operating on BIM models. The simulator collects measurements of digital models and transfers the model semantic data to generated point clouds, enabling automated labeling. Customizable scanning parameters allow for the reflection of real scanners (including imperfections) and the transformation of synthetic point clouds, making the data more realistic. The evaluation of generated point clouds against real-world data through a neural network segmentation experiment provides a foundation for the effective utilization of DynamoPCSim synthetic point clouds in machine learning training.
... Xue et al.'s (2019b) approach fundamentally differs from existing semantic segmentation approaches in that it is training-105 free, capable of processing complex scenes, and able to cleverly reuse existing information such as BIM models. From the perspective of machine learning, semantic segmentation is based on supervised learning with expensive training and labeling (Ma et al. 2020;Xia et al. 2022), while semantic registration is like unsupervised learning (Xue et al. 2019a;2019b). ...
Article
[Free PDF:🌐 https://authors.elsevier.com/a/1jIVl3IhXM-Iyn 🌐 until Aug 2024] Existing construction activity-monitoring technologies, such as CCTV cameras and IoT devices, have limitations, such as lack of depth information, 3D measurement errors, or wireless signal vulnerability. The limitations are particularly problematic for activities related to mobile cranes due to their high mobility and flexibility. This paper presents a 4D point cloud (4DPC)-based spatial-temporal semantic registration method to overcome the limitations. The proposed method integrates spatial-temporal semantic registration into process site 4DPC with as-designed BIM semantics. Results from a one-hour on-site experiment demonstrated that the proposed method achieved 99.93-100% F1 accuracy in detecting BIM objects, and high resolution (centimeter-second granularity) of the trajectories of hoisting activities. This paper offers a twofold contribution. First, spatial-temporal semantic registration represents an innovative approach to 4D point cloud (4DPC) processing. Secondly, the hoisting activities are comprehensively analyzed based on semantic registration, which can improve safety and productivity monitoring for smarter construction in the future.
... However, training data scarcity is the main barrier to this method. Although the synthetic dataset [12,26] and synthetic data augmentation strategies [11,27] can be used as a remedy for the data scarcity challenge, these datasets are idealised point clouds primarily converted from as-design BIM models and fail to consider the data incompleteness, different noises, and other uncertainties in realworld data collection process. Additionally, the bridges in the real world are likely to suffer from various damage, such as component deficiency and deformation, which is less considered in the synthetic data in previous research. ...
Article
Full-text available
Utilising domain knowledge (DK) to semantically segment bridge point clouds has attracted growing research interest. However, current approaches are often tailored to specific bridges, limiting their general applicability. To address this problem, this paper introduces a DK-enhanced Region Growing (DKRG) framework for point cloud semantic segmentation of reinforced concrete (RC) girder bridges. Inspired by the vertical layout characteristics of bridges, the generation of DK-based point features from Finite Element Analysis (FEA) is first proposed. Then, DKRG is employed to segment bridge components from substructures to superstructures by leveraging an “easy-to-difficult” strategy. Validation results demonstrate the effectiveness of our method, achieving the lowest mean Intersection over Union (mIoU) of 95.47% for the entire bridge and 93.44% for different component types. This study provides a new DK-based framework for semantic segmentation of RC girder bridges and sheds new light on using FEA-generated point features.
... Only few valuable training datasets such as S2DIS (Stanford 2D-3D-Semantics) [47], ScanNet [48] or Paris-Lille 3D [49] are publicly available. Current research is therefore turning to innovative approaches using synthetic point cloud generation in order to enrich the availability of varied and representative training data [50]. However, this type of method has its limits in terms of geometric representations, whether volumetric or in terms of detail accuracy. ...
... Based on synthetic data [26], [27] 4 Manual intervention required [26], [28] 41st International Symposium on Automation and Robotics in Construction (ISARC 2024) This study revealed the main challenges in current point cloud segmenting methods for applications in the construction domain. In practice, despite the increasing adoption of technology in construction sites, there is still a significant reliance on manual and hybrid (semiautomated) construction information processing. ...
Conference Paper
Full-text available
The construction industry is witnessing a transformative shift with the integration of advanced technologies, especially in the topic of 3D segmentation. This study underscores the current state and challenges of 3D segmentation, with special emphasis on construction research, and provides an insightful understanding of the latest research developments and trends. The study also looks at the performance metrics of the most relevant techniques, as well as the main limitations and research gaps, highlighting the need for further research in highly-performing techniques based on Deep Learning for point cloud segmentation in construction applications.
... Du et al. (2021) have conducted a comprehensive review of this area of computer vision-based robotic grasping processes, and identify object localization, pose estimation, and grasp estimation as the core components to achieving this task. Many pose estimation models are trained on synthetically generated data, constructed by simulating different views or positions of 3D CAD models (Litvak et al. 2019;Ma et al. 2020). While many pose estimation models focus on small-scale objects, our research extends those techniques to the larger scale required for construction materials, drawing inspiration from Tish et al. (2020) and their strategy for adaptive robotic construction of large facade panels. ...
Article
Full-text available
Automated robotic construction of wood frames faces significant challenges, particularly in the perception of large studs and maintaining tight assembly tolerances amidst the natural variability and dimensional instability of wood. To address these challenges, we introduce a novel multi-modal, multi-stage perception strategy for adaptive robotic construction, particularly for wood light-frame assembly. Our strategy employs a coarse-to-fine method of perception by integrating deep learning-based stud pose estimation with subsequent stages of pose refinement, combining the flexibility of AI-based approaches with the precision of traditional computer vision techniques. We demonstrate this strategy through experimental validation and construction of two different wall designs, using both low- and high-quality framing lumber, and achieve far better precision than construction industry guidelines suggest for designs of similar dimension.
... As such, DL can provide a general framework to address a wide variety of segmentation tasks. Its competitive point cloud segmentation performance has been demonstrated by Ma et al. (2020) and Matrone et al. (2020) for heritage buildings and by Xia et al. (2022) for reinforced concrete slab and beam-slab bridge structures. ...
Article
Full-text available
Transformer architecture based on the attention mechanism achieves impressive results in natural language processing (NLP) tasks. This paper transfers the successful experience to a 3D point cloud segmentation task. Inspired by newly proposed 3D Transformer neural networks, this paper introduces a new Transformer-based module, which is called Local Geo-Transformer. To alleviate the heavy memory consumption of the original Transformer, Local Geo-Transformer only performs the attention mechanism in local regions. It is designed to mitigate the information loss caused by the subsampling of point clouds for segmentation. Global Geo-Transformer is proposed to exploit the global relationships between features with the lowest resolution. The new architecture is validated on a masonry bridge dataset developed by the authors for their earlier work on a previous segmentation network called BridgeNet. The new version of the network with Transformer architecture, BridgeNetv2, outperforms BridgeNet in all metrics. BridgeNetv2 is also shown to be lightweight and memory efficient, well-adapted to large-scale point cloud segmentation tasks in civil engineering.
... For example, the semantic segmentation of airborne electrical substation point clouds simplifies industrial defect detection and facilitates the development of digital twins in the power sectors [3][4][5][6]. With the development of remote sensing technology, using drones equipped with RGB cameras and LiDAR sensors to acquire remote sensing images and airborne LiDAR point clouds offers an efficient way to acquire comprehensive scene information [7,8], providing basic data for intelligent industrial O&M and applications of building information modeling (BIM) model in various fields, such as cultural heritage protection [9], smart cities [10,11], and energy [12]. ...
Article
Full-text available
The semantic segmentation of drone LiDAR data is important in intelligent industrial operation and maintenance. However, current methods are not effective in directly processing airborne true-color point clouds that contain geometric and color noise. To overcome this challenge, we propose a novel hybrid learning framework, named SSGAM-Net, which combines supervised and semi-supervised modules for segmenting objects from airborne noisy point clouds. To the best of our knowledge, we are the first to build a true-color industrial point cloud dataset, which is obtained by drones and covers 90,000 m². Secondly, we propose a plug-and-play module, named the Global Adjacency Matrix (GAM), which utilizes only few labeled data to generate the pseudo-labels and guide the network to learn spatial relationships between objects in semi-supervised settings. Finally, we build our point cloud semantic segmentation network, SSGAM-Net, which combines a semi-supervised GAM module and a supervised Encoder–Decoder module. To evaluate the performance of our proposed method, we conduct experiments to compare our SSGAM-Net with existing advanced methods on our expert-labeled dataset. The experimental results show that our SSGAM-Net outperforms the current advanced methods, reaching 85.3% in mIoU, which ranges from 4.2 to 58.0% higher than other methods, achieving a competitive level.
... In the point cloud domain, existing work leveraging CAD models for semantic segmentation labeling is scarce. 3D models have primarily been used as a means of generating synthetic scans with "free" semantic labels [17,[34][35][36][37]. In contrast, we tackle the problem of having to label real pre-existing point clouds, which were not sampled from a 3D model but from a real-world scene. ...
Article
Full-text available
We propose a fully automatic annotation scheme that takes a raw 3D point cloud with a set of fitted CAD models as input and outputs convincing point-wise labels that can be used as cheap training data for point cloud segmentation. Compared with manual annotations, we show that our automatic labels are accurate while drastically reducing the annotation time and eliminating the need for manual intervention or dataset-specific parameters. Our labeling pipeline outputs semantic classes and soft point-wise object scores, which can either be binarized into standard one-hot-encoded labels, thresholded into weak labels with ambiguous points left unlabeled, or used directly as soft labels during training. We evaluate the label quality and segmentation performance of PointNet++ on a dataset of real industrial point clouds and Scan2CAD, a public dataset of indoor scenes. Our results indicate that reducing supervision in areas that are more difficult to label automatically is beneficial compared with the conventional approach of naively assigning a hard “best guess” label to every point.
... A virtual model can contain anything from a small object to a city. For example, building models can be used to create indoor based point clouds (Ma et al., 2020) or depth and semantics, as in Hypersim (Roberts et al., 2021). Studies related to autonomous driving have also benefited from the developments of synthetic data creation. ...
... Later, with the advent of DL-based object detection Braun et al. (2020) proposed the integration of DL to improve further the efficiency of building elements detection. Recently with the development of more advanced point cloud segmentation algorithms (Ma et al., 2020;Maalek et al., 2019), CPM by building element detection directly from point clouds has become more accessible. However, as these methods only rely on geometry information, understanding the operation-level progress status is still challenging. ...
Article
Full-text available
Effective progress monitoring is ineviTable for completing the construction of building and infrastructure projects successfully. In this digital transformation era, with the data-centric management and control approach, the effectiveness of monitoring methods is expected to improve dramatically. "Digital Twin," which creates a bidirectional communication flow between a physical entity and its digital counterpart, is found to be a crucial enabling technology for information-aware decision-making systems in manufacturing and other automotive industries. Recognizing the benefits of this technology in production management in construction, researchers have proposed Digital Twin Construction (DTC). DTC leverages building information modeling technology and processes, lean construction practices, on-site digital data collection mechanisms, and Artificial Intelligence (AI) based data analytics for improving construction production planning and control processes. Progress monitoring, a key component in construction production planning and control, can significantly benefit from DTC. However, some knowledge gaps still need to be filled for the practical implementation of DTC for progress monitoring in the built environment domain. This research reviews the existing vision-based progress monitoring methods, studies the evolution of automated vision-based construction progress monitoring research, and highlights the methodological and technological knowledge gaps that must be addressed for DTC-based predictive progress monitoring. Subsequently, it proposes a framework for closed-loop construction control through DTC. Finally, the way forward for fully automated, real-time construction progress monitoring built upon the DTC concept is proposed.
... However, we assume that all the steps are perfectly implemented with the exception of the last step. Regarding the 3D deep-learningbased approach, the compared network is selected as PointNet, which has been widely used recently (e.g., Xiong and Wang 2021;Park et al. 2022;Ma et al. 2020) to convert point clouds into BIMs. The Stanford 2D-3D-Semantics data set (Armeni et al. 2017), a popular set of point clouds collected from six buildings, is adopted to train the PointNet. ...
... As explained previously, the STR and two-step HMSTR approaches leverage image synthesis and augmentation techniques to supplement imbalanced data 29 . Figure 6 provides an overview of the manner in which a few types of marked semantic text account for most occurrences in the dataset. ...
Article
Full-text available
Automated text recognition techniques have made significant advancements; however, certain tasks still present challenges. This study is motivated by the need to automatically recognize hand-marked text on construction defect tags among millions of photographs. To address this challenge, we investigated three methods for automating hand-marked semantic text recognition (HMSTR)—a modified scene text recognition-based (STR) approach, a two-step HMSTR approach, and a lumped approach. The STR approach involves locating marked text using an object detection model and recognizing it using a competition-winning STR model. Similarly, the two-step HMSTR approach first localizes the marked text and then recognizes the semantic text using an image classification model. By contrast, the lumped approach performs both localization and identification of marked semantic text in a single step using object detection. Among these approaches, the two-step HMSTR approach achieved the highest F1 score (0.92) for recognizing circled text, followed by the STR approach (0.87) and the lumped approach (0.78). To validate the generalizability of the two-step HMSTR approach, subsequent experiments were conducted using check-marked text, resulting in an F1 score of 0.88. Although the proposed methods have been tested specifically with tags, they can be extended to recognize marked text in reports or books.
... Another example introduced Markov Random Field-based approach for segmenting textured meshes into urban classes [5]. On the other hand, deep learning techniques have been employed to segment point clouds of building interiors [6]. Other research efforts [7] used convolutional neural networks to semantically segment point clouds of structural and mechanical components. ...
... Machine learning methods are less flexible and intelligent than deep learning methods, so deep learningbased point cloud segmentation interests more researchers. Ma et al. [44] presented a deep learning-based approach with PointNet and DGCNN to segment point clouds of building structures, in which a synthetic point cloud dataset generated from BIM models was used to train the deep neural network. One way of improving the ability of a neural network is to extract multi-scale features, so Lee et al. [45] proposed a graph-based hierarchical DGCNN for segmenting railway bridges more accurately. ...
... The second step is to generate labelled synthetic images that closely resemble photographs from the BIM images using the trained GAN, and the final step is to combine the synthetic images together in order to create a comprehensive dataset of high-quality synthetic data. Ma et al. (2020) conducted an investigation on the utilization of synthetic point cloud data for training deep models and facilitating the development of asbuilt BIM. They introduced a workflow that involved converting existing BIM models into synthetic point clouds using three different software tools. ...
Article
Full-text available
Over the past decade, the use of machine learning and deep learning algorithms to support 3D semantic segmentation of point clouds has significantly increased, and their impressive results has led to the application of such algorithms for the semantic modeling of heritage buildings. Nevertheless, such applications still face several significant challenges, caused in particular by the high number of training data required during training, by the lack of specific data in the heritage building scenarios, and by the time-consuming operations to data collection and annotation. This paper aims to address these challenges by proposing a workflow for synthetic image data generation in heritage building scenarios. Specifically, the procedure allows for the generation of multiple rendered images from various viewpoints based on a 3D model of a building. Additionally, it enables the generation of per-pixel segmentation maps associated with these images. In the first part, the procedure is tested by generating a synthetic simulation of a real-world scenario using the case study of Spedale del Ceppo. In the second part, several experiments are conducted to assess the impact of synthetic data during training. Specifically, three neural network architectures are trained using the generated synthetic images, and their performance in predicting the corresponding real scenarios is evaluated.
... The network contains local aggregation operator modules to extract the local geometric structures. Ma et al. [32] trained the neural network using the point cloud data with the labels generated by the existing BIM model for the semantic segmentation of real point clouds. Lu et al. [33][34][35] created BIM models by combining the information obtained from drawings and from real building. ...
... Studies, such as Nikolenko [67] and Tremblay et al. [68], showed that synthetic data, such as PCDs sampled from 3D models, can assist in overcoming the lack of real-world data. Researchers compared strategies for training models when real-world ground truth data were limited [69]. They claimed that neural networks trained on a large amount of synthetic data and real data (20% of the available real dataset) achieved worse performance-roughly 10% lower than a trained model on the entire real dataset. ...
Article
Full-text available
Most of the buildings that exist today were built based on 2D drawings. Building information models that represent design-stage product information have become prevalent in the second decade of the 21st century. Still, it will take many decades before such models become the norm for all existing buildings. In the meantime, the building industry lacks the tools to leverage the benefits of digital information management for construction, operation, and renovation. To this end, this paper reviews the state-of-the-art practice and research for constructing (generating) and maintaining (updating) geometric digital twins. This paper also highlights the key limitations preventing current research from being adopted in practice and derives a new geometry-based object class hierarchy that mainly focuses on the geometric properties of building objects, in contrast to widely used existing object categorisations that are mainly function-oriented. We argue that this new class hierarchy can serve as the main building block for prioritising the automation of the most frequently used object classes for geometric digital twin construction and maintenance. We also draw novel insights into the limitations of current methods and uncover further research directions to tackle these problems. Specifically, we believe that adapting deep learning methods can increase the robustness of object detection and segmentation of various types; involving design intents can achieve a high resolution of model construction and maintenance; using images as a complementary input can help to detect transparent and specular objects; and combining synthetic data for algorithm training can overcome the lack of real labelled datasets.
... Synthetic LiDAR scans are more regular and neat than real LiDAR scans because of simplicity and low level of detail of the BIM, such as the lack of transparent surfaces. Unseen environments and objects cause identification errors of changes, which makes the trained network vulnerable to an unseen environment [35]. Using satellite images in two epoches to detect changes on urban environments and earth surfaces based on deep learning techniques have been widely studied [36][37][38]. ...
Article
Full-text available
Detecting changes of indoor environments with respect to a 3D model is important for building monitoring and management. Existing change detection methods based on LiDAR segmentation and comparison with 3D models are limited to simple environments without temporary changes or moving objects. The aim of this paper is to propose a novel change detection method based on LiDAR segmentation for complex environments. We formulate the problem of detecting differences between the 3D model and the real environment as the detection of differences between real LiDAR scans captured in the environment and synthetic LiDAR scans generated in the 3D model. This allows real-time building change detection and classification using a mobile LiDAR. A two-branch convolutional network is proposed to detect differences between the 3D model and LiDAR scans. Synthetic LiDAR scans are generated in the 3D model using the estimated poses of a set of real LiDAR scans. The network is trained with pairs of synthetic and real LiDAR scans and tested with new real LiDAR scans. Each point of real LiDAR scans is classified into one of four categories: unchanged, structural change, moving objects and temporary change. A comparison is performed between the performance of four backbone architectures to find a suitable backbone architecture for change detection networks. Experimental results show that the proposed approach can achieve 94% overall change classification accuracy with the SqueezeNet-based change detection network and the trained network is transferable to comparable indoor environments. This research enables updating 3D models of complex indoor environments efficiently using a mobile LiDAR scanner.
... Some researchers examined the quality of 3D imagery data in terms of accuracy, level of comprehensiveness, and detail [132,133]. Some researchers define data quality metrics for quantifying the information losses and uncertainties caused by missing data [134][135][136][137][138][139][140][141][142][143]. The data can be tabular data, multimedia data such as images and audios, natural language reports, and sensory time series. ...
Article
Full-text available
Trustworthy and explainable structural health monitoring (SHM) of bridges is crucial for ensuring the safe maintenance and operation of deficient structures. Unfortunately, existing SHM methods pose various challenges that interweave cognitive, technical, and decision-making processes. Recent development of emerging sensing devices and technologies enables intelligent acquisition and processing of massive spatiotemporal data. However, such processes always involve human-in-the-loop (HITL), which introduces redundancies and errors that lead to unreliable SHM and service safety diagnosis of bridges. Comprehending human-cyber (HC) reliability issues during SHM processes is necessary for ensuring the reliable SHM of bridges. This study aims at synthesizing studies related to HC reliability for supporting the trustworthy and explainable SHM of bridges. The authors use a bridge inspection case to lead a synthesis of studies that examined techniques relevant to the identified HC reliability issues. This synthesis revealed challenges that impede the industry from monitoring, predicting, and controlling HC reliability in bridges. In conclusion, a research road map was provided for addressing the identified challenges.
... In addition, recent studies have taken advantage of ML/DL to improve the automatic level of converting point cloud to geometric and BIM models. The main steps include point cloud segmentation, object detection and classification via support vector machines (SVMs) (Bassier et al. 2017), convolutional neural networks (CNNs) (Perez-Perez et al. 2020, 2021, recurrent neural networks (RNNs) (Perez-Perez et al. 2021), and PointNet/PointNet ++ (Agapaki and Brilakis 2020;Chen et al. 2019a;Koo et al. 2021;Ma et al. 2020;Park et al. 2020;Park and Cho 2022). ...
Article
Full-text available
This paper presents an innovative and fully automatic solution of generating as-built computer-aided design (CAD) drawings for landscape architecture (LA) with three dimensional (3D) reality data scanned via drone, camera, and LiDAR. To start with the full pipeline, 2D feature images of ortho-image and elevation-map are converted from the reality data. A deep learning-based light convolu-tional encoder-decoder was developed, and compared with U-Net (a binary segmentation model), for image pixelwise segmentation to realize automatic site surface classification, object detection, and ground control point identification. Then, the proposed elevation clustering and segmentation algorithms can automatically extract contours for each instance from each surface or object category. Experimental results showed that the developed light model achieved comparable results with U-Net in landing pad segmentation with average intersection over union (IoU) of 0.900 versus 0.969. With the proposed data augmentation strategy, the light model had a testing pixel accuracy of 0.9764 and mean IoU of 0.8922 in the six-class segmentation testing task. Additionally, for surfaces with continuous elevation changes (i.e., ground), the developed algorithm created contours only have an average elevation difference of 1.68 cm compared to dense point clouds using drones and image-based reality data. For objects with discrete elevation changes (i.e., stair treads), the generated contours accurately represent objects' elevations with zero difference using light detection and ranging (LiDAR) data. The contribution of this research is to develop algorithms that automatically transfer the scanned LA scenes to contours with real-world coordinates to create as-built computer-aided design (CAD) drawings, which can further assist building information modeling (BIM) model creation and inspect the scanned LA scenes with augmented reality. The optimized parameters for the developed algorithms are analyzed and recommended for future applications.
... proposed a straight skeleton-based navigation network that generates a complete 3D indoor navigation network for indoor wayfinding. Won et al. (2020) used deep learning to semantically segment the synthetic point cloud inside a BIM-based building. Experiments have proven the feasibility of combining the synthetic point cloud generated by BIM with the small dataset of a real point cloud. ...
Article
Full-text available
Building information Modeling (BIM) has been applied to the whole life cycle planning of construction projects, becoming the latest “engineering brain”. Currently, researches on BIM involve various stages, but most of the review fields are relatively single and lack of systematic review and analysis. In order to comprehensively analyze the research trend of BIM in the field of engineering management, this paper takes the holistic analysis method as the framework. In the first stage, 2066 research projects were quantitatively analyzed by bibliometrics to clarify their research environment. In the second stage, scientometric analysis method is adopted to identify scholars, countries, key words and journal sources that have achieved fruitful results and influence in BIM research, and to clarify the research environment. In the last stage, indepth qualitative discussion is carried out to achieve three objectives: (1) to divide the whole life cycle of the article and summarize the research hotspots in each stage; (2) identify BIM application problems; (3) determine the future research direction. This work is helpful for researchers and practitioners in this field to quickly find influential and fruitful research or journals, and to understand the current research hot spots and trends for the next research planning.
Article
Full-text available
Variational autoencoders (VAEs) play an important role in high-dimensional data generation based on their ability to fuse the stochastic data representation with the power of recent deep learning techniques. The main advantages of these types of generators lie in their ability to encode the information with the possibility to decode and generalize new samples. This capability was heavily explored for 2D image processing; however, only limited research focuses on VAEs for 3D data processing. In this article, we provide a thorough review of the latest achievements in 3D data processing using VAEs. These 3D data types are mostly point clouds, meshes, and voxel grids, which are the focus of a wide range of applications, especially in robotics. First, we shortly present the basic autoencoder with the extensions towards the VAE with further subcategories relevant to discrete point cloud processing. Then, the 3D data specific VAEs are presented according to how they operate on spatial data. Finally, a few comprehensive table summarizing the methods, codes, and datasets as well as a citation map is presented for a better understanding of the VAEs applied to 3D data. The structure of the analyzed papers follows a taxonomy, which differentiates the algorithms according to their primary data types and application domains.
Article
Full-text available
The latest advances in mobile platforms, such as robots, have enabled the automatic acquisition of full coverage point cloud data from large areas with terrestrial laser scanning. Despite this progress, the crucial post-processing step of registration, which aligns raw point cloud data from separate local coordinate systems into a unified coordinate system, still relies on manual intervention. To address this practical issue, this study presents an automated point cloud registration approach optimized for a stop-and-go scanning system based on a quadruped walking robot. The proposed approach comprises three main phases: perpendicular constrained wall-plane extraction; coarse registration with plane matching using point-to-point displacement calculation; and fine registration with horizontality constrained iterative closest point (ICP). Experimental results indicate that the proposed method successfully achieved automated registration with an accuracy of 0.044 m and a successful scan rate (SSR) of 100% within a time frame of 424.2 s with 18 sets of scan data acquired from the stop-and-go scanning system in a real-world indoor environment. Furthermore, it surpasses conventional approaches, ensuring reliable registration for point cloud pairs with low overlap in specific indoor environmental conditions.
Chapter
Quality assurance (QA) plays an essential role in the construction project life cycle. During the construction phase, discrepancies between as-built structures and as-designed models can lead to schedule delays and cost overruns. Currently, QA for buildings is primarily conducted by inspectors physically touring the building to visually inspect and manually measure discrepancies between the design model and the finished structure. This manual approach is time-consuming, costly, and error prone. In this study, we proposed a vision-based approach toward automated QA using images collected via virtual cameras in a game engine. Specifically, our approach aimed to address the problem of the lack of large-scale labeled open datasets for training reliable machine learning models for the task of semantic segmentation of building components (i.e., labeling of each pixel to a specific class of building component). The approach leveraged Building Information Modeling (BIM) authoring tools and a game engine to automatically generate images of virtual buildings with pixel-wise labels. Given the labeled images, a convolutional neural network (CNN)-based model can be implemented for accurate segmentation of the images. To validate the approach, we used one building information model as the testbed. In total, 20,700 images (18,000 for training, 2200 for validation and 500 for testing) were generated from the BIM. Performance of the CNN segmentation model was measured by the mean Intersection over Union (MIoU), which achieved 0.89. The result is significant since it rivals the current state-of-the-art from the Architecture Engineering and Construction (AEC) domain. The approach proposed in this study lays a concrete step toward automated QA, where inspectors can leverage the trained CNN model to automatically label images collected onsite during or after construction to avoid labor-intensive manual inspections.KeywordsQuality assuranceBuilding componentsSemantic segmentationSynthetic images
Article
Graph Neural Networks (GNNs) have emerged as a promising solution for effectively handling non-Euclidean data in construction, including building information models (BIM) and scanned point clouds. However, despite their potential, there is a lack of comprehensive scholarly work providing a holistic understanding of the application of GNNs in the construction domain. This paper addresses this gap by conducting a thorough review of 34 publications on GNNs in construction, presenting a comprehensive overview of the current research landscape. By analyzing the existing literature, this paper aims to identify opportunities and challenges for further advancing the application of GNNs in construction. The findings from this review shed light on diverse approaches for constructing graph data from common construction data types and demonstrate the significant potential of GNNs for the industry. Moreover, this paper contributes to the existing body of knowledge by increasing awareness of the current state of GNNs in the construction industry and offering practical recommendations to overcome challenges in real-world practice.
Preprint
Full-text available
Building Information Modeling (BIM) technology is a key component of modern construction engineering and project management workflows. As-is BIM models that represent the spatial reality of a project site can offer crucial information to stakeholders for construction progress monitoring, error checking, and building maintenance purposes. Geometric methods for automatically converting raw scan data into BIM models (Scan-to-BIM) often fail to make use of higher-level semantic information in the data. Whereas, semantic segmentation methods only output labels at the point level without creating object level models that is necessary for BIM. To address these issues, this research proposes a hybrid semantic-geometric approach for clutter-resistant floorplan generation from laser-scanned building point clouds. The input point clouds are first pre-processed by normalizing the coordinate system and removing outliers. Then, a semantic segmentation network based on PointNet++ is used to label each point as ceiling, floor, wall, door, stair, and clutter. The clutter points are removed whereas the wall, door, and stair points are used for 2D floorplan generation. A region-growing segmentation algorithm paired with geometric reasoning rules is applied to group the points together into individual building elements. Finally, a 2-fold Random Sample Consensus (RANSAC) algorithm is applied to parameterize the building elements into 2D lines which are used to create the output floorplan. The proposed method is evaluated using the metrics of precision, recall, Intersection-over-Union (IOU), Betti error, and warping error.
Article
Full-text available
The manual creation of digital models of existing buildings for operations and maintenance is difficult and time-consuming. Machine learning and deep learning techniques have recently emerged to help automate this process. To assess the numerous publications in the field, this paper presents a systematic literature review and highlights potential research gaps and development opportunities. Following the procedure suggested by PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses), 95 eligible publications are selected for the final review. The findings indicate that future research should explore alternative data sources, extract component attributes alongside geometries, and address retrospective infrastructure modeling, which remains widely unexplored. This paper sheds new insights on the latest research on using ML approaches to generate digital models of existing buildings, with the aim of providing guidance for researchers seeking ideas for future studies in this area.
Article
This research was conducted based on social phenomena related to the high level of student anxiety in facing the development of the digital era when carrying out industrial practices due to low confidence in students' abilities. BIM is software for simulating all information in a three-dimensional model that is not only an image, but also a development project system, management, method, and work steps.Therefore, this study aims to determine the relationship between students' self-efficacy and work readiness of Vocational High Schoolsin facing the Building Information Modelling (BIM) technology era that developed during construction work.BIM as a form of a management information system requires middle-class employees to work jointly as a team. In the field of AEC, the process of planning and design, development, as well as maintenance is performed collaboratively. Furthermore, a total of 360 students in West Java were selected as participants using proportional stratified random sampling. Data were collected using product-moment analysis. The results showed that self-efficacy was positively and significantly related to work readiness with a correlation coefficient of r= 0.692 and p-0.000 where p<0.05. This means that the increase inself-efficacy helps to develop work readiness and vice versa. The coefficient of determination shows that self-efficacy and other variables contribute 47.9% and52.1%, respectively, to students' work readiness in facing the BIM technology era. This can be interpreted that the higher the self-efficacy, the lower the presentation anxiety and vice versa. Based on the results of this study as recommendations for improvement, it is necessary to carry out additional learning activities to increase student skills, so that student confidence will increase when working.
Article
The dimensional quality of precast concrete (PC) subcomponents (concrete and rebars) should be inspected in advance to ensure assembly quality. Currently, PC components are mainly inspected in a manual manner using tools such as tape measure, which is error-prone and inefficient. This study developed an innovative approach for automatic dimensional quality assessment of PC components using point cloud-based deep learning techniques. The approach consists of 1) a dataset-generating method to automatically create the synthetic dataset of PC components’ point clouds, 2) an enhanced focal loss-based precast concrete component recognition net (PCCR-Net) employing hierarchical feature learning to segment the synthetic point clouds dataset into rebars and concrete (i.e. the synthetic dataset generated is used to train the PCCR-Net), and 3) a quantitative measurement protocol that can estimate the dimensional quality of the segmented concrete and rebars. Experiments were conducted to test the capability of the approach, and the results show that the proposed approach was able to yield satisfactory performance. First, the dataset-generating method can solve the shortage of point cloud datasets in engineering practice. Second, the PCCR-Net segmentation network can simultaneously realize the high-precision identification of various typical PC components, including PC columns, beams, slabs, and walls. Third, the dimension average deviations between experimental results and manual measurements demonstrate that the assessment approach can accurately estimate the dimensions of PC components.
Conference Paper
Full-text available
We present a new, publicly-available image dataset generated by the NVIDIA Deep Learning Data Synthesizer intended for use in object detection, pose estimation, and tracking applications. This dataset contains 144k stereo image pairs that synthetically combine 18 camera viewpoints of three photorealistic virtual environments with up to 10objects (chosen randomly from the 21 object models of the YCB dataset [2]) and flying distractors. Object and camera pose, scene lighting, and quantity of objects and distractors were randomized. Each provided view includes RGB, depth, segmentation, and surface normal images, all pixel level. We describe our approach for domain randomization and provide insight into the decisions that produced the dataset.
Article
Full-text available
Over the past decade a considerable number of studies have focused on generating semantically rich as‐built building information models (BIMs). However, the prevailing methods rely on laborious manual segmentation or automatic but error‐prone segmentation. In addition, the methods failed to make good use of existing semantics sources. This article presents a novel segmentation‐free derivative‐free optimization (DFO) approach that translates the generation of as‐built BIMs from 2D images into an optimization problem of fitting BIM components regarding architectural and topological constraints. The semantics of the BIMs are subsequently enriched by linking the fitted components with existing semantics sources. The approach was prototyped in two experiments using an outdoor and an indoor case, respectively. The results showed that in the outdoor case 12 out of 13 BIM components were correctly generated within 1.5 hours, and in the indoor case all target BIM components were correctly generated with a root‐mean‐square deviation (RMSD) of 3.9 cm in about 2.5 hours. The main computational novelties of this study are: (1) to translate the automatic as‐built BIM generation from 2D images as an optimization problem; (2) to develop an effective and segmentation‐free approach that is fundamentally different from prevailing methods; and (3) to exploit online open BIM component information for semantic enrichment, which, to a certain extent, alleviates the dilemma between information inadequacy and information overload in BIM development.
Conference Paper
Full-text available
In this paper, we present a novel 3D segmentation approach operating on point clouds generated from overlapping images. The aim of the proposed hybrid approach is to effectively segment co-planar objects, by leveraging the structural information originating from the 3D point cloud and the visual information from the 2D images, without resorting to learning based procedures. More specifically, the proposed hybrid approach, H-RANSAC, is an extension of the well-known RANSAC plane-fitting algorithm, incorporating an additional consistency criterion based on the results of 2D segmentation. Our expectation that the integration of 2D data into 3D segmentation will achieve more accurate results, is validated experimentally in the domain of 3D city models. Results show that HRANSAC can successfully delineate building components like main facades and windows, and provide more accurate segmentation results compared to the typical RANSAC plane-fitting algorithm.
Article
Full-text available
Point clouds provide a flexible and scalable geometric representation suitable for countless applications in computer graphics; they also comprise the raw output of most 3D data acquisition devices. Hence, the design of intelligent computational models that act directly on point clouds is critical, especially when efficiency considerations or noise preclude the possibility of expensive denoising and meshing procedures. While hand-designed features on point clouds have long been proposed in graphics and vision, however, the recent overwhelming success of convolutional neural networks (CNNs) for image analysis suggests the value of adapting insight from CNN to the point cloud world. To this end, we propose a new neural network module dubbed EdgeConv suitable for CNN-based high-level tasks on point clouds including classification and segmentation. EdgeConv is differentiable and can be plugged into existing architectures. Compared to existing modules operating largely in extrinsic space or treating each point independently, EdgeConv has several appealing properties: It incorporates local neighborhood information; it can be stacked or recurrently applied to learn global shape properties; and in multi-layer systems affinity in feature space captures semantic characteristics over potentially long distances in the original embedding. Beyond proposing this module, we provide extensive evaluation and analysis revealing that EdgeConv captures and exploits fine-grained geometric properties of point clouds. The proposed approach achieves state-of-the-art performance on standard benchmarks including ModelNet40 and S3DIS.
Conference Paper
Full-text available
Semantically rich three dimensional models such as Building Information Models (BIMs) are increasingly used in digital heritage. They provide the required information to varying stakeholders during the different stages of the historic buildings life cyle which is crucial in the conservation process. The creation of as-built BIM models is based on point cloud data. However, manually interpreting this data is labour intensive and often leads to misinterpretations. By automatically classifying the point cloud, the information can be proccesed more effeciently. A key aspect in this automated scan-to-BIM process is the classification of building objects. In this research we look to automatically recognise elements in existing buildings to create compact semantic information models. Our algorithm efficiently extracts the main structural components such as floors, ceilings, roofs, walls and beams despite the presence of significant clutter and occlusions. More specifically, Support Vector Machines (SVM) are proposed for the classification. The algorithm is evaluated using real data of a variety of existing buildings. The results prove that the used classifier recognizes the objects with both high precision and recall. As a result, entire data sets are reliably labelled at once. The approach enables experts to better document and process heritage assets.
Article
Full-text available
Ventricular tachycardia (VT) and ventricular fibrillation (VFib) are the life-threatening shockable arrhythmias which require immediate attention. Cardiopulmonary resuscitation (CPR) and defibrillation are highly recommended means of immediate treatment of these shockable arrhythmias and to resume spontaneous circulation. However, to increase efficacy of defibrillation by an automated external defibrillator (AED), an accurate distinction of shockable ventricular arrhythmias from non-shockable ones needs to be provided upfront. Therefore, in this work, we have proposed a novel tool for an automated differentiation of shockable and non-shockable ventricular arrhythmias from 2 s electrocardiogram (ECG) segments. Segmented ECGs are processed by an eleven-layer convolutional neural network (CNN) model. Our proposed system was 10-fold cross validated and achieved maximum accuracy, sensitivity and specificity of 93.18%, 95.32% and 91.04% respectively. Its high performance indicates that shockable life-threatening arrhythmia can be immediately detected and thus increase the chance of survival while CPR or AED-based support is performed. Our tool can also be seamlessly integrated with an ECG acquisition systems in the intensive care units (ICUs).
Article
Full-text available
Few prior works study deep learning on point sets. PointNet by Qi et al. is a pioneer in this direction. However, by design PointNet does not capture local structures induced by the metric space points live in, limiting its ability to recognize fine-grained patterns and generalizability to complex scenes. In this work, we introduce a hierarchical neural network that applies PointNet recursively on a nested partitioning of the input point set. By exploiting metric space distances, our network is able to learn local features with increasing contextual scales. With further observation that point sets are usually sampled with varying densities, which results in greatly decreased performance for networks trained on uniform densities, we propose novel set learning layers to adaptively combine features from multiple scales. Experiments show that our network called PointNet++ is able to learn deep point set features efficiently and robustly. In particular, results significantly better than state-of-the-art have been obtained on challenging benchmarks of 3D point clouds.
Conference Paper
Full-text available
Recent progress in computer vision has been driven by high-capacity models trained on large datasets. Unfortunately, creating large datasets with pixel-level labels has been extremely costly due to the amount of human effort required. In this paper, we present an approach to rapidly creating pixel-accurate semantic label maps for images extracted from modern computer games. Although the source code and the internal operation of commercial games are inaccessible, we show that associations between image patches can be reconstructed from the communication between the game and the graphics hardware. This enables rapid propagation of semantic labels within and across images synthesized by the game, with no access to the source code or the content. We validate the presented approach by producing dense pixel-level semantic annotations for 25 thousand images synthesized by a photorealistic open-world computer game. Experiments on semantic segmentation datasets show that using the acquired data to supplement real-world images significantly increases accuracy and that the acquired data enables reducing the amount of hand-labeled real-world data: models trained with game data and just 1/3 of the CamVid training set outperform models trained on the complete CamVid training set.
Article
Full-text available
Diverse approaches to laser point segmentation have been proposed since the emergence of the laser scanning system. Most of these segmentation techniques, however, suffer from limitations such as sensitivity to the choice of seed points, lack of consideration of the spatial relationships among points, and inefficient performance. In an effort to overcome these drawbacks, this paper proposes a segmentation methodology that: (1) reduces the dimensions of the attribute space; (2) considers the attribute similarity and the proximity of the laser point simultaneously; and (3) works well with both airborne and terrestrial laser scanning data. A neighborhood definition based on the shape of the surface increases the homogeneity of the laser point attributes. The magnitude of the normal position vector is used as an attribute for reducing the dimension of the accumulator array. The experimental results demonstrate, through both qualitative and quantitative evaluations, the outcomes’ high level of reliability. The proposed segmentation algorithm provided 96.89% overall correctness, 95.84% completeness, a 0.25 m overall mean value of centroid difference, and less than 1_ of angle difference. The performance of the proposed approach was also verified with a large dataset and compared with other approaches. Additionally, the evaluation of the sensitivity of the thresholds was carried out. In summary, this paper proposes a robust and efficient segmentation methodology for abstraction of an enormous number of laser points into plane information.
Article
Full-text available
We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects. ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy. It is a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations. Annotations are made available through a public web-based interface to enable data visualization of object attributes, promote data-driven geometric analysis, and provide a large-scale quantitative benchmark for research in computer graphics and vision. At the time of this technical report, ShapeNet has indexed more than 3,000,000 models, 220,000 models out of which are classified into 3,135 categories (WordNet synsets). In this report we describe the ShapeNet effort as a whole, provide details for all currently available datasets, and summarize future plans.
Article
Full-text available
As-built spatial data are useful in many construction-related applications, such as quality control and progress monitoring. These data can be collected using a number of imaging and time-of-flight-based (e. g., laser scanning) sensor methods. Each application will demand a particular level of data accuracy and quality, yet little information is available to help engineers choose the most cost-effective approach. This paper presents an analytical and quantitative comparison of photogrammetric, videogrammetric, and time-of-flight-based methods. This comparison is done with respect to accuracy, quality, time efficiency, and cost. To this end, representative image-based three-dimensional reconstruction software and commercially available hardware (two cameras and a time-of-flight-based laser scanner) are evaluated. Spatial data of typical infrastructure (two bridges and a building) are collected under different settings. The experimental parameters include camera type, resolution, and shooting distance for the imaging sensors. By comparing these data with the ground truth collected by a total station, it is revealed that video/photogrammetry can produce results of moderate accuracy and quality but at a much lower cost as compared to laser scanning. The obtained information is useful to help engineers make cost-effective decisions and help researchers better understand the performance impact of these settings for the sensor technologies. DOI: 10.1061/(ASCE)CO.1943-7862.0000565. (C) 2013 American Society of Civil Engineers.
Article
Full-text available
Purpose Building information modelling (BIM) implementation is a major change management task, involving diversity of risk areas. The identification of the challenges and barriers is therefore an imperative precondition of this change process. This paper aims to diagnose UK's construction industry to develop a clear understanding about BIM adoption and to form an imperative step of consolidating collective movements towards wider BIM implementation and to provide strategies and recommendations for the UK construction industry for BIM implementation. Design/methodology/approach Through comprehensive literature review, the paper initially establishes BIM maturity concept, which paves the way for the analysis via qualitative and quantitative methods: interviews are carried out with high profile organisations in Finland to gauge the best practice before combining the results with the analysis of survey questionnaire amongst the major contractors in the UK. Findings The results are established in the form of the initial phase of a sound BIM implementation guidance at strategic and operational levels. The findings suggest three structured patterns to systematically tackle technology, process and people issues in BIM implementation. These are organisational culture, education and training, and information management. The outcome is expressed as a roadmap for the implementation of BIM in the UK entailing issues that require consideration for organisations to progress on the BIM maturity ladder. Practical implications It paves a solid foundation for organisations to make informed decisions in BIM adaptation within the overall organisation structure. Originality/value This research consolidates collective movements towards wider implementation of BIM in the UK and forms a base for developing a sound BIM strategy and guidance.
Article
Full-text available
This article outlines a new semi-automatic approach for generating accurate BIM facade models for existing buildings from laser and image data. Two new developments for as-built BIM modelling are presented in this article. The first is a new library of reusable parametric objects designed for modelling classical architectural elements from survey data. These library objects are dynamic and have parameters that can instantly alter the shape, size and other properties of objects. The parametric architectural objects have been designed from historic manuscripts and architectural pattern books. These parametric objects were built using an embedded programming language within the ArchiCAD BIM software called Geometric Description Language (GDL). The second development which is described in more detail in this article is a complete parametric building façade. This parametric building facade can be used as a template for fast and efficient generation of BIM facade geometry. The design of this parametric façade incorporates concepts from procedural modelling which is an automated approach to generating 3D geometries based on rules and algorithms. Parametric architectural objects are automatically combined with rules to generate many different façade arrangements which are controlled by user parameters. When automatically generating a façade, the initial position and size of elements are estimated using classical architectural proportions. Object can then be graphically edited individually or in groups to match the computer generated geometry to survey data. The parametric façade template has also been implemented with the Geometric Description Language for ArchiCAD BIM software. This enables the tools developed to utilise the full benefits of BIM software which includes automated construction or conservation documents, semantic object oriented objects based on IFC semantic classes, automatic lists of objects and material and the ability to add and link additional information to the model. Initial user tests have indicated that the parametric façade is more efficient than previous methods for creating accurate façade models from survey data. The façade template also provides an easier solution for generating façade models when compared to existing methods. Non-specialist users with little experience in 3D modelling can easily generate and modify the façade template by altering parameters graphically or from a dialogue box. REFERENCE: Conor Dore, Maurice Murphy (2014). Semi-automatic generation of as-built BIM façade geometry from laser and image data. COPYRIGHT: © 2014 The authors. This is an open access article distributed under the terms of the Creative Commons Attribution 3.0 unported (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Article
Full-text available
While BIM processes are established for new buildings, the majority of existing buildings is not maintained, refurbished or deconstructed with BIM yet. Promising benefits of efficient resource management motivate research to overcome uncertainties of building condition and deficient documentation prevalent in existing buildings. Due to rapid developments in BIM research, involved stakeholders demand a state-of-the-art overview of BIM implementation and research in existing buildings. This paper presents a review of over 180 recent publications on the topic. Results show scarce BIM implementation in existing buildings yet, due to challenges of (1) high modeling/conversion effort from captured building data into semantic BIM objects, (2) updating of information in BIM and (3) handling of uncertain data, objects and relations in BIM occurring in existing buildings. Despite fast developments and spreading standards, challenging research opportunities arise from process automation and BIM adaption to existing buildings' requirements.
Article
Full-text available
Convolutional Neural Networks are extremely efficient architectures in image and audio recognition tasks, thanks to their ability to exploit the local translational invariance of signal classes over their domain. In this paper we consider possible generalizations of CNNs to signals defined on more general domains without the action of a translation group. In particular, we propose two constructions, one based upon a hierarchical clustering of the domain, and another based on the spectrum of the graph Laplacian. We show through experiments that for low-dimensional graphs it is possible to learn convolutional layers with $O(1)$ parameters, resulting in efficient deep architectures.
Article
Full-text available
The rapid development of Information Technologies in Architecture, Engineering and Construction Industry (AEC), as well as in Architecture, Engineering, Construction and Owner/Operator (AECO), are consistently changing the definition of Building Information Modeling (BIM). BIM technology takes new meanings, highlighting the generic concepts of this universal determination for product deliverables build on usage of building intellectual 3D virtual model associated with this processes like project inception, design, evaluation, construction, operation and demolition. In his article, the authors review the stages and trends of BIM concept development presenting the case studies of four real projects in which elements of BIM technology have been adopted by project participants reviewing benefits as well as obstacles and problems of practical BIM implementation providing recommendations for future applications of BIM.
Article
Full-text available
We propose a new approach for building detection using high-resolution satellite imagery based on an adaptive fuzzy-genetic algorithm. This novel approach improves object detection accuracy by reducing the premature convergence problem encountered when using genetic algorithms. We integrate the fundamental image processing operators with genetic algorithm concepts such as population, chromosome, gene, crossover and mutation. To initiate the approach, training samples are selected that represent the specified two feature classes, in this case “building” and “non-building”. The image processing operations are carried out on a chromosome-by-chromosome basis to reveal the attribute planes. These planes are then reduced to one hyperplane that is optimal for discriminating between the specified feature classes. For each chromosome, the fitness values are calculated through the analysis of detection and mis-detection rates. This analysis is followed by genetic algorithm operations such as selection, crossover and mutation. At the end of each generation cycle, the adaptive-fuzzy module determines the new (adjusted) probabilities of crossover and mutation. This evolutionary process repeats until a specified number of generations has been reached. To enhance the detected building patches, morphological image processing operations are applied. The approach was tested on ten different test scenes of the Batikent district of the city of Ankara, Turkey using 1 m resolution pan-sharpened IKONOS imagery. The kappa statistics computed for the proposed adaptive fuzzy-genetic algorithm approach were between 0.55 and 0.88. The extraction performance of the algorithm was better for urban and suburban buildings than for buildings in rural test scenes.
Conference Paper
Full-text available
The construction industry has been facing a paradigm shift to (i) increase; productivity, efficiency, infrastructure value, quality and sustainability, (ii) reduce; lifecycle costs, lead times and duplications, via effective collaboration and communication of stakeholders in construction projects. Digital construction is a political initiative to address low productivity in the sector. This seeks to integrate processes throughout the entire lifecycle by utilising building information modelling (BIM) systems. The focus is to create and reuse consistent digital information by the stakeholders throughout the lifecycle. However, implementation and use of BIM systems requires dramatic changes in the current business practices, bring new challenges for stakeholders e.g., the emerging knowledge and skill gap. This paper reviews and discusses the status of implementation of the BIM systems around the globe and their implications to the industry. Moreover, based on the lessons learnt, it will provide a guide to tackle these challenges and to facilitate successful transition towards utilizing BIM systems in construction projects.
Article
The recent success of deep learning in 3-D data analysis relies upon the availability of large annotated data sets. However, creating 3-D data sets with point-level labels are extremely challenging and require a huge amount of human efforts. This paper presents a novel open-sourced method to extract light detection and ranging point clouds with ground truth annotations from a simulator automatically. The virtual sensor can be configured to simulate various real devices, from 2-D laser scanners to 3-D real-time sensors. Experiments are conducted to show that using additional synthetic data for training can: 1) achieve a visible performance boost in accuracy; 2) reduce the amount of manually labeled real-world data; and 3) help to improve the generalization performance across data sets.
Article
Currently, fully automated as-built modeling of building interiors using point-cloud data still remains an open challenge, due to several problems that repeatedly arise: (1) complex indoor environments containing multiple rooms; (2) time-consuming and labor-intensive noise filtering; (3) difficulties of representation of volumetric and detail-rich objects such as windows and doors. This study aimed to overcome such limitations while improving the amount of details reproduced within the model for further utilization in BIM. First, we input just the registered three-dimensional (3D) point-cloud data and segmented the point cloud into separate rooms for more effective performance of the later modeling phases for each room. For noise filtering, an offset space from the ceiling height was used to determine whether the scan points belonged to clutter or architectural components. The filtered points were projected onto a binary map in order to trace the floor-wall boundary, which was further refined through subsequent segmentation and regularization procedures. Then, the wall volumes were estimated in two ways: inside- and outside-wall-component modeling. Finally, the wall points were segmented and projected onto an inverse binary map, thereby enabling detection and modeling of the hollow areas as windows or doors. The experimental results on two real-world data sets demonstrated, through comparison with manually-generated models, the effectiveness of our approach: the calculated RMSEs of the two resulting models were 0.089 m and 0.074 m, respectively.
Article
The aim of this study is to propose a method for generating as-built BIMs from laser-scan data obtained during the construction phase, particularly during ongoing structural works. The proposed method consists of three steps: region-of-interest detection to distinguish the 3D points that are part of the structural elements to be modeled, scene segmentation to partition the 3D points into meaningful parts comprising different types of elements (e.g., floors, columns, walls, girders, beams, and slabs) using local concave and convex properties between structural elements, and volumetric representation. The proposed method was tested in field experiments by acquiring and processing laser-scan data from construction sites. The performance of the proposed method was evaluated by quantitatively measuring how accurately each of the structural elements was recognized as its functional semantics. Overall, 139 elements of the 141 structural elements (99%) in the two construction sites combined were recognized and modeled according to their actual functional semantics. As the experimental results imply, the proposed method can be used for as-built BIMs without any prior information from as-planned models.
Article
Automatic 3D plane segmentation is necessary for many applications including point cloud registration, building information model (BIM) reconstruction, simultaneous localization and mapping (SLAM), and point cloud compression. However, most of the existing 3D plane segmentation methods still suffer from low precision and recall, and inaccurate and incomplete boundaries, especially for low-quality point clouds collected by RGB-D sensors. To overcome these challenges, this paper formulates the plane segmentation problem as a global energy optimization because it is robust to high levels of noise and clutter. First, the proposed method divides the raw point cloud into multiscale supervoxels, and considers planar supervoxels and individual points corresponding to nonplanar supervoxels as basic units. Then, an efficient hybrid region growing algorithm is utilized to generate initial plane set by incrementally merging adjacent basic units with similar features. Next, the initial plane set is further enriched and refined in a mutually reinforcing manner under the framework of global energy optimization. Finally, the performances of the proposed method are evaluated with respect to six metrics (i.e., plane precision, plane recall, under-segmentation rate, over-segmentation rate, boundary precision, and boundary recall) on two benchmark datasets. Comprehensive experiments demonstrate that the proposed method obtained good performances both in high-quality TLS point clouds (i.e., SEMANTIC3D.NET dataset) and low-quality RGB-D point clouds (i.e., S3DIS dataset) with six metrics of (94.2%, 95.1%, 2.9%, 3.8%, 93.6%, 94.1%) and (90.4%, 91.4%, 8.2%, 7.6%, 90.8%, 91.7%) respectively.
Article
In the spectrum of vision-based autonomous driving, vanilla end-to-end models are not interpretable and suboptimal in performance, while mediated perception models require additional intermediate representations such as segmentation masks or detection bounding boxes, whose annotation can be prohibitively expensive as we move to a larger scale. Raw images and existing intermediate representations are also loaded with nuisance details that are irrelevant to the prediction of vehicle commands, e.g. the style of the car in front or the view beyond the road boundaries. More critically, all prior works fail to deal with the notorious domain shift if we were to merge data collected from different sources, which greatly hinders the model generalization ability. In this work, we address the above limitations by taking advantage of virtual data collected from driving simulators, and present DU-drive, an unsupervised real to virtual domain unification framework for end-to-end driving. It transforms real driving data to its canonical representation in the virtual domain, from which vehicle control commands are predicted. Our framework has several advantages: 1) it maps driving data collected from different source distributions into a unified domain, 2) it takes advantage of annotated virtual data which is free to obtain, 3) it learns an interpretable, canonical representation of driving image that is specialized for vehicle command prediction. Extensive experiments on two public highway driving datasets clearly demonstrate the performance superiority and interpretive capability of DU-drive.
Article
Owing to the increasing complexity of modern industrial plants such as naval and ocean plants and industrial factories, it is very urgent to have an efficient method for inspecting such structures in terms of as-built inspection and maintenance. In this paper, we propose a method called ‘parametric comparing’ which utilizes laser scan technology to support the inspection process of industrial plants. In our approach, component parameters are extracted from laser scan data and then compared with their as-designed parameters. The results of this process can support engineers to assess the quality of the as-built model. Because most components in an industrial plant could be classified as piping elements, we focus on two classes of components: straight pipes and connecting components. Validations on two prototype data sets have proved that our approach is practical and fit for industrial application.
Article
Building information models (BIMs) have proven to be data-rich, object-oriented, intelligent, and parametric digital representations of buildings to support diverse activities throughout the lifecycle of the building. Despite the growing use of BIMs for new construction projects in recent years, most existing buildings today often do not have complete as-is information documents or a meaningful BIM. Thus, incomplete or even incorrect information in as-is records is still one of the main reasons for the low level of efficiency in facilities management for existing buildings. Furthermore, creating an as-is BIM for an existing building is considered a time-consuming and expensive process that requires great effort, time, costs, and skilled workers. Convenient, efficient, and economical approaches with high accuracy for constructing as-is BIMs would essentially be the foremost step for effective operation and maintenance of existing buildings. To this end, this study aims at categorizing and analyzing the state-of-the-art technologies in image-based BIM construction processes as the first step. For effective review, a general framework for image-based as-is BIM construction processes is proposed with the following key steps: (1) data capturing and processing, (2) object recognition, and (3) as-is BIM construction. Detailed comparative analyses of methods commonly used in each step were conducted and a prospective model is proposed. Finally, based on the prospective model, knowledge gaps and future development of image-based as-is BIM construction processes are identified. This paper presents the results of systematic review and analyses for image-based as-is BIM construction processes, contributing to the development and widespread adoption of this technology in the construction industry.
Article
A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets. Unfortunately, in the context of RGB-D scene understanding, very little data is available -- current datasets cover a small range of scene views and have limited semantic annotations. To address this issue, we introduce ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations. To collect this data, we designed an easy-to-use and scalable RGB-D capture system that includes automated surface reconstruction and crowdsourced semantic annotation. We show that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks, including 3D object classification, semantic voxel labeling, and CAD model retrieval. The dataset is freely available at http://www.scan-net.org.
Article
Recent progress in computer vision has been driven by high-capacity models trained on large datasets. Unfortunately, creating large datasets with pixel-level labels has been extremely costly due to the amount of human effort required. In this paper, we present an approach to rapidly creating pixel-accurate semantic label maps for images extracted from modern computer games. Although the source code and the internal operation of commercial games are inaccessible, we show that associations between image patches can be reconstructed from the communication between the game and the graphics hardware. This enables rapid propagation of semantic labels within and across images synthesized by the game, with no access to the source code or the content. We validate the presented approach by producing dense pixel-level semantic annotations for 25 thousand images synthesized by a photorealistic open-world computer game. Experiments on semantic segmentation datasets show that using the acquired data to supplement real-world images significantly increases accuracy and that the acquired data enables reducing the amount of hand-labeled real-world data: models trained with game data and just 1/3 of the CamVid training set outperform models trained on the complete CamVid training set.
Article
Signal processing is the key component of any vibration-based structural health monitoring (SHM). The goal of signal processing is to extract subtle changes in the vibration signals in order to detect, locate and quantify the damage and its severity in the structure. This paper presents a state-of-the-art review of recent articles on signal processing techniques for vibration-based SHM. The focus is on civil structures including buildings and bridges. The paper also presents new signal processing techniques proposed in the past few years as potential candidates for future SHM research. The biggest challenge in realization of health monitoring of large real-life structures is automated detection of damage out of the huge amount of very noisy data collected from dozens of sensors on a daily, weekly, and monthly basis. The new methodologies for on-line SHM should handle noisy data effectively, and be accurate, scalable, portable, and efficient computationally.
Article
Various methods are available for the planar segmentation of point clouds from terrestrial laser scanning. In this letter, a new method is proposed to extract planar features from the range image of a point cloud scanned from one standpoint. In this method, a plane is parameterized by its normal vector and the distance from the origin. The algebraic derivation of the parameters is presented in this letter. The parameters are calculated based on the gradient value of a pixel in the range image. The multiple-band synthetic image of planar parameters is segmented using the Iso cluster unsupervised classification method. Experimental plane segmentation results using range images of two point clouds are illustrated. In comparison with existing methods, the proposed method gives an exact estimation of the planar parameters and can handle planes of any orientation.
Article
Building information modeling (BIM) is one of the most promising recent developments in the architecture, engineering, and construction (AEC) industry. With BIM technology, an accurate virtual model of a building is digitally constructed. This model, known as a building information model, can be used for planning, design, construction, and operation of the facility. It helps architects, engineers, and constructors visualize what is to be built in a simulated environment to identify any potential design, construction, or operational issues. BIM represents a new paradigm within AEC, one that encourages integration of the roles of all stakeholders on a project. In this paper, current trends, benefits, possible risks, and future challenges of BIM for the AEC industry are discussed. The findings of this study provide useful information for AEC industry practitioners considering implementing BIM technology in their projects.
Article
Accurate and rapidly produced 3D models of the as-built environment can be significant assets for a variety of Engineering scenarios. Starting with a point cloud of a scene—generated using laser scanners or image-based reconstruction methods—the user must first identify collections of points that belong to individual surfaces, and then, fit surfaces and solid geometry objects appropriate for the analysis. When performed manually, this task is often prohibitively time consuming and, in response, several research groups have recently focused on developing methods for automating the modeling process. Due to the limitations of the data collection processes as well as the complexity of as-built scenes, automated 3D modeling still presents many challenges. To overcome existing limitations, in this paper, we propose a new region growing method for robust context-free segmentation of unordered point clouds based on geometrical continuities. In our method, the user sets a single parameter which accounts for the desired level of abstraction. We treat this parameter as a locally adaptive threshold to account for local context. Our method of segmentation starts with a multi-scale feature detection, describing surface roughness and curvature around each 3D point, and is followed by seed finding and region growing steps. Experimental results from seven challenging point clouds of the built environment demonstrate that our method can account for variability in point cloud density, surface roughness, curvature, and clutter within a single scene.
Conference Paper
In this paper, we propose a data-driven approach to leverage repositories of 3D models for scene understanding. Our ability to relate what we see in an image to a large collection of 3D models allows us to transfer information from these models, creating a rich understanding of the scene. We develop a framework for auto-calibrating a camera, rendering 3D models from the viewpoint an image was taken, and computing a similarity measure between each 3D model and an input image. We demonstrate this data-driven approach in the context of geometry estimation and show the ability to find the identities and poses of object in a scene. Additionally, we present a new dataset with annotated scene geometry. This data allows us to measure the performance of our algorithm in 3D, rather than in the image plane.
Article
In this work we present a framework for the recognition of natural scene text. We use purely data-driven, deep learning models to perform word recognition on the whole image at the same time, departing from the character based recognition systems of the past. These models are trained solely on data produced by a synthetic text generation engine -- synthetic data that is highly realistic and sufficient to replace real data, giving us infinite amounts of training data. This excess of data exposes new possibilities for word recognition models, and here we introduce three novel models, each one "reading" words in a complementary way: via large-scale dictionary encoding, character sequence encoding, and bag-of-N-gram encoding. In the scenarios of language/lexicon based and completely unconstrained text recognition we demonstrate state-of-the-art performance on standard datasets, using our fast, simple machinery and requiring zero data-acquisition costs.
Article
This paper presents a global plane fitting approach for roof segmentation from lidar point clouds. Starting with a conventional plane fitting approach (e.g., plane fitting based on region growing), an initial segmentation is first derived from roof lidar points. Such initial segmentation is then optimized by minimizing a global energy function consisting of the distances of lidar points to initial planes (labels), spatial smoothness between data points, and the number of planes. As a global solution, the proposed approach can determine multiple roof planes simultaneously. Two lidar data sets of Indianapolis (USA) and Vaihingen (Germany) are used in the study. Experimental results show that the completeness and correctness are increased from 80.1% to 92.3%, and 93.0% to 100%, respectively; and the detection cross-lap rate and reference cross-lap rate are reduced from 11.9% to 2.2%, and 24.6% to 5.8%, respectively. As a result, the incorrect segmentation that often occurs at plane transitions is satisfactorily resolved; and the topological consistency among segmented planes is correctly retained even for complex roof structures.
Article
3D object recognition in cluttered scenes is a rapidly growing research area. Based on the used types of features, 3D object recognition methods can broadly be divided into two categories – global or local feature based methods. Intensive research has been done on local surface feature based methods as they are more robust to occlusion and clutter which are frequently present in a real-world scene. This paper presents a comprehensive survey of existing local surface feature based 3D object recognition methods. These methods generally comprise three phases: 3D keypoint detection, local surface feature description, and surface matching. This paper covers an extensive literature survey of each phase of the process. It also enlists a number of popular and contemporary databases together with their relevant attributes.
Article
As-built models and drawings are essential documents used during the operations and maintenance (OM) of buildings for a variety of purposes including the management of facility spaces, equipment, and energy systems. These documents undergo continuous verification and updating procedures both immediately after construction during the initial handover process to reflect construction changes and during occupancy stage for the changes that occur throughout the building's lifespan. Current as-built verification and updating procedures involve largely time consuming on-site surveys, where measurements are taken and recorded manually. In an attempt to streamline this process, the paper investigates the advantages and limitations of using photogrammetric image processing to document and verify actual as-built conditions. A test bed of both the interior and exterior of a university building is used to compare the dimensions generated by automated image processing to dimensions gathered through the manual survey process currently employed by facilities management and strategies for improved accuracy are investigated. Both manual and image-based dimensions are then used to verify dimensions of an existing as-built Building Information Model (BIM). Finally, the potential of the image-based spatial data is assessed for accurately generating 3D models. 2011 Elsevier B.V. All Rights Reserved.
Article
Building information models (BIMs) are maturing as a new paradigm for storing and exchanging knowledge about a facility. BIMs constructed from a CAD model do not generally capture details of a facility as it was actually built. Laser scanners can be used to capture dense 3D measurements of a facility's as-built condition and the resulting point cloud can be manually processed to create an as-built BIM — a time-consuming, subjective, and error-prone process that could benefit significantly from automation. This article surveys techniques developed in civil engineering and computer science that can be utilized to automate the process of creating as-built BIMs. We sub-divide the overall process into three core operations: geometric modeling, object recognition, and object relationship modeling. We survey the state-of-the-art methods for each operation and discuss their potential application to automated as-built BIM creation. We also outline the main methods used by these algorithms for representing knowledge about shape, identity, and relationships. In addition, we formalize the possible variations of the overall as-built BIM creation problem and outline performance evaluation measures for comparing as-built BIM creation algorithms and tracking progress of the field. Finally, we identify and discuss technology gaps that need to be addressed in future research.
Article
This paper highlights potential problems in the construction industry concerning the large quantities of information produced and the lack of an adequate information structure within which to coordinate this information. The Information Engineering Method (IEM) and Information Engineering Facility (IEF) CASE tool are described and put forward as a means of establishing an information structure at a strategic level thus providing a framework for the implementation of lower level applications systems.The paper describes how the ICON (Integration/Information for Construction) project at Salford University is establishing and modelling the information requirements for the construction industry at the strategic level. The IEM and IEF are demonstrated using activity, data and interaction models with particular attention being paid to the function of building design within the broader context of design, procurement and the management of construction. Implications for future practice are also discussed.
Article
Terrestrial laser scanning is becoming a common surveying technique to measure quickly and accurately dense point clouds in 3-D. It simplifies measurement tasks on site. However, the massive volume of 3-D point measurements presents a challenge not only because of acquisition time and management of huge volumes of data, but also because of processing limitations on PCs. Raw laser scanner point clouds require a great deal of processing before final products can be derived. Thus, segmentation becomes an essential step whenever grouping of points with common attributes is required, and it is necessary for applications requiring the labelling of point clouds, surface extraction and classification into homogeneous areas. Segmentation algorithms can be classified as surface growing algorithms or clustering algorithms. This paper presents an unsupervised robust clustering approach based on fuzzy methods. Fuzzy parameters are analysed to adapt the unsupervised clustering methods to segmentation of laser scanner data. Both the Fuzzy C-Means (FCM) algorithm and the Possibilistic C-Means (PCM) mode-seeking algorithm are reviewed and used in combination with a similarity-driven cluster merging method. They constitute the kernel of the unsupervised fuzzy clustering method presented herein. It is applied to three point clouds acquired with different terrestrial laser scanners and scenarios: the first is an artificial (synthetic) data set that simulates a structure with different planar blocks; the second a composition of three metric ceramic gauge blocks (Grade 0, flatness tolerance ± 0.1 μm) recorded with a Konica Minolta Vivid 9i optical triangulation digitizer; the last is an outdoor data set that comes up to a modern architectural building collected from the centre of an open square. The amplitude-modulated-continuous-wave (AMCW) terrestrial laser scanner system, the Faro 880, was used for the acquisition of the latter data set. Experimental analyses of the results from the proposed unsupervised planar segmentation process are shown to be promising.
Article
This paper presents a method for automatically registering multiple rigid three dimensional (3D) data sets, a process we call multi-view surface matching. Previous approaches required manual registration or relied on specialized hardware to record the sensor position. In contrast, our method does not require any pose measuring hardware or manual intervention. We do not assume any knowledge of initial poses or which data sets overlap. Our multi-view surface matching algorithm begins by converting the input data into surface meshes, which are pair-wise registered using a surface matching engine. The resulting matches are tested for surface consistency, but some incorrect matches may be indistinguishable from correct ones at this local level. A global optimization process searches a graph constructed from the pair-wise matches for a connected sub-graph containing only correct matches, employing a global consistency measure to eliminate incorrect, but locally consistent, matches. From this sub-graph, the rigid-body transforms that register all the views can be computed directly. We apply our algorithm to the problem of 3D digital reconstruction of real-world objects and show results for a collection of automatically digitized objects.
Conference Paper
In this work we address the question of how to exploit typical architectural structures to improve recovery for CAD modeling of built environments from 3D data. In doing so we have examined the applicability of the GENOCOP III algorithm to the model fitting. The algorithm uses explicit domain knowledge, specifically geometric constraints in the form of parameterized surface models and a Euclidean fitting of geometric primitives that describe the parameterized surface models. Beside some results fitting parameterized surface models to real 3D datasets, example times for convergence and comparison with known ground truth are given.
Deep learning brain conductivity mapping using a patch-based 3D U-net
  • Hampe
Localization recall precision (LRP): A new performance metric for object detection
  • Oksuz
Automatic reconstruction of tree skeletal structures from point clouds
  • Livny
Deep leaf segmentation using synthetic data
  • Ward