Figure 4 - uploaded by Tanittha Sutjaritvorakul
Content may be subject to copyright.
Proposed workflow for each individual network

Proposed workflow for each individual network

Source publication
Conference Paper
Full-text available
In this paper, we propose an approach and workflow in order to detect humans in the environment around a crane with Monocular Images. The considered area is split up into a zone around the crane truck and one around the load. The load will be monitored with an optical zoom camera where we can control the zoom. We discretize the zoom levels and a Co...

Contexts in source publication

Context 1
... also will use simulated data since we will need to generate it any- way, as explained further below. The workflow will be similar to figure 4, but keep in mind that we start with a pretrained network and do not need to train from scratch. With this in hand we have all the necessary parts to com- bine it into a human detection algorithm around the crane truck. ...
Context 2
... so to speak need a DatasetZoom0, DatasetZoom25, DatasetZoom50 and DatasetZoom75. Since such datasets are not publicly available we follow the workflow proposed in figure 4 for each network individually. Assuming this training is finished we take an additional network which we will call MetaNetwork, which differs in its architecture. ...
Context 3
... also will use simulated data since we will need to generate it anyway, as explained further below. The workflow will be similar to figure 4, but keep in mind that we start with a pretrained network and do not need to train from scratch. With this in hand we have all the necessary parts to combine it into a human detection algorithm around the crane truck. ...
Context 4
... so to speak need a DatasetZoom0, DatasetZoom25, DatasetZoom50 and DatasetZoom75. Since such datasets are not publicly available we follow the workflow proposed in figure 4 for each network individually. Assuming this training is finished we take an additional network which we will call MetaNetwork, which differs in its architecture. ...

Similar publications

Article
Full-text available
Truck-lifting accidents are common in container-lifting operations. Previously, the operation sites are needed to arrange workers for observation and guidance. However, with the development of automated equipment in container terminals, an automated accident detection method is required to replace manual workers. Considering the development of visi...

Citations

... As the dataset can provide the same scene in a variety of atmospheric conditions, e.g., foggy, clear, and rainy, its variety is enhanced. According to [14], synthetic data can be used for training and testing neural networks for detecting workers around crane loads. Using a game-engine driven synthetic dataset, [15] evaluates the workers' and safety hooks' detection systems when they perform scaffolding tasks. ...
Conference Paper
Full-text available
Real-time construction site monitoring is essential for ensuring compliance with safety measures to prevent fatal accidents among workers. This monitoring is carried out using computer vision-based object detection models that require high-quality datasets collected with various viewpoints, specularities, and illumination levels. It is, however, extremely difficult and labor intensive to collect large-scale datasets. Although several previous studies propose innovative approaches to augmenting existing data and creating synthetic data, they are still largely ineffective in dynamic construction work environments. In this study, we propose using OpenAI DALL.E 2 original synthetic image generation tool to improve CV-based object detection accuracy. It aims to improve monitoring technologies by developing a DALL.E 2 synthetic dataset, resulting in fewer construction site fatalities.
... The research is mainly focused on adaptive zoom conditions rather than how to implement an effective zoom control. In other words, the previous studies [6,15] do not provide evidence on how to conduct the zoom control to reach the desired zoom level, but only zoom constraints. For instance, the authors in [6] merely mentioned that the camera is zoomed to hold a defined pixel range. ...
... The zoom function of the crane load-view camera is essential for the crane operator. Vierling et al. [15] propose an automatic zoom load-view camera based on the working zone and load occlusion. The authors trained four CNNs with the load-view images as input. ...
Conference Paper
Full-text available
Zoom camera is essential for detecting objects from the top-view. The deep learning detection algorithm can fail to handle scale invariance, especially for detectors whose input size is changed in an extremely wide range. The adaptive zoom feature can enhance the quality of the deep learning worker detection. In this paper, we introduce an automatic zoom control approach and demonstrate its efficacy in real-world top-view object detection. To avoid further data gathering and extensive re-training, the zoom adaptability method of the load-view crane camera is able to support the deep learning algorithm, specifically in the high scale variant problem. The finite state machine is employed for control strategies to adapt the zoom level to cope not only with inconsistent detection but also abrupt camera movement during lifting operation. As the result, the detector is able to detect a small size object by smooth continuous zoom control without additional training.
... The load is often not in direct sight of the crane operator. Therefore surveillance with a camera mounted on the boom top is used to increase the safety [1]. Due to the birds-eye-view, vertical and horizontal translational robustness is a natural aspect the detection algorithm should fulfill. ...
... The high-visibility color feature of an emergency vest is used. Vierling et al. [21] adapt the zoom camera level of the crane based on the crane working zone. ...
Chapter
Safety assistance is important in the construction industry specifically for crane operation due to the high fatality rate of the crane. Semantic visibility assistance as a crucial part of the safety system is highly dependent on the image database. In this paper, we demonstrate how to develop the simulation platform to generate the data that is needed for worker detection from the load-view crane camera.
... The role of the crane is to lower the boom, pick up the load, and move the load to the desired position. The camera takes a snapshot of the load and the ground while the boom is moving [22]. ...
Preprint
Full-text available
This paper addresses the problem of dense depth predictions from sparse distance sensor data and a single camera image on challenging weather conditions. This work explores the significance of different sensor modalities such as camera, Radar, and Lidar for estimating depth by applying Deep Learning approaches. Although Lidar has higher depth-sensing abilities than Radar and has been integrated with camera images in lots of previous works, depth estimation using CNN's on the fusion of robust Radar distance data and camera images has not been explored much. In this work, a deep regression network is proposed utilizing a transfer learning approach consisting of an encoder where a high performing pre-trained model has been used to initialize it for extracting dense features and a decoder for upsampling and predicting desired depth. The results are demonstrated on Nuscenes, KITTI, and a Synthetic dataset which was created using the CARLA simulator. Also, top-view zoom-camera images captured from the crane on a construction site are evaluated to estimate the distance of the crane boom carrying heavy loads from the ground to show the usability in safety-critical applications.
... They applied YOLO for detection but only achieved a precision of about 75%. Vierling et al. [25] proposed a convolutional neural network (CNN)-based concept detecting workers in top-view images. To cope with high altitudes, their approach relies on several zoom levels each with a separate CNN for detection. ...
Article
Full-text available
Keeping an overview of all ongoing processes on construction sites is almost unfeasible, especially for the construction workers executing their tasks. It is difficult for workers to concentrate on their work while paying attention to other processes. If their workflows in hazardous areas do not run properly, this can lead to dangerous accidents. Tracking pedestrian workers could improve the productivity and safety management on construction sites. For this, vision-based tracking approaches are suitable, but the training and evaluation of such a system requires a large amount of data originating from construction sites. These are rarely available, which complicates deep learning approaches. Thus, we use a small generic dataset and juxtapose a deep learning detector with an approach based on classical machine learning techniques. We identify workers using a YOLOv3 detector and compare its performance with an approach based on a soft cascaded classifier. Afterwards, tracking is done by a Kalman filter. In our experiments, the classical approach outperforms YOLOv3 on the detection task given a small training dataset. However, the Kalman filter is sufficiently robust to compensate for the drawbacks of YOLOv3. We found that both approaches generally yield a satisfying tracking performances but feature different characteristics.
... The role of the crane is to lower the boom, pick up the load, and move the load to the desired position. The camera takes a snapshot of the load and the ground while the boom is moving [21]. ...
... Hu et al. [7] use YOLOv3 to detect non-complaint worker without a helmet. Vierling et al. [29] propose an automatic zoom load-view camera based on the working zone and load occlusion. The authors train the convolutional neural network with the load-view image and current zoom level, then result the optimal zoom level for the operator. ...
Preprint
Cranes as an essential part of the construction machinery, are one of the prominent sources of fatalities in the construction sites. The camera assistant system can contribute significantly to the safety of the crane operation particularly in blind lifts tasks, where the operator highly relies on the load-view camera. In this paper, we address the worker detection from an off-the-shelf load-view crane camera using a data-driven approach. Due to the difficulties in collecting data, we generate five training datasets via a simulation platform to build up the synthetic samples to improve the state-of-the-art detector. Despite the fact that only the simulation data is used as training datasets, the trained network demonstrates the average precision of up to 66.84% in two real-world scenarios.
... Hu et al. [9] use YOLOv3 to detect non-complaint worker without a helmet. Vierling et al. [30] propose an automatic zoom load-view camera based on the working zone and load occlusion. The authors train the convolutional neural network with the load-view image and current zoom level, then result the optimal zoom level for the operator. ...
Conference Paper
Full-text available
Cranes as an essential part of the construction machinery, are one of the prominent sources of fatalities in the construction sites. The camera assistant system can contribute significantly to the safety of the crane operation particularly in blind lifts tasks, where the operator highly relies on the load-view camera. In this paper, we address the worker detection from an off-the-shelf load-view crane camera using a data-driven approach. Due to the difficulties in collecting data, we generate five training datasets via a simulation platform to build up the synthetic samples to improve the state-of-the-art detector. Despite the fact that only the simulation data is used as training datasets, the trained network demonstrates the average precision of up to 66.84% in two real-world scenarios.
... Secondly, workers are detected in video images using Support Vector Machine (SVM) and k-Nearest Neighbors (k-NN) classifiers [13] and are then tracked over time [14]. More recent approaches employ Convolutional Neural Networks (CNNs) for both detection and tracking purposes [15,16]. However, the detection results of such approaches are in need of improvement. ...
Article
Full-text available
Vision-based tracking systems enable the optimization of the productivity and safety management on construction sites by monitoring the workers' movements. However, training and evaluation of such a system requires a vast amount of data. Sufficient datasets rarely exist for this purpose. We investigate the use of synthetic data to overcome this issue. Using 3D computer graphics software, we model virtual construction site scenarios. These are rendered for the use as a synthetic dataset which augments a self-recorded real world dataset. Our approach is verified by means of a tracking system. For this, we train a YOLOv3 detector identifying pedestrian workers. Kalman filtering is applied to the detections to track them over consecutive video frames. First, the detector's performance is examined when using synthetic data of various environmental conditions for training. Second, we compare the evaluation results of our tracking system on real world and synthetic scenarios. With an increase of about 7.5 percentage points in mean average precision, our findings show that a synthetic extension is beneficial for otherwise small datasets. The similarity of synthetic and real world results allow for the conclusion that 3D scenes are an alternative to evaluate vision-based tracking systems on hazardous scenes without exposing workers to risks.