Typical image grabbed by a walking humanoid robot. As can be seen, the image is highly affected by motion blur. 

Typical image grabbed by a walking humanoid robot. As can be seen, the image is highly affected by motion blur. 

Source publication
Conference Paper
Full-text available
Motion blur is a severe problem in images grabbed by legged robots and, in particular, by small humanoid robots. Standard feature extraction and tracking approaches typically fail when applied to sequences of images strongly affected by motion blur. In this paper, we propose a new feature detection and tracking scheme that is robust even to non-uni...

Context in source publication

Context 1
... its camera moves in a jerky and sometimes unpredictable way. This causes an undesired motion blur in the images grabbed by the robot's camera that negatively affects the performance of the feature detectors and especially of the feature tracking classic algorithms. A typical image affected by motion blur grabbed by a walking robot is depicted in Fig. ...

Similar publications

Conference Paper
Full-text available
Synopsis Dynamic Radial VIBE (DRV) DCE-MRI allows to image with sufficient spatio-temporal resolution for functional imaging of kidneys. However, fast movements of babies during the scan corrupt individual lines in k-space and severely compromise the quality of the reconstructed images and limits the clinical utility of non-sedated imaging. In this...

Citations

... Estimation of camera pose from images can be hindered by a range of environmental circumstances [8,10]. Many visual complexities have been considered in the literature, and methods proposed to overcome specific difficulties: motion blur [11][12][13], illumination change [14,15], dynamic scenes [16,17], textures [18][19][20], indoor/outdoor transitions, and specular highlights [18,21]. General approaches to tackle complexities have recently been proposed [18,22], but do so indiscriminately of the source of errors. ...
Article
Full-text available
Visual simultaneous localisation and mapping (vSLAM) finds applications for indoor and outdoor navigation that routinely subjects it to visual complexities, particularly mirror reflections. The effect of mirror presence (time visible and its average size in the frame) was hypothesised to impact localisation and mapping performance, with systems using direct techniques expected to perform worse. Thus, a dataset, MirrEnv, of image sequences recorded in mirror environments, was collected, and used to evaluate the performance of existing representative methods. RGBD ORB-SLAM3 and BundleFusion appear to show moderate degradation of absolute trajectory error with increasing mirror duration, whilst the remaining results did not show significantly degraded localisation performance. The mesh maps generated proved to be very inaccurate, with real and virtual reflections colliding in the reconstructions. A discussion is given of the likely sources of error and robustness in mirror environments, outlining future directions for validating and improving vSLAM performance in the presence of planar mirrors. The MirrEnv dataset is available at https://doi.org/10.17035/d.2023.0292477898.
... The success of robot localization methods based on vision relies on the quality of the camera exposure. While many methods exist to mitigate poor exposure effects after images have been acquired (e.g., motion blur [1][2][3][4][5][6][7], saturation [8], low contrast [9][10][11]), these often jeopardize the real-time capabilities of state estimation. Moreover, the performance of these specialized visual odometry (VO) and simultaneous localization and mapping (SLAM) pipelines can only be less than or equal to some equivalent generic pipelines fed with appropriately acquired images. ...
Article
Full-text available
The success of robot localization based on visual odometry (VO) largely depends on the quality of the acquired images. In challenging light conditions, specialized auto-exposure (AE) algorithms that purposely select camera exposure time and gain to maximize the image information can therefore greatly improve localization performance. In this work, an AE algorithm is introduced which, unlike existing algorithms, fully leverages the camera’s photometric response function to accurately predict the optimal exposure of future frames. It also features feedback that compensates for prediction inaccuracies due to image saturation and explicitly balances motion blur and image noise effects. For validation, stereo cameras mounted on a custom-built motion table allow different AE algorithms to be benchmarked on the same repeated reference trajectory using the stereo implementation of ORB-SLAM3. Experimental evidence shows that (1) the gradient information metric appropriately serves as a proxy of indirect/feature-based VO performance; (2) the proposed prediction model based on simulated exposure changes is more accurate than using γ transformations; and (3) the overall accuracy of the estimated trajectory achieved using the proposed algorithm equals or surpasses classic exposure control approaches. The source code of the algorithm and all datasets used in this work are shared openly with the robotics community.
... The degree of viewpoint variation that arises during scene perception by a humanoid robot is appreciably more complex than viewpoint varia-tions experienced by mobile robots [27]. When a humanoid robot is walking, squatting or turning its head moves in a jerky and sometimes unpredictable way [29]. ...
... Pretto Emmanouil (1) the vertical alignment of the humanoid chain amplifies motion effects and causes blurriness during image acquisition, (2) the periodic motions followed by the robot's CoM aggravate data association, (3) ground impacts propagate through the robot's kinematic chain and cause trembling and jerkiness. et al. [3] proposed one of the first VO frameworks to tackle motion blurriness and localize small-size humanoid robots with monocular feature-based vision. More recently, Oriolo et al. [4] proposed an Extended Kalman Filter to fuse joint encoder, pressure, Inertial Measurement Unit (IMU), and the VO computed with PTAM [5]. ...
Conference Paper
Full-text available
In the current paper we investigate the challenges of localizing walking humanoid robots using Visual SLAM (VSLAM). We propose a novel dense RGB-D SLAM framework that seamlessly integrates with the dynamic state of a humanoid, to provide real-time localization and dense mapping of its surroundings. Following the path of recent research in humanoid localization, in the current work we explore the integration between a VSLAM system and the humanoid state, by considering the gait cycle and the feet contacts. We analyze how these effects undermine the quality of data acquisition and association for VSLAM, by capturing the unilateral ground forces at the robot's feet, and design a system that mitigates their impact. We evaluate our framework on both open and closed-loop bipedal gaits, using a low-cost humanoid platform, and demonstrate that it outperforms kinematic odometry and state-of-the-art dense RGB-D VSLAM methods, by continuously localizing the robot, even in the face of highly irregular and unstable motions.
... The degree of viewpoint variation that takes place during scene perception by a humanoid robot is far more complex than viewpoint variations experienced by mobile robots [7]. When a humanoid robot is walking, turning, or squatting, its head mounted camera moves in a jerky and sometimes unpredictable way [8]. Motion blur, one of the biggest problems for feature-based SLAM systems, causes inaccuracies and location losses during map construction. ...
Chapter
Full-text available
Current approaches to visual place recognition for loop closure do not provide information about confidence of decisions. In this work we present an algorithm for place recognition on the basis of graph-based decisions on deep embeddings and blur detections. The graph constructed in advance permits together with information about the room category an inference on usefulness of place recognition, and in particular, it enables the evaluation the confidence of final decision. We demonstrate experimentally that thanks to proposed blur detection the accuracy of scene recognition is much higher. We evaluate performance of place recognition on the basis of manually selected places for recognition with corresponding sets of relevant and irrelevant images. The algorithm has been evaluated on large dataset for visual place recognition that contains both images with severe (unknown) blurs and sharp images. Images with 6-DOF viewpoint variations were recorded using a humanoid robot.
... The proposed taxonomy of SLAM related tasks takes into account 2 principles: sensor-oriented, and functionality-oriented. Fig. 7. Camera motion blur under walking shake (left), and LiDAR motion distortion under rapid rotation (right) [42], [43]. Oriented by sensor setup, there are mainly five branches for SLAM related tasks: LiDAR-based, vision-based, vision-LiDAR fusion, RGB-D-based, and image-based. ...
Preprint
Full-text available
Due to the complicated procedure and costly hardware, Simultaneous Localization and Mapping (SLAM) has been heavily dependent on public datasets for drill and evaluation, leading to many impressive demos and good benchmark scores. However, with a huge contrast, SLAM is still struggling on the way towards mature deployment, which sounds a warning: some of the datasets are overexposed, causing biased usage and evaluation. This raises the problem on how to comprehensively access the existing datasets and correctly select them. Moreover, limitations do exist in current datasets, then how to build new ones and which directions to go? Nevertheless, a comprehensive survey which can tackle the above issues does not exist yet, while urgently demanded by the community. To fill the gap, this paper strives to cover a range of cohesive topics about SLAM related datasets, including general collection methodology and fundamental characteristic dimensions, SLAM related tasks taxonomy and datasets categorization, introduction of state-of-the-arts, overview and comparison of existing datasets, review of evaluation criteria, and analyses and discussions about current limitations and future directions, looking forward to not only guiding the dataset selection, but also promoting the dataset research.
... The third systematic error source we consider is blur. Images can be blurred due to motion or objects not being in proper focus [100]. Since the amount and the characteristics of the blur depend on many different factors, of which many cannot be modeled due to missing information from the environment, the correct probability distribution of the position of a feature cannot always be determined [101]. ...
Thesis
Full-text available
With the advent of autonomous driving, the localization of mobile robots, especially without GNSS information, is becoming increasingly important. It must be ensured that the localization works robustly and timely warnings are provided if the pose estimates are too uncertain to assure a safe operation of the system. To meet these requirements, autonomous systems require reliable and trustworthy information about their environment. To improve the reliability and the integrity of information, and to be robust with respect to sensor failures, information from multiple sensors should be fused. However, this requires inter-sensor properties (e.g. the transformation between sensor coordinate systems) to be known. Naturally, neither the actual sensor measurements nor the inter-sensor properties can be determined without errors, and thus must be modeled accordingly during sensor fusion. To localize autonomous vehicles without GNSS information in 3D, this work introduces a dead reckoning approach relying on information from a camera, a laser scanner and an IMU. First, novel error models for the individual sensors are introduced. Here, the errors are assumed to be unknown but bounded, which requires bounds (i.e. intervals) that are not exceeded by the actual sensor errors to be known. However, no further assumptions are required. In particular, the error distribution within the bounds does not need to be known, which is a frequently overlooked assumption of established approaches. Furthermore, interval-based error models are compatible with unknown systematic errors and can be used to guarantee results. Second, to determine the inter-sensor properties and the corresponding uncertainties, this thesis presents new approaches for the spatiotemporal calibration between camera, laser scanner and IMU that employ the proposed error models. Third, an innovative method that considers both sensor and inter-sensor errors for guaranteed sensor fusion is proposed. The fused information is subsequently used to perform interval-based dead reckoning of a mobile robot. To evaluate the developed methods, both simulated and real data are analyzed. It becomes evident that all proposed approaches are guaranteed to enclose the true solution if the sensor error bounds are correct. Moreover, although interval-based approaches consider the “worst case”, i.e. the maximum sensor errors, the results are reasonably accurate. In particular, it can be determined in which instances a state-of-the-art method computes a result that deviates significantly from the actual solution.
... As many computer vision algorithms such as semantic segmentation, object detection, or visual odometry rely on visual input, blurry images challenge the performance of these algorithms. It is well known that many algorithms (e.g., depth prediction, feature detection, motion estimation, or object recognition) suffer from motion blur [17], [25], [26], [33]. The motion deblurring problem has thus received considerable attention in the past [7], [17], [21], [28], [32]. ...
Preprint
Motion blurry images challenge many computer vision algorithms, e.g, feature detection, motion estimation, or object recognition. Deep convolutional neural networks are state-of-the-art for image deblurring. However, obtaining training data with corresponding sharp and blurry image pairs can be difficult. In this paper, we present a differentiable reblur model for self-supervised motion deblurring, which enables the network to learn from real-world blurry image sequences without relying on sharp images for supervision. Our key insight is that motion cues obtained from consecutive images yield sufficient information to inform the deblurring task. We therefore formulate deblurring as an inverse rendering problem, taking into account the physical image formation process: we first predict two deblurred images from which we estimate the corresponding optical flow. Using these predictions, we re-render the blurred images and minimize the difference with respect to the original blurry inputs. We use both synthetic and real dataset for experimental evaluations. Our experiments demonstrate that self-supervised single image deblurring is really feasible and leads to visually compelling results.
... The latter (Marziliano et al., 2004;Ferzli, and Karam, 2009;Narvekar and Karam, 2011) classifies the images into the blurred and clear images, and then eliminates the blurred ones. Alberto Pretto et al. (2009) proposed a robust VO method based on SIFT (Lowe, 2004). However, only Gaussian blur is considered in this work, whereas other three common types of blurring effect are ignored. ...
... The latter (Marziliano et al., 2004;Ferzli, and Karam, 2009;Narvekar and Karam, 2011) classifies the images into the blurred and clear images, and then eliminates the blurred ones. Alberto Pretto et al. (2009) proposed a robust VO method based on SIFT (Lowe, 2004). However, only Gaussian blur is considered in this work, whereas other three common types of blurring effect are ignored. ...