Fig 8 - uploaded by Ke Wang
Content may be subject to copyright.
Small pedestrian detection in three scales in Caltech dataset (data is summarized in October 1, 2019).

Small pedestrian detection in three scales in Caltech dataset (data is summarized in October 1, 2019).

Contexts in source publication

Context 1
... the COCO dataset, general small object detection reflects an average precision of 0.343, compared to medium objects of 0.556 and large objects of 0.660. This also shows the difficulty of small object detection. As for small pedestrian detection, Caltech researchers summed up the performance of detectors in three scales shown in Fig. 8. Reasonably, detectors in near scale has the best performance with the lowest miss rate of nearly 0%. However, when it comes to medium scale, the performance drops dramatically with the lowest miss rate of 23%. To make matters worse, detectors in the far scale have a terrible miss rate of only 60%. As analyzed above, the medium-scale ...
Context 2
... objects of 0.660. This also shows the difficulty of small object detection. As for small pedestrian detection, Caltech researchers summed up the performance of detectors in three scales shown in Fig. 8. Reasonably, detectors in near scale has the best performance with the lowest miss rate of nearly 0%. However, when it comes to medium scale, the performance drops dramatically with the lowest miss rate of 23%. To make matters worse, detectors in the far scale have a terrible miss rate of only 60%. As analyzed above, the medium-scale ...
Context 3
... the interaction helps to boost acceptance and develop proper mental models when AVs are originally put into markets. More importantly, it is crucial to inform pedestrians when AVs have a failure state (Steinfeld et al., 2018). ...

Similar publications

Article
Full-text available
The safety of vulnerable road users is of paramount importance as transport moves towards fully automated driving. The richness of real-world data required for testing autonomous vehicles is limited and furthermore, available data do not present a fair representation of different scenarios and rare events. Before deploying autonomous vehicles publi...
Preprint
Full-text available
Pedestrians are arguably one of the most safety-critical road users to consider for autonomous vehicles in urban areas. In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes. These encompass visual appearance and behavior, and also include the forecasting of road crossing, which is a main sa...
Article
Full-text available
Numerous traffic crashes occur every year on zebra crossings in China. Pedestrians are vulnerable road users who are usually injured severely or fatally during human-vehicle collisions. The development of an effective pedestrian street-crossing decision-making model is essential to improving pedestrian street-crossing safety. For this purpose, this...
Article
Full-text available
Predicting the future trajectories of multiple pedestrians in certain scenes has become a key task for ensuring that autonomous vehicles, socially interactive robots and other autonomous mobile platforms can navigate safely. The social interactions between people and the multimodal nature of pedestrian movement make pedestrian trajectory prediction...
Preprint
Full-text available
Autonomous Vehicles (AV) will transform transportation, but also the interaction between vehicles and pedestrians. In the absence of a driver, it is not clear how an AV can communicate its intention to pedestrians. One option is to use visual signals. To advance their design, we conduct four human-participant experiments and evaluate six representa...

Citations

... Yet, many safety-related unknowns need to be investigated toward deploying autonomous vehicles (AVs) with aforementioned dynamic driving tasks. As stated by Wang et al. (2020), China is the world's largest automotive market and is ambitious for AVs development. Moreover, according to Chen et al. (2023), "AD accident" is among the top three most-searched keywords in China. ...
Article
Rear-end collisions between autonomous vehicles (AVs) and human-driven vehicles (HVs) represent critical scenarios in road networks. Few studies have focused on the scenarios where a HV hits an AV from behind (called a HV-AV collision). This paper aims to investigate the occurrence of HV-AV collisions in the stop-in-lane (SiL) scenario where a HV follows an AV. A humanlike brake control (HLBC) model is firstly proposed to simulate the driver brake control. The HLBC model considers human driving intention, vision-based expectancy, and certain inherent characteristics of human driving to achieve dynamic humanlike braking. Additionally, the joint distribution of off-road-glance and time-headway is originally introduced to simulate the glance distraction of drivers during their dynamic vehicle control. Sequentially, we apply the HLBC model to the SiL scenario to investigate how the HV-AV collision probability changes with respect to various dynamic driving parameters. The results of the case study provide a thorough understanding of the dynamic driving conditions that lead to HV-AV collisions and pave the way for identifying practical countermeasures to improve the road safety involving AVs.
... Pedestrian detection is a widely studied object detection problem that finds extensive applications in domains such as intelligent video surveillance [1], intelligent transportation [2], and autonomous driving systems [3,4]. It also serves as a fundamental technology supporting tasks like pedestrian pose estimation [5,6] and pedestrian re-identification [7,8]. ...
Article
Full-text available
Pedestrian detection is crucial for various applications, including intelligent transportation and video surveillance systems. Although recent research has advanced pedestrian detection models like the YOLO series, they still face limitations in handling diverse pedestrian scales, leading to performance challenges. To address these issues, we propose HF-YOLO, an advanced pedestrian detection model. HF-YOLO tackles the complexities of pedestrian detection in complex scenes by addressing scale variations and occlusions among pedestrians. In the feature fusion stage, our algorithm leverages both shallow localization information and deep semantic information. This involves fusing P2 layer features and adding a high-resolution detection layer, significantly improving the detection of small-scale pedestrians and occluded instances. To enhance feature representation, HF-YOLO incorporates the HardSwish activation function, introducing more non-linear factors and strengthening the model’s ability to represent complex and discriminative features. Additionally, to address regression imbalance, a balance factor is introduced to the CIoU loss function. This modification effectively resolves the imbalance problem and enhances pedestrian localization accuracy. Experimental results demonstrate the effectiveness of our proposed algorithm. HF-YOLO achieves notable improvements, including a 3.52% increase in average precision, a 1.35% boost in accuracy, and a 4.83% enhancement in recall. Moreover, the algorithm maintains real-time performance with a detection time of 8.5ms, meeting the stringent requirements of real-time applications.
... Despite the advancements in autonomous vehicles, there remain many challenges that must be addressed. Interacting with vulnerable road users, such as pedestrians and bicyclists, is one of the current challenges facing the development of fully autonomous driving technology in urban settings [13], [14]. These road users are more susceptible to injury in accidents [15]. ...
Preprint
Full-text available
Pedestrian trajectory prediction is a critical component of autonomous driving in urban environments, allowing vehicles to anticipate pedestrian movements and facilitate safer interactions. While egocentric-view-based algorithms can reduce the sensing and computation burdens of 3D scene reconstruction, accurately predicting pedestrian trajectories and interpreting their intentions from this perspective requires a better understanding of the coupled vehicle (camera) and pedestrian motions, which has not been adequately addressed by existing models. In this paper, we present a novel ego-centric pedestrian trajectory prediction approach that uses a two-tower structure and multi-modality inputs. One tower, the vehicle module, receives only the initial pedestrian position and ego-vehicle actions and speed, while the other, the pedestrian module, receives additional prior pedestrian trajectory and visual features. Our proposed action-aware loss function allows the two-tower model to decompose pedestrian trajectory predictions into two parts, caused by ego-vehicle movement and pedestrian movement, respectively, even when only trained on combined ego-view motions. This decomposition increases model flexibility and provides better estimation of pedestrian actions and intentions, enhancing overall performance. Experiments on three publicly available benchmark datasets show that our proposed model outperforms all existing algorithms in ego-view pedestrian trajectory prediction accuracy.
... Moreover, incorrect information in the map can hinder its reusability for future mapping tasks. Detecting dynamic feature points is of utmost importance, and common solutions in VSLAM are illustrated in Fig. 4. One approach involves leveraging multi-view geometry techniques or other conventional methods to identify outliers or dynamic regions with substantial residuals [20] [21]. Alternatively, deep learning methods are used for object detection and instance segmentation, facilitating the isolation and rejection of dynamic objects [22] [23] [24]. ...
Article
Full-text available
In the context of automated driving, navigating through challenging urban environments with dynamic objects, large-scale scenes, and varying lighting/weather conditions, achieving accurate localization is paramount for highly-automated (HAVs) or autonomous vehicles (AVs). An imprecise localization can greatly impact subsequent decision-making to manage an HAV or AV’s motion (planning and control tasks). In recent years, visual simultaneous localization and mapping (VSLAM) has shown substantial progress and equipping it can lead to handling non-standardized situations of real-world scenes and achieving higher localization and mapping accuracy. In this article, we present a comprehensive analysis of the current research status of VSLAM and its potential application to HAV or AV operating in complex urban environments. We first discuss the criteria to assess how well for the solutions that VSLAM methods offer to address the challenges, which include real-time performance, accuracy, robustness, and system operating cost. By employing these assessment criteria, we evaluate various VSLAM methods in four essential aspects including rejection and tracking of high dynamic objects, map construction in large-scale environments, loop detection and error correction, and sustainable operation and map updating. This evaluation provides valuable insights into the effectiveness of different VSLAM techniques. We then discuss potential research directions for leveraging VSLAM methods in achieving high-level automated driving in complex settings. We hope this article to serve as a timely update on recent progress and advances in VSLAM which are applicable to HAVs or AVs. To facilitate future research, we create a repository that includes links to relevant reviews and methodological papers for learning at https://github.com/bumblebee15138/VSLAM for HAVs and AVs.
... Interpersonal communication between the human driver and other fellow road users is a vital component for avoiding accidents (Stanciu et al., 2018). Road users communicate with each other to coordinate movements by exchanging gestures, facial expressions, eye contact, etc. and thus contribute to ensuring road safety, which will be absent when there is no human driver paying attention in a level 4 AV (Wang, K. et al., 2020). So in order to adapt to this human-centric behavior to operate in a human-centric environment, there should be an external HMI that will declare its status and intention as well as be able to read the other road users' intentions by monitoring body language. ...
Conference Paper
Full-text available
When Autonomous vehicles have come into the limelight both in the field of practical use and in the research domain, the continuous complexity of shared decision-making, liability, and ethics in a human-autonomy team is still finding ways to be resolved. Especially in level 4 autonomous vehicles, when almost everything is to be done by the autonomous system, the role of humans needs to be more structured. In this paper, we discuss the safety concern of a level 4 autonomous vehicle, keeping the human users in focus. We identify the critical safety concerns that can occur in a level 4 autonomous vehicle due to the autonomous system’s error and how humans in and out of the loop can play a role there. We have three major topics discussed in this paper regarding the safety of level 4 autonomous vehicles: 1) identification of major safety issues of a level 4 autonomous vehicle, where we discuss the safety issue that can arise and whether the human user’s invention can prevent the hazard, 2) discussion of the factors which needs to be considered to build an efficient Human-Machine Interface, through which the human user will be in the loop of the operation of the autonomous system and can play a crucial role to prevent potential safety hazard caused by the autonomous system error, 3) potential use of brain-machine interface technology to provide more robustness to the human-autonomy teaming in an autonomous vehicle context.
... [30,31] state that policies need to intervene to reduce emissions from AVs, and [32] emphasizes that accidents may increase due to excessive trust in AVs. Ref. [33] show that autonomous vehicles have trouble reacting to the complex pedestrian environment. Thus, FAV drivers need to pay additional attention to protect pedestrian safety. ...
... Li et al. (2019) and Raj et al. (2020) point out that resolving the problems regarding the responsibility for damages is crucial for promoting FAV use [35,36]. These problems eventually discourage people from choosing FAVs [33,36]. Therefore, Morita and Managi (2020) mention that to promote use, credibility should be guaranteed [37]. ...
Article
Full-text available
This study investigates the impact of environmental concerns, concerns about potential accidents, and the perceived advantages of Fully Autonomous Vehicles on individuals' willingness to buy and perceived value of these vehicles. Our research, conducted through a comprehensive survey with over 180,000 respondents in Japan and analyzed using structural equation modeling, reveals a nuanced disparity between willingness to buy and perceived value. We find that individuals concerned with natural environment conservation are more likely to purchase Fully Autonomous Vehicles due to their broader interest in societal issues and belief in the potential of new technologies like Fully Autonomous Vehicles as solutions. However, these individuals attribute a lower perceived value to these vehicles, mainly because their adoption does not directly contribute to natural environment conservation. Additionally, our results show that those recognizing the potential advantages of Fully Autonomous Vehicle technology have a higher willingness to buy and perceived value, while those with apprehensions about the technology are less likely to purchase and attribute a lower perceived value to these vehicles. This study offers vital insights for policy and planning, highlighting the complex interplay of factors influencing the willingness to buy and perceived value of Fully Autonomous Vehicles, critical for strategizing their adoption.
... for about 22%, and in some countries even as high as 65% (Wang et al., 2020). 13 ...
Preprint
Full-text available
There is weak traffic control at unsignalized crosswalks, and the operation of pedestrians and motor vehicles is based on their recognition of the surrounding road conditions, environment, and degree of danger. This is fundamentally a game process of mutual compliance and obstacles. Currently, there is still insufficient understanding of the characteristics and mechanisms of this game behavior. In this paper, a large number of human-vehicle interaction examples in the non-signaled pedestrian crossing are collected by UAV to analyze the pedestrian-vehicle interaction mode, and a comprehensive index called Pedestrian-Vehicle Game Index (PVGI) that depicts the pedestrian-vehicle game process considering the change of motion state is proposed. Then, the Markov-chain Monte Carlo (MCMC)has been used to identify the critical conditions for game modes. Additionally, a BN model based on the Gaussian Mixture Model (GMM) and the Expectation-Maximum algorithm (EM) algorithm is applied to model and analyze multiple games between pedestrians and vehicles. The results show that pedestrian-vehicle interaction includes 11 typical game modes in 3 categories, and there are significant differences in each interaction mode. MCMC identified the PVGI domain of the pedestrian-vehicle as [-4.0s, 2.0s]. In this game interval, the game mode will be divided into "pedestrian yield - vehicle dominant" and " vehicle yield - pedestrian dominant ", with corresponding game intervals of [-4.0, 0] and [0, 2.0]. The Naive Bayes (NB) model for second-round game recognition based on the EM algorithm and GMM model performs better, with a total accuracy of 83.78%.
... Although these studies are imperative for comprehending the fundamental principles of pedestrianvehicle interactions, they may not reflect the intricacies of real-world traffic situations that typically involve multiple entities with varying intentions and contexts. Moreover, in China, intersections tend to be more intricate and contain more occlusions (Wang et al., 2020) (Figure 2), further emphasizing the necessity of investigating the scalability of the eHMI approach's impact on pedestrian crossing intentions in one-to-multiple and multiple-to-multiple AV-pedestrian interaction scenarios. As a consequence, knowledge gained from interaction design in other domains can aid in advancing eHMI research. ...
... Information overload could easily lead to the inability to identify the most critical pieces of information [23,73] and, thus, pose a high level of safety risks [47,48]. Traffic efficiency may also be adversely affected in multi-lane crossing scenarios where pedestrians must process multiple cues to cross the road safely [46,89]. Holländer et al. [46] found that the individual AV signals (e.g., projected crosswalks) caused participants to stay longer in the first lane while waiting for the second lane to clear. ...
Conference Paper
Full-text available
Autonomous vehicles (AVs) may use external interfaces, such as LED light bands, to communicate with pedestrians safely and intuitively. While previous research has demonstrated the effectiveness of these interfaces in simple traffic scenarios involving one pedestrian and one vehicle, their performance in more complex scenarios with multiple road users remains unclear. The scalability of AV external communication has therefore attracted increasing attention, prompting the need for further investigation. This scoping review synthesises information from 54 papers to identify seven key scalability issues in multi-vehicle and multi-pedestrian environments, with Clarity of Recipients, Information Overload, and Multi-Lane Safety emerging as the most pressing concerns. To guide future research in scalable AV-pedestrian interactions, we propose high-level design directions focused on three communication loci: vehicle, infrastructure, and pedestrian. Our work contributes the groundwork and a roadmap for designing simplified, coordinated, and targeted external AV communication, ultimately improving safety and efficiency in complex traffic scenarios.
... Yet, many safety-related unknowns need to be investigated toward deploying autonomous vehicles (AVs) with aforementioned dynamic driving tasks. As stated by Wang et al. (2020), China is the world's largest automotive market and is ambitious for AVs development. Moreover, according to Chen et al. (2023), "AD accident" is among the top three most-searched keywords in China. ...
Chapter
Since there will be a mix of automated vehicles (AVs) and human-driven vehicles (HVs) on future roadways, in the literature, while many existing studies have investigated collisions where an AV hits an HV from behind, few studies have focused on the scenarios where an HV hits an AV from behind (called HV-AV collision). In this paper, we will investigate the HV-AV collision risk in the Stop-in-Lane (SiL) scenario. To achieve this aim, a Human-like Brake (HLB) model is proposed first to simulate the driver brake control. In particular, the joint distribution of Off-Road-Glance and Time-Headway is originally introduced to simulate the glance distraction of drivers during their dynamic vehicle control. Sequentially, a case study of HV-AV collisions in the SiL scenario of autonomous driving (AD) is conducted based on the HLB model, to reveal how the collision probability changes with respect to various parameters. The results of the case study provide us with an in-depth understanding of the dynamic driving conditions that lead to rear-end collisions in the SiL scenario.