Fig 3 - uploaded by Xue Yang
Content may be subject to copyright.
The Structure of the NAO Robot

The Structure of the NAO Robot

Source publication
Article
Full-text available
Talking and literary reading are important activities for children, especially for children with Autism Spectrum Disorder (ASD). We try to integrate the activities with NAO robots to excite their communication willingness. Making book choices according to the conversation is a task for NAO robots. In this paper, a novel multi-modal picture book rec...

Context in source publication

Context 1
... and behavior, children with ASD often show depression and anxiety when they interact with other people [11]. Some of the existing studies from the literature show that children with ASD have a great affinity with mechanical components, computers, and robots. The NAO robot was designed to look approachable and portray emotions like a toddler. Fig. 3 shows the overall appearance of the NAO robot. Our research team have developed some new features which made the NAO robot behave like a future family service robot. Additionally, some research work in the area of robotics has been conducted [12- 15]. Currently, the NAO robot has already been widely used in research for medical ...

Similar publications

Article
Full-text available
Individuals with Autism Spectrum Disorder (ASD) often exhibit difficulty in movement preparation and allocating attention towards different Regions of Interest (ROIs) of a visual stimulus. Though research has alluded to differences in movement preparation for aiming tasks between individuals with ASD and typically developing (TD) individuals, there...

Citations

... However, images are a form of visual data, while text is linguistic data, and they represent information with inherent differences. To bridge the gap between images and text, image-text matching technology for robots requires a deep understanding of both modalities and their seamless integration, which adds complexity to the task of feature extraction (Russell et al., 2002;Yang et al., March 2019). Furthermore, reducing the model's complexity while enhancing its representation capabilities and interpretability is a significant challenge in this context (Paolanti et al., 2019). ...
Article
Full-text available
With the rapid development of artificial intelligence and deep learning, image-text matching has gradually become an important research topic in cross-modal fields. Achieving correct image-text matching requires a strong understanding of the correspondence between visual and textual information. In recent years, deep learning-based image-text matching methods have achieved significant success. However, image-text matching requires a deep understanding of intra-modal information and the exploration of fine-grained alignment between image regions and textual words. How to integrate these two aspects into a single model remains a challenge. Additionally, reducing the internal complexity of the model and effectively constructing and utilizing prior knowledge are also areas worth exploring, therefore addressing the issues of excessive computational complexity in existing fine-grained matching methods and the lack of multi-perspective matching.
... Such an inclination has been found to overlook the synergistic potential of textual and visual information. Moreover, in instances where amalgamation attempts have been made, rudimentary methods like averaging or linear weighting dominate, often neglecting the nuanced importance of varying features [20][21][22]. In light of the issues delineated, the primary objective delineated in this study centered on the design of a book recommendation algorithm, synergizing multimodal image processing and deep learning. ...
... 6. Studies (Kumazaki, Muramatsu, Yoshikawa, Yoshimura, et al., 2019) reported that the ADOS results comparing children with ASD and TD could not be confirmed. In addition, Yang et al. (2018) mentioned that the outcomes were not validated with the ASD population. As a result, there was doubt regarding the study's validity and the data's reliability in the absence of examination of gold-standard evaluations and experiment groups. ...
... Their findings demonstrated that ASD children learn more intricately than both adults and TD children. Further,Yang et al. (2018) proposed a multimodal picture book recommendation method for ASD children capturing a live conversation between the child and the NAO Robot. They ...
Article
Full-text available
Background A severe shortage of skilled clinicians and infrastructure limits the delivery of early intervention programmes for autism spectrum disorder (ASD) children that are labour and duration intensive and most advantageous in the first 3 years. Aim Assess the role of robot mediated intervention (RMI) role in the rehabilitation of ASD individuals by responding to five research questions in the area of (1) Technology maturity; (2) Skill improvement areas; (3) Research design including participant's demographics, datasets, intervention details, and evaluation tools; (4) Data gathering, analysis, and technology contribution, and (5) Role of Robots in intervention and its effectiveness. Methods Scoping review included RMI studies for ASD individuals published in PUBMED, SCOPUS, and IEEE‐Xplore databases between January 1, 2011, and December 31, 2020. The publications were evaluated utilizing the PRISMA scoping review criteria (PRISMA‐ScR) checklist and the Critical Appraisal Skills Program (CASP). Results The 59 selected publications demonstrated that RMI improved skills for ASD individuals in 12 areas. During RMI, extensive joint attention stimuli were given to ASD individuals, and the therapy promoted ASD children's eye contact, imitation, socio‐communication, and academic skills. However, various ethical, privacy, and safety concerns were reported in the review. Conclusion RMI can improve access, quality, and affordability in ASD intervention. The acceptance and use of technology can be fast‐tracked by (1) incorporating statistically valid study designs; (2) carrying out field trials including diverse participant groups; (3) standardizing datasets with quality parameters; (4) recruiting statistically appropriate participant groups from ASD, Neuro Typical (NT) and diverse developmental disorder population; and (5) and addressing ethical, privacy, safety, trust, and other stakeholder concerns.
... Another approach that is used to detect ASD via machine learning is to analyze interaction behavior. One such effort is done by Xue et al. (Yang et al. 2019). They have proposed a system that analyzes interaction between children and NAO robot. ...
Article
Full-text available
Autism Spectrum Disorder (ASD) is linked with abridged ability in social behavior. Scientists working in the broader domain of cognitive sciences have done a lot of research to discover the root cause of ASD, but its biomarkers are still unknown. Some studies from the domain of neuroscience have highlighted the fact that corpus callosum and intracranial brain volume hold significant information for the detection of ASD. Taking inspiration from such findings, in this article, we have proposed a machine learning based framework for automatic detection of ASD using features extracted from corpus callosum and intracranial brain volume. Our proposed framework has not only achieved good recognition accuracy but has also reduced the complexity of training machine learning model by selecting features that are most significant in terms of discriminative capabilities for classification of ASD. Second, for benchmarking and to verify potential of deep learning on analyzing neuroimaging data, in this article, we have presented results achieved by using the transfer learning approach. For this purpose, we have used the pre-trained VGG16 model for the classification of ASD.
... This robot has been a useful due to its high capacity of interaction with the environment and all types of public, its naturalness, and its highly versatile programmability [17]. This set of characteristics has allowed the projects developed on this robot to obtain a higher level of independence, generating therapies and treatments in which the patient with ASD and the robot can interact almost without the presence of a human therapist [18] [19] [20]. The use of more than one robot involved is notably improving specific characteristics in patients such as multiple interactions or joint attention [21], [22]. ...
... The NAO robot was used to develop a multimodal picture book recommendation framework based on the conversation content [26]. The proposed framework consisted of textual information extraction, image information extraction, and a multimodality information integration module. ...
Article
Full-text available
Autism Spectrum Disorder is a neurological and developmental disorder. Children diagnosed with this disorder have persistent deficits in their social-emotional reciprocity skills, nonverbal communication, and developing, maintaining, and understanding relationships. Besides, autistic children usually have motor deficits that influence their imitation and gesture production ability. The present study aims to review and analyze the current research findings in using robot-based and virtual reality-based intervention to support the therapy of improving the social, communication, emotional, and academic deficits of children with autism. Experimental data from the surveyed works are analyzed regarding the target behaviors and how each technology, robot, or virtual reality, was used during therapy sessions to improve the targeted behaviors. Furthermore, this study explores the different therapeutic roles that robots and virtual reality were observed to play. Finally, this study shares perspectives on the affordances and challenges of applying these technologies.
... First, such a convenient manner of interaction is conducive to the development of special eye-control devices for patients with hand disability, ALS and polio, facilitating their lives and improving their quality of life. Second, fixation is advantageous to diagnose certain mental illnesses, such as autism spectrum disorder (ASD) [38], [39] and schizophrenia spectrum disorders (SSD) [40], [41]. This task understands personal fixations at the object level, which is helpful to improve the accuracy of disease diagnosis. ...
Article
Fixation as representation of one viewer’s attention are very intuitive to reflect the viewer’s observation procedure. The viewer’s observation behavior can be further revealed by analyzing fixations features. In this paper, we propose a fixation based personalized salient object segmentation method involving personal observation behavior learning. Concretely, we design three neural networks and deploy a meta-learning method. The first network is a base segmentation network that can be converted into a meta-segmentation network by meta-learning. The meta- segmentation network can learn one viewer’s observation behavior from only one sample and then generates the viewer’s segmentation network to segment the other samples. Moreover, a fusion network plays an important role in alleviating an unsuitable transmission problem and generating a final segmentation result. The experimental results demonstrate the reasonability of our observation behavior learning and the effectiveness of the three proposed neural networks.
... First, such a convenient manner of interaction is conducive to the development of special eye-control devices for patients with hand disability, ALS and polio, facilitating their lives and improving their quality of life. Second, fixation is advantageous to diagnose certain mental illnesses, such as autism spectrum disorder (ASD) [38], [39] and schizophrenia spectrum disorders (SSD) [40], [41]. This task understands personal fixations at the object level, which is helpful to improve the accuracy of disease diagnosis. ...
Preprint
Full-text available
As a natural way for human-computer interaction, fixation provides a promising solution for interactive image segmentation. In this paper, we focus on Personal Fixations-based Object Segmentation (PFOS) to address issues in previous studies, such as the lack of appropriate dataset and the ambiguity in fixations-based interaction. In particular, we first construct a new PFOS dataset by carefully collecting pixel-level binary annotation data over an existing fixation prediction dataset, such dataset is expected to greatly facilitate the study along the line. Then, considering characteristics of personal fixations, we propose a novel network based on Object Localization and Boundary Preservation (OLBP) to segment the gazed objects. Specifically, the OLBP network utilizes an Object Localization Module (OLM) to analyze personal fixations and locates the gazed objects based on the interpretation. Then, a Boundary Preservation Module (BPM) is designed to introduce additional boundary information to guard the completeness of the gazed objects. Moreover, OLBP is organized in the mixed bottom-up and top-down manner with multiple types of deep supervision. Extensive experiments on the constructed PFOS dataset show the superiority of the proposed OLBP network over 17 state-of-the-art methods, and demonstrate the effectiveness of the proposed OLM and BPM components. The constructed PFOS dataset and the proposed OLBP network are available at https://github.com/MathLee/OLBPNet4PFOS.
... However, most works focus on only recognizing certain targets involved in driving rather than providing driving suggestions. Image captioning originates from Machine Translating [60] and Natural Language Processing(NPL) [43], and mainly applied in human-robot interactions [61], early childhood education [62], information retrieval [63], and visually impaired assistance [64]. In this paper, we attempt to introduce the image captioning method, the new emerging technology in artificial intelligence, to ITS so as to generate high-level semantic information for driving suggestions or strategies. ...
Article
Full-text available
The traffic scene understanding is the core technology in Intelligent Transportation Systems (ITS) and Advanced Driver Assistance System (ADAS), and it is becoming increasingly important for smart or autonomous vehicles. The recent methods for traffic scene understanding, such as Traffic Sign Recognition (TSR), Pedestrian Detection, and Vehicle Detection, have three major shortcomings. First, most models are customized for recognizing a specific category of traffic target instead of general traffic targets. Second, as for these recognition modules, the task of traffic scene understanding is to recognize objects rather than make driving suggestions or strategies. Third, numerous independent recognition modules disadvantage to fusing multi-modal information to make a comprehensive decision for driving operation in accordance with complicated traffic scenes. In this paper, we first introduce the image captioning model to alleviate the aforementioned shortcomings. Different from existing methods, our primary idea is to accurately identify all categories of traffic objects and understand traffic scenes by making full use of all information, and making the suggestions or strategy for driving operation in natural language by using Long Short Term Memory network (LSTM) rather than keywords. The proposed solution naturally solves the problems of feature fusion, general object recognition, and low-level semantic understanding. We tested the solution on our created traffic scene image dataset for evaluation of image captioning. Extensive experiments including quantitative and qualitative comparisons demonstrate that the proposed solution can identify more objects and produce higher-level semantic information than the state-of-the-arts.
... First, such a convenient manner of interaction is conducive to the development of special eye-control devices for patients with hand disability, ALS and polio, facilitating their lives and improving their quality of life. Second, fixation is advantageous to diagnose certain mental illnesses, such as autism spectrum disorder (ASD) [38], [39] and schizophrenia spectrum disorders (SSD) [40], [41]. This task understands personal fixations at the object level, which is helpful to improve the accuracy of disease diagnosis. ...
Article
Full-text available
As a natural way for human-computer interaction, fixation provides a promising solution for interactive image segmentation. In this paper, we focus on Personal Fixations-based Object Segmentation (PFOS) to address issues in previous studies, such as the lack of appropriate dataset and the ambiguity in fixations-based interaction. In particular, we first construct a new PFOS dataset by carefully collecting pixel-level binary annotation data over an existing fixation prediction dataset, such dataset is expected to greatly facilitate the study along the line. Then, considering characteristics of personal fixations, we propose a novel network based on Object Localization and Boundary Preservation (OLBP) to segment the gazed objects. Specifically, the OLBP network utilizes an Object Localization Module (OLM) to analyze personal fixations and locates the gazed objects based on the interpretation. Then, a Boundary Preservation Module (BPM) is designed to introduce additional boundary information to guard the completeness of the gazed objects. Moreover, OLBP is organized in the mixed bottom-up and top-down manner with multiple types of deep supervision. Extensive experiments on the constructed PFOS dataset show the superiority of the proposed OLBP network over 17 state-of-the-art methods, and demonstrate the effectiveness of the proposed OLM and BPM components. The constructed PFOS dataset and the proposed OLBP network are available at https://github.com/MathLee/OLBPNet4PFOS .