PreprintPDF Available

HATSUKI : An anime character like robot figure platform with anime-style expressions and imitation learning based action generation

Authors:

Abstract and Figures

Japanese character figurines are popular and have pivot position in Otaku culture. Although numerous robots have been developed, less have focused on otaku-culture or on embodying the anime character figurine. Therefore, we take the first steps to bridge this gap by developing Hatsuki, which is a humanoid robot platform with anime based design. Hatsuki's novelty lies in aesthetic design, 2D facial expressions, and anime-style behaviors that allows it to deliver rich interaction experiences resembling anime-characters. We explain our design implementation process of Hatsuki, followed by our evaluations. In order to explore user impressions and opinions towards Hatsuki, we conducted a questionnaire in the world's largest anime-figurine event. The results indicate that participants were generally very satisfied with Hatsuki's design, and proposed various use case scenarios and deployment contexts for Hatsuki. The second evaluation focused on imitation learning, as such method can provide better interaction ability in the real world and generate rich, context-adaptive behavior in different situations. We made Hatsuki learn 11 actions, combining voice, facial expressions and motions, through neuron network based policy model with our proposed interface. Results show our approach was successfully able to generate the actions through self-organized contexts, which shows the potential for generalizing our approach in further actions under different contexts. Lastly, we present our future research direction for Hatsuki, and provide our conclusion.
Content may be subject to copyright.
HATSUKI : An anime character like robot figure platform with anime-style
expressions and imitation learning based action generation
Pin-Chu Yang1, Mohammed Al-Sada2, Chang-Chieh Chiu3, Kevin Kuo4, Tito Pradhono Tomo5,
Kanata Suzuki6, Nelson Yalta7, Kuo-Hao Shu8and Tetsuya Ogata9
Abstract Japanese character figurines are popular and have
pivot position in Otaku culture. Although numerous robots
have been developed, less have focused on otaku-culture or on
embodying the anime character figurine. Therefore, we take the
first steps to bridge this gap by developing Hatsuki, which is a
humanoid robot platform with anime based design. Hatsuki’s
novelty lies in aesthetic design, 2D facial expressions, and
anime-style behaviors that allows it to deliver rich interaction
experiences resembling anime-characters. We explain our de-
sign implementation process of Hatsuki, followed by our evalua-
tions. In order to explore user impressions and opinions towards
Hatsuki, we conducted a questionnaire in the world’s largest
anime-figurine event. The results indicate that participants were
generally very satisfied with Hatsuki’s design, and proposed
various use case scenarios and deployment contexts for Hatsuki.
The second evaluation focused on imitation learning, as such
method can provide better interaction ability in the real world
and generate rich, context-adaptive behavior in different situa-
tions. We made Hatsuki learn 11 actions, combining voice, facial
expressions and motions, through neuron network based policy
model with our proposed interface. Results show our approach
was successfully able to generate the actions through self-
organized contexts, which shows the potential for generalizing
our approach in further actions under different contexts. Lastly,
we present our future research direction for Hatsuki, and
provide our conclusion.
I. INTRODUCTION
The Japanese term Otaku refers to a person who is a fan
of a specific subculture, yet such term has become almost al-
ways associated with people who are fans of Japanese anime
(animated cartoons), manga (comics), and video games [1],
[2]. Overall, the Otaku culture has become a worldwide
phenomenon, fostering many local communities, societies
and global events revolving around related hobbies [2],[3].
An essential aspect of the Otaku culture is figurines, which
present a physical embodiment of virtual characters from
Japanese anime, manga or video games. These figurines
1Pin-Chu Yang,2Mohammed Al-Sada, 5Tito Pradhono Tomo, 6Kanata
Suzuki, 7Nelson Yalta, and 9Tetsuya Ogata are with Waseda University,
Tokyo, Japan
1kcy.komayang@akane.waseda.jp
2alsada@dcl.cs.waseda.ac.jp
5tito@toki.waseda.jp
6suzuki.kanata@jp.fujitsu.com
7nelson.yalta@ieee.org
9ogata@waseda.jp
1Pin-Chu Yang, 3Chang-Chieh Chiu, 4Kevin Kuo, 7Nelson Yalta and
8Kuo-Hao Shu are (also) with Cutieroid Project: www.cutieroid.com
3kyumasaki@gmail.com
4freemonk6436@gmail.com
8hudmc2000@gmail.com
2Mohammed Al-Sada is also with Qatar University, Doha, Qatar
6Kanata Suzuki is also with Fujitsu Laboratories LTD, Japan
Fig. 1. Hatsuki is an interactive humanoid robot design that embodies
Japanese anime characters, which allows for various entertainment-based
interaction scenarios.
resemble 2D-like facial features that are commonly found
in Japanese anime and manga designs. The rising global
popularity of the Otaku cultures and advancements in mass-
production made figurines highly desired item by fans of the
Otaku culture worldwide.
Despite the massive popularity of such culture worldwide,
we believe robotics had a minimum contribution to such
a culture. There is a scarcity of research literature that
investigated potential applications of robotics within the
Otaku culture. Especially, we believe that potential appli-
cations can span beyond previously investigated applications
of entertainment robots, where they can directly contribute
to business profitability and value creation [4], similarly to
figurines.
In this work, we introduce Hatsuki, which is a humanoid
robot that is uniquely designed to resemble 2D anime designs
found in Fig.1. Hatsuki bridges the gap between figurines and
humanoid robots through its unique design and functional
capabilities. As a platform, Hatsuki and its design process
can be used to embody various anime-like characters both in
terms of aesthetics, expressions, and behavior.
Accordingly, we start by explaining the design and im-
plementation specifications of Hatsuki, followed by two
evaluations. Similar to previous approaches [5], [6], [7], we
focused our first evaluation on investigating users’ impres-
sions of Hatsuki through a survey, which was handed out to
visitors of Hatsuki’s exhibition booth at the largest figurine
exhibition in the world (Wonder Festival [8]). Results show
that participants regarded Hatsuki as a combination of the
figurine and humanoid robots, proposed various intriguing
use cases of Hatsuki within public and private usage contexts
arXiv:2003.14121v1 [cs.RO] 31 Mar 2020
[9], and were generally very satisfied with Hatsuki.
Interaction in the real world is hard to predict, creating pre-
defined robot actions for every world situation is impossible.
Imitation learning is one approach to enable context-adaptive
interactions for different situations, it is especially useful
in enabling the system to perform actions through learned
policy with contextual inputs (e.g. as sensory information,
motor information or any internal states) step by step and
able to perform various behaviors. Moreover, this approach
does not require predefined every action-state situation one-
by-one but directly learns from operators’ experience to
generate a policy in order to perform much human-like
behavior.
Our second evaluation focuses on performing imitation
learning for eleven expressive actions of Hatsuki, which were
acquired through kinesthetic teaching. The results indicate
that the trained neuron networks based policy successfully
generates the actions and self-organize the context neurons
for each different trajectory.
Lastly, we conclude that Hatsuki’s evaluation results were
very encouraging to pursue future works. We highlight a
number of future research directions that allow Hatsuki to
be applicable within a variety of novel interactive contexts
of use.
We summarize the main contributions of our work as fol-
lows: 1) Design and Implementation of Hatsuki, which em-
bodies anime-character designs into an interactive humanoid
robot. 2) Evaluation results that explored overall impressions
of Hatsuki, and applicability of imitation learning for use in
different interactive contexts.
II. REL ATED WORKS
Our work extends three strands of related works on
Humanoid Robots,Animatronics, and Entertainment Robots.
We discuss each of these domains as follows:
Humanoid robots such as Twendy-one [10], Asimo [11]
are designed for in-door daily life support. A subcategory of
these robots attempts to resemble realistic human-designs.
For example, Gemiroid [12] and Sophia [13] presented very
realistic human-like appearance. Such an approach requires
comprehensive design, makeup skills and integration efforts
to design every aesthetic detail. Hatsuki takes a different
approach as it is based on anime-character figurines. In
addition, unlike mentioned works that emphasize daily life
services, Hatsuki is designed to emphasize entertainment
applications related to the Otaku culture.
The design direction of Hatsuki is similar to ”Anima-
tronics”, which are electro-mechanically animated robots
that aim to mimic life-looking characters or creatures [14].
Various previous efforts, presented vivid robots, such as
humans [14] or animals [15], for the entertainment industry.
Similarly, Hatsuki shares similarities with the works in
animatronics, yet extend such works through novel aesthetic
design and behaviors that mimic Japanese anime characters
beyond existing works.
Entertainment Robots is a subcategory of robots that are
mainly concerned with applications like singing, dancing and
various performances [16]. For example, Kousaka Kokona is
an adult-size humanoid robot designed for entertainment, like
singing and dancing [17]. Similarly, other robot [18], provide
similar functionalities in smaller body proportions. Although
some of the mentioned robots [17], [18] are designed
with doll-like aesthetics, these robots are limited; they lack
anime-like facial expressions, speech or enable autonomous
interactivity. On the contrary, Hatsuki advances the state
of the art by embedding various its superior design and
interactive modalities, like speech, facial expressions, body
gestures. Therefore, Hatsuki presents a thorough embodiment
of anime-character designs beyond previous works, thereby
providing various novel interaction potentials. The novelty
of Hatsuki can translate to profitability and create value to
consumers, in a similar fashion to previous efforts within
entertainment robots [4].
III. DES IGN A ND IMPLEMENTATION OF HATSUKI
The design approach of Hatsuki using the outside-in [19]
approach which refers to the aesthetics orient design pro-
cess. For usual engineering product designs, inside-out style
approach is considered easier to apply due to it is functional-
oriented design, which starts by designing aesthetic aspects,
followed by functional components of the system. On the
contrary, the outside-in we start and emphasize the aesthetic
design of the robot then proceed to implement the techni-
cal/mechanical design of the robot in an iterative fashion.
Overall, this design process is iterative and combines CAD
and engineering design as well as common high-polygon
3D modeling designs.Accordingly, we followed the above
process to design and implement the various components
of Hatsuki. First, we discuss the appearance of Hatsuki,
followed by the facial expression system. Next, we introduce
our implementation of mechanical and structural compo-
nents, control and action generation in Hatsuki.
A. Appearance
Fig. 2. Hatsuki extends popular anime culture character designs by
embodying a Mecha-Musume character model. Such a design direction
combines mechanical and anthropomorphic attributes into the aesthetics of
the character.
A Japanese anime character usually applies simplified 2D
characteristics, which people use to distinguish the character
and largely favored in Otaku culture [20]. These charac-
teristics are hairstyle, hair color, eye shape, pupils style,
pupils color and eye’s high-light style especially for the main
protagonist of the story who usually be designed delicately
[21]. For the character who is not so highly recognized
through these characteristics will usually aid with clothes
or decorations design to improve recognizability (ex. special
hairpin on specific position). We considered all aspects
mentioned above to design Hatsuki.
The art style of design applies ”Mecha-Musume” that is
a popular category in Otaku culture. This art style refers
to a character that is female-like with mechanized design
decoration or body parts. This art style blend mechanized
design with a humanoid character can affect people to
recognize Hatsuki as a non-human form. The final character
design compared to the actual finished prototype is shown as
Fig.2 and the appearance and dimension of Hatsuki is shown
as Fig.3.
Fig. 3. Hatsuki has short brown hair, purple eyes, 145 centimeter tall,
1:5.5 body proportions[22] (head-to-body ratio), which resembles a common
anime-character design attributes[23].
B. Facial Expression
The anime character’s facial expression are sometimes
abstract that represent character’s mind status directly [21].
This characteristic is important for enriching the character’s
personality. Therefore, we designed a wide variety of facial
expression which are common in anime characters. A rear
projection is integrated to Hatsuki to project the expressions
in 2D onto Hatsuki’s face as in Fig.4.
We apply projection mapping to project facial expression
to a 3D organic face screen. Unlike [25] trying to apply
real human textures, Hatsuki’s facial expression is essentially
2D animation which does not suffer from mesh un-matching
issue that much.
The calibration of projection mapping is shown as Fig.5
which applied with a parameter controllable facial texture
model to perform vivid facial expressions.
Fig. 4. Hatsuki use a rear projection facial expression system to provide rich
anime-like facial expressions. Unlike physical facial expression mechanism
(e.g. [24]), our system is able to express a further variety of facial
expressions without limitation.
Fig. 5. Calibration of facial expression on 3D organic face: (1) First, trace
drawing 2D facial illustrate and animation with front view of the face 3D
model. (2) After creating facial material, we do texture projection on the 3D
mesh. (3) Recapture the projected texture from the projector lens position
and output image to the projector.
C. Mechanical Structure and Sensing
Hatsuki’s body is constructed using 3D printed PLA parts
(polylactic acid). We chose this PLA as it is lightweight
yet robust enough to withstand the weight of various body
parts. Our current implementation focuses on the upper
torso design and motions. Therefore, Hatsuki currently (Mk.I
version) has 17 DoFs in total, where we used a variety of
servo motors to actuate different sections of Hatsuki (Table
I)). We explain each of these sections as follows:
Head and Arms: We used three servomotors (XM430) to
actuate the head as well as each arm. The shoulder joint
consists of servo XM540, which provides higher torque that
can be used for lifting or holding objects.
Fingers: To actuate the fingers, we use Futaba S3114
RC servo motors connected directly to the Arduino Nano’s
PWM (Pulse Width Modulation) pins. We implemented a
tendon-driven mechanism to achieve a human-size hand.
Each tendon is attached to a servo motor head, where its
position can be controlled by changing the PWM signal of
Arduino Nano directly from a PC via serial communication.
Ears: We use two HobbyKing HK15148 analog servo motors
TABLE I
SERVO MOT ORS SPECIFICATIONS
Specifications XM430 XM540 S3114 HK15148
Weight (g) 82 165 7.8 15
Stall Torque 4.1(12V) 10.6(12V) 0.17(6V) 0.2(6V)
Speed (RPM) 46(12V) 30(12V) 100(6V) 55(6V)
Voltage (V) 10.0 - 14.8 10.0 - 14.8 4.8 - 6.0 4.8 - 6.0
(also connected to an Arduino Nano) to actuate the ears.
Therefore, the ears can be used as an additional modality to
communicate emotions, such as happiness or sadness.
Overall, we selected the above servo motors because they
offered good trade-offs between their size and provided
torque. Moreover, we chose Robotis servomotors as they pro-
vide advanced controls (such as PID), and were successfully
used in imitation learning studies [26]. Lastly, the Robotis
servo motors communicate via RS-485 (4-wires) and can
be daisy-chained through serial communication. This set up
drastically enabled good overall cable management within
the Hatsuki’s confined design.
Control: The robot is controlled via a PC using an
RS-485 to a USB converter (U2D2), which allows us to
directly control the servomotors. We used two Arduino nano
microcontrollers to control all other servomotors, which are
connected to the PC using USB.
Power: The Robotis servomotors are powered using a
12 V power supply, while the Futaba servo motors and
HobbyKings servo motors are powered from a 6 V power
supply.
Sensors: Hatsuki’s head embeds an Intel RealSense D435
RGBD camera and a generic Bluetooth speaker. The Robotis
servomotor also provides feedback, including position and
output current sensor that enables estimating applied torque
on the motors.
D. Control Infrastructure
The main controller of Hatsuki is using a gaming PC
which allows us to use the PC for VR control or for
machine learning. The control interface uses the game engine
”Unity3D” due to its flexibility to develop interactive appli-
cations. Robotis SDK and serial communication are used to
control the Robotis and Arduino nano, respectively. Our con-
trol architecture is modular, and was implemented using mul-
tiple Unity scenes as shown in Fig.6. Each scene is created
to control a specific aspect of the robot, including various
control parameters and attributes. New robot functionalities
or sensors can easily be created or integrated by creating
new scenes. Therefore, we chose a modular implementation
of our system as it provides benefits in terms of development
and maintainability of interactive applications.
E. Action Recording
The action refers to the performance of the robot such
as motion, facial expressions, audio performance, etc. which
can be observed. For motion, Hatsuki provides two style of
motion recording, Kinesthetic Teaching and record through
Fig. 6. Our control infrastructure has three main scenes, which are Motor
Control Scene, Expression Scene and Integration Scene. Motor Control
Scene is mainly for controlling robot actuators and related attributes. Ex-
pression Scene is for facial expression and sound system control. Integrated
Scene combines the two mentioned scenes, and also provides a unified
structure to develop different robot applications that combine robotic motion,
voice and facial expressions.
VR device. Facial expression and audio can be added to
the recorded trajectory and create a synced command with
motion.
1) Kinesthetic Teaching: Kinesthetic Teaching is a com-
mon method for teaching motor skills of the robot [27]. An
operator moves the robot’s body to perform motions by cut-
ting down the torque supply for actuators and keep retrieving
encoders data continuously. For most of the industrial robot
may be hard for doing kinesthetic teaching method due to
robot inertia, however since Hatsuki is mostly built with 3D
print material has the low weight that can be easily moved.
2) Record through VR device: Hatsuki also provides
Virtual Reality(VR) device HTC Vive control interface to
capture the movement of the operator and transfer to a robot.
We make VR device works as a limited amount trackers
motion capture device to control multiple effectors(motion
generating reference objectives) corresponding to joints of
the robot and generate motion through inverse kinemat-
ics(IK) algorithm with them. The IK algorithm we used
an evolutionary algorithm based method ”BioIK”[28] which
can create full-body, multi-objective and high continuously
motions.
F. Implementation of Imitation Learning
The policy refers to a set of rules (or state) that describe
how AI chooses its action to take. In this case, it represents
the output model or a function that can output the action
with the state input. Imitation learning is possible to learn
the policy from task performing [29] to the dynamic motion
generating [30]. It shows the advantage of learning from hu-
man operator experience may perform human-like behaviors
but also good at interacting with environments or objects.
By applying our neural networks based policy model, it can
generate context-dependent actions that provide more variety
of actions. The context can be sensory-motor information or
any designed or generated contexts.
The implementation of imitation learning is shown in Fig.7
and we proceeding trained policy model embedded with the
integrated scene in our control infrastructure. For training
policy, we used a multiple time-scale recurrent neural net-
work (MTRNN) [31] due to its powerful performance [32]
to learn the relationship between facial-sound expression and
motor information. The MTRNN is composed of three types
of neurons: input-output (IO), fast context (Cf), and slow
context (Cs) neurons.The model effectively memorizes the
trained sequences as combinations of the dynamics of Cf
and Cs neurons.
Fig. 7. The outline of training imitation policy
IV. EVALUAT ION 1: INVESTIGATING IMPRESSIONS AND
OPI NIO NS TOWAR D HATSUKI
A. Study Design and Procedure
Objective: To investigate the challenges and opportunities
of novel robotic platforms from the user’s perceptive, various
previous works have utilized user-centered design approaches
[5], [7], [33], [6] to gain insights about user impressions
and expectations of such platforms. Accordingly, our main
evaluation objective is to investigate the impressions and
opinions about Hatsuki’s design and to gain insights about
highly-desired applications. We focused on how Hatsuki’s
aesthetics, behavior, and tasks are perceived by anime fans
when compared to common anime characters and figurines.
Method: We extended the questionnaires in previous works
[5], [7], [6], [34] to design a survey that measured users’
satisfaction of Hatsuki’s design, as well as interaction expec-
tations during public and private usage contexts. As Hatsuki
is mainly designed to target anime and figurine fans, we
carried out our survey at Wonder Festival 2020 [8], which is
one of the biggest anime figurine conferences in the world.
We exhibited various interactive experiences of Hatsuki (Fig.
8), such as greeting visitors, talking to them and posing
for selfies. All experiences utilized various capabilities of
Hatsuki, such as speech, facial expressions, hand gestures.
Visitors first were given a chance to interact with Hatsuki
and discuss its various capabilities with four researchers.
Next, visitors were handed the survey (as described below),
and researchers followed-up with visitors to ensure they
understood and answered the questions correctly. Each visitor
took around 10-15 minutes to complete the survey and was
handed a Hatsuki seal as a reward.
Survey: the survey included 16 questions, that were sep-
arated into three sections. The first section includes demo-
graphic questions. To ensure the participants are within our
targeted audience, the second section included questions that
gauged how much each participant is into anime and figurine
cultures; we asked questions about numbers of collected
anime figurines and the time spent on each of mentioned
hobbies. In general, participants who are fond of such culture
would have many figurines and would dedicate time on a
daily basis for related activities.
Figurine culture has an associated collectability value [35];
it is very common for people to collect and exhibit figurines
of their favorite anime characters. Therefore, we wanted to
know whether or not Hatsuki is perceived as a figurine, and
thereby has collectability value.
The third section focused on understanding basic impres-
sions of Hatsuki. Accordingly, we included brainstorming
questions about the visitors’ most desired use cases, whether
they perceived Hatsuki as a figurine or a robot, and their
overall satisfaction with Hatsuki. The fourth section included
a detailed rating of Hatsuki’s various body locations. We
have chosen to focus on the mentioned aspects as they
provide insights about our design direction of Hatsuki and it
is perceived from the viewpoint of our target audience.
Participants: We asked 51 visitors to our booth to take
our survey. Participants were aged between 20-51 (m=41.6,
females=2), Most participants were Japanese (33), while the
remaining 18 came from various Asian and European coun-
tries. Participants reported spending 4.7 hours (SD=5.02) on
Otaku culture activities (anime or figurines), and reported
owning an average of 31 anime figures (SD=36.32), with 30
participants owning 10 to 100 figures. Therefore, in addition
to being visitors to the largest anime figurine conventions, we
conclude that the participants fell into our target audience.
Fig. 8. Hatsuki demonstrate in Wonder Festival 2020 Winter event
B. Results and Analysis
The gathered results indicate a variety of intriguing aspects
regarding participants’ expectations and impressions toward
Hatsuki. Accordingly, we classified results based on the
before mentioned survey sections and discuss them in the
following subsections.
1) Impressions of Hatsuki’s Design: We asked partic-
ipants to rate Hatsuki’s aesthetic design Fig.9. Overall,
the results indicate that participants liked Hatsuki’s design
(m=5.20, SD=1.55). Qualitative results reveal further insights
about mentioned ratings. Participants especially praised the
rich 2D expressions of Hatsuki’s face, hairstyle, smooth
movements. They also liked the Mecha-Musume based anime
Fig. 9. Participants’ rate of Hatsuki’s aesthetic design with emphasis on
specific body sections (1-7 likert-scale, 7 is best) .
design and body proportions, which they thought resembles
cute anime figurines.
Participants also thought some aspects of Hatsuki should
be improved. Participants wanted Hatsuki to walk around
the exhibition area and consider. Also, they criticized the
arms due to exposed wiring, thereby assigning a slightly
lower rating. However, one-way ANOVA test with pair-
wise comparisons using the bonferroni correction turned
negative. Therefore, although the hand rating is slightly low,
the difference is not significant. Some participants criticized
the facial-expression system, mentioning it is too bright and a
little pixelated. We intend to address these issues through the
full implementation of a bipedal walking system in Hatsuki,
better wire management and installing a better projection
unit.
2) The Use Cases of Hatsuki: Participants detailed a
variety of intriguing use cases of Hatsuki. In total, we
gathered 108 use cases, and We extend classifications in
previous works [5], [7] with further categories to classify
the use cases within the public and private contexts [9] as
described below:
In private contexts, such as at home, participants proposed
a total of 49 use cases that we classified under four main cate-
gories. First, the majority of tasks fell under companionship
applications (25 use cases), where Hatsuki is expected to
greet users and engage in conversations about daily topics
in different contexts (e.g. during dinner or before going
to work). Daily life assistance (15) included tasks similar
to common service robots, such as doing house chores
(e.g. cleaning, organizing house items...etc), and providing
services like waking the user up and serving food items.
Participants thought Hatsuki is useful in private performances
(5), such as singing, dancing or talk-shows. Lastly, digital
assistance (4) comprised tasks like telling the latest news,
playing music and scheduling, similar to smart home devices.
In public contexts, such as during events or conventions,
participants proposed a total of 59 use cases that we classified
under three categories. First, public performances in events
and conventions (e.g. entertainment robots [16], such as
dancing, singing and posing in a cute manner constituted
33 use cases. Interacting with users in public contexts,
such as by talking, hugging, shaking hands, and playing
games (e.g. paper rock scissors) included 21 use cases.
Lastly, providing public services (5), included tasks like
shop-keeping, receptionist and serving drinks to users.
We asked participants about the most suitable interaction
context for Hatsuki. 65.22% of the participants thought that
Hatsuki is better suited for public contexts, citing examples
of anime events, conferences and tourist attractions as po-
tential deployment venues. Moreover, participants provided
many references from pop-culture and anime characters to
give examples of proposed tasks in public context, like virtual
entertainment performers (e.g. Hatsune Miku [36]).
3) Discussion: The results indicate that our design di-
rection is highly favored, and participants provided various
insights to further enhance our design direction. Likewise,
the proposed use cases provided insights about potential
application and deployment contexts. Participants rated their
overall satisfaction with Hatsuki with 5.65 (SD=1.48, 7 is
best). Therefore, we believe Hatsuki was generally well-
received. A correlation analysis (Pearson product-moment
correlation) to understand which aspect of Hatsuki’s aesthetic
ratings affected satisfaction revealed significant results for
the face (r=0.687, n=50, p<0.001). Additional tests turned
negative for other body parts. Therefore, we conclude that
Hatsuki’s face was most significant in affecting the overall
satisfaction score, which indicates the importance of design-
ing robust facial features and expressions for this form of
robots.
Although some proposed applications of Hatsuki have
been investigated before (e.g. [24], [4]), we believe Hatsuki
advances the state of the art through its unique design;
Hatsuki can be designed to embody any virtual character,
from anime or pop-culture, thereby enabling experiences
beyond what has been mainly investigated in social robots.
For example, the familiarity of users with anime figurines
characters can be used as a pretext to initiate and carry out
various tasks. Such a pretext can be a significant factor in es-
tablishing familiar and trust-worthy interactive experiences.
Another interesting result is whether or not participants
would consider Hatsuki a figurine with robotic components.
We asked visitors to rate whether they consider Hatsuki
a robot, similar to common service or companion robots,
or a figurine (7 means Hatsuki is a figurine). The average
response was 4.05 (SD=1.59), where most participants in-
dicated that Hatsuki is both a figurine and a robot; since
Hatsuki is designed to resemble common figurines, yet
could move and interact with visitors. We believe this result
verifies our design direction in Hatsuki, as it confirms that
Hatsuki’s design provides an appeal to visitors to pose,
interact and potentially buy Hatsuki in similar to common
anime figurines.
There is lack of female participants in our questionnaire,
which is due to the event being mainly targeting male
visitors [8]. Therefore, we intend to carry out a survey
with other demographics, such as with predominantly female
participants or in other countries. We believe such research
direction would yield deeper insights about Hatsuki’s appeal
in varied target groups.
Overall, the results indicate that otaku culture fans highly
appreciated Hatsuki, and provided a variety of desired tasks
within public and private contexts. Therefore, we are en-
couraged to further use Hatsuki as a platform for research
by realizing proposed tasks and deploying Hatsuki within
incoming otaku events.
V. EVALUATION 2: IMITATION LEARNING
We evaluated our method from the viewpoint of the
imitation learning platform with a MTRNN policy. By using
imitation learning, the proposed method enables the robot
to generalize actions without the experimenters to design
many action details. This is useful for making the character
more lifelike with natural motions based on observed context,
rather than pre-recorded ones. In this study, we performed
imitation learning with time-series data obtained by using
the Hatsuki platform. Moreover, to evaluate the introduction
cost of imitation learning, we measured the estimated person-
hours cost. We compared our platform with the normal
imitation learning process that requires synchronization of
multiple time-series data.
A. Experimental Setup
For training data, we obtained ten motion patterns by the
kinesthetic teaching described in the section.III.E.The ten
actions are the following: 1) self-introduction, 2) feeling
challenged, 3) angry, 4)annoyed, 5)confused, 6) rejection, 7)
hating something, 8) joy, 9) sad, 10) & 11) two expressions
of agreeing with the user. Each motion pattern consists of
17-dimensional joint angles, facial expression commands,
and audio commands. We converted facial expression and
audio commands to a one-hot vector format, and incorporated
them into time-series data. The value input to the MTRNN
was scaled to [-1.0, 1.0]. We set parameters of our model
according to the previous study [32].
B. Results
Hatsuki successfully generating the sequence of actions
includes motions, facial expressions and voice expressions.
The generated actions are shown as Fig.10 and all actions
can be found in the supplementary video. The motion is
generated by inputting the initial value of Cs and the
initial posture of the robot to MTRNN. The errors in the
generated trajectories are small, indicating that the model
has successfully learned high-dimensional motions. We also
visualized the internal state of the MTRNN by principal
component analysis (Fig.11). Each color in Fig.11 indicates
trained sequences. It was confirmed that each motion pattern
was separated and that the MTRNN can generate robot
behaviors from the initial value of Cs.
We discussed the effectiveness of the proposed method
from the viewpoint of the reduction in the number of
work steps of imitation learning.In the conventional imitation
learning method, the process of linking the robot controller
with the projection mapping, facial expression, and audio
perform requires the following four processes: (1) Acquisi-
tion of robot motion by using the kinesthetic teaching (2)
Fig. 10. Four samples of the motions generated through MTRNN model
Fig. 11. PCA of context neurons of ten motion patterns.
Synchronization of motion data, facial expression, and audio
perform (3) Training process of deep neural network and (4)
Loading the trained model into the system, and synchroniz-
ing the robot motion, acquisition of sensor information, and
expression of robot.
Our platform that integrates multi scenes can eliminate the
switching cost of multiple Operating Systems. Further, the
synchronization processing of motion data, facial expression,
and audio perform, can be simplified. Therefore, a significant
reduction in the required person-hours is expected.
VI. CONCLUSION AND FUTURE WOR K
In this paper, we presented Hatsuki, which is a humanoid
robot that is designed to resemble anime figurines in terms
of aesthetics, expressions and movements. We explained our
implementation specifications and potential applications. We
carried out an evaluation to understand user impressions
regarding Hatsuki’s design, as well as potential usage sce-
narios. We also carried out an evaluation using imitation
learning which shows it can successfully perform action
generation through learnt policy model effectively.
Overall, the results are very encouraging to pursue further
work. Moreover, we believe Hatsuki was generally liked
by visitors, and they thought Hatsuki is an embodiment of
anime-characters in real life. Therefore, we will focus on
investigating applications and available hardware design of
Hatsuki within the otaku culture worldwide. In the future, we
intend to improve our design by implementing better grasp-
ing hand for environment, human interaction and bipedal
system for the robot so it can walk and demonstrate full
body movements.
Lastly, we believe that features like branching stories [37]
and dating simulators [38] are highly sought after features
of anime style games. Such features provide interactive and
unpredictable elements, which were also found to be greatly
liked features of entertainment robots [39]. Therefore, we
intend to realize similar features in Hatsuki. Our evaluation
results show the potential of imitation learning to provide
adaptive behaviour to different interactive contexts, such as
motions, facial expressions or speech, which can be used to
provide traits similar to anime-style games. Therefore, we
intend advance our work in this direction and using Hatsuki
as a deployment platform.
REFERENCES
[1] A. Newitz, “Anime otaku: Japanese animation fans outside japan,”
1994.
[2] M. Hills and E. G. McGregor, “Transcultural otaku: Japanese repre-
sentations of fandom and representations of japan in anime/manga fan
cultures,” 2002.
[3] A. S. Lu, “The many faces of internationalization in japanese anime,
Animation, vol. 3, no. 2, pp. 169–187, 2008.
[4] R. Matsumura, M. Shiomi, and N. Hagita, “Does an animation
character robot increase sales?” in Proceedings of the 5th International
Conference on Human Agent Interaction, ser. HAI ’17. New York,
NY, USA: Association for Computing Machinery, 2017, p. 479–482.
[5] V. Vatsal and G. Hoffman, “Wearing your arm on your sleeve:
Studying usage contexts for a wearable robotic forearm,” in 2017
26th IEEE International Symposium on Robot and Human Interactive
Communication (RO-MAN), Aug 2017, pp. 974–980.
[6] H. Jiang, S. Lin, V. Prabakaran, M. R. Elara, and L. Sun, “A
survey of users’ expectations towards on-body companion robots,” in
Proceedings of the 2019 on Designing Interactive Systems Conference,
ser. DIS ’19. New York, NY, USA: Association for Computing
Machinery, 2019, p. 621–632.
[7] M. Al-Sada, T. H ¨
oglund, M. Khamis, J. Urbani, and T. Nakajima,
“Orochi: Investigating requirements and expectations for multipurpose
daily used supernumerary robotic limbs,” in Proceedings of the 10th
Augmented Human International Conference 2019, ser. AH2019. New
York, NY, USA: Association for Computing Machinery, 2019.
[8] “ワンダフェスティバルと は[about wonder festival].
[Online]. Available: https://wonfes.jp/knowledge/about/(Accessed:
10March2020)
[9] A. K. Pandey and R. Gelin, “A mass-produced sociable humanoid
robot: pepper: the first machine of its kind,” IEEE Robotics &
Automation Magazine, vol. 25, no. 3, pp. 40–48, 2018.
[10] Sugano Lab., “TWENDY-ONE.” [Online]. Available: http:
//twendyone.com/(Accessed:9March2020)
[11] Honda Motor Co., Ltd, “ASIMO.” [Online]. Available: https:
//www.honda.co.jp/ASIMO/(Accessed:9March2020)
[12] Hiroshi Ishiguro Laboratories, ATR, “Understanding and transmitting
human presence.” [Online]. Available: http://www.geminoid.jp/en/
projects.html(Accessed:29February2020)
[13] J. Retto, “Sophia, first citizen robot of the world,” ResearchGate
https://www. researchgate. net, pp. 2–9, 2017.
[14] G. M. Poor and R. J. K. Jacob, “Introducing animatronics to hci:
Extending reality-based interaction,” in Human-Computer Interaction.
Interaction Techniques and Environments, J. A. Jacko, Ed. Berlin,
Heidelberg: Springer Berlin Heidelberg, 2011, pp. 593–602.
[15] Y. Terada and I. Yamamoto, “An animatronic system including lifelike
robotic fish,” Proceedings of the IEEE, vol. 92, no. 11, pp. 1814–1820,
Nov 2004.
[16] Y. Kuroki, M. Fujita, T. Ishida, K. Nagasaka, and J. Yamaguchi,
“A small biped entertainment robot exploring attractive applications,
in 2003 IEEE International Conference on Robotics and Automation
(Cat. No.03CH37422), vol. 1, Sep. 2003, pp. 471–476 vol.1.
[17] Speecys, “Product.” [Online]. Available: http://robo-pro.com/speecys/
products/index.html(Accessed:29February2020)
[18] Asratec Corp., “Se-01.” [Online]. Available: https://www.asratec.co.
jp/portfolio page/se-01/(Accessed:10March2020).
[19] K. M. Kim and K. P. Lee, “Two types of design approaches regarding
industrial design and engineering design in product design,” 11th
International Design Conference, DESIGN 2010, pp. 1795–1806,
2010.
[20] T. Matsuda, D. Kim, and T. Ishii, “An evaluation study of preferences
between combinations of 2d look shading and limited animation in 3d
computer animation,” International Journal of Asia Digital Art and
Design Association, vol. 19, no. 3, pp. 73–82, 2015.
[21] CC, “中 的 [classic facial expressions in
manga],” in 1[Manga Skill Bible 1 Basic],
1st ed. Taipei: 資産 , 2012, ch. 4.2, pp.
86–92.
[22] CC, “[understanding body proportions],” in
1[Manga Skill Bible 1 Basic], 1st ed. Taipei:
資産限公, 2012, ch. 5.1, pp. 94–102.
[23] P. W. Galbraith, “Moe: Exploring Virtual Potential in Post-Millennial
Japan,” electronic journal of contemporary japanese studies, no. 5,
2009.
[24] J. Nakanishi, I. Kuramoto, J. Baba, K. Ogawa, Y. Yoshikawa, and
H. Ishiguro, “Continuous hospitality with social robots at a hotel,” SN
Applied Sciences, vol. 2, 03 2020.
[25] T. Kuratate, B. Pierce, and G. Cheng, ““ Mask-bot ” - a life-size talking
head animated robot for AV speech and human-robot communication
research,” Avsp, 2011.
[26] H. Ito, K. Yamamoto, and T. Ogata, “Development of Integration
Method of Element Motions using Deep Learning,” The Proceedings of
JSME annual Conference on Robotics and Mechatronics (Robomec),
vol. 2018, no. 0, pp. 1A1–D09, 2018.
[27] B. Akgun, M. Cakmak, J. W. Yoo, and A. L. Thomaz, “Trajectories
and keyframes for kinesthetic teaching: A human-robot interaction
perspective,HRI’12 - Proceedings of the 7th Annual ACM/IEEE
International Conference on Human-Robot Interaction, pp. 391–398,
2012.
[28] S. Starke, N. Hendrich, D. Krupke, and J. Zhang, “Evolutionary
multi-objective inverse kinematics on highly articulated and humanoid
robots,” IEEE International Conference on Intelligent Robots and
Systems, vol. 2017-Septe, no. September, pp. 6959–6966, 2017.
[29] P.-C. Yang, K. Sasaki, K. Suzuki, K. Kase, S. Sugano, and T. Ogata,
“Repeatable folding task by humanoid robot worker using deep
learning,” IEEE Robotics and Automation Letters, vol. 2, no. 2, pp.
397–403, April 2017.
[30] X. B. Peng, P. Abbeel, S. Levine, and M. van de Panne, “Deepmimic:
Example-guided deep reinforcement learning of physics-based char-
acter skills,” ACM Trans. Graph., vol. 37, no. 4, pp. 143:1–143:14,
July 2018.
[31] Y. Yamashita and J. Tani, “Emergence of functional hierarchy in a
multiple timescale neural network model : A humanoid robot experi-
ment,” PLoS Computational Biology, vol. 4, no. 11, pp. e000 220–1–
e1 000 220–18, 2008.
[32] K. Suzuki, H. Mori, and T. Ogata, “Motion switching with sensory
and instruction signals by designing dynamical systems using deep
neural network,” IEEE Robotics and Automation Letters, vol. 3, no. 4,
pp. 3481–3488, Oct 2018.
[33] M. A. Sada, M. Khamis, A. Kato, S. Sugano, T. Nakajima, and F. Alt,
“Challenges and opportunities of supernumerary robotic limbs,” 2017.
[34] J. P. Chin, V. A. Diehl, and K. L. Norman, “Development of an instru-
ment measuring user satisfaction of the human-computer interface,” in
Proceedings of the SIGCHI conference on Human factors in computing
systems, 1988, pp. 213–218.
[35] H. Masuda, T. Sudo, K. Rikukawa, Y. Mori, N. Ito, Y. Kameyama,
and M. Onouchi, “アニメレポ2019[anime industrial report
2019],” The Association of Japanese Animation, Tech. Rep., 2019.
[Online]. Available: https://www.spi-information.com/report/24755.
html
[36] F. Greenwood, “The girl at the center of the world: Gender, genre,
and remediation in bishojo media works,” Mechademia, vol. 9, pp.
237–252, 2014.
[37] 5pb.jp, “Ps vitaやはりでもラブコメはまちがっ
ている[ps vita : yahari game demo ore no seishun zoku
].” [Online]. Available: http://5pb.jp/games/oregairu/story/(Accessed:
10March2020)
[38] Konami Digital Entertainment Co., Ltd, “ラブプラ every
サイト[loveplus every official site].” [Online]. Available: https:
//www.konami.com/games/loveplus/every/(Accessed:10March2020)
[39] H. Oh, S. S. Kwak, and M. Kim, “Application of unexpectedness to the
behavioral design of an entertainment robot,” in 2010 5th ACM/IEEE
International Conference on Human-Robot Interaction (HRI), March
2010, pp. 119–120.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Social robots are being increasingly employed in service encounters at hotels. This study explored the possibility that social robots can engage in heartwarming interactions with hotel customers. A collaboration design known as ‘Continuous Hospitality with Social Robots’, in which social robots compensate for gaps in hospitality through heartwarming interaction, was evaluated. A field test was conducted in which social robots engaged in heartwarming interaction with customers in a public area of a hotel and then collected customers’ impressions of the social robots and overall service via a questionnaire and an interview. The results demonstrate social robots’ potential for engaging in heartwarming interactions that enhance overall customer satisfaction through the use of the ‘Continuous Hospitality with Social Robots’ collaboration design. An exploratory analysis suggests that the perceived impressions of the interaction with social robots are influenced by customer gender and the duration of interactions. Furthermore, the results suggest that social robots could be utilized in other roles at hotels, namely effective advertisement through heartwarming interaction and mental support for employees who do not interact with customers.
Conference Paper
Full-text available
Supernumerary robotic limbs (SRLs) present many opportunities for daily use. However, their obtrusiveness and limitations in interaction genericity hinder their daily use. To address challenges of daily use, we extracted three design considerations from previous literature and embodied them in a wearable we call Orochi. The considerations include the following: 1) multipurpose use, 2) wearability by context, and 3) unobtrusiveness in public. We implemented Orochi as a snake-shaped robot with 25 DoFs and two end effectors, and demonstrated several novel interactions enabled by its limber design. Using Orochi, we conducted hands-on focus groups to explore how multipurpose SRLs are used daily and we conducted a survey to explore how they are perceived when used in public. Participants approved Orochi's design and proposed different use cases and postures in which it could be worn. Orochi's unobtrusive design was generally well received, yet novel interactions raise several challenges for social acceptance. We discuss the significance of our results by highlighting future research opportunities based on the design, implementation, and evaluation of Orochi.
Article
Full-text available
As robotics technology evolves, we believe that personal social robots will be one of the next big expansions in the robotics sector. Based on the accelerated advances in this multidisciplinary domain and the growing number of use cases, we can posit that robots will play key roles in everyday life and will soon coexist with us, leading all people to a smarter, safer, healthier, and happier existence.
Article
Full-text available
To ensure that a robot is able to accomplish an extensive range of tasks, it is necessary to achieve a fiexible combination of multiple behaviors. This is because the design of task motions suited to each situation would become increasingly difficult as the number of situations and the types of tasks performed by them increase. To handle the switching and combination of multiple behaviors, we propose a method to design dynamical systems based on point attractors that accept (i) "instruction signals" for instruction-driven switching. We incorporatethe(ii)-instructionphase-toformapointattractor and divide the target task into multiple subtasks. By forming an instruction phase that consists of point attractors, the model embeds a subtask in the form of trajectory dynamics that can be manipulated using sensory and instruction signals. Our model comprises two deep neural networks: a convolutional autoencoder and a multiple time-scale recurrent neural network. In this study, we apply the proposed method to manipulate soft materials. To evaluate our model, we design a clothfolding task that consists of four subtasks and three patterns of instruction signals, which indicate the direction of motion. The results depict that the robot can perform the required task by combining subtasks based on sensory and instruction signals. Additionally, the proposed model determined the relations among these signals using its internal dynamics.
Article
Full-text available
The making of Sophia is described and commented, a new humanoid robot that stands out for having been built with the latest advances in Artificial Intelligence (AI) that allow it, for example, to learn and gain experience from its interaction with human beings. Also, its appearance and wide repertoire of facial gestures that it has, significantly brings it closer to the human pattern. Its presence has gained notoriety for her presentations in at least a couple of United Nations events and also for having received Saudi citizenship, being the first robot in the world to hold that status. Sophia's particular technological qualities have begun to generate repercussions of various kinds, not only in the academic-scientific world, but also ethically, artistically, religiously, morally, politically and economically. From the above, it can be affirmed that Sophia has marked the beginning of a new era, not only robotic but also technological in general, which now allows to see with greater certainty the real emergence of a successor of the human species.
Conference Paper
Full-text available
While solving inverse kinematics on serial kine-matic chains is well researched, many methods still seem rather limited in jointly handling more complex geometries, including dexterous multi-finger hands or humanoid robots. In particular, object manipulation and motion tasks would benefit from the ability to define intermediate goals along the kinematic chains, such as an elbow position or wrist orientation. In this paper, we propose a fast hybrid evolutionary approach that is capable of solving inverse kinematics for multiple end effectors simultaneously, leaving high flexibility for specifying full-body postures with different objectives. Accurate solutions can be found in real-time and suboptimal extrema are robustly avoided. Our experimental results on the NASA Valkyrie and Shadow Dexterous Hand demonstrate that the algorithm is fast and can be efficiently applied for different robotic tasks which require flexible control of fully-constrained geometries.
Conference Paper
Being as a robotic companion is an extensive application of on-body robots; yet, as an emerging type of robots, few previous works focus on the design of on-body companion robots from the users' perspective, remaining users' expectations towards this type of robots unclear. To assist designers in the design process of on-body companion robots, we surveyed users' expectations towards on-body companion robots (n=215) by a questionnaire constituting of questions on factors that may affect robot acceptance, including robot functionality, robot appearance, and robot social ability. Based on the survey results, we stated design guidelines for the design of on-body companion robots supporting designers with insights into users. To demonstrate how to design on-body companion robots based on our findings, we organized a workshop with experienced designers to develop a conceptual on-body companion robot, and they proposed Bubo, an example prototype of on-body companion robot.
Article
A longstanding goal in character animation is to combine data-driven specification of behavior with a system that can execute a similar behavior in a physical simulation, thus enabling realistic responses to perturbations and environmental variation. We show that well-known reinforcement learning (RL) methods can be adapted to learn robust control policies capable of imitating a broad range of example motion clips, while also learning complex recoveries, adapting to changes in morphology, and accomplishing user-specified goals. Our method handles keyframed motions, highly-dynamic actions such as motion-captured flips and spins, and retargeted motions. By combining a motion-imitation objective with a task objective, we can train characters that react intelligently in interactive settings, e.g., by walking in a desired direction or throwing a ball at a user-specified target. This approach thus combines the convenience and motion quality of using motion clips to define the desired style and appearance, with the flexibility and generality afforded by RL methods and physics-based animation. We further explore a number of methods for integrating multiple clips into the learning process to develop multi-skilled agents capable of performing a rich repertoire of diverse skills. We demonstrate results using multiple characters (human, Atlas robot, bipedal dinosaur, dragon) and a large variety of skills, including locomotion, acrobatics, and martial arts.
Conference Paper
This paper investigates whether a network robot salesclerk system increases sales in real shopping contexts. Our robot system, which consists of an autonomous virtual agent and a semi-autonomous physical agent, enables customers to interact with the virtual agents on their smartphones and reserve special character merchandise. Moreover, their virtual agent is transferred to the physical agent at the shop to physically distribute the reserved merchandise to customers. Through such cyber-physical interaction, we provided rich shopping experiences to customers to increase sales. We collaborated with an animation company, Production I.G Inc., and employed an animation character named Tachikoma from the Ghost in the Shell:Stand Alone Complex (a.k.a S.A.C. seriese) universe to design the appearance and the characteristics of both agents. We conducted field trials to investigate whether the developed system contributed to sales related to the animation merchandise of Ghost in the Shell, and the results showed our system's effectiveness.