Figure - uploaded by Man To Tang
Content may be subject to copyright.
Individual animation ratings, VTuber participants (N = 56).

Individual animation ratings, VTuber participants (N = 56).

Source publication
Conference Paper
Full-text available
VTubers are live streamers who embody computer animation virtual avatars. VTubing is a rapidly rising form of online entertainment in East Asia, most notably in Japan and China, and it has been more recently introduced in the West. However, animating an expressive VTuber avatar remains a challenge due to budget and usability limitations of current...

Contexts in source publication

Context 1
... 2.80 1.09 3.04 1.10 VMagicMirror (C2) 3.55 0.92 3.11 1.13 3.22 1.11 AlterEcho (E) 4.16 0.89 4.02 1.04 3.94 1.08 Agree"). The average individual ratings of the animations are given in Table 3. Table 4 gives the ratings for the VTuber participants. ...
Context 2
... 2.80 1.09 3.04 1.10 VMagicMirror (C2) 3.55 0.92 3.11 1.13 3.22 1.11 AlterEcho (E) 4.16 0.89 4.02 1.04 3.94 1.08 Agree"). The average individual ratings of the animations are given in Table 3. Table 4 gives the ratings for the VTuber participants. ...

Citations

... Virtual Uploader (VUP) VUP has become an international sensation, particularly in East Asia (Tang et al., 2021). In 2021, business driven by VUP in China was valued at US$17 billion (Stanford, 2022). ...
Article
In response to the rapid growth in the popularity of virtual humans, this study investigates the attitudes and perceptions of young viewers, from Generation Z in particular, toward virtual uploaders (VTubers). We qualitatively and quantitatively compared the online mourning directed at the “demised” virtual uploaders and deceased human uploaders, and human celebrities through a data-mining approach. Two salient patterns emerge. The mourning remarks for virtual uploaders are considerably different from those concerning human celebrities. And the mourning remarks for the disembodied human uploaders are more consistent with those for virtual uploaders, but the remarks for embodied human uploaders are more in line with those for offline celebrities. Our findings suggest that young viewers are becoming accustomed to virtual beings in online environments and are beginning to treat humans like machines based on their similarities (the degree of embodiment in this case). Young generations immersed in virtual spaces may develop different concepts of life, demise, and even humanity.
... Considering the current use of stand-alone HMDs, the system targets photorealistic avatars and avatars with limited control rigs, such as in manga and anime. In the future, this could be controlled by machine learning, and we already know that a classifier for unique facial expressions is possible, as shown in AlterEcho: Loose Avatar-Streamer Coupling for Expressive VTubing (Tang et al.) [TZP21]. This study examines the generation of avatar facial expressions using recorded speech in multiple language environments while highlighting the limitations of expression in singing choruses. ...
... Tang et al. [TZP21] propose a solution to dynamically play animations using various sources of input such as voice, facial capture from ARKit, and personality factors. In this case, speech recognition cannot control facial expressions directly; it triggers premade animations. ...
Preprint
This paper contributes to building a standard process of research and development (R&D) for new user experiences (UX) in metaverse services. We tested this R&D process on a new UX proof of concept (PoC) for Meta Quest head-mounted display (HMDs) consisting of a school-life karaoke experience with the hypothesis that it is possible to design the avatars with only the necessary functions and rendering costs. The school life metaverse is a relevant subject for discovering issues and problems in this type of simultaneous connection. To qualitatively evaluate the potential of a multi-person metaverse experience, this study investigated subjects where each avatar requires expressive skills. While avatar play experiences feature artistic expressions, such as dancing, playing musical instruments, and drawing, and these can be used to evaluate their operability and expressive capabilities qualitatively, the Quest's tracking capabilities are insufficient for full-body performance and graphical art expression. Considering such hardware limitations, this study evaluated the Quest, focusing primarily on UX simplicity using AI Fusion techniques and expressiveness in instrumental scenes played by approximately four avatars. This research reported methods for multiuser metaverse communication and its supporting technologies, such as head-mounted devices and their graphics performance, special interaction techniques, and complementary tools and the importance of PoC development, its evaluation, and its iterations. The result is remarkable for further research; these expressive technologies in a multi-user context are directly related to the quality of communication within the metaverse and the value of the user-generated content (UGC) produced there.
... It raises questions about the types of usage, areas of focus, and core benefits of XR applications. In Table I, Vtubing identified by a respondent refers to using avatars as a replacement for using an actual image of a streamer on YouTube, in which motion capture or keyboard input provides the image's animation [96], [97]. Most respondents chose the need to have users experience an active viewing environment (76.6%). ...
Article
Full-text available
XR provides benefits in innovation, competitiveness and sustainability that offset disruptions in and enhances physical reality. The Caribbean’s metaverse evolution started before the pandemic with the development of XR projects and creatives’ NFTs. The physical isolation during the Covid-19 pandemic accelerated the Caribbean’s interest in the metaverse and XR. In 2020, only 83 participants from Trinidad and Tobago entered the CARIRI AR/VR Challenge to demonstrate their XR ideas. There is a need to encourage and accelerate regional XR development. The purpose of this research is to explore Caribbean XR developers’ experiences to provide an understanding of the factors affecting their XR development. This paper addresses the question: What factors of influence will encourage the development of XR projects in the Caribbean to advance their metaverse development? Online questionnaires issued to Caribbean XR developers from July to December 2021 obtained responses from 77 participants throughout 13 regional countries. The primary data were statistically insignificant and skewed towards two countries (Jamaica and Trinidad & Tobago). Comparative and inferential analyses identified factors of influence, industry sectors, and design foci. The originality of this research is an XR development strategy that incorporates the I4.0, UX, and financial strategies. It establishes the XR project design foci (the user, the purpose and the location). The factors of influence minimum criteria and the industry sector(s) influence each design focus. An initial reference list of industry sectors is education (the preferred option), healthcare, tourism, culture, manufacturing for export, construction, entertainment, game development, agriculture, and environmental protection. The strategy’s value is in enabling content creators to design XR applications to meet consumers’ needs and increase the regional adoption of XR. The impact of the research on the Caribbean is to facilitate a path to the regional metaverse evolution. This research identified the need for a regional XR development policy.
... Anyone worldwide can watch live streaming through Youtube, Twitch, Bilibili, and Tiktok. These platforms allow anyone with a webcam and an internet connection to become a streamer, exporting live content and interacting with online viewers in real time [3]. Recent advances in motion capture and artificial intelligence technologies have enabled streamers to represent their appearances with 2D or 3D animated avatars. ...
... Recent advances in motion capture and artificial intelligence technologies have enabled streamers to represent their appearances with 2D or 3D animated avatars. Such streamers embodying computer-animated avatars are often called virtual streamers (Vtubers) [3]. They have rapidly gained international popularity since their debut in 2016 [4], are highly sought after by young people, and have produced many excellent achievements. ...
... They have rapidly gained international popularity since their debut in 2016 [4], are highly sought after by young people, and have produced many excellent achievements. For example, the Vtuber Gawr Gura surpassed 1 million subscribers in only 41 days on Youtube [3]. A Chinese entertainment company, Yuehua Entertainment, created a virtual idol group called "A-SOUL," which had a record of 12 million in popularity value in single live streaming [5]. ...
Article
Full-text available
As an emerging communication practice, there are many research gaps in the field of Vtubing (Vtubing refers to the Vtuber live streaming, and Vtuber refers to the virtual streamer), including the lack of psychological attribute analysis. Through in-depth qualitative interviews, this study comprehensively explores the key psychological attributes of viewers when watching Vtubing, including perceived persona attractiveness, perceived appearance attractiveness, perceived voice attractiveness, perceived reliability, perceived anthropomorphism, immersion, psychological distance, and imagination. This study provides suggestions for Vtuber owners to design and manage Vtubers with important theoretical and practical significance.
... Platforms like YouTube and Twitch allow anyone with a webcam and an Internet connection to become a streamer, sharing their life experiences and creative content while engaging with an online audience in real-time [14]. More recently, advances in motion capture and computer animation have empowered streamers to represent themselves with virtual avatars without revealing their real self [18]. ...
Article
Niche preference communities on social media are gradually increasing with related research, but there are few studies related to the motivation of viewers of Live-streaming oriented Vtuber. In this study, five motivational factors were set up using time spent as a measure of platform use, and valid data were collected through an interview and survey study of members of Generation Z who watched Vtuber live streams on the Youtube platform. The results obtained from the survey were compared and analyzed, and the main motivation that helps explain Vtuber's live streaming participation was found: Social anxiety. Compared to usual live stream viewers, Vtuber watchers are more prone to social phobia in real life and show a stronger dependence on the host. In addition, the appearance of Vtuber is one of the main motivations to attract viewers. These results provide the main motivations of Vtuber's viewers and lay the foundation for future research.
... In addition to appearance, research has found that anchors of different genders have different effects on audiences' emotions, including during clothing brand live broadcasts [17] and other fields. Gender differences, including the impact of gender differences on live anchors and virtual images, have become popular in image research in recent years [18,19]. ...
... On the other hand, the non-humanoid AI news anchors were taken from the famous cartoon image of a Japanese Vtuber, which is a category of anchors who use virtual images to conduct activities on video sites. This kind of AI non-humanoid characters originated in Japan; thus, their appearances are highly animated and resemble Japanese cartoon characters [17]. ...
Article
Full-text available
Since the concept of artificial intelligence was introduced in 1956, AI technology has been gradually applied in various fields, including journalism. This paper focuses on research related to AI news anchors, and two correlated experiments are applied to examine audiences’ perceived attractiveness of AI news anchors from a psychological dimension. Study 1 focuses on the different variables that influence the behavioral willingness of AI news anchor viewers, while Study 2 focuses on the mediating and moderating variables that influence audiences’ psychological changes. The results indicate that non-humanoid female AI news anchors who use anthropomorphic voices to broadcast news obtain the highest perceived attractiveness among audiences. Additionally, the mediating effect of perceived attractiveness and the negative moderating effect on the inherent impression of traditional news anchors are both verified in the study. Based on the research findings, the implications and suggestions are addressed accordingly.
... Expressive avatar animation has benefited from the advancement of data-driven deep learning and motion capture technologies, which have a wide range of applications in virtual reality (VR) fields. In particular, human-driven animated avatars with expressive motions are in high demand to act as virtual characters in telecommunication, virtual try-on, 3D games, or entertainment production, replacing time-consuming and expensive manual animation production [1]. ...
... We design the user study in terms of realistic animation, fluency, and the user's preference of the animated avatars. We model the evaluations and experience quantitatively as 5 levels, from 1 ("No satisfaction") to 5 ("High satisfaction"), following the previous work [1]. ...
... A virtual YouTuber, or "Vtuber", is a fictional character in YouTube videos and live streams. These are 3D models that most commonly exist in the digital form and are typically associated with some voice to provide vocal performances [7]. With 3 million subscribers on YouTube, Kizuna AI (Fig. 2) is one of the most famous Vtubers. ...
Article
Full-text available
Influencers are people on social media that distinguish themselves by the high number of followers and the ability to influence other users. While Influencers are a long-standing phenomenon in social media, Virtual Influencers have made their appearance on such platforms only recently: they are Computer-Generated Imagery (CGI) characters that act and resemble humans, even if they do not physically exist in the real world. This recent phenomenon has sparked interest in society, and several questions arise regarding their evolution, opinions, ethics, purpose in marketing, and future perspective. In this article, we conduct an exhaustive review of the virtual influencer phenomenon. Through an extensive study of the literature, press articles, social platforms data, blogs, and interviews, we give a comprehensive reflection on Virtual Influencers. Starting from their evolution, we analyze their opportunities and threats. We provide detailed information about the most popular ones and their marketing collaborations, with a comparative analysis of virtual and real (human) influencers. Moreover, we conducted an online survey to grasp people’s perspectives. From the 360 participants’ answers, we draw conclusions about VirtualInfluencers’ ethics, importance, overall feelings, and future. Results show controversial opinions on this recent phenomenon.
... The scanning phase will allow AR frameworks to collect useful environment information, which will further be used as input for a lighting estimation module to output lighting information [43]. The environment lighting information, often represented in the form of environment map [13], will then be used by rendering frameworks to overlay the virtual objects either in a user-specified world position [57] or a position based on tracking results [47]. Finally, each video frame augmented with virtual objects can be piped to existing streaming software such as OBSStudio [11] to use services like Twitch. ...
... Figure 2 presents an overview of the AR streaming workflow where reflective rendering can lead to creators' physical environment information being recovered by viewers. AR streaming, as we defined previously, is an emerging and popular form for indie streamers to reach out to followers via platforms such as Tiktok, YouTube, and Twitch [33,47]. We assume that streamers are using existing AR software to create videos with seamlessly overlayed virtual objects. ...
Preprint
Full-text available
Many augmented reality (AR) applications rely on omnidirectional environment lighting to render photorealistic virtual objects. When the virtual objects consist of reflective materials, such as a metallic sphere, the required lighting information to render such objects can consist of privacy-sensitive information that is outside the current camera view. In this paper, we show, for the first time, that accuracy-driven multi-view environment lighting can reveal out-of-camera scene information and compromise privacy. We present a simple yet effective privacy attack that extracts sensitive scene information such as human face and text information from the rendered objects, under a number of application scenarios. To defend against such attacks, we develop a novel $IPC^{2}S$ defense and a conditional $R^2$ defense. Our $IPC^{2}S$ defense, used in conjunction with a generic lighting reconstruction method, preserves the scene geometry while obfuscating the privacy-sensitive information. As a proof-of-concept, we leverage existing OCR and face detection models to identify text and human faces from past camera observations and blur the color pixels associated with detected regions. We evaluate the visual quality impact of our defense by comparing rendered virtual objects to ones rendered with a generic multi-lighting reconstruction technique, ARKit, and $R^2$ defense. Our visual and quantitative results demonstrate that our defense leads to structurally similar reflections with up to 0.98 SSIM score across a variety of rendering scenarios while preserving sensitive information by reducing the automatic extraction success rate to at most 8.8%.
... Virtual characters and agents digitally represent humans in a variety of contexts, such as in computer games and motion pictures, as virtual tutors [44], streamers [59], medical practitioners [8], and 3D virtual assistants 1 . These characters are important for standalone and interconnected ("metaverse") 3D virtual worlds [16], which in addition to entertainment, are expected to accommodate future virtual currencies, businesses, jobs, laws and properties [6]. ...
Conference Paper
The portrayed personality of virtual characters and agents is understood to influence how we perceive and engage with digital applications. Understanding how the features of speech and animation drive portrayed personality allows us to intentionally design characters to be more personalized and engaging. In this study, we use performance capture data of unscripted conversations from a variety of actors to explore the perceptual outcomes associated with the modalities of speech and motion. Specifically, we contrast full performance-driven characters to those portrayed by generated gestures and synthesized speech, analysing how the features of each influence portrayed personality according to the Big Five personality traits. We find that processing speech and motion can have mixed effects on such traits, with our results highlighting motion as the dominant modality for portraying extraversion and speech as dominant for communicating agreeableness and emotional stability. Our results can support the Extended Reality (XR) community in development of virtual characters, social agents and 3D User Interface (3DUI) agents portraying a range of targeted personalities.