Conference PaperPDF Available

Equipping social robots with culturally-sensitive facial expressions of emotion using data-driven methods

May 2019

May 2019

DOI:10.1109/FG.2019.8756570

Conference: 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019)

Authors:

Chaona Chen

University of Glasgow

Laura Hensel

University of Glasgow

Yaocong Duan

University of Glasgow

Show all 8 authorsHide

A. Examples of the six standardized universal facial expressions of basic emotions and individual face movements, called Action Units (AUs). B. Color-coded points show the average recognition accuracy of these facial expressions in 40 locations across the world as reported in 15 previous studies [2, 5, 6, 21-32]. Figure adapted from [33] with permission.

…

A. Recognition accuracy of culturally-derived facial expressions and the social robot's existing facial expressions. B. Judgments of humanlikeness. In both panels, red circles represent the culturally-derived facial expression models; blue represents the social robot's existing standardized facial expressions. Circle size represents the number of facial expression models (e.g., in happy, six models are recognized at 95% accuracy; in disgust, 1 model is recognized at 25% accuracy).

…

A. Classification performance of culturally-derived facial expressions of the six basic emotions. The color-coded matrix shows the proportion of trials on which participants classified the input facial expression as a given emotion (see the colorbar to the right). B. Color-coded face maps show the Action Unit patterns of the models that participants classified correctly (diagonal squares) and incorrectly (off-diagonal squares). Color-coding indicates the proportion of trials (see colorbar to the right). For example, Upper Lip Raiser Left and Right and Cheek Raiser Left are common Action Units in disgust expressions, which likely causes the confusion of anger as disgust as shown in A.

…

Dynamic Action Unit patterns associated with high recognition accuracy (panel A) and high judgments of human-likeness (panel B). In each panel, the face maps show the Action Units that are associated with high performance; the color-coded matrices also indicate any specific (unit interval) temporal parameter values associated with performance (see legend at bottom). Action Units that further boost performance are indicated with white asterisks. For example, panel A shows that in sad, Chin Raiser at high amplitude further boosts recognition accuracy. Panel B shows that in fear, judgments of human-likeness are boosted by Mouth Stretch with medium peak latency.

…

Figures - uploaded by Philippe G Schyns

Content may be subject to copyright.

Content uploaded by Philippe G Schyns

Content may be subject to copyright.

Equipping social robots with culturally-sensitive facial expressions of

emotion using data-driven methods

Chaona Chen1,2, Laura B. Hensel2, Yaocong Duan2, Robin A. A. Ince1,

Oliver G. B. Garrod1, Jonas Beskow3, Rachael E. Jack1,2, and Philippe G. Schyns1,2

1Institute of Neuroscience and Psychology, University of Glasgow, G12 8QB, Scotland, UK

2School of Psychology, University of Glasgow, G12 8QB, Scotland, UK

3Furhat Robotics, 11428 Stockholm, Sweden

Email: rachael.jack@glasgow.ac.uk

Abstract— Social robots must be able to generate realistic

and recognizable facial expressions to engage their human

users. Many social robots are equipped with standardized

facial expressions of emotion that are widely considered to be

universally recognized across all cultures. However, mounting

evidence shows that these facial expressions are not univer-

sally recognized – for example, they elicit signiﬁcantly lower

recognition accuracy in East Asian cultures than they do

in Western cultures. Therefore, without culturally sensitive

facial expressions, state-of-the-art social robots are restricted

in their ability to engage a culturally diverse range of human

users, which in turn limits their global marketability. To

develop culturally sensitive facial expressions, novel data-driven

methods are used to model the dynamic face movement patterns

that convey basic emotions (e.g., happy, sad, anger) in a given

culture using cultural perception. Here, we tested whether such

dynamic facial expression models, derived in an East Asian

culture and transferred to a popular social robot, improved

the social signalling generation capabilities of the social robot

with East Asian participants. Results showed that, compared to

the social robot's existing set of facial ‘universal’ expressions,

the culturally-sensitive facial expression models are recognized

with generally higher accuracy and judged as more human-

like by East Asian participants. We also detail the speciﬁc

dynamic face movements (Action Units) that are associated with

high recognition accuracy and judgments of human-likeness,

including those that further boost performance. Our results

therefore demonstrate the utility of using data-driven methods

that employ human cultural perception to derive culturally-

sensitive facial expressions that improve the social face signal

generation capabilities of social robots. We anticipate that these

methods will continue to inform the design of social robots and

broaden their usability and global marketability.

I. INTRODUCTION

Facial expressions are widely considered to be the univer-

sal language of emotion. Based on Darwin's ground-breaking

theory on the biological origins of facial expressions of

emotion [1] and Ekman's seminal cross-cultural recognition

studies (e.g., [2]), several dominant theories in the ﬁeld of

psychology have argued that six basic emotions – happy,

surprise, fear, disgust, anger and sad – are expressed and

This work was supported by The Economic and Social Research Council

and Medical Research Council (United Kingdom; ESRC/MRC-060-25-

0010), British Academy (BA SG171783), Wellcome Trust (107802/Z/15/Z)

and Multidisciplinary University Research Initiative (MURI)/Engineering

and Physical Sciences Research Council (EP/N019261/1).

Fig. 1. A. Examples of the six standardized universal facial expressions of

basic emotions and individual face movements, called Action Units (AUs).

B. Color-coded points show the average recognition accuracy of these facial

expressions in 40 locations across the world as reported in 15 previous

studies [2, 5, 6, 21-32]. Figure adapted from [33] with permission.

recognized in the same way across all cultures (e.g., [2-

7]). To represent these universal facial expressions, the ﬁeld

established a set of six standardized facial expressions (see

Fig. 1Afor examples) each comprising a speciﬁc pattern

of face movements called Action Units (AUs) such as Lid

Tightener (AU7) and Lip Corner Puller (AU20); see Fig

1Afor examples) [8]. These standardized facial expression

images quickly became the gold standard in research and

consequently inﬂuenced a broad range of ﬁelds including

affective computing [see 9 for a review] and social robotics

[10-12]. For example, state-of-the-art social robots such as

Felix [13], SAYA [14] and Furhat [15, see also 16 for a re-

view] generate facial expressions based on these standardized

Action Unit patterns.

However, mounting evidence shows that these facial ex-

pressions are not recognized well across different cultures.

Whereas all six facial expressions are recognized with com-

parably high accuracy in Western cultures, facial expressions

such as fear, disgust, and anger elicit signiﬁcantly lower

accuracy in a number of other cultures [17, 18, see also

reviews of 19, 20]. To illustrate, Fig. 1Bshows the recog-

nition accuracy of these standardized facial expressions in

Fig. 2. Data-driven, perception-based method to model culturally-sensitive dynamic facial expressions of emotion and their transference to social robotics.

A. Stimulus generation and task procedure. B. Facial expression modelling procedure. C. Transference of facial expression models to social robotics and

cultural validation.

studies [2, 5, 6, 21-32]. Red represents high recognition

accuracy (i.e., >75%) and blue represents lower accuracy

(i.e., <75%) [33]. As shown by the distribution of red and

blue points, these standardized facial expressions tend to

be recognized primarily in Western cultures but less so in

Eastern cultures. These consistent cultural differences there-

fore suggest that the facial expressions widely considered

to be universal are instead more representative of Western

culture [see also reviews of 18, 34]. Indeed, ﬁndings of

substantial cultural differences in a variety of psychological

phenomena once thought to be universal are now increasing

[17, 35, 36] because most current psychological knowledge is

derived from Western (more speciﬁcally, Western, educated,

industrialized, rich and democratic – WEIRD) [37] popula-

tions and using Western-centric theories and conﬁrmatory

methods [38]. A further limitation of these standardized

facial expressions of emotion is that they are static and

therefore do not represent the naturalistic dynamics of human

facial expressions [39]. Traditional theory-driven approaches

in psychology have therefore restricted understanding of the

speciﬁc dynamic face movement patterns that convey basic

emotions in different cultures. In turn, this has impacted

related ﬁelds such as social robotics where expressive ca-

pacity remains limited (e.g., primarily to Western cultures,

without naturalistic dynamics). For example, social robots

using these standardized universal facial expressions tend

to elicit low recognition accuracy (<50%) amongst non-

Western participants [40].

II. RELATED WORK

To better understand facial expression communication

across cultures, new data-driven methods have been used

to model the speciﬁc dynamic face movement patterns that

convey the six basic emotions in different cultures [e.g., 17,

41]. Fig. 2A-Billustrates this approach. On each experimental

trial, cultural participants view a random facial animation

generated by a facial animation platform [42] that randomly

samples a subset Action Units (AUs) from a core set of 42

AUs. For example, in Fig. 2A, three AUs are selected – Outer

Brow Raiser (AU2) color-coded in green, Lip Corner Puller

(AU12) in blue, and Lips Part (AU14) in red. Each AU is

then independently activated with a random movement (in

Fig. 2Asee color-coded temporal activation curves for each

AU; temporal parameters are labelled in the green curve). For

each Action Unit, we generated time courses using a cubic

Hermite spline interpolation of three 2-dimensional (time,

amplitude) control points and randomly generating 6 values

by sampling a uniform distribution on the interval [0,1].

These values are then transformed on the unit interval into

temporal parameters that represent the properties of onset

latency, acceleration, peak amplitude, peak latency, decel-

eration, and offset latency, according to the rules for each

parameter (see [42] for full details). Participants view the

resulting randomly generated facial animation and classify it

according to one of six emotions (‘happy,’ ‘surprise,’ ‘fear,’

‘disgust,’ ‘anger’ or ‘sad’) and rate its intensity on a 5-point

scale (‘very weak’ to ‘very strong’). If the facial animation

does not correspond to any of these emotions, participants se-

lect ‘other.’ Therefore, each facial animation that is classiﬁed

by the participant as a particular emotion at a given intensity

contains a dynamic face movement pattern that elicits the

perception of that particular emotion in the participant. After

many such trials, a statistical relationship is built between the

stimulus information (here, dynamic Action Units) presented

on each trial and the participant's corresponding responses

(e.g., ‘happy,’ ‘very strong’) as depicted in Fig. 2B. This

procedure therefore produces a statistically robust model of

a dynamic facial expression pattern that elicits the perception

of a given emotion in a participant from the culture of interest

as demonstrated by a perceptual validation task (see [42]

for full details of the modelling procedure1). Importantly,

because these models are quantiﬁable representations of

facial expressions, they can be directly transferred to social

robotics to generate culturally-sensitive facial expressions, as

illustrated in Fig. 2C. Therefore, this data-driven approach of

agnostically sampling face movements and using subjective

human cultural perception to model the dynamic Action

Unit patterns that represent different emotions (or any social

category such as different smiles [43], personality traits [44],

pain and pleasure [35] and mental states [45]) in a bottom-

up manner is particularly suitable for objectively exploring

facial expressions in diverse cultures [38].

Using this approach, Jack, et al. [17] modelled a set of 30

dynamic facial expressions of each of the six basic emotions

using the cultural perception of 30 East Asian participants

with each model derived from an individual participant.

Comparison of these 30 individual models in each emotion

category showed high consistency across participants as

measured by average Hamming distance: Happy, Median =

0.07 (SE = 0.002); Surprise: Median = 0.12 (SE = 0.003);

Fear: Median = 0.19 (SE = 0.003); Disgust: Median = 0.19

(SE = 0.004); Anger: Median = 0.21 (SE = 0.004) and Sad:

Median = 0.14 (SE = 0.003; see also similarity matrix in

Fig. 2 in [17]). Here, we aim to transfer these 30 culturally-

derived dynamic facial expression models to a popular social

robot head – Furhat https://www.furhatrobotics.com/ – and

examine whether they improve recognition accuracy amongst

East Asian participants compared to the social robot's exist-

ing ‘universal’ facial expressions.

III. METHOD

A. Transference of culturally-derived dynamic facial expres-

sion models to a social robot

To display the culturally-derived dynamic facial expression

models on the social robot head, we ﬁrst supplemented

the social robot's existing facial movement vocabulary of

7 pre-set universal facial expressions of emotion (2 happy,

1 surprise, 1 fear, 1 disgust, 1 anger and 1 sad) with a

set of 42 individual dynamic Action Units including all

combinations (see [46] for full details of transforming the AU

shape deviation data to the social robot's mesh topologies).

With this development, displayed each of the culturally-

derived dynamic facial expression models of the six basic

emotions (n = 30 models per emotion) on the social robot

head along with the existing set of standardized universal

facial expressions of emotion (n = 7).

In a ﬁrst experiment, we asked a group of East Asian

participants to classify all of these facial expressions by

emotion (section B). In a second experiment, we asked the

same group of East Asian participants to judge their human-

likeness (section C). We blocked and counterbalanced these

1The values of temporal parameters reported in this study are normalized

within their unit interval (i.e., [0,1]). Formula (11) to (16) in [42] should

be used to transform these temporal parameter values into to their real

values (e.g., time in seconds for peak latency). The Action Unit patterns and

temporal parameters of each facial expression model have been deposited

on Open Science Framework (available at https://osf.io/nxe9q/).

two tasks across participants and describe the design and re-

sults below. For both experiments, we recruited 10 East Asian

participants (10 Chinese, 5 females, mean age 23.6 years, SD

= 2.12 years) with minimal exposure to and engagement with

other cultures as assessed by questionnaire (see [17] for an

example). All participants had normal or corrected-to-normal

vision, were free from any emotion related atypicalities (e.g.

Autism Spectrum Disorder, depression), learning difﬁculties

(e.g. dyslexia), synaesthesia, and disorders of face perception

(e.g. prosopagnosia) as per self-report. All participants had

a minimum International English Language Testing System

(IELTS) score of 6.0 (competent user). Each participant gave

written informed consent and received a standard rate of £6

per hour for their participation. The Ethics Committee of the

College of Science and Engineering, University of Glasgow

provided ethical approval (Ref No: 300160186).

Fig. 3. A. Recognition accuracy of culturally-derived facial expressions

and the social robot's existing facial expressions. B. Judgments of human-

likeness. In both panels, red circles represent the culturally-derived facial

expression models; blue represents the social robot's existing standardized

facial expressions. Circle size represents the number of facial expression

models (e.g., in happy, six models are recognized at 95% accuracy; in

disgust, 1 model is recognized at 25% accuracy).

B. Recognition of universal versus culturally-derived facial

expressions of emotion

On each trial, participants viewed a facial expression

displayed on the social robot head and classiﬁed it according

Fig. 4. A. Classiﬁcation performance of culturally-derived facial expressions of the six basic emotions. The color-coded matrix shows the proportion of

trials on which participants classiﬁed the input facial expression as a given emotion (see the colorbar to the right). B. Color-coded face maps show the

Action Unit patterns of the models that participants classiﬁed correctly (diagonal squares) and incorrectly (off-diagonal squares). Color-coding indicates

the proportion of trials (see colorbar to the right). For example, Upper Lip Raiser Left and Right and Cheek Raiser Left are common Action Units in

disgust expressions, which likely causes the confusion of anger as disgust as shown in A.

to one of six emotions – happy, surprise, fear, disgust, anger

or sad – in a 6-alternative forced choice task. Each participant

viewed a total of 374 facial animations ([30 culturally-

derived facial expression models ×6 emotions] + [7 existing

standardized universal facial expressions] ×2 repetitions)

presented in random order across the experiment. We pre-

sented each facial animation on one of the social robot's 7

in-built face textures (‘Default,’ ‘Male,’ ‘Female,’ ‘Obama,’

‘iRobot,’ ‘Gabriel,’ and ‘Avatar’), pseudo-randomly selected

for each participant so that each face texture appeared an

equal number of times across the experiment. We blocked all

trials by face texture and randomized the order of the blocks

for each participant. We presented each facial animation once

for a duration of 1.25 seconds. After each animation, the face

returned to a neutral expression. Participants then responded

using a Graphic User Interface (GUI) displayed on a 19-inch

ﬂat panel Dell monitor next to the robot head. We instructed

participants to respond quickly and accurately. Following

participant response, two beeps sounded to cue participants

to prepare for the next trial. Participants then viewed the

social robot and pressed the spacebar to start the next trial.

We displayed the social robot head (size 22.5 cm ×16 cm)

in the participant's central visual ﬁeld at a constant viewing

distance of 90 cm using a chin rest. The facial expressions

therefore subtended 14.25◦(vertical) and 10.16◦(horizontal)

of visual angle, which reﬂects the average size of a human

face [47] during natural social interaction [48]. We used

MATLAB 2016a to display the GUI and record participant

responses.

To compare the recognition accuracy of the culturally-

derived facial expression models with the social robot's

existing facial expressions, we computed the proportion of

correct responses for each facial expression model (n = 30

per emotion) and each of the social robot's existing facial

expressions (n = 7 total) by pooling the data across all

trials and participants. Fig. 3Ashows the results for each

emotion. Red circles represent each culturally-derived facial

expression model; blue represents the social robot's existing

facial expressions. Circle size represents the number of facial

expression models with a speciﬁc accuracy (e.g., in happy,

6 models are recognized at 95% accuracy). As shown by

the distribution of red circles in each emotion category, the

majority of the culturally-derived facial expression models

elicited higher recognition accuracy than the social robot's

existing facial expressions. One exception is anger where

only 1 model showed higher performance than the social

robot's existing facial expression. We will explore and re-

port on this lower recognition performance later in the

manuscript.

C. Judgments of human-likeness of universal versus

culturally-derived facial expressions of emotion

Next, we compared the participants' judgments of human-

likeness for the culturally-derived facial expression models

and the social robot's existing universal facial expressions.

On each trial, we presented a pair of facial expressions

of the same emotion (e.g., happy) – one culturally-derived

facial expression and one of the social robot's existing facial

expressions – each displayed on the same face texture and

in pseudo-random sequential order. We presented each facial

expression once for a duration of 1.25 seconds, with an inter-

stimulus interval (ISI) of 0.5 second between each. After

displaying each pair of facial expressions, the social robot

face returned to a neutral expression. Participants indicated

which facial expression they thought looked most human-

like using a GUI displayed on a 19-inch ﬂat panel Dell

monitor positioned next to the social robot head. Following

participant response, two beeps sounded to cue participants

to prepare for the next trial. Participants then viewed the

social robot head and pressed the spacebar to start the next

trial. We randomly assigned one of the social robot's 7 in-

built face textures to each emotion category, blocked the

trials by emotion, and randomized the order of the blocks

across the experiment for each participant. Each participant

completed a total of 420 trials ([60 pairs of happy facial

expressions + 30 pairs of facial expressions for each of

the other 5 emotions] ×2 pair orders). We used the same

viewing conditions and equipment as in B. Recognition

of universal versus culturally-derived facial expressions of

emotion above.

To compare the human-likeness judgments of the

culturally-derived facial expression models and the social

robot's existing facial expressions, we computed the propor-

tion of times that participants selected each facial expression

as most humanlike by pooling data across all trials and

participants. Fig. 3Bshows the results. As shown by the

distribution of red circles in Fig. 3B, participants consistently

judged the culturally-derived facial expression models as

more human-like than the social robot's existing standardized

universal facial expressions.

We now return to exploring the low recognition accuracy

amongst the culturally-derived facial expression models of

anger (see Fig. 3A). First, we examined the distribution

of correct and incorrect classiﬁcations as shown by the

confusion matrix in Fig. 4A. The y axis represents the

emotion of the input stimulus and the x axis represents

the participants' classiﬁcation response. Each color-coded

cell of the matrix shows the proportion of trials on which

participants classiﬁed the input stimulus (e.g., facial expres-

sion model of anger) as a given emotion category (e.g.,

disgust) with proportions derived by pooling data across

all participants and trials. Squares on the diagonal show

the correct classiﬁcations; off-diagonal squares show the

incorrect classiﬁcations. Brighter colors indicate a higher

proportion of trials; darker colors indicate a lower proportion

of trials (see colorbar to the right). For the anger models, the

off-diagonal squares show that participants tended to mis-

classify these facial expression models as disgust. Similarly,

participants misclassiﬁed disgust models as anger, although

to a lower degree. To explore the potential face signalling

source of these misclassiﬁcations, we examined the Action

Units distributed across correct and incorrect responses. Fig.

4Bshows the results where each face map shows the Action

Unit patterns that participants classiﬁed correctly (diagonal

squares) and incorrectly (off-diagonal squares). Red indicates

a high proportion of trials; blue indicates a low proportion

of trials (see colorbar to the right). For anger, the off-

diagonal squares show that participants tended to misclassify

as disgust the facial expression models that comprised Action

Units that are prevalent in disgust such as the Upper Lip

Raiser, bilaterally (AU10R and AU10L), and Cheek Raiser

Left (AU6L). Similarly, participants misclassiﬁed disgust

facial expressions as anger when they contained Action Units

that are common in correctly classiﬁed anger expressions

such as Lip Pressor (AU24) and Lip Tightener (AU23).

D. Dynamic Action Units associated with high performance

We showed that the culturally-derived facial expression

models are recognized with higher accuracy and are judged

as more human-like compared to the social robot's existing

facial expressions. To identify which speciﬁc face move-

ments are associated with these improved performances, we

used an information-theoretic approach based on mutual

information (MI) [49, 50]. Speciﬁcally, MI quantiﬁes the

relationship between two variables – here, the presence

of an Action Unit and performance of a facial expression

model (i.e., recognition accuracy or judgments of human-

likeness). High MI would indicate that an Action Unit (e.g.,

Brow Lowerer, AU4) is strongly associated with performance

(e.g., correct emotion classiﬁcations); low MI would in-

dicate a weak relationship. To identify, for each emotion,

the AUs that are strongly associated with performance, we

applied the following analysis for recognition accuracy and

human-likeness separately: We computed the MI between

each Action Unit (i.e., whether it present or absent in the

culturally-derived facial expression model) and performance

(e.g., correct emotion classiﬁcations) by pooling the partic-

ipants' responses to the culturally-derived facial expressions

collected in B. Recognition of universal versus culturally-

derived facial expressions of emotion, resulting in 600 trials

per emotion category (30 models ×10 participants ×2

repetitions). We computed the MI for each Action Unit

except three that are present in 100% of the facial expres-

sion models – i.e., in happy, Lip Corner Puller (AU12)

and Dimpler (AU14); in surprise, Inner/Outer Brow Raiser

(AU1-2) – which therefore provides no variance to compute

MI. We established the statistical signiﬁcance of high MI

values using a Monte Carlo simulation method by shufﬂing

the participants' responses 1000 times, computing MI for

each Action Unit at each iteration and using the random

distribution of MI values to identify the Action Units with

MI values that are signiﬁcantly higher than chance (i.e.,

>95% of the distribution, uncorrected). All Action Units

with signiﬁcantly high MI are displayed on color-coded face

maps in Fig. 5Afor recognition accuracy and in Fig. 5B

for judgements of human-likeness. The color-coded matrices

next to the face maps indicate these Action Units in the ﬁrst

column.

Certain Action Units could also improve recognition per-

formance based on their speciﬁc dynamic properties such

as high amplitude, early peak latency, or fast acceleration.

Fig. 5. Dynamic Action Unit patterns associated with high recognition accuracy (panel A) and high judgments of human-likeness (panel B). In each panel,

the face maps show the Action Units that are associated with high performance; the color-coded matrices also indicate any speciﬁc (unit interval) temporal

parameter values associated with performance (see legend at bottom). Action Units that further boost performance are indicated with white asterisks.

For example, panel Ashows that in sad, Chin Raiser at high amplitude further boosts recognition accuracy. Panel Bshows that in fear, judgments of

human-likeness are boosted by Mouth Stretch with medium peak latency.

To identify any such Action Units, we computed for each

Action Unit separately, the MI between performance (e.g.,

correct versus incorrect emotion classiﬁcations) and four

main temporal parameters – amplitude, peak latency, ac-

celeration, deceleration – separately using three levels of

temporal parameter values (high, medium, low). We estab-

lished statistical signiﬁcance of high MI values for each

temporal parameter using a Monte Carlo simulation method

as described above. Action Unit dynamics with signiﬁcantly

high MI are also displayed in the face maps shown in Fig.

5Afor recognition accuracy and Fig. 5Bfor judgments of

human-likeness. Next, to specify the level of these dynamic

properties (i.e., high, medium, or low) we computed the

frequency of each level across the high-performance trials

(e.g., correct emotion classiﬁcations) and took the temporal

parameter level with the highest frequency. The results

are shown in Fig. 5Ain the color-coded matrices where

distinct colours indicate the value of each temporal parameter

signiﬁcantly associated with high recognition accuracy (blue

– low [0.01, 0.4], green – medium [0.41, 0.8] and yellow

– high [0.81, 1] for the unit interval of each parameter;

see legends below). Together, these results show that for

each emotion, several speciﬁc Action Units and/or their

speciﬁc dynamic properties are strongly associated with high

recognition accuracy and judgments of human-likeness. For

example, for happy, the presence of Inner Brow Raiser (AU1)

and Cheek Raiser (AU6) is strongly associated with high

recognition accuracy. For surprise, the presence of Mouth

Stretch (AU27) is strongly associated with judgments of

human-likeness.

E. Dynamic Action Units that further boost performance

Above, we identiﬁed the Action Units and their dynamic

properties that are strongly associated with (and therefore

important for) the correct classiﬁcation of emotions and

judgments of human-likeness. As shown in Fig. 3, certain

facial expression models elicit particularly high performance.

To identify the speciﬁc face movements that further boost

performance, we ﬁrst computed the MI between each Action

Unit and very high performance for each emotion separately.

High MI would indicate that an Action Unit boosts perfor-

mance. We deﬁned very high performance as the accuracy

elicited by the top 25% of facial expression models in each

task – i.e., recognition accuracy and judgments of human-

likeness – separately. We established statistical signiﬁcance

using the Monte Carlo method described above. These high-

performance Action Units are also displayed in the face maps

in Figs. 5A-Band indicated with white asterisks in the color-

coded matrices. Next, to identify and characterize the speciﬁc

dynamic Action Unit properties that boost performance, we

conducted a similar MI analysis as before. These Action

Units are also displayed in the face maps in Figs. 5A-B

with the color-coded matrices showing the speciﬁc level of

each temporal parameter. For example, in fear, Brow Lowerer

(AU4) boosts performance; for fear, Mouth Stretch with

medium peak latency boosts judgments of human-likeness.

IV. CONCLUSIONS

Here, we transferred a set of 30 culturally-derived dynamic

facial expression models of the six basic emotions to a pop-

ular social robot and compared the recognition accuracy and

human-likeness judgments of East Asian participants with

those of the social robot's existing standardized universal

facial expressions. Results show that these culturally-derived

dynamic facial expression models generally outperformed the

social robot's existing facial expressions on both recognition

accuracy and judgements of human-likeness. Further anal-

ysis of the facial expression models revealed that certain

Action Units and temporal dynamic properties drive high

performance on both recognition accuracy and judgements

of human-likeness. We also showed that the misclassiﬁcation

of the anger facial expression models as disgust could be

due to shared Action Units – i.e., Upper Lip Raiser (AU10R

and AU10L) and Cheek Raiser Left (AU6L) – as shown

in Fig. 4B(see also e.g. [27, 51, 52]) that could reﬂect

a latent signalling structure across emotion categories [17].

Identifying such common face movement patterns and those

that clearly distinguish speciﬁc emotion categories could

therefore better inform the design of social robots to further

enhance performance.

Together, our results highlight the advantage of using

culturally-sensitive dynamic facial expressions that are de-

rived from cultural perception using data-driven methods

over the theoretically-derived standardized facial expressions

of emotion that currently in common use. We therefore

anticipate that our data-driven and user-centred approach

to modelling dynamic facial expressions will be used to

further diversify and reﬁne the social face signal gener-

ation capabilities of social robots. Such facial expression

models could also be used, in conjunction with additional

information about cultural context, to improve the social

sensing capabilities of social robots. User-directed selec-

tions of culturally-sensitive facial expressions in artiﬁcial

agents could also meet personal preferences such as learning

culture-speciﬁc facial expressions. Together, we anticipate

that the use of data-driven approaches will further inform

the design of culturally-sensitive digital agents to improve

their performance, accessibility, and marketability within a

culturally diverse global market.

REFERENCES

[1] C. Darwin, The Expression of the Emotions in Man and Animals, 3rd

ed. London: Fontana Press, 1999/1872.

[2] P. Ekman, E. R. Sorenson, and W. V. Friesen, ”Pan-Cultural Elements

in Facial Displays of Emotion,” Science, vol. 164, pp. 86-88, April 4,

1969 1969.

[3] P. Ekman and W. Friesen, ”A new pan-cultural facial expression of

emotion,” Motivation and Emotion, vol. 10, pp. 159-168, 1986.

[4] D. Matsumoto, ”American-Japanese Cultural Differences in the

Recognition of Universal Facial Expressions,” Journal of Cross-

Cultural Psychology, vol. 23, pp. 72-84, 1992.

[5] M. Biehl, D. Matsumoto, P. Ekman, V. Hearn, K. Heider, T. Kudoh,

et al., ”Matsumoto and Ekman's Japanese and Caucasian Facial Ex-

pressions of Emotion (JACFEE): Reliability Data and Cross-National

Differences,” Journal of Nonverbal Behavior, vol. 21, pp. 3-21, 1997.

[6] J. D. Boucher and G. E. Carlson, ”Recognition of Facial Expression

in Three Cultures,” Journal of Cross-Cultural Psychology, vol. 11, pp.

263-280, 1980.

[7] Kimiko Shimoda, M. Argyle, and P. R. Bitti, ”The intercultural

recognition of emotional expressions by three national racial groups:

English, Italian and Japanese,” European Journal of Social Psychology,

vol. 8, pp. 169-179, 1978.

[8] P. Ekman and W. V. Friesen, Manual for the Facial Action Coding

System. Palo Alto, CA: Consulting Psychologists Press, 1978.

[9] M. Pantic and L. J. M. Rothkrantz, ”Automatic analysis of facial

expressions: The state of the art,” IEEE Transactions on pattern

analysis and machine intelligence, vol. 22, pp. 1424-1445, 2000.

[10] N. Lazzeri, D. Mazzei, A. Greco, A. Rotesi, A. Lanat, and D. E. De

Rossi, ”Can a humanoid face be expressive? A psychophysiological

investigation,” Frontiers in bioengineering and biotechnology, vol. 3,

p. 64, 2015.

[11] C. C. Bennett and S. Sabanovic, ”Deriving minimal features for

human-like facial expressions in robotic faces,” International Journal

of Social Robotics, vol. 6, pp. 367-381, 2014.

[12] E. G. Krumhuber and K. R. Scherer, ”The look of fear from the eyes

varies with the dynamic sequence of facial actions,” Swiss Journal of

Psychology, 2016.

[13] L. Canamero and J. Fredslund, ”I show you how I like you-can you

read it in my face? [robotics],” IEEE Transactions on systems, man,

and cybernetics – Part A: Systems and humans, vol. 31, pp. 454-459,

2001.

[14] T. Hashimoto, S. Hitramatsu, T. Tsuji, and H. Kobayashi, ”Develop-

ment of the face robot SAYA for rich facial expressions,” in SICE-

ICASE, 2006. International Joint Conference, 2006, pp. 5423-5428.

[15] S. Al Moubayed, J. Beskow, G. Skantze, and B. Granstrm, ”Furhat: a

back-projected human-like robot head for multiparty human-machine

interaction,” in Cognitive behavioural systems, ed: Springer, 2012, pp.

114-130.

[16] T. Fong, I. Nourbakhsh, and K. Dautenhahn, ”A survey of socially

interactive robots,” Robotics and autonomous systems, vol. 42, pp.

143-166, 2003.

[17] R. E. Jack, O. G. B. Garrod, H. Yu, R. Caldara, and P. G. Schyns,

”Facial expressions of emotion are not culturally universal,” Proceed-

ings of the National Academy of Sciences, vol. 109, pp. 7241-7244,

2012 2012.

[18] H. A. Elfenbein and N. Ambady, ”On the universality and cultural

speciﬁcity of emotion recognition: A meta-analysis,” Psychological

Bulletin, vol. 128, pp. 203-235, 2002.

[19] B. Mesquita and N. H. Frijda, ”Cultural variations in emotions: a

review,” Psychological Bulletin, vol. 112, pp. 179-204, Sep 1992.

[20] J. A. Russell, ”Is there universal recognition of emotion from facial

expression? A review of the cross-cultural studies,” Psychological

Bulletin, vol. 115, pp. 102-41, Jan 1994.

[21] L. Ducci, L. Arcuri, and T. Sineshaw, ”Emotion Recognition in

Ethiopia The Effect of Familiarity with Western Culture on Accuracy

of Recognition,” Journal of Cross-Cultural Psychology, vol. 13, pp.

340-351, 1982.

[22] P. Ekman, ”Universals and cultural differences in facial expressions of

emotion,” presented at the Nebraska Symposium on Motivation, 1972.

[23] P. Ekman, W. V. Friesen, M. O'Sullivan, A. Chan, I. Diacoyanni-

Tarlatzis, K. Heider, et al., ”Universals and cultural differences in the

judgments of facial expressions of emotion,” Journal of Personality

and Social Psychology, vol. 53, pp. 712-7, Oct 1987.

[24] H. A. Elfenbein and N. Ambady, ”Universals and Cultural Differences

in Recognizing Emotions,” Current Directions in Psychological Sci-

ence, vol. 12, pp. 159-164, October 1, 2003 2003.

[25] H. A. Elfenbein, M. Mandal, N. Ambday, S. Harizuka, and S.

Kumar, ”Hemifacial differences in the in-group advantage in emotion

recognition,” Cognition and Emotion, vol. 18, pp. 613-629, 2004.

[26] Y. Huang, S. Tang, D. Helmeste, T. Shioiri, and T. Someya, ”Dif-

ferential judgement of static facial expressions of emotions in three

cultures,” Psychiatry and clinical neurosciences, vol. 55, pp. 479-483,

2001.

[27] R. E. Jack, C. Blais, C. Scheepers, P. G. Schyns, and R. Caldara,

”Cultural confusions show that facial expressions are not universal,”

Current Biology, vol. 19, pp. 1543-1548, 2009.

[28] G. Kirouac and F. Y. Dore, ”Accuracy of the judgment of facial

expression of emotions as a function of sex and level of education,”

Journal of Nonverbal Behavior, vol. 9, pp. 3-7, 1985.

[29] G. Kirouac and F. Y. Dor, ”Accuracy and latency of judgment of

facial expressions of emotions,” Perceptual and Motor Skills, vol. 57,

pp. 683-686, 1983.

[30] D. Matsumoto and P. Ekman, ”American-Japanese cultural differences

in intensity ratings of facial expressions of emotion,” Motivation and

Emotion, vol. 13, pp. 143-157, 1989.

[31] F. T. McAndrew, ”A cross-cultural study of recognition thresholds for

facial expressions of emotion,” Journal of Cross-Cultural Psychology,

vol. 17, pp. 211-224, 1986.

[32] T. Shioiri, T. Someya, D. Helmeste, and S. W. Tang, ”Misinterpretation

of facial expression: A cross-cultural study,” Psychiatry and clinical

neurosciences, vol. 53, pp. 45-50, 1999.

[33] R. E. Jack, ”Culture and facial expressions of emotion,” Visual

Cognition, vol. 21, pp. 1248-1286, Sep 1 2013.

[34] N. L. Nelson and J. A. Russell, ”Universality revisited,” Emotion

Review, vol. 5, pp. 8-15, 2013 2013.

[35] C. Chen, C. Crivelli, O. G. B. Garrod, P. G. Schyns, J.-M. Fernndez-

Dols, and R. E. Jack, ”Distinct facial expressions represent pain and

pleasure across cultures,” Proceedings of the National Academy of

Sciences, 2018.

[36] C. Crivelli, J. A. Russell, S. Jarillo, and J.-M. Fernndez-Dols, ”The fear

gasping face as a threat display in a Melanesian society,” Proceedings

of the National Academy of Sciences, vol. 113, pp. 12403-12407, 2016.

[37] J. Henrich, S. Heine, and A. Norenzayan, ”The weirdest people in the

world?,” Behavioral and Brain Sciences, vol. 33, pp. 61-83, 2010.

[38] R. E. Jack, C. Crivelli, and T. Wheatley, ”Data-driven methods

to diversify knowledge of human psychology,” Trends in Cognitive

Sciences, vol. 22, pp. 1-5, 2018.

[39] E. Krumhuber, A. S. Manstead, D. Cosker, D. Marshall, P. L. Rosin,

and A. Kappas, ”Facial dynamics as indicators of trustworthiness and

cooperative behavior,” Emotion, vol. 7, p. 730, 2007.

[40] G. Trovato, T. Kishi, N. Endo, K. Hashimoto, and A. Takanishi,

”A cross-cultural study on generation of culture dependent facial

expressions of humanoid social robot,” in International Conference

on Social Robotics, 2012, pp. 35-44.

[41] R. E. Jack, W. Sun, I. Delis, O. G. Garrod, and P. G. Schyns, ”Four

not six: Revealing culturally common facial expressions of emotion,”

Journal of Experimental Psychology: General, vol. 145, p. 708, 2016.

[42] H. Yu, O. G. B. Garrod, and P. G. Schyns, ”Perception-driven facial

expression synthesis,” Computers and Graphics, vol. 36, pp. 152-162,

2012.

[43] M. Rychlowska, R. E. Jack, O. G. Garrod, P. G. Schyns, J. D. Martin,

and P. M. Niedenthal, ”Functional smiles: Tools for love, sympathy,

and war,” Psychological Science, vol. 28, pp. 1259-1270, 2017.

[44] D. Gill, O. G. Garrod, R. E. Jack, and P. G. Schyns, ”Facial

movements strategically camouﬂage involuntary social signals of face

morphology,” Psychological science, vol. 25, pp. 1079-1086, 2014.

[45] C. Chen, O. Garrod, P. Schyns, and R. Jack, ”The Face is the Mirror

of the Cultural Mind,” Journal of vision, vol. 15, pp. 928-928, 2015.

[46] C. Chen, O. G. Garrod, J. Zhan, J. Beskow, P. G. Schyns, and R. E.

Jack, ”Reverse Engineering Psychologically Valid Facial Expressions

of Emotion into Social Robots,” in Automatic Face and Gesture

Recognition (FG 2018), 2018 13th IEEE International Conference on,

2018, pp. 448-452.

[47] L. Ibrahimagic-Seper, A. Celebic, N. Petricevic, and E. Selimovic,

”Anthropometric differences between males and females in face di-

mensions and dimensions of central maxillary incisors,” Medicinski

glasnik, vol. 3, pp. 58-62, 2006.

[48] E. Hall, The Hidden Dimension. Garden City, NY: Doubleday, 1966.

[49] T. E. Nichols and A. P. Holmes, ”Nonparametric permutation tests

for functional neuroimaging: a primer with examples,” Human Brain

Mapping, vol. 15, pp. 1-25, 2002.

[50] T. M. Cover and J. A. Thomas, Elements of information theory: John

Wiley and Sons, 2012.

[51] S. Du, Y. Tao, and A. M. Martinez, ”Compound facial expressions

of emotion,” Proceedings of the National Academy of Sciences, p.

201322355, 2014.

[52] V. Shuman, E. Clark-Polner, B. Meuleman, D. Sander, and K. R.

Scherer, ”Emotion perception from a componential perspective,” Cog-

nition and Emotion, vol. 31, pp. 47-56, 2017.

"Who's there?": Depicting identity in interaction

Article

Apr 2023
BEHAV BRAIN SCI

Social robots have limited social competences. This leads us to view them as depictions of social agents rather than actual social agents. However, people also have limited social competences. We argue that all social interaction involves the depiction of social roles and that they originate in, and are defined by, their function in accounting for failures of social competence.

Interacting with characters redux

Article

Apr 2023
BEHAV BRAIN SCI

Richard J. Gerrig

Clark and Fischer (C&F) discuss how people interact with social robots in the context of a general analysis of interaction with characters. I suggest that a consideration of aesthetic illusion would add nuance to this analysis. In addition, I illustrate how people's experiences with other depictions of characters require adjustments to C&F's claims.

Teaching robots the art of human social synchrony

Article

Mar 2024

Rachael E Jack

Humanoid robots can now learn the art of social synchrony using neural networks.

Human-robot facial coexpression

Article

Mar 2024

Large language models are enabling rapid progress in robotic verbal communication, but nonverbal communication is not keeping pace. Physical humanoid robots struggle to express and communicate using facial movement, relying primarily on voice. The challenge is twofold: First, the actuation of an expressively versatile robotic face is mechanically challenging. A second challenge is knowing what expression to generate so that the robot appears natural, timely, and genuine. Here, we propose that both barriers can be alleviated by training a robot to anticipate future facial expressions and execute them simultaneously with a human. Whereas delayed facial mimicry looks disingenuous, facial coexpression feels more genuine because it requires correct inference of the human’s emotional state for timely execution. We found that a robot can learn to predict a forthcoming smile about 839 milliseconds before the human smiles and, using a learned inverse kinematic facial self-model, coexpress the smile simultaneously with the human. We demonstrated this ability using a robot face comprising 26 degrees of freedom. We believe that the ability to coexpress simultaneous facial expressions could improve human-robot interaction.

FAVS: 3D Facial Animation According to Vietnamese Semantic Analysis

Chapter

Jan 2024

There have recently been many studies on face animation according to sound, but facial expressions have not yet accurately represented and clarified the semantic meaning of the text. Studies show that characters need to represent at least six basic emotions: happy, sad, fear, disgust, anger, surprise. However, creating for facial animation for virtual characters is time-consuming and requires high creativity. The main objective of this study is to create Facial Animations according to Vietnamese Semantics (FAVS) more easily. The method is based on the important numerical blendshapes of the 3D model. The input text after predicting the emotion will be passed to the lips synchronous and emotion animating to perform the 3D face animation. Do the comparison with 2 methods: create animation by direct control with real human face via webcam and using keyframe methods. Assess the emotional expression of 3D characters according to the above 3 approaches. Survey respondents were asked to recognize a 3D Virtually sensitive emotional pattern generated for each sentence of text input and give a confidence score for each sentence. Survey results show, negative emotions are the most recognizable, happy and excited are easily confused.

Cultural facial expressions dynamically convey emotion category and intensity information

Article

Dec 2023
CURR BIOL

Using Data-Driven Methods to Advance Knowledge of Social Face Perception

Chapter

Dec 2023

As our society ages, questions concerning the relations between generations gain importance. The quality of human relations depends on the quality of emotion communication, which is a signiﬁcant part of our daily interactions. Emotion expressions serve not only to communicate how the expresser feels, but also to communicate intentions (whether to approach or retreat) and personality traits (such as dominance, trustworthiness, or friendliness) that inﬂuence our decisions regarding whether and how to interact with a person. Emotion Communication by the Aging Face and Body delineates how aging affects emotion communication and person perception by bringing together research across multiple disciplines. Scholars and graduate students in the psychology of aging, affective science, and social gerontology will beneﬁt from this over-view and theoretical framework.

Data-Driven Communicative Behaviour Generation: A Survey

Article

Aug 2023

The development of data-driven behaviour generating systems has recently become the focus of considerable attention in the fields of human-agent interaction (HAI) and human-robot interaction (HRI). Although rule-based approaches were dominant for years, these proved inflexible and expensive to develop. The difficulty of developing production rules, as well as the need for manual configuration in order to generate artificial behaviours, places a limit on how complex and diverse rule-based behaviours can be. In contrast, actual human-human interaction data collected using tracking and recording devices makes human-like multimodal co-speech behaviour generation possible using machine learning and specifically, in recent years, deep learning. This survey provides an overview of the state-of-the-art of deep learning-based co-speech behaviour generation models and offers an outlook for future research in this area.

Social appropriateness in HMI: The Five Factors of Social Appropriateness (FASA) Model

Article

Full-text available

Apr 2023

Find the contribution here: https://www.jbe-platform.com/content/journals/10.1075/is.22017.wul ------ Pre-print full version attached ------ Social appropriateness is an important topic – both in the human-human interaction (HHI), and in the human-machine interaction (HMI) context. As sociosensitive and socioactive assistance systems advance, the question arises whether a machine’s behavior should include considerations regarding social appropriateness. However, the concept of social appropriateness is difficult to define, as it is determined by multiple aspects. Thus, to date, a unified perspective, encompassing and combining multidisciplinary findings, is missing. When translating results from HHI to HMI, it remains unclear whether such insights into the dynamics of social appropriateness between humans may in fact apply to sociosensitive and socioactive assistance systems. To shed light on this matter, we propose the Five Factor Model of Social Appropriateness (FASA) which provides a multidisciplinary perspective on the notion of social appropriateness and its implementation into technical systems. Finally, we offer reflections on the applicability and ethics of the FASA Model, highlighting both strengths and limitations of the framework.

Virtual and real: Symbolic and natural experiences with social robots

Article

Apr 2023
BEHAV BRAIN SCI

Byron Reeves

Interactions with social robots are symbolic experiences guided by the pretense that robots depict real people. But they can also be natural experiences that are direct, automatic, and independent of any thoughtful mapping between what is real and depicted. Both experiences are important, both may apply within the same interaction, and they may vary within a person over time.

Distinct facial expressions represent pain and pleasure across cultures

Article

Full-text available

Oct 2018

Significance Humans often use facial expressions to communicate social messages. However, observational studies report that people experiencing pain or orgasm produce facial expressions that are indistinguishable, which questions their role as an effective tool for communication. Here, we investigate this counterintuitive finding using a new data-driven approach to model the mental representations of facial expressions of pain and orgasm in individuals from two different cultures. Using complementary analyses, we show that representations of pain and orgasm are distinct in each culture. We also show that pain is represented with similar face movements across cultures, whereas orgasm shows differences. Our findings therefore inform understanding of the possible communicative role of facial expressions of pain and orgasm, and how culture could shape their representation.

Reverse Engineering Psychologically Valid Facial Expressions of Emotion into Social Robots

Conference Paper

Full-text available

May 2018

Social robots are now part of human society, destined for schools, hospitals, and homes to perform a va- riety of tasks. To engage their human users, social robots must be equipped with the essential social skill of facial expression communication. Yet, even state-of-the-art social robots are limited in this ability because they often rely on a restricted set of facial expressions derived from theory with well-known limitations such as lacking naturalistic dynamics. With no agreed methodology to objectively engineer a broader variance of more psychologically impactful facial expressions into the social robots' repertoire, human-robot interactions remain restricted. Here, we address this generic challenge with new methodologies that can reverse-engineer dynamic facial expressions into a social robot head. Our data-driven, user-centered approach, which combines human perception with psychophysical methods, produced highly recognizable and human-like dynamic facial expressions of the six classic emotions that generally outperformed state-of-art social robot facial expressions. Our data demonstrates the feasibility of our method applied to social robotics and highlights the benefits of using a data-driven approach that puts human users as central to deriving facial expressions for social robots. We also discuss future work to reverse-engineer a wider range of socially relevant facial expressions including conversational messages (e.g., interest, confusion) and personality traits (e.g., trustworthiness, attractiveness). Together, our results highlight the key role that psychology must continue to play in the design of social robots.

Data-Driven Methods to Diversify Knowledge of Human Psychology

Article

Full-text available

Nov 2017
TRENDS COGN SCI

Psychology aims to understand real human behavior. However, cultural biases in the scientific process can constrain knowledge. We describe here how data-driven methods can relax these constraints to reveal new insights that theories can overlook. To advance knowledge we advocate a symbiotic approach that better combines data-driven methods with theory.

Functional Smiles: Tools for Love, Sympathy, and War

Article

Full-text available

Jul 2017

A smile is the most frequent facial expression, but not all smiles are equal. A social-functional account holds that smiles of reward, affiliation, and dominance serve basic social functions, including rewarding behavior, bonding socially, and negotiating hierarchy. Here, we characterize the facial-expression patterns associated with these three types of smiles. Specifically, we modeled the facial expressions using a data-driven approach and showed that reward smiles are symmetrical and accompanied by eyebrow raising, affiliative smiles involve lip pressing, and dominance smiles are asymmetrical and contain nose wrinkling and upper-lip raising. A Bayesian-classifier analysis and a detection task revealed that the three smile types are highly distinct. Finally, social judgments made by a separate participant group showed that the different smile types convey different social messages. Our results provide the first detailed description of the physical form and social messages conveyed by these three types of functional smiles and document the versatility of these facial expressions.

The fear gasping face as a threat display in a Melanesian society

Article

Full-text available

Nov 2016
P NATL ACAD SCI USA

Significance Humans interpret others’ facial behavior, such as frowns and smiles, and guide their behavior accordingly, but whether such interpretations are pancultural or culturally specific is unknown. In a society with a great degree of cultural and visual isolation from the West—Trobrianders of Papua New Guinea—adolescents interpreted a gasping face (seen by Western samples as conveying fear and submission) as conveying anger and threat. This finding is important not only in supporting behavioral ecology and the ethological approach to facial behavior, as well as challenging psychology’s approach of allegedly pancultural “basic emotions,” but also in applications such as emotional intelligence tests and border security.

Four Not Six: Revealing Culturally Common Facial Expressions of Emotion

Article

Full-text available

Apr 2016

As a highly social species, humans generate complex facial expressions to communicate a diverse range of emotions. Since Darwin's work, identifying among these complex patterns which are common across cultures and which are culture-specific has remained a central question in psychology, anthropology, philosophy, and more recently machine vision and social robotics. Classic approaches to addressing this question typically tested the cross-cultural recognition of theoretically motivated facial expressions representing 6 emotions, and reported universality. Yet, variable recognition accuracy across cultures suggests a narrower cross-cultural communication supported by sets of simpler expressive patterns embedded in more complex facial expressions. We explore this hypothesis by modeling the facial expressions of over 60 emotions across 2 cultures, and segregating out the latent expressive patterns. Using a multidisciplinary approach, we first map the conceptual organization of a broad spectrum of emotion words by building semantic networks in 2 cultures. For each emotion word in each culture, we then model and validate its corresponding dynamic facial expression, producing over 60 culturally valid facial expression models. We then apply to the pooled models a multivariate data reduction technique, revealing 4 latent and culturally common facial expression patterns that each communicates specific combinations of valence, arousal, and dominance. We then reveal the face movements that accentuate each latent expressive pattern to create complex facial expressions. Our data questions the widely held view that 6 facial expression patterns are universal, instead suggesting 4 latent expressive patterns with direct implications for emotion communication, social psychology, cognitive neuroscience, and social robotics. (PsycINFO Database Record

The Look of Fear from the Eyes Varies with the Dynamic Sequence of Facial Actions

Article

Full-text available

Jan 2016

Most research on the ability to interpret expressions from the eyes has utilized static information. This research investigates whether the dynamic sequence of facial actions in the eye region influences the judgments of perceivers. Dynamic fear expressions involving the eye region and eyebrows were created which systematically differed in the sequential occurrence of facial actions. Participants rated the intensity of sequential fear expressions, either in addition to a simultaneous, full-blown expression (Experiment 1) or in combination with different levels of eye gaze (Experiment 2). The results showed that the degree of attributed emotion and the appraisal ratings differed as a function of the sequence of facial expressions of fear, with direct gaze resulting in stronger subjective responses. The findings challenge current notions surrounding the study of static facial displays from the eyes and suggest that emotion perception is a dynamic process shaped by the time course of the facial actions of an expression. Possible implications for the field of affective computing and clinical research are discussed.

The Face is the Mirror of the Cultural Mind

Conference Paper

Full-text available

Sep 2015
J VISION

With the advent of the digital economy, increasing globalization and cultural integration, cross-cultural social communication is increasing, where the mutual understanding of mental states (e.g., confusion, bored) is a key social skill. One of the most powerful tools in social communication is the face, which can flexibly create a broad spectrum of dynamic facial expressions. Yet, systematic cultural differences in face signalling and decoding (e.g., see Jack, 2013 for a review) presents a challenge to the evolving communication needs of modern society (e.g., designing culturally aware digital avatars and companion robots that can adaptively recognize and produce both culture-specific and universal face signals). Understanding which face signals support accurate communication across cultures, and those that produce confusions therefore remains a central question. To address this question, we used a 4D Generative Face Grammar (GFG, Yu et al., 2012) with reverse correlation (Ahumada & Lovell, 1972) to model the dynamic facial expressions of four mental states - 'thinking,' 'interested,' 'bored' and 'confused' - in 15 Western Caucasian (WC) and 15 East Asian (EA) observers (See Figure S1 Panel A. See also Jack et al., 2012, 2014, Gill et al., 2014). Cross-cultural comparison of the dynamic models revealed, for each mental state, clear commonalities (see Figure S1, Panel B, Common Signals) and cultural specificities in AU patterns (Culture-specific Signals). To illustrate, in 'confused,' Cheek Raiser/Lip Stretcher are culturally common, whereas Upper Lip Raiser is WC-specific and Jaw Drop is EA-specific. Similarly, in 'thinking,' the Chin Raiser is culturally common, whereas the Dimpler is WC-specific and, in contrast, Brow Lowerer/Nostril Compressor are EA-specific. Together, our data provides a common face signalling basis for cross-cultural social communication, and identifies confusing face signals, with implications for the digital economy (e.g., algorithms designed to automatically detect face signals, e.g., Vinciarelli et al., 2009). Meeting abstract presented at VSS 2015.

Universals and cultural differences in facial expressions of emotion

Article

Jan 1972

P. Ekman

The expression of the emotions in man and animals

Article

Jan 1993

C. Darwin

Equipping social robots with culturally-sensitive facial expressions of emotion using data-driven methods

Figures

Recommended publications

Scotland’s most sustainable university

Adam Smith 300 Year Anniversary – Global Reading Group Events

Adam Smith 300 Year Anniversary – Global Reading Group Events

The future with quantum

Modeling Emotions and Moods in an Affective System for Virtual Human and Social Robots

Dyadic Stance in Natural Language Communication with a Teachable Robot

Perspectives to Observe Talk-in-Interaction in Studies of Conversational Analysis and Human-Robot In...

Making Place for Social Norms in the Design of Human-Robot Interaction