Conference PaperPDF Available

The Use of Deep and Machine Learning for Face Expression Recognition: A Literature Review

Authors:

Abstract

In the current era, Artificial Intelligence (AI) and Computer Vision take the main role and participating to the people daily activities. Face Expression became as interesting topic to be explore. Face expression recognition or detection could be applied in many aspects, such as for student focus detection based on face expression, intruder detection system, lie detection and many more. Referring the useful of face expression detection, there are also many research that focused on the methodology to detects or recognize the face expression. Several approach are use such by using Deep Learning, Machine Learning and statistical method. This paper focused on the process to obtain and knowing the best approach or methods to recognize the face expression by using Systematic Literature Review (SLR) process. From hundreds of retrieved paper with similar interest, there are 6 paper in total that fulfil our requirements based on tittle, result and methods. Subsequently, every research paper was reviewed and compared with another to find the best approaches proposed and dataset used.
The Use of Deep and Machine Learning For Face
Expression Recognition : a Literature Review
*Note: Sub-titles are not captured in Xplore and should not be used
Gusti Pangestu*
Computer Science Department, BINUS
Graduate Program - Doctor of
Computer Science
Computer Science Department,
School of Computer Science
Bina Nusantara university
Jakarta, Indonesia 11480
gusti.pangestu@binus.ac.id
Harco Leslie Hendric Spits Warnars
Computer Science Department, BINUS
Graduate Program - Doctor of
Computer Science
Bina Nusantara university
Jakarta, Indonesia 11480
Spits.hendric@binus.ac.id
Ford Lumban Gao
Computer Science Department, BINUS
Graduate Program - Doctor of
Computer Science
Bina Nusantara university
Jakarta, Indonesia 11480
fgaol@binus.edu
Benfano Soewito
Computer Science Department, BINUS
Graduate Program - Doctor of
Computer Science
Bina Nusantara university
Jakarta, Indonesia 11480
bsoewito@binus.edu
Abstract In the current era, Artificial Intelligence (AI) and
Computer Vision take the main role and participating to the
people daily activities. Face Expression became as interesting
topic to be explore. Face expression recognition or detection
could be applied in many aspects, such as for student focus
detection based on face expression, intruder detection system,
lie detection and many more. Referring the useful of face
expression detection, there are also many research that focused
on the methodology to detects or recognize the face expression.
Several approach are use such by using Deep Learning,
Machine Learning and statistical method. This paper focused on
the process to obtain and knowing the best approach or methods
to recognize the face expression by using Systematic Literature
Review (SLR) process. From hundreds of retrieved paper with
similar interest, there are 6 paper in total that fulfil our
requirements based on tittle, result and methods. Subsequently,
every research paper was reviewed and compared with another
to find the best approaches proposed and dataset used.
Keywordscomponent, formatting, style, styling, insert (key
words)
I. INTRODUCTION
The increase of technology, especially computational
powers and the availability of sensing techniques and
engineering leading us to a new condition and era. The
increase of technology also made computer become more
intelligence [1]. Along with the increasing ability of
computer, the interaction between human and computer
became increasingly better. Computers have become one of
the tools that are widely used related to things related to
humans. Facial expressions are one of the most chosen
research focuses. Until now, quite a lot of research has
focused on reading human facial expressions.
There are lots of external products that utilize human
facial expressions, starting from the field of Education [2],
Health [3] even on social aspect [4][5]. The three aspects that
have been mentioned above are the aspects that make the
most use of the results of facial expression analysis.
There are quite a number of methods for detecting facial
expressions. Many researchers use an image processing
approach to detect and recognize facial expressions. One of
the methods used by researchers is to use Neural Networks,
including Convolutional Neural Networks (CNN) or Neural
Networks in a conventional way. In connection with the many
models and choices for the Neural Network architecture, it is
necessary to conduct further research on how successful and
accurate the Neural Network model is.
In this paper, we will discuss what are the methods,
approaches and methods for the facial expression recognition
process, especially using Neural Networks. There are several
steps and methods that are carried out, briefly these methods
are divided into several sections which will be discussed in
detail in the Methods section. This research will also examine
the results of several studies related to the use of Neural
Networks to recognize facial expressions. The final result of
this research is a conclusion analysis that is expected to help
other researchers to be able to accurately determine the most
appropriate architecture for the case study being carried out,
especially in detecting facial expressions.
II. METHOD
There are three main steps for reaching the objective
including: Plan the review, conduct and manage the review,
and the last is report and analysis the review [6].
Those three main steps are expanded in several stages
shown in Fig. 1. The stages mechanism was adopted
Literature Review Research proposed and written by Huda C,
et al [7].
In this section, every sub-section would be as an
explanation of every step that shown by Fig. 1. There is total
10 steps which started from Defining the Research Question
[8], [9] and concluded with the result of the selected paper
[10]. The process is arranged as sequential, but in reality it is
not limited to the sequential manner because it could be
repeated in the specific part to obtain the best results and
conclusion.
A. Defining Research Questions
Defining the Research Question is the most important
step in the research because of its vitality to lead the
methodology, methods, data and the instrument of the
research [11].
In this research, there are 3 items of Research Question
(RQ) that represent the needs and goals of this research. This
research also defining the RQ using the FINERMAPS
manner introduced by [12]. The Research Question items are
shown in Fig. 2.
B. Convert RQ to Keywords
After Research Question (RQ) wad defined, the next
step is to convert RQ to the keywords. The keywords needs
to be obtained in due to the process of finding a related
research in the manuscript database such as Scopus, Web of
Science etc. In this research, according to the problem
defined in the previous section, the keywords obtained from
the translation of the RQ are shown in Fig. 3.
C. Define the Inclusion and Exclusion criteria
The keywords will produce a lot of data whereas not all
of the data will be included in this research. Therefore, we
need and mechanism to filter the result to meet the data we
needs. In this stage, the inclusion and exclusion were applied
to produce a deep knowledges [13]. The inclusion and
exclusion criteria are shown in Table 1.
D. Define the Consulted Database
There are a lots of research paper database like Scopus,
Web of Science, IEEE, Springer, Elsevier etc. Those paper
database was containing huge indexed manuscript.
According to the amount of database, this research use
the two of those database which is Scopus and Google
Scholar. The use of those database is based on the access that
we has. Furthermore, scopus database contain a lot of high
quality, picked and good paper research [14]. But, according
to the inclusion and exclusion stated above that needs the
paper from 2018-2022, the scopus itself does not as fast as we
think about the indexing method. Therefor, we use the google
scholar database to obtain the latest research paper especially
with the year of publication in 2022.
E. Retrieve the paper
The process of retrieve the paper require the inclusion and
exclusion. The first result without inclusion and exclusion
will produce lot of paper considering that there is no
limitation in the query.
The second result of retrieved paper is the result after
applying inclusion and exclusion. The result will be more
shrink caused by the limitation. From the second result, again,
we choose several papers called “selected papers”. There are
many criteria that are taken into consideration, including
years, topic, and result. The selected paper then will be
entering the last steps, to produce the output and answering
the Research Question.
F. Produce the Result
In this step, the selection paper will be fully reviewed
from the dataset that used, model, methods or approaches
until the result. Furthermore, from the selected reviewed
Fig 1. Steps of Literature Review Process
RQ 1
What is the Problem?
RQ 2
What kind of Model being used?
RQ 3
What kind of data being used?
Fig 2. The Research Question (RQ)
String 1
Face Expression Recognition Deep
Learning
String 2
Face Expression Detection Deep
Learning
String 3
Face Emotion Detection Deep Learning
String 4
Face Emotion Recognition Deep
Learning
TABLE I. INCLUSION AND EXCLUSION CRITERIA
Inclusion Criteria
Justification
Published paper in 2018-
2022
Using the most recent
published paper only
Journal and Conference
only
Avoiding unreviewed
research
The study of paper must
be identical to the
keywords (Face
expression recognition)
Avoiding research that
does not meet the criteria
Exclusion Criteria
Not in english paper
Paper must be in english
which is the standard
International language
Paper is in review or still
not published
To focus on primary
research only
papers, the conclusion and the answer of the Research
Question (RQ) are obtained. The conclusion will determine
the best model and scenario of Deep Learning model as
approach for Face Expression recognition or detection.
III. RESULT AND DISCUSSION
This section will discussed the result of retrieving data
from the database. As mentioned before, we use two database
called scopus and google scholars.
The first retrieval process produce around 1848
documents in Scopus database and around 2.420.000
documents in Google Scholar database. Therefor, total of the
first retrieval process is 2.421.848 documents. For better
quality result, we use the scopus retrieved data for projection
as shown in Fig. 4.
According to Fig. 4, the research about Face Expression
detection using Deep Learning was significantly increasing
since 2015. The scopus database also shown that the highest
publication about the topic in 2021, remembering 2022 is still
in the process.
The next process it to apply the inclusion and exclusion
criteria to the retrieved data. The result after applying
inclusion and exclusion criteria, there was about 1.420 papers
result with China become the most country published
research paper about face expression detection.
Fast reviewing process also was done to the 1.420 paper
selected. In the fast-reviewing process, we have done the
selection using reading the title, abstract and conclusion.
Therefore, there are 63 paper that quite relevant to our topics.
Also, the query for search process in the database became “(
TITLE-ABS-KEY ( face AND expression AND recognition
AND deep AND learning ) OR TITLE-ABS-KEY ( face
AND expression AND detection AND deep AND learning
) OR TITLE-ABS-KEY ( face AND emotion AND
detection AND deep AND learning ) OR TITLE-ABS-
KEY ( face AND emotion AND recognition AND deep
AND learning ) ) AND ( LIMIT-TO ( PUBYEAR , 2022 )
OR LIMIT-TO ( PUBYEAR , 2021 ) OR LIMIT-TO (
PUBYEAR , 2020 ) OR LIMIT-TO ( PUBYEAR , 2019 )
OR LIMIT-TO ( PUBYEAR , 2018 ) ) AND ( LIMIT-TO
( DOCTYPE , "cp" ) OR LIMIT-TO ( DOCTYPE , "ar" )
) AND ( LIMIT-TO ( SUBJAREA , "COMP" ) OR LIMIT-
TO ( SUBJAREA , "ENGI" ) )”.
To answer the Research Question (RQ), first, we applied
the inclusion and exclusion to 63 papers that previously
obtained and select the best related paper (selected paper).
From 63 papers selected. There are 6 paper that meets
our criteria with high-quality research presented. From those
6 papers, we do several analyze such the method uses,
accuracy and dataset. Those analytic result of 6 papers will
be used to answer the following RQ :
A. RQ 1, what is the problem?
This research paper was developed to satisfy the
curiosity about the best approach to detect the face
expression. Moreover, from the 6 selected papers, most of
them are focused on how to obtain the best accuracy based on
the model or approach they have been used. There are 4
papers proposed a deep learning approach to detects the face
expression, 1 paper proposed Recurrent Neural Network
approach (RNN), and one paper propose the use of Machine
Learning as shown in Table II and Fig. 5.
TABLE III. METHODS AND MODEL USED BY SELECTED PAPER
Methods or Model
Selected Paper
Proposed Model 1
[18]
Proposed Model 2
[18]
Efficient Net
[22]
VGG-F
[23]
VGG Face
[23]
ResNet-50
[23]
RNN + landmark
[17]
CNN-LCDRC
[19]
Gaussian NB
[24]
DT
[24]
KNN
[24]
SVM
[24]
MLP
[24]
QDA
[24]
RF
[24]
LR
[24]
TABLE II. APPROACHES FOR FACE EXPRESSION DETECTION IN
THE SELECTED PAPER
Approach
Selected Paper
Deep Learning / CNN
[18], [22], [23],
[19]
Recurrent Neural Network
(RNN) + Facial landmark
[17]
Machine Learning
[24]
Fig. 4. Timeline of the Face Expression research
Fig. 5. Comparison of Face Expression Detection Methods
67%
16%
17%
Face Expression Detection Approaches
CNN RNN Machine Learning
Based on the reviewing the result and approach of each
papers, the answer of RQ 1 (what is the problem?) was to
determine the best approach that has been done previously by
other researchers.
B. RQ 2, What kind of model being used?
According to the Table II, from 6 selected papers, there
are 16 models and Methods was used (shown in Table III).
Those methods has its own accuracy measurement. The
limitation of this research is only
Table III shows the methods or approaches utilized by
the selected paper. As shown in Table III, Deep Learning is
the most used approach for Face Expression Detection.
However, there is a lot of Deep Learning or CNN models that
used. Also, Machine Learning produce a significance result
too such as an SVM method [15][16].
Based on the deep reviewing Result of Table II and Table
III, each method or approach has its own result for face
expression recognition. The accuracy result has difference
value in each different dataset. Hence, the average
approaches was used to obtain the featureless results shown
in Fig. 6.
According to Fig. 6, the highest accuracy value was
obtained by RNN (Recurrent Neural Network) method [17].
Nonetheless, the result proposed by Mayang et al [18]
produce deficient result with average accuracy around
63.51% using the CNN approach. Whereas the highest
accuracy result produce by CNN approaches in those 6
research papers was 99% using the CNN customized model
proposed by Sangamesh et al [19].
C. RQ3, What kind of data being used?
The use of Machine Learning and its development
mostly used for classification process[20]. Classification is a
challenging step, especially for low dataset cases. Datasets is
the significant aspect in classification process [21]. Face
Expression Recognition utilize the classification process to
determine the expression on the face. Therefore, the dataset
take a significant role for face expression detection.
From 6 studies we have reviewed, there are total 11
dataset used in the experiments. Every dataset also produce
its own result and accuracy. The detail of dataset and its own
accuracy shown in Fig. 7.
Referring to Fig. 7, the highest accuracy of face
expression recognition was produced using YALE and ORL
dataset followed by CK+ dataset.
Fig. 6. Average accuracy comparison for each method
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
CNN
(costumized)
SVM
(costumized)
RNN
(costumized)
MLP
(costumized)
KNN
(costumized)
Methods Accuracy Comparison
Fig. 7. Average accuracy produced using each dataset
66.93%
63.55%
64.74%
65.10%
55.43%
95.75%
91.40%
85.50%
99%
99%
Acc, 91%
ACCURACIES ON EVERY DATASET
IV. CONCLUSION
Based on deep review process on the six research paper
about face expression recognition. There are several points to
be concluded. First, face expression recognition experiencing
the popularities since 2015 causing by the increase of
Artificial Intelligence (AI) based on Fig. 4. Second, the face
expression recognition gain its best and highest accuracy by
utilizing several approach especially Support Vector Machine
(SVM) and Deep Learning. The accuracy result also
influenced by the dataset used. There are several dataset
commonly used in the research, the highest accuracy of face
expression recognition or detection is by utilizing YALE,
ORL, and followed by CK+ dataset.
Indeed, the combination of method and dataset take a
major role. Therefore, in the future, it is very challenging to
obtain the best approaches and methods using the
combination of the dataset consider of the different result in
the different dataset.
ACKNOWLEDGMENT
This research paper was supported by Doctor of Computer
Science program Bina Nusantara University, Indonesia. This
research is done due to the fulfilling the Software Matrix
course requirements.
REFERENCES
[1] D. G. Ganakwar, “A Case Study of various Face
Detection Methods,” International Journal for
Research in Applied Science and Engineering
Technology, vol. 7, no. 11, pp. 496500, 2019, doi:
10.22214/ijraset.2019.11080.
[2] A. R. Dores, F. Barbosa, C. Queirós, I. P. Carvalho,
and M. D. Griffiths, “Recognizing emotions through
facial expressions: A largescale experimental study,”
International Journal of Environmental Research
and Public Health, vol. 17, no. 20, pp. 113, Oct.
2020, doi: 10.3390/ijerph17207420.
[3] Y. S. Lee and W. H. Park, “Diagnosis of Depressive
Disorder Model on Facial Expression Based on Fast
R-CNN,” Diagnostics, vol. 12, no. 2, Feb. 2022, doi:
10.3390/diagnostics12020317.
[4] “Supplemental Material for Facial Expression
Predictions as Drivers of Social Perception,” Journal
of Personality and Social Psychology, 2018, doi:
10.1037/pspa0000108.supp.
[5] S. Jia, S. Wang, C. Hu, P. J. Webster, and X. Li,
“Detection of Genuine and Posed Facial Expressions
of Emotion: Databases and Methods,” Frontiers in
Psychology, vol. 11. Frontiers Media S.A., Jan. 15,
2021. doi: 10.3389/fpsyg.2020.580287.
[6] B. Kitchenham, “Procedures for Performing
Systematic Reviews,” 2004.
[7] C. Huda, A. Ramadhan, A. Trisetyarso, E.
Abdurachman, and Y. Heryadi, “Smart Tourism
Recommendation Model: A Systematic Literature
Review.” [Online]. Available: www.ijacsa.thesai.org
[8] J. Agee, “Developing qualitative research questions:
A reflective process,” International Journal of
Qualitative Studies in Education, vol. 22, no. 4, pp.
431447, Jul. 2009, doi:
10.1080/09518390902736512.
[9] “CONTINUING MEDICAL EDUCATION
FORMATION MÉDICALE CONTINUE
PRACTICAL TIPS FOR SURGICAL
RESEARCH,” 2010.
[10] H. Alkharusi, “Literature review on achievement
goals and classroom goal structure: implications for
future research,” Electronic Journal of Research in
Education Psychology, vol. 8, no. 22, pp. 13631386,
Nov. 2017, doi: 10.25115/ejrep.v8i22.1425.
[11] O. Doody and M. E. Bailey, “Setting a research
question, aim and objective,” 2016.
[12] S. K. Ratan, T. Anand, and J. Ratan, “Formulation of
Research Question - Stepwise Approach,” J Indian
Assoc Pediatr Surg, vol. 24, no. 1, pp. 1520, 2019,
doi: 10.4103/jiaps.JIAPS_76_18.
[13] C. M. Patino and J. C. Ferreira, “Inclusion and
exclusion criteria in research studies: Definitions and
why they matter,” Jornal Brasileiro de Pneumologia,
vol. 44, no. 2. Sociedade Brasileira de Pneumologia
e Tisiologia, p. 84, Mar. 01, 2018. doi:
10.1590/s1806-37562018000000088.
[14] R. Pranckutė, “Web of science (Wos) and scopus:
The titans of bibliographic information in today’s
academic world,” Publications, vol. 9, no. 1. MDPI
AG, Mar. 01, 2021. doi:
10.3390/publications9010012.
[15] H. Hasan, H. Z. M. Shafri, and M. Habshi, “A
Comparison between Support Vector Machine
(SVM) and Convolutional Neural Network (CNN)
Models for Hyperspectral Image Classification,” in
IOP Conference Series: Earth and Environmental
Science, Nov. 2019, vol. 357, no. 1. doi:
10.1088/1755-1315/357/1/012035.
[16] S. Chen and C. Liu, “Eye detection using
discriminatory Haar features and a new efficient
SVM,” Image and Vision Computing, vol. 33, pp.
6877, 2015, doi: 10.1016/j.imavis.2014.10.007.
[17] S. A. Rizwan, Y. Y. Ghadi, A. Jalal, and K. Kim,
“Automated Facial Expression Recognition and Age
Estimation Using Deep Learning,” Computers,
Materials and Continua, vol. 71, no. 2, pp. 5235
5252, 2022, doi: 10.32604/cmc.2022.023328.
[18] M. K. Rusia and D. K. Singh, “An efficient CNN
approach for facial expression recognition with some
measures of overfitting,” International Journal of
Information Technology (Singapore), vol. 13, no. 6,
pp. 24192430, Dec. 2021, doi: 10.1007/s41870-
021-00803-x.
[19] S. Hosgurmath, V. V. Mallappa, N. B. Patil, and V.
Petli, “A face recognition system using convolutional
feature extraction with linear collaborative
discriminant regression classification,” International
Journal of Electrical and Computer Engineering,
vol. 12, no. 2, pp. 14681476, Apr. 2022, doi:
10.11591/ijece.v12i2.pp1468-1476.
[20] S. B. Kotsiantis, “Supervised Machine Learning: A
Review of Classification Techniques,” 2007.
[21] A. Althnian et al., “Impact of dataset size on
classification performance: An empirical evaluation
in the medical domain,” Applied Sciences
(Switzerland), vol. 11, no. 2, pp. 118, Jan. 2021, doi:
10.3390/app11020796.
[22] N. Kumari and R. Bhatia, “Efficient facial emotion
recognition model using deep convolutional neural
network and modified joint trilateral filter,” Soft
Computing, 2022, doi: 10.1007/s00500-022-06804-
7.
[23] M. I. Georgescu, G. E. Duţǎ, and R. T. Ionescu,
“Teacher–student training and triplet loss to reduce
the effect of drastic face occlusion: Application to
emotion recognition, gender identification and age
estimation,” in Machine Vision and Applications,
Jan. 2022, vol. 33, no. 1. doi: 10.1007/s00138-021-
01270-x.
[24] A. I. Siam, N. F. Soliman, A. D. Algarni, F. E. Abd
El-Samie, and A. Sedik, “Deploying Machine
Learning Techniques for Human Emotion
Detection,” Computational Intelligence and
Neuroscience, vol. 2022, 2022, doi:
10.1155/2022/8032673.
IEEE conference templates contain guidance text for
composing and formatting conference papers. Please
ensure that all template text is removed from your
conference paper prior to submission to the
conference. Failure to remove template text from
your paper may result in your paper not being
published.
... Both segmentation and classification tasks are done mostly using deep learning with different architectures and modifications [14], [15]. U-net is one of the most used architectures in medical image processing [16]- [18]. ...
Article
Full-text available
In recent years, deep learning has found widespread applications in tasks like segmentation and classification. Fine-tuning hyperparameters is crucial for improving performance, with learning rate being a key parameter. Various methods, including adaptive learning rates, learning rate scheduling, and cyclical learning rates, have been used to optimize learning rates. Cyclical learning rates offer significant benefits with minimal computational cost, as seen in prior research. This study introduces a novel approach to cyclical learning rate tuning, incorporating Exponential Moving Average. These methods are applied to the BraTS 2021 dataset for segmentation tasks, resulting in superior performance compared to previous approach. Our proposed method reduces the epochs required to reach convergence by 19 and 54 epochs for U-Net and Dense U-net, respectively. For Res U-net, the epoch needed to convergence is 10 epochs more. However, the proposed method produces lower loss values with 0.707, 0.657, and 0.665 compared to previous method with 0.712, 0.685, and 0.725 for U-net, Res U-net, and Dense U-net, respectively.
... Since then, there have been three approaches for eliciting and collecting facial expression data (Adyapady and Annappa, 2023): (1) Posed expressions, which involve deliberate displays based on instructions and placed in a controlled environment that may not reflect real-life situations (Weber et al., 2018); (2) Spontaneous expressions, which occur naturally, and are crucial for developing intelligent human-computer interaction systems but challenging to annotate due to the dynamics of facial actions (Donia et al., 2014);and (3) In-the-wild expressions, which are acquired from unconstrained environments, essential for advancing facial expression analysis research with challenges in feature extraction due to the realistic conditions they encompass . The transition from lab-controlled environments to naturalistic data collection is an ongoing challenge (Pangestu et al., 2022). ...
Preprint
Full-text available
The effectiveness of computerized cognitive training in slowing cognitive decline and brain aging in dementia is often limited by the engagement of participants in the training. Monitoring older users' real-time engagement in domains of attention, motivation, and affect is crucial to understanding the overall effectiveness of such training. In this paper, we propose to predict engagement, quantified via an established mental fatigue measure assessing users' perceived attention, motivation, and affect throughout computerized cognitive training sessions, in older adults with mild cognitive impairment (MCI), by monitoring their real-time video-recorded facial gestures in training sessions. To achieve the goal, we used computer vision, analyzing video frames every 5 seconds to optimize the balance between information retention and data size, and developed a novel Recurrent Video Transformer (RVT). Our RVT model, which combines a clip-wise transformer encoder module and a session-wise Recurrent Neural Network (RNN) classifier, achieved the highest balanced accuracy, F1 score, and precision compared to other state-of-the-art models for both detecting mental fatigue/disengagement cases (binary classification) and rating the level of mental fatigue (multi-class classification). By leveraging dynamic temporal information, the RVT model demonstrates the potential to accurately predict engagement among computerized cognitive training users, which lays the foundation for future work to modulate the level of engagement in computerized cognitive training interventions. The code will be released.
Chapter
One of the most distinctive elements that enhances a woman’s facial features, in terms of esthetic assessments, is her hair. According to beauty experts, the haircut or hairstyle accounts for 70% of the total appearance of the face. Yet one of the most time-consuming choices a woman ever has to make is choosing the appropriate hairstyle or hairdo. The initiation of this work is by an approach that incorporates face recognition and landmark detection to classify face shapes into 5 different shapes: heart, long, oval, round, square. Five machine learning models, namely K-Nearest Neighbors (KNN), Random Forest classifier (RF), Gradient Boosting (GB), Linear Discriminant Analysis (LDA) and Multilayer Perceptron (MLP) classifier have been implemented in this work in order to accurately classify the user’s face into one of the five aforementioned shapes. MLP classifier yielded the highest accuracy of 88%. Furthermore, the hairstyle recommendation software is an adaptive software that evolves over time based on user feedback.
Article
Full-text available
Face recognition is one of the important biometric authentication research areas for security purposes in many fields such as pattern recognition and image processing. However, the human face recognitions have the major problem in machine learning and deep learning techniques, since input images vary with poses of people, different lighting conditions, various expressions, ages as well as illumination conditions and it makes the face recognition process poor in accuracy. In the present research, the resolution of the image patches is reduced by the max pooling layer in convolutional neural network (CNN) and also used to make the model robust than other traditional feature extraction technique called local multiple pattern (LMP). The extracted features are fed into the linear collaborative discriminant regression classification (LCDRC) for final face recognition. Due to optimization using CNN in LCDRC, the distance ratio between the classes has maximized and the distance of the features inside the class reduces. The results stated that the CNN-LCDRC achieved 93.10% and 87.60% of mean recognition accuracy, where traditional LCDRC achieved 83.35% and 77.70% of mean recognition accuracy on ORL and YALE databases respectively for the training number 8 (i.e. 80% of training and 20% of testing data).
Article
Full-text available
Facial emotion recognition extracts the human emotions from the images and videos. As such, it requires an algorithm to understand and model the relationships between faces and facial expressions and to recognize human emotions. Recently, deep learning models are utilized to improve the performance of facial emotion recognition. However, the deep learning models suffer from the overfitting issue. Moreover, deep learning models perform poorly for images which have poor visibility and noise. Therefore, in this paper, an efficient deep learning-based facial emotion recognition model is proposed. Initially, contrast-limited adaptive histogram equalization (CLAHE) is applied to improve the visibility of input images. Thereafter, a modified joint trilateral filter is applied to the obtained enhanced images to remove the impact of impulsive noise. Finally, an efficient deep convolutional neural network is designed. Adam optimizer is also utilized to optimize the cost function of deep convolutional neural networks. Experiments are conducted by using the benchmark dataset and competitive human emotion recognition models. Comparative analysis demonstrates that the proposed facial emotion recognition model performs considerably better compared to the competitive models
Article
Full-text available
Emotion recognition is one of the trending research fields. It is involved in several applications. Its most interesting applications include robotic vision and interactive robotic communication. Human emotions can be detected using both speech and visual modalities. Facial expressions can be considered as ideal means for detecting the persons' emotions. This paper presents a real-time approach for implementing emotion detection and deploying it in the robotic vision applications. The proposed approach consists of four phases: preprocessing, key point generation, key point selection and angular encoding, and classification. The main idea is to generate key points using MediaPipe face mesh algorithm, which is based on real-time deep learning. In addition, the generated key points are encoded using a sequence of carefully designed mesh generator and angular encoding modules. Furthermore, feature decomposition is performed using Principal Component Analysis (PCA). This phase is deployed to enhance the accuracy of emotion detection. Finally, the decomposed features are enrolled into a Machine Learning (ML) technique that depends on a Support Vector Machine (SVM), k-Nearest Neighbor (KNN), Naïve Bayes (NB), Logistic Regression (LR), or Random Forest (RF) classifier. Moreover, we deploy a Multilayer Perceptron (MLP) as an efficient deep neural network technique. The presented techniques are evaluated on different datasets with different evaluation metrics. The simulation results reveal that they achieve a superior performance with a human emotion detection accuracy of 97%, which ensures superiority among the efforts in this field.
Article
Full-text available
This study examines related literature to propose a model based on artificial intelligence (AI), that can assist in the diagnosis of depressive disorder. Depressive disorder can be diagnosed through a self-report questionnaire, but it is necessary to check the mood and confirm the consistency of subjective and objective descriptions. Smartphone-based assistance in diagnosing depressive disorders can quickly lead to their identification and provide data for intervention provision. Through fast region-based convolutional neural networks (R-CNN), a deep learning method that recognizes vector-based information, a model to assist in the diagnosis of depressive disorder can be devised by checking the position change of the eyes and lips, and guessing emotions based on accumulated photos of the participants who will repeatedly participate in the diagnosis of depressive disorder.
Article
Full-text available
With the advancement of computer vision techniques in surveillance systems, the need for more proficient, intelligent, and sustainable facial expressions and age recognition is necessary. The main purpose of this study is to develop accurate facial expressions and an age recognition system that is capable of error-free recognition of human expression and age in both indoor and outdoor environments. The proposed system first takes an input image pre-process it and then detects faces in the entire image. After that landmarks localization helps in the formation of synthetic face mask prediction. A novel set of features are extracted and passed to a classifier for the accurate classification of expressions and age group. The proposed system is tested over two benchmark datasets, namely, the Gallagher collection person dataset and the Images of Groups dataset. The system achieved remarkable results over these benchmark datasets about recognition accuracy and computational time. The proposed system would also be applicable in different consumer application domains such as online business negotiations, consumer behavior analysis, E-learning environments, and emotion robotics.
Article
Full-text available
The tourism industry has become a potential sector to leverage economic growth. Many attractions are detected on several platforms. Machine learning and data mining are some potential technologies to improve the service of tourism by providing recommendations for a specific attraction for tourists according to their location and profile. This research applied for a systematic literature review on tourism, digital tourism, smart tourism, and recommender system in tourism. This research aims to evaluate the most relevant and accurate techniques in tourism that focused on recommendations or similar efforts. Several research questions were defined and translated into search strings. The result of this research was promoting 41 research that discussed tourism, digital tourism, smart tourism, and recommender systems. All of the literature was reviewed on some aspects, in example the problem addressed, methodology used, data used, strength, and the limitation that can be an opportunity for improvement in future research. This study proposed some references for further study based on reviewed papers regarding tourism management, tourist experience, tourist motivation, and tourist recommendation system. The opportunities for a further research study can be conducted with more data usage especially for a smart recommender system in tourism through many types of recommendation techniques such as content-based, collaborative filtering, demographic, knowledge-based, community-based, and hybrid recommender systems
Article
Full-text available
We study a series of recognition tasks in two realistic scenarios requiring the analysis of faces under strong occlusion. On the one hand, we aim to recognize facial expressions of people wearing virtual reality headsets. On the other hand, we aim to estimate the age and identify the gender of people wearing surgical masks. For all these tasks, the common ground is that half of the face is occluded. In this challenging setting, we show that convolutional neural networks trained on fully visible faces exhibit very low performance levels. While fine-tuning the deep learning models on occluded faces is extremely useful, we show that additional performance gains can be obtained by distilling knowledge from models trained on fully visible faces. To this end, we study two knowledge distillation methods, one based on teacher–student training and one based on triplet loss. Our main contribution consists in a novel approach for knowledge distillation based on triplet loss, which generalizes across models and tasks. Furthermore, we consider combining distilled models learned through conventional teacher–student training or through our novel teacher–student training based on triplet loss. We provide empirical evidence showing that, in most cases, both individual and combined knowledge distillation methods bring statistically significant performance improvements. We conduct experiments with three different neural models (VGG-f, VGG-face and ResNet-50) on various tasks (facial expression recognition, gender recognition, age estimation), showing consistent improvements regardless of the model or task.
Article
A person's emotion can be represented through facial expressions in non-vocal communication. Nowadays, automatic facial expression recognition systems have attracted myriad interest in applications such as face biometric-based authentication, behavior analysis (psychology), health monitoring (cerebral palsy), recommendation systems, and many others. Deep learning-based solutions have become the most-handy method to solve any image-video processing problems in recent times. Nevertheless, these CNN models include many hidden layers with complex predefined mathematical functions, resulting in increased complexity. Therefore, deep architecture poses a challenge to deal with a large number of learning parameters. This manuscript proposes two customized-CNN models, named Proposed_Model_1 and Proposed_Model_2, to classify universal facial expressions without overfitting. In this paper, the effect of hyper-parameters such as activation function, learning rate, kernel size, and convolutional block are investigated and optimized efficiently. Experimental results reveal that both of our proposed models outperform other existing methods considering all universal facial expressions, with an accuracy of 67.24% and 66.61%, respectively, on the well-known benchmark public dataset (Facial Expression Recognition-2013).