Original Image Step 2: Transform the image into CIEL*a*b* color space...

African jewel fish (Hemichromis bimaculatus) distinguish individual faces based on their unique iridophore patterns

Article

Full-text available

Jun 2023
ANIM COGN

Previous research has shown that African jewel fish (Hemichromis bimaculatus) recognize pair-bonded mates during their exchanges of egg-guarding duties. The current research examined the perceptual cues for face recognition by comparing two face models displaying anatomically realistic arrangements of blue iridophores derived from discriminant function analysis of distinct sibling groups. Four groups each consisting of 9 subadults were examined using a narrow compartment restraining lateral movement where face models were presented at eye level for eight trials. Because respiratory movement of the operculum can mechanically displace the eye thereby shifting the retinal image, jewel fish reduce their respiration rate during increased attention. When two experimental groups were presented with the same face models on four trials following initial model presentations, both groups exhibited stable respiration rates indicative of model habituation. When the habituated face models were switched to novel face models on the fifth trial, the rates of respiration decreased as measured by reliable increases in the elapsed times of opercular beats. Switching the models back to the habituated models on the sixth trial caused reliable decreases in the elapsed times of opercular beats, resembling the earlier trials for the habituated models. Switching the face models again to the formerly novel models on the seventh trial produced respiration rates that resembled those of the habituated models. The two control groups viewing the same models for all eight trials exhibited no substantial change in respiration rates. Together, these findings indicate that jewel fish can learn to recognize novel faces displaying unique arrangements of iridorphores after one trial of exposure.

Learning image by-parts using early and late fusion of auto-encoder features

Article

Full-text available

Aug 2021
MULTIMED TOOLS APPL

A novel sub-part learning scheme is introduced in our work for the purpose of recognizing handwritten numeral images. The idea is borrowed from the concept of visual perception and part-wise integration of visual information by the cortical regions of the brain. In this context, each numeral image is divided into four half-parts: top-half, bottom-half, left-half and right-half; the other half of the image being kept masked. An efficient data representation is derived in an unsupervised manner, from each image part, using convolutional auto-encoders (CAE), for our learning scheme that involves both early and late fusion of features. The chief advantage of the features derived from convolutional auto-encoders is the preservation of 2D spatial locality while the features are being filtered layer-by-layer through the convolutional architecture. The features derived from each individual CAE are fused by concatenation in our early fusion scheme, and learnt using an appropriate classifier. The late fusion strategy involves learning the probability density pertaining to the predicted values emanating from the four base classifiers using a meta-learner classifier. The early-cum-late fusion is proposed in the later stage of our work to combine the goodness of both schemes and enhance the performance. The support vector machine is used in all the classification stages. Experiments on the benchmark MNIST dataset of handwritten English numerals prove that our method competes favorably to the state of the art, as inferred from the high classification scores achieved. Our method thus provides a computationally simple and effective methodology for sub-part learning and part-wise integration of information from different parts of the image. The method also contributes to saving in computational expense since, at a time, only a small part of the image is processed, speeding up the inferencing process.

DST-ML-EkNN: Data Space Transformation with Metric Learning and Elite k-Nearest Neighbor Cluster Formation for Classification of Imbalanced Datasets

Chapter

Full-text available

Sep 2020

Most of the real-world datasets suffer from the problem of imbalanced class representation, with some classes having more than sufficient samples while some other classes are heavily underrepresented. The imbalance in the class distributions of the majority and minority samples renders conventional classifiers like the k-Nearest Neighbor (kNN) classifier ineffective due to the heavy bias towards the majority class. In this paper, we propose data space transformation of the training set by distance metric learning prior to an enhanced classification phase. The classification phase comprises of partitioning the set of k-nearest neighbors of each transformed test sample into two clusters based on their distances from the two extreme members in the set. A majority voting of the training samples within the ‘elite’ cluster that is closest to the test sample indicates the label of the test sample. Our proposed method is called Data Space Transformation with Metric Learning and Elite k-Nearest Neighbor cluster formation (DST-ML-EkNN). Extensive experiments on benchmark datasets using a variety of Metric Learning methods, with comparisons to the state-of-the-art, establish the supremacy of our learning approach for imbalanced datasets. Link to code: https://github.com/AmiteshDTU/DST-ML-EkNN

Human Attention Span Modeling using 2D Visualization Plots for Gaze Progression and Gaze Sustenance

Conference Paper

Full-text available

Jul 2019

This paper presents a novel perspective on human attention span modeling based on gaze estimation from head pose data extracted from videos. This is achieved by devising specialized 2D visualization plots that capture gaze progression and gaze sustenance over time. In doing so, a low-resolution analysis is assumed, as is the case with most crowd surveillance videos wherein the retinal analysis and iris pattern extraction of individual subjects is made impossible. The information is useful for studies involving the random gaze behavior pattern of humans in a crowded place, or in a controlled environment in seminars or office meetings. The extraction of useful information regarding the attention span of the individual from the spatial and temporal analysis of gaze points is the subject of study in this paper. Different solutions ranging from plotting temporal gaze plots to sustained attention span graphs are investigated, and the results are compared with the existing techniques of attention span modeling and visualization.

Image coding based on maximum entropy partitioning for identifying improbable intensities related to facial expressions

Article

Full-text available

Nov 2016
SADHANA-ACAD P ENG S

In this paper we investigate information-theoretic image coding techniques that assign longer codes to improbable, imprecise and non-distinct intensities in the image. The variable length coding techniques when applied to cropped facial images of subjects with different facial expressions, highlight the set of low probability intensities that characterize the facial expression such as the creases in the forehead, the widening of the eyes and the opening and closing of the mouth. A new coding scheme based on maximum entropy partitioning is proposed in our work, particularly to identify the improbable intensities related to different emotions. The improbable intensities when used as a mask decode the facial expression correctly, providing an effective platform for future emotion categorization experiments.

3D-Difference Theoretic Texture Features for dynamic face recognition

Conference Paper

Full-text available

Mar 2016

We define a new set of spatio-temporal features called the 3D-Difference Theoretic Texture Features (3D-DTTF) for dynamic face recognition from videos. The Difference Theoretic Texture Features (DTTF) is a low-dimensional 2D scale-, rotation- and illumination-invariant texture descriptor set which reported high accuracies for texture recognition experiments in [6]. The 3D-DTTF extends the gray-level difference statistics along the Front (F), Front-Diagonal Horizontal (FDH) and Front-Diagonal Vertical (FDV) directions in addition to the existing horizontal, vertical and diagonal directions in the two-dimensional DTTF. The new 3D features are affine-invariant similar to their 2D counterpart, a property useful for recognizing faces in a video irrespective of the change in facial expressions. Experimentation on the Cohn-Kanade facial expression video dataset yields higher accuracy for the proposed dynamic face recognition as compared to other methods.

Fuzzy match index for scale-invariant feature transform (SIFT) features with application to face recognition with weak supervision

Article

Full-text available

Jul 2015
IET IMAGE PROCESS

A fuzzy match index for scale-invariant feature transform (SIFT) features is proposed in this study that cumulatively involves all the test SIFT keypoints in the decision-making process. The new fuzzy SIFT classifier is adapted successfully for robust face recognition from complex backgrounds without using any face cropping tools and using only a single training template. The further incorporation of entropy weights ensures that the facial features have a greater role in the soft decision-making as compared with the background features. The highlights of the authors' work are: (i) The development of a novel highly efficient fuzzy SIFT descriptor matching tool; (ii) incorporation of feature entropy weights to highlight the contribution of facial features; (iii) application to robust face recognition from uncropped images having diverse backgrounds with a single template for each subject. The authors thus allow for weak supervision of the face recognition experiment and obtain high accuracy for 20 subjects of the CALTECH-256 face database, 133 subjects of the labelled faces for the wild dataset and 994 subjects of the FERET database, with state-of-the-art comparisons indicating the supremacy of the authors' approach. Link to code: https://in.mathworks.com/matlabcentral/fileexchange/82933-fuzzy-sift-keypoint-matching

Facial Expression Recognition from 3D Facial Landmarks Reconstructed from Images

Conference Paper

Full-text available

Dec 2020

Direct classification of normalized and flattened 3D facial landmarks reconstructed from 2D images is proposed in this paper for recognizing eight types of facial expressions depicting the emotions of- sadness, anger, contempt, disgust, fear, happiness, neutral and surprised. The first stage is the 3D projection of 2D facial landmarks. The pre-trained convolutional Face Alignment Network (FAN) proposed recently for 2D/3D face alignment is used for the purpose. The 3D cartesian coordinates are translated to the spherical coordinate system prior to the classification stage. A variety of classifiers are tried and tested for classifying the 68 facial landmarks, for both the coordinate systems. The benchmark CK+ video dataset is used for the experimentation; the last frame of each video that depicts the peak of each emotion is used as the input image. The experimental results indicate that direct classification of normalized and flattened 3D facial landmarks in the spherical coordinate system yields the highest accuracy for the support vector machine classifier with grid search for determining optimal parameters.

Identity Concealment When Uploading Pictures of Patients in a Tele-Medicine System

Chapter

Jan 2021

Tele-medicine systems run the risk of unauthorized access to medical records, and there is greater possibility for the unlawful sharing of sensitive patient information, including children, and possibly showing their private parts. Aside from violating their right to privacy, such practices discourage patients from subjecting themselves to tele-medicine. The authors thus present an automatic identity concealment system for pictures, the way it is designed in the GetBetter tele-medicine system developed under a WHO/TDR grant. Based on open-source face- and eye-detection algorithms, identity concealment is executed by blurring the eye region of a detected face using pixel shuffling. This method is shown to be not only effective in concealing the identity of the patient, but also in preserving the exact distribution of pixel values in the image. This is useful when subsequent image processing techniques are employed, such as when identifying the type of lesions based on images of the skin.

Identity Concealment When Uploading Pictures of Patients in a Tele-Medicine System

Article

Apr 2019

Tele-medicine systems run the risk of unauthorized access to medical records, and there is greater possibility for the unlawful sharing of sensitive patient information, including children, and possibly showing their private parts. Aside from violating their right to privacy, such practices discourage patients from subjecting themselves to tele-medicine. The authors thus present an automatic identity concealment system for pictures, the way it is designed in the GetBetter tele-medicine system developed under a WHO/TDR grant. Based on open-source face- and eye-detection algorithms, identity concealment is executed by blurring the eye region of a detected face using pixel shuffling. This method is shown to be not only effective in concealing the identity of the patient, but also in preserving the exact distribution of pixel values in the image. This is useful when subsequent image processing techniques are employed, such as when identifying the type of lesions based on images of the skin.

Contexts in source publication

Citations