PreprintPDF Available

EnD: Entangling and Disentangling deep representations for bias correction

March 2021

March 2021

License
CC BY 4.0

Authors:

Enzo Tartaglione

Télécom Paris

Carlo Alberto Barbano

Università degli Studi di Torino

Marco Grangetto

Università degli Studi di Torino

Preprints and early-stage research may not have been peer reviewed yet.

Artificial neural networks perform state-of-the-art in an ever-growing number of tasks, and nowadays they are used to solve an incredibly large variety of tasks. There are problems, like the presence of biases in the training data, which question the generalization capability of these models. In this work we propose EnD, a regularization strategy whose aim is to prevent deep models from learning unwanted biases. In particular, we insert an "information bottleneck" at a certain point of the deep neural network, where we disentangle the information about the bias, still letting the useful information for the training task forward-propagating in the rest of the model. One big advantage of EnD is that we do not require additional training complexity (like decoders or extra layers in the model), since it is a regularizer directly applied on the trained model. Our experiments show that EnD effectively improves the generalization on unbiased test sets, and it can be effectively applied on real-case scenarios, like removing hidden biases in the COVID-19 detection from radiographic images.

Model overview. The features for EnD are extracted at the output of Γ , after a normalization layer performing the operation as in (3).

…

Grad-CAM on CORDA: vanilla model (a) and EnD-regularized model (b).

…

Biased MNIST performance on the unbiased test set.

…

Figures - available via license: Creative Commons Attribution 4.0 International

Content may be subject to copyright.

Available via license: CC BY 4.0

Content may be subject to copyright.

EnD: Entangling and Disentangling deep representations for bias correction

Enzo Tartaglione

University of Turin,

Computer Science Dept.

enzo.tartaglione@unito.it

Carlo Alberto Barbano

University of Turin,

Computer Science Dept.

carlo.barbano@unito.it

Marco Grangetto

University of Turin,

Computer Science Dept.

marco.grangetto@unito.it

Abstract

Artiﬁcial neural networks perform state-of-the-art in an

ever-growing number of tasks, and nowadays they are used

to solve an incredibly large variety of tasks. There are prob-

lems, like the presence of biases in the training data, which

question the generalization capability of these models. In

this work we propose EnD, a regularization strategy whose

aim is to prevent deep models from learning unwanted bi-

ases. In particular, we insert an “information bottleneck”

at a certain point of the deep neural network, where we dis-

entangle the information about the bias, still letting the use-

ful information for the training task forward-propagating in

the rest of the model. One big advantage of EnD is that we

do not require additional training complexity (like decoders

or extra layers in the model), since it is a regularizer di-

rectly applied on the trained model. Our experiments show

that EnD effectively improves the generalization on unbi-

ased test sets, and it can be effectively applied on real-case

scenarios, like removing hidden biases in the COVID-19 de-

tection from radiographic images.

1. Introduction

In the last two decades artiﬁcial neural network models

(ANNs) received huge interest from the research commu-

nity. Nowadays, complex and even ill-posed problems can

be tackled provided that one can train a deep enough ANN

model with a large enough dataset. Furthermore, they aim

to become a powerful tool helping us take a variety of de-

cisions: for example, AI is currently used for scouting and

hiring people [17]. These ANNs are trained to process a de-

sired output from some inputs. We have no clear idea how

the information is effectively processed inside. Recently,

AI trustworthiness has been recognized as major prereq-

uisite for people and societies to use and accept such sys-

tems [14,33]. In April 2019, the High-Level Expert Group

This work has been accepted as a conference paper for the 2021 Con-

ference on Computer Vision and Pattern Recognition (CVPR 2021).

on AI of the European Commission deﬁned the three main

aspects of trustworthy AI [14]: it should be lawful, ethical

and robust. Providing a warranty on this topic is currently a

matter of study and discussion.

Focusing on the concept of robustness for AI, Atten-

berg et al. discussed the problem of ﬁnding the so-called

“unknown unknowns” [3] in data. These unknown un-

knowns relate to the case when the deep model elaborates

information in an unintended way, but shows high conﬁ-

dence on its predictions. Such behavior affected many re-

cent works proposing AI-based solutions on the COVID de-

tection from radiographic images. Unfortunately, the avail-

able datasets at the beginning of the pandemic were heavily

biased. This often resulted in models predicting COVID di-

agnosis with a high conﬁdence, thanks to the presence of

unwanted biases, for example by detecting the presence of

catheters or medical devices for positive patients, their age

(at the beginning of the pandemic, most ill patients were el-

derly people), or even by recognizing the origin of the data

itself (when negative cases were augmented borrowing sam-

ples from other datasets) [2,25,26].

In this work we propose a regularization strategy which

Entangles the deep features extracted by patterns belong-

ing to the same target class and Disentangles the biased fea-

tures: we name it EnD, and with it we wish to put an end

to the bias propagation in any deep model. We assume we

know data might have some bias (like in the case of COVID,

the origin of data) but we ignore what it translates into (we

do not have a prior knowledge on whether the bias is the

presence of some color, a speciﬁc feature in the image or

anything else). EnD regularizes the output of some layer

Γwithin the deep model in order to create an “information

bottleneck” where the regularizer:

• entangles the feature vectors extracted from data be-

longing to the same target class;

• disentangles the features extracted from data having

the same “bias label”.

Since the deep model is trained minimizing both the loss

and EnD, all the biased features are discouraged to be ex-

arXiv:2103.02023v1 [cs.CV] 2 Mar 2021

tracted in favor of the unbiased ones. Compared to other

de-biasing techniques, we have no training overhead: we

do not train extra models to perform gradient inversion on

the biased information or involve the use of GaNs, or even

de-bias the input data. EnD works directly on the target

model, and is minimized via standard back-propagation.

In general, directly tackling the problem of mutual in-

formation’s minimization is hard, given both its non-

differentiability and the computational complexity in-

volved. Nonetheless, previous works have already shown

that adding further constraints to the learning problem could

be effective [28] as, typically, the trained ANN models are

over-sized and allows a large number of solutions to the

same learning task [27]. Our experiments show that EnD

effectively favors the choice of unbiased features over the

biased ones at training time, yielding competitive general-

ization capabilities compared to models trained with other

un-biasing techniques.

The rest of the work is structured as follows. In Sec. 2we

review some works close to our problem. Then, in Sec. 3

we introduce EnD in detail providing intuitions on its effect.

Sec. 4shows some empirical results and ﬁnally, in Sec. 5,

the conclusions are drawn.

2. Related works

In this section we review state-of-the-art techniques de-

signed to prevent models from learning biases. The tech-

niques can be grouped into (but not limited to) three main

approaches: direct data de-biasing from the source, use of

GANs/ensembling towards data de-biasing and direct learn-

ing the de-biasing within the trained model.

De-biasing from data source It is known that datasets

are typically affected by biases. In their work, Tor-

ralba and Efros [30] showed how biases affect some of the

most commonly used datasets, drawing considerations on

the generalization performance and classiﬁcation capability

of the trained ANN models. Following a similar approach,

Tommasi et al. [29] conducted experiments reporting dif-

ferences between a number of datasets and verifying how

ﬁnal performances vary when applying different de-biasing

strategies in order to balance data. Working at the dataset

level is in general a critical aspect, and greatly helps in un-

derstanding the data and its structure [8]. The concept of

removing bias by using data borrowed by different sources

has been explored in a practical and empirical context by

Gupta et al. [11]. In particular, they have designed a de-

biasing strategy to minimize the effects of imperfect execu-

tion and calibration errors by reducing the effect of unbal-

anced data, showing improvements in the generalization of

the ﬁnal model.

Adversarial and ensembling approaches. Having an ex-

plicit formulation for the bias contribution in the loss term

is typically hard. One possible approach is to use additional

models to learn the biases in data and use them to condi-

tion the primary model so that it avoids them. Kim et al.

use adversarial learning and gradient inversion to eliminate

the information related to the biases in the model [16]. An-

other possibility is to use the gray-level co-occurence ma-

trix to extract unbiased features and to train the model, as

proposed by Wang et al. with HEX [32]. Alvi et al. pro-

pose the BlindEye [1] technique, where they train a clas-

siﬁer on the extracted deep features to retrieve information

from biases: then, they force the “bias classiﬁer” to be no

longer able to retrieve bias-related information, modifying

the deep features accordingly. Bahng et al. [4] develop an

ensembling-based technique, called ReBias. It consists in

solving a min-max problem where the target is to promote

the independence between the network prediction and all

biased predictions. Identifying the “known unknowns” [3]

and optimize on those using a neural networks ensemble is

the approach proposed by Nam et al. with their LfF [21].

A similar approach is followed by Clark et al. in their

LearnedMixin [6].

De-biasing within the deep model. Dataset de-biasing

helps in the learning process, as training is performed with

no biases; however, with such an approach we typically

have no direct control on the information we are removing

from the dataset itself, or we are including an extremely-

high computational complexity like when training GANs.

A context in which, on the contrary, we can have direct ac-

cess to these biases is presented by Hendricks et al. [13]. In

such a work it was possible to explicitly introduce a correc-

tive loss term (coherent with the formulation introduced by

Vinyals et al. [31]) with the aim to help the ANN model to

focus on the correct features. Similarly, Cadene et al. pro-

pose RUBi [5] where they use logit re-weighting to lower

the bias impact in the learning process, and Sagawa et al.,

with Group-DRO [23], avoid bias overﬁtting by deﬁning

prior data sub-groups and controlling their generalization.

EnD belongs to this class of approaches, since we directly

regularize the trained model, with no additional parameters

to be learned. In Sec. 3we are going to describe in detail

the approach we take in order to EnD bias propagation in

the trained model.

3. Entangling and Disentangling deep repre-

sentations

In this section, after introducing the notation, we present

EnD, our proposed regularization term, whose aim is to reg-

ularize the deep features in order to discourage the deep

model to learn biases.

3.1. Preliminaries

In this section we ﬁrst introduce the notation we are go-

ing to use for the rest of this work and we provide some

intuitions on how EnD is going to work. Let us assume we

Figure 1: Model overview. The features for EnD are ex-

tracted at the output of Γ, after a normalization layer per-

forming the operation as in (3).

(a) (b)

Figure 2: Toy example of EnD’s effect. Each arrow rep-

resents the feature vector associated with a sample. Biases

are represented by the three different colors (green, orange

and blue) while the target class is represented by the ar-

rows marker’s symbol (triangle, square and circle). While in

some un-regularized training the deep model strongly cor-

relates with the bias (a), using EnD we aim at enforcing the

choices of different features (b).

focus our attention on some layer Γ, at the output of which

we are going to apply EnD. Let Tbe the cardinality of the

target classes of the learning problem and Bthe cardinality

of the bias classes in the dataset. We say the output of Γ

is y∈RNΓ×M, where Mis the batchsize and NΓis the

output size of Γ.

We also deﬁne:

•Mt,b as the cardinality of the samples having the same

target tand the same bias b;

•Mt,−as the cardinality of the samples having the same

target tregardless the biases;

•M−,b as the cardinality of the samples having the same

bias bregardless the target class;

•yt,b as the subset of the features ybelonging to the

inputs having the same target class tand showing the

same bias b;

•yt,−as the subset of the features ybelonging to the in-

puts having the same target class tregardless the bias;

•y−,b as the subset of the features ybelonging to the in-

puts having the same bias bregardless the target class;

•yias the i-th sample in the minibatch;

•T(yi)extracts the target class of yi;

•B(yi)extracts the bias class of yi.

In our work, EnD sides the loss minimization, discouraging

the selection of biased deep features and encouraging the

unbiased ones at training time. Hence, the overall objective

function we aim to minimize is

J=L+R, (1)

where Lis the loss function for the trained task and Ris

our proposed EnD term, applied at the output of Γ. Fig. 1

provides the overall structure of the trained model.

Let us consider, as a toy example, some classiﬁcation prob-

lem having three target classes, but as well three different

bias classes (Fig. 2shows the extracted feature vectors at

Γ). We encode the biases as three different colors (green,

orange and blue), while the target class is represented by

the arrows marker (triangle, square and circle). Typically,

training a deep model without taking biases into account

produces feature representations shown in Fig. 2a: here, the

loss on the target classes is minimized (three distinct groups

are formed depending on the arrow marker), but it is driven

by a heavy bias (the colors of the arrows). The purpose of

EnD is to disentangle the representations belonging to the

same bias class (color) and to entangle the representations

with the same target class (the arrow’s marker). Fig. 2b rep-

resents the effect of EnD on the deep representations: while

the disentangling term un-groups the biased example’s rep-

resentations, i.e. makes corresponding vectors almost or-

thogonal, the entangling one promotes correlations between

samples having the same target.

3.2. Data correlations

Our main goal is to train our model to correctly classify

the data into the Tpossible classes, preventing the use of

the bias features provided in the data. Towards this end, we

aim at inserting an information bottleneck: the information

related to these biases will be used as little as possible for

the target classiﬁcation task.

We can build a similarity matrix G∈RM×M:

G= (˜

y)0·˜

y,(2)

where (·)0indicates transposed matrix and ˜

yindicates a per-

representation normalization

yi=yi

kyik2

∀i∈[1, M ].(3)

Hence, every gi,j entry between two patterns i, j in Gindi-

cates their correlation:

gi,j = (˜

yi)0·˜

yj.(4)

Gis a special case of Gramian matrix, as any

gi,j ∈[−1; +1] and indicates the difference in the direc-

tion between any two yiand yj.Ghas some properties:

• is a symmetric, positive semi-deﬁnite matrix;

• all the elements in the main diagonal are exactly 1by

construction;

• if the subset of outputs ˜

yforms an ortho-normal basis

(or Gis full-rank), then G=Iby deﬁnition.

Handling these relations, we are going to build our regular-

ization strategy, which consists in two terms:

• a disentangling term, whose task is to try to de-

correlate as much as possible all the patterns belonging

to the same bias class b;

• an entangling term, which attempts to force correla-

tions between data from different bias classes but hav-

ing the same target class t.

3.3. The EnD regularizer

The regularization Rwe propose blends the disentan-

gling R⊥and entangling Rkterms by setting

R=αR⊥+βRk,(5)

where αand βare proper multipliers. In the following, we

are going to describe in detail the disentangling and the en-

tangling terms.

3.3.1 Disentangling term

In order to disentangle biased representations, at training

time, we select the patterns belonging to a bias class band

build the corresponding Gramian matrix

G−,b =˜

y−,b0·˜

y−,b.(6)

Then, we enforce de-correlation between the features be-

longing to the same class: ideally, we would like to get

G−,b →I∀b. To this end, we introduce the regularization

term

R⊥=1

b=1

(M−,b)2X

i,j 



g−,b

i,j 



(7)

that promotes minimization of the off-diagonal elements in

G−,b,∀b.

3.3.2 Entangling term

While R⊥discourages the model to learn biases, the model

should also build strong correlations between patterns be-

longing to different bias classes, but to the same target class

t. With an orthogonal approach to the one used to derive

(6), we compute the Gramian matrix for the patterns belong

to the same target class t:

Gt,−=˜

yt,−0·˜

yt,−.(8)

Let us focus, now, on the vector gt,−

i, extracted from the i-th

column of Gt,−: it expresses how the i-th pattern correlates

to all the other patterns which will be grouped to the same

t-th target class. As a ﬁrst option, we might ask the model

to correlate the i-th pattern to all the other patterns having

the same target class t, deriving the pattern entangling rule

as the opposite of the disentangling rule in (7):

Rk= 1 −1

t=1

(Mt,−)2X

i,j

gt,−

i,j (9)

In this formulation we are asking all the gt,−

i,j →1, corre-

lating the features as much as possible. However, (9) has

a major shortcoming: it simply forces again correlations

according to the target class tregardless the bias informa-

tion, which might be re-introduced. This is already done

at a more general level by the loss function minimization as

in (1): it is desirable to have a term which entangles features

having the same target class, but belonging to different bias

classes. Towards this end, we can re-write (9) maximizing

the correlations between each single example yiand every

other example yjsuch that T(yi) = T(yj)but, at the same

time, B(yi)6=B(yj). Hence, our entangling term reads

Rk= 1 −1

i=1

b6=B(yi)

MT(yi),b ·

·X

δB(yi),B(yj)·gT(yi),−

i,j ,(10)

where

δ(a, b) = 0a=b

1a6=b.(11)

4. Experiments

In the experiments we present in this section, we aim to

remove different types of biases such as color, age, gen-

der which can have a high impact on classiﬁcation perfor-

mance when recognizing, for example, attributes such as

hair color and presence of makeup on facial images. Ad-

ditionally, we also show how this technique can help in

sensitive tasks such as in the medical ﬁeld, speciﬁcally in

Figure 3: Biased MNIST by Bahng et al. [4], where the

background colors highly correlate with the digit classes.

Method ρvalues

0.999 0.997 0.995 0.990

Vanilla 10.4 33.4 72.1 89.1

HEX [32] 10.8 16.6 19.7 24.7

LearnedMixin [6] 12.1 50.2 78.2 88.3

RUBi [5] 13.7 43.0 90.4 93.6

ReBias [4] 22.7 64.2 76.0 88.1

EnD 52.30 83.70 93.92 96.02

±2.39 ±1.03 ±0.35 ±0.08

Table 1: Biased MNIST performance on the unbiased

test set.

COVID-19 detection from CXR images. In all the results

tables, the best results are denoted as boldface, the second

best results are underlined. “Vanilla” denotes the baseline

model performance for the learning problem, with no debi-

asing technique applied. All the EnD’s results are averaged

over three different runs.1

4.1. Controlled experiments

In this section we describe the controlled experiments

that we performed in order to assess the performance of

EnD. Full control over the amount and type of bias allows

to correctly analyze EnD’s behavior, excluding noise and

uncertainty given by real-world data.

4.1.1 Biased MNIST

We test our method on a synthetic dataset, where we can

control the bias in the training data. We use the Biased

MNIST dataset proposed by Bahng et al. [4]. This dataset

is constructed from the MNIST dataset [18] by injecting a

color into the images background, as shown in Figure 3.

Each digit is associated with one of ten pre-deﬁned colors.

To assign the color bias to an image of a given target class,

the pre-deﬁned color is selected with a probability ρ, and

any other color is chosen with a probability (1 −ρ). To

vary the level of difﬁculty in the dataset, the authors select

ρ∈ {0.990,0.995,0.997,0.999}. Higher values of ρcor-

respond to higher correlation between target class and bias

class (color). Two testing datasets are constructed with the

1The source code, written using PyTorch 1.7, will be made publicly

available in the ﬁnal version of the article. The hyperparameters used for

the proposed experiments are optimized using a validation set or k-folding

cross-validation depending on the dataset.

same criterion: biased, with ρ= 1.0, and unbiased, with

ρ= 0.1. Given the low correlation between color and digit

class in the unbiased test set, models must learn to classify

shapes instead of colors in order to reach a high accuracy.

Setup. We use the network architecture proposed by

Bahng et al. [4], consisting of four convolutional layers with

7×7kernels. The EnD regularization term is applied on the

average pooling layer, before the fully connected classiﬁer

of the network.

Results. Results are shown in Table 1. EnD’s results are

averaged across three different runs for each value of ρ. For

all values of ρwe report the accuracy obtained by EnD on

the unbiased evaluation set, compared with other debiasing

algorithms.

EnD successfully mitigates bias propagation. The im-

provement obtained with EnD with respect to the baseline

model is noticeable, especially in the higher levels of difﬁ-

culty. We observe an increase of accuracy across all values

of ρ. Notably, for ρ= 0.999 the vanilla model reaches

10.4% accuracy, meaning that the background color is used

as the only cue for classifying the digits, whereas employ-

ing EnD yields an accuracy of 52.30%. Figure 4shows the

effect of EnD, using Grad-CAM [24] to highlight the im-

portant regions of the input image for the model prediction.

We observe that the vanilla model (Figure 4a) focuses on the

background, while the EnD-regularized model (Figure 4b)

correctly learns to focus on the digit shape.

Comparison with other techniques. We observe that EnD

yields the highest results among all of the compared debi-

asing algorithms. Such gap is especially higher in the most

difﬁcult settings for ρ∈ {0.999,0.997}where many algo-

rithms are unable to generalize to the unbiased set, espe-

cially HEX [32] and LearnedMixin [6]. Some of the com-

pared algorithms even show a collapse in accuracy com-

pared to the vanilla baseline in certain cases (HEX for most

values of ρ, LearnedMixin and ReBias for ρ= 0.990).

Ablation study. We also perform an ablation study of EnD

to analyze how each of the EnD’s terms affect the perfor-

mance of the trained model. For a ﬁxed ρ= 0.997, we

evaluate only the contribution of the disentangling term R⊥

and disable the entangling term Rkby setting β= 0. We

then perform the opposite evaluation by setting α= 0, to

only take into account the entangling term. The results are

shown in Table 2. We observe that both the regularization

terms contribute to boost the model’s generalization capa-

bility. As expected, the best results are achieved when both

of them are jointly applied. The entangling term yields a

higher increase in performance compared to the disentan-

gling one, however it is in general not always applicable,

for example when, given some i-th sample yi,

@j| T (yi) = T(yj)∧ B(yi)6=B(yj)∀i.

The disentangling term provides a smaller beneﬁt in this

(a) (b)

Figure 4: Grad-CAM on Colored MNIST: [24] vanilla model (a) and EnD-regularized model (b).

20 40 60 801 Epoch

0.985

0.990

0.995

1.000

Accuracy

Biased Test Accuracy

kick-in region

(a)

20 40 60 801 Epoch

0.25

0.50

0.75

1.00

Accuracy

Unbiased Test Accuracy

kick-in region

(b)

20 40 60 801 Epoch

0.0

0.5

1.0

Train CE

kick-in region

(c)

20 40 60 801 Epoch

0.4

0.6

Train R

kick-in region

(d)

Figure 5: EnD learning curves on Colored MNIST for ρ=0.995.Biased accuracy (a), unbiased accuracy (b), Lvalue on

the training set (c) and Rvalue on the training set (d).

Setting α β Unbiased

accuracy

Vanilla 0 0 33.4

Disentangling only [0; 1] 0 45.67 ±0.67

Entangling only 0 [0; 1] 75.36 ±0.94

EnD [0; 1] [0; 1] 83.70 ±1.03

Table 2: Ablation study of EnD on the Biased MNIST

dataset,ρ= 0.997.

case, but, on the other hand, it can always be applied. We

ﬁnd that the ideal case for EnD is when both of the terms

can be used in the learning process, leading to better gen-

eralization capabilities. Furthermore, we observe a similar

pattern in the learning process when employing the full EnD

regularization for different values of ρ. Figure 5shows the

learning curves for ρ= 0.995. We notice how models tend

to quickly learn the color bias in the ﬁrst few epochs, as

the accuracy on the biased test set is close to 100% (Fig-

ure 5a). However, once the value of the loss (in this case,

we have used the cross-entropy loss, Figure 5c) falls be-

low a certain threshold, the contribution Rof the EnD term

becomes predominant (Figure 5d). In this phase, which we

call kick-in region, the optimization process begin to rapidly

minimize R, stopping the model from relying on the bias-

related features. This can be observed in the rapid increase

of the accuracy on the unbiased test set (Figure 5b), whereas

the biased accuracy momentarily drops as the models shift

their focus from the background color to the digit shape.

4.2. Real world datasets

After benchmarking EnD in a controlled scenario on

synthetic data, we move to real world datasets where biases

might be subtle and harder to handle. In this section we aim

at removing age and gender bias in different datasets. We

also apply EnD on a computer-aided diagnosis task, where

hidden biases might lead to sub-optimal generalization of

the model.

Setup. For CelebA and IMDB Face, we use the ResNet-

18 model proposed by He et al. [12]. The network was

pre-trained on ImageNet [9], except for the last fully con-

nected layer. The EnD regularization is applied on the aver-

age pooling layer, before the fully connected classiﬁer. For

CORDA,2we use a DenseNet-121 [15] encoder pre-trained

on publicly available CXR data, which is then followed by

a two-layer fully connected classiﬁer.

4.2.1 CelebA

CelebA [19] is a dataset of for face-recognition tasks,

providing 40 attributes for every image. Following

2This dataset’s name and the involved institutions are kept anonymous

(just) in the reviewing process since it has not been publicly released yet.

Method Unbiased Bias-conﬂicting

Learn

HairColor

Vanilla 70.25 ±0.35 52.52 ±0.19

Group DRO [23] 85.43 ±0.53 83.40 ±0.67

LfF[21] 84.24 ±0.37 81.24 ±1.38

EnD 91.21 ±0.22 87.45 ±1.06

Learn

HeavyMakeup

Vanilla 62.00 ±0.02 33.75 ±0.28

Group DRO [23] 64.88 ±0.42 50.24 ±0.68

LfF[21] 66.20 ±1.21 45.48 ±4.33

EnD 75.93 ±1.31 53.70 ±5.24

Table 3: Performance on CelebA.

Nam et al. [21], we select BlondHair and HeavyMakeup as

target attributes tand Male as bias attribute b. This choice

is dictated by the fact that there is a high correlation be-

tween the target and the bias attributes (i.e. most women

have blond hair or wear heavy makeup in this dataset).

The dataset contains a total of 202,599 images, and follow-

ing the ofﬁcial train-validation split we obtain 162,770 im-

ages for training and 19,867 images for testing our models.

Nam et al. [21] build two types of testing dataset: unbiased,

by selecting the same number of samples for every possible

value of the pair (t, b), and bias-conﬂicting, by removing

from the unbiased set all of the samples where band tare

equal.

Results. Following Nam et al. [21], the accuracy is com-

puted as average accuracy over all the (t, b)pairs. Ta-

ble 3shows the results obtained on the CelebA dataset. We

observe how the vanilla model heavily relies on the bias

attribute, scoring a low accuracy especially on the bias-

conﬂicting sets. EnD, on the other hand, outperforms the

baseline in both the tasks. We report reference results [21]

of other debiasing algorithms, speciﬁcally Group DRO [23]

and LfF [21], for comparison with EnD. The results we ob-

tain are signiﬁcantly higher across most of the evaluation

sets, and comparable with Group DRO and LfF on the bias-

conﬂicting set when the target attribute is HeavyMakeup.

4.2.2 IMDB Face

The IMDB Face dataset [22] contains 460,723 face images

annotated with age and gender information. To ﬁlter out the

misannotated labels of this dataset [22,30], Kim et al. [16]

use a model trained on the Audience benchmark [10], keep-

ing the images where the prediction matches the provided

label. Following Kim et al.’s proposed data split, 20% of the

IMDB is used as test set, containing samples with age 0-29

or 40+. The remaining data is then split into two extreme-

bias subset: EB1 contains women in the age range 0-29 and

men with age 40+, while EB2 contains men aged 0-29 and

women 40+. Thus, when learning to predict the gender at-

(a)

(b)

Figure 6: IMDB train splits: EB1 (a) and EB2 (b).

Method Trained on EB1 Trained on EB2

EB2 Test EB1 Test

Learn Gender

Vanilla 59.86 84.42 57.84 69.75

BlindEye [1] 63.74 85.56 57.33 69.90

Kim et al. [16]68.00 86.66 64.18 74.50

EnD 65.49 87.15 69.40 78.19

±0.81 ±0.31 ±2.01 ±1.18

Learn Age

Vanilla 54.30 77.17 48.91 61.97

BlindEye [1] 66.80 75.13 64.16 62.40

Kim et al. [16] 65.27 77.43 62.18 63.04

EnD 76.04 80.15 74.25 78.80

±0.25 ±0.96 ±2.26 ±1.48

Table 4: Performance on IMDB Face. When gender is

learned, age is the bias, and when age is learned the gender

is the bias.

tribute, the bias is given by the age and vice-versa. An ex-

ample of the EB1 and EB2 training sets is shown in Fig-

ure 6.

Results. Table 4shows the results obtained on the IMDB

Face dataset. We performed two main experiments: gen-

der and age prediction. Besides the perfomance evaluation

on the test set, when training on EB1 we also tested the

model’s performance on EB2, and viceversa. This allows us

to better evaluate the bias features’ inﬂuence on the model

prediction. We notice how the baseline model is heavily

biased towards age when predicting gender, and towards

gender when predicting age. This can be observed on the

performance achieved on the EB2 and EB1 sets, both for

gender and age prediction. When employing our regular-

ization term, we observe an increase across all of the ob-

tained results: in particular, when training on EB2 for age

prediction, we notice an increase from 48.91% to 74.25%

on the EB1 set. We also report reference results of other

debiasing algorithms, speciﬁcally BlindEye [1] and the ad-

versarial approach proposed by Kim et al. [16]. In general,

EnD obtains the best results among all the other debiasing

algorithms we compared to.

Test on CORDA-CDSS

TPR TNR BA

Vanilla 69.99 ±3.27 59.26 ±2.09 64.63 ±2.50

EnD 68.16 ±2.08 76.30 ±2.10 72.22 ±0.01

Test on CORDA-SLG

TPR TNR BA

Vanilla 52.14 ±3.20 87.63 ±4.37 69.88 ±2.95

EnD 68.37 ±6.04 84.51 ±3.04 75.94 ±1.62

Table 5: Performance on CORDA, sorted by collecting

institution.

4.2.3 COVID CXR dataset

CORDA is a dataset comprising 898 Chest X-Ray images

(CXR) that were collected during March and April 2020

by radiology units at Citt´

a della Salute e della Scienza and

San Luigi Gonzaga, in Italy. Virus testing (nasopharingeal

swab) was used to determine the presence or absence of

COVID-19 infection. The dataset can be split by collecting

institution, resulting in CORDA-CDSS with 297 images of

COVID-19 positive patients and 150 of negative ones, and

CORDA-SLG with 129 positives and 322 negatives. Re-

cent literature [7,20,26] shows that merging CXRs com-

ing from different sources poses bias issues, since differ-

ences in acquisition techniques given by the scan machines

or composition of the population sample might be used by

the deep model to distinguish the provenance of the data

itself, even when pre-processing techniques are employed.

For CORDA, we notice that data coming from Citt´

a della

Salute e della Scienza contain a majority of positive sam-

ples, while data coming from San Luigi Gonzaga have a

majority of negative samples. Hence, if distinguishing fea-

tures are embedded in the scans, then the networks might

learn to discriminate the source of the data, instead of actu-

ally classifying between COVID positives and negatives. To

build the test sets, we use 30% of CORDA-CDSS and 30%

of CORDA-SLG. The remaining data are then merged and

used as training set. Testing on the two distinguished sets

allows us to assess whether the prediction of the models are

biased towards the origin of the data.

Results. The results obtained on CORDA-CDSS and

CORDA-SLG are presented in Table 5. We observe how

the vanilla model is in fact biased towards the source of

the data. On CORDA-CDSS (which contains mostly posi-

tive samples) the vanilla model shows a higher true positive

rate (TPR) and a lower true negative rate (TNR). On the

other hand, on CORDA-SLG (which contains mostly nega-

tive samples) we notice a lower TPR compared to the sen-

sibly higher TNR. Employing EnD helps in improving the

results in this case too. While maintaining a similar TPR

on CORDA-CDSS and TNR on CORDA-SLG, we obtain

an improvement of the TNR 59.26%→76.30% and of the

TPR 52.14%→68.37% on CORDA-CDSS and CORDA-

SLG, respectively. This also results in an increased bal-

anced accuracy (BA) on both of the test sets. As a further

(a) (b)

Figure 7: Grad-CAM on CORDA: vanilla model (a) and

EnD-regularized model (b).

insight, we observe in Figure 7a that the vanilla model fo-

cuses on irrelevant regions outside the lungs area, while the

EnD-regularized model mainly focuses on the lower lobes

of the lungs (Figure 7b).

5. Conclusion

In this work we aimed at EnD-ing the selection of biased

features in deep model trained on biased datasets. Towards

this end, we have designed a regularizer whose task is to ei-

ther disentangle deep feature representations with the same

bias and to entangle deep features with different biases, but

belonging to the same target classiﬁcation class. Differently

from other de-biasing techniques, we do not introduce any

additional parameters to be learned and we do not mod-

ify the input data: the model itself is naturally driven into

choosing deep features which are unbiased, without intro-

ducing additional priors to the data. Our experiments show

the effectiveness of EnD when compared to other state-of-

the-art techniques, excelling in the cases of heavily-biased

data (like ρ= 0.999 for Biased MNIST or IMDB). As an

application case, we have also tested the effect of EnD on

the COVID diagnosis from CXR images, where the bias is

given by the data source and it is not straightforward to de-

tect. In this case we have observed an overall improvement

of the performance on the test set as well, showing that our

technique may be employed to build more reliable models

even in more sensitive tasks.

References

[1] Mohsan Alvi, Andrew Zisserman, and Christoffer

Nell˚

aker. Turning a blind eye: Explicit removal of

biases and variation from deep neural network embed-

dings. In Proceedings of the European Conference on

Computer Vision (ECCV), pages 0–0, 2018. 2,7

[2] Ioannis D Apostolopoulos and Tzani A Mpesiana.

Covid-19: automatic detection from x-ray images

utilizing transfer learning with convolutional neural

networks. Physical and Engineering Sciences in

Medicine, page 1, 2020. 1

[3] Joshua Attenberg, Panos Ipeirotis, and Foster Provost.

Beat the machine: Challenging humans to ﬁnd a

predictive model’s “unknown unknowns”. Journal

of Data and Information Quality (JDIQ), 6(1):1–17,

2015. 1,2

[4] Hyojin Bahng, Sanghyuk Chun, Sangdoo Yun, Jaegul

Choo, and Seong Joon Oh. Learning de-biased rep-

resentations with biased representations. In Inter-

national Conference on Machine Learning (ICML),

2020. 2,5

[5] Remi Cadene, Corentin Dancette, Matthieu Cord,

Devi Parikh, et al. Rubi: Reducing unimodal biases

for visual question answering. In Advances in neu-

ral information processing systems, pages 841–852,

2019. 2,5

[6] Christopher Clark, Mark Yatskar, and Luke Zettle-

moyer. Don’t take the easy way out: Ensemble based

methods for avoiding known dataset biases. In Ken-

taro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan,

editors, Proceedings of the 2019 Conference on Em-

pirical Methods in Natural Language Processing and

the 9th International Joint Conference on Natural

Language Processing, EMNLP-IJCNLP 2019, Hong

Kong, China, November 3-7, 2019, pages 4067–4080.

Association for Computational Linguistics, 2019. 2,5

[7] Beatriz Garcia Santa Cruz, J. S¨

olter, M. Bossa, and

A. Husch. On the composition and limitations of pub-

licly available covid-19 x-ray imaging datasets. ArXiv,

abs/2008.11572, 2020. 8

[8] Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay

Vasudevan, and Quoc V. Le. Autoaugment: Learning

augmentation strategies from data. In The IEEE Con-

ference on Computer Vision and Pattern Recognition

(CVPR), June 2019. 2

[9] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L.

Fei-Fei. ImageNet: A Large-Scale Hierarchical Image

Database. In CVPR09, 2009. 6

[10] Eran Eidinger, Roee Enbar, and Tal Hassner. Age

and gender estimation of unﬁltered faces. IEEE

Transactions on Information Forensics and Security,

9(12):2170–2179, 2014. 7

[11] Abhinav Gupta, Adithyavairavan Murali, Dhi-

raj Prakashchand Gandhi, and Lerrel Pinto. Robot

learning in homes: Improving generalization and

reducing dataset bias. In Advances in Neural Infor-

mation Processing Systems, pages 9094–9104, 2018.

[12] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian

Sun. Deep residual learning for image recognition.

In Proceedings of the IEEE conference on computer

vision and pattern recognition, pages 770–778, 2016.

[13] Lisa Anne Hendricks, Kaylee Burns, Kate Saenko,

Trevor Darrell, and Anna Rohrbach. Women also

snowboard: Overcoming bias in captioning models.

In European Conference on Computer Vision, pages

793–811. Springer, 2018. 2

[14] European Commission (AI HLEG). Ethics guidelines

for trustworthy AI. High-Level Expert Group on Arti-

ﬁcial Intelligence, 2019. 1

[15] Gao Huang, Zhuang Liu, Laurens Van Der Maaten,

and Kilian Q Weinberger. Densely connected convo-

lutional networks. In Proceedings of the IEEE con-

ference on computer vision and pattern recognition,

pages 4700–4708, 2017. 6

[16] Byungju Kim, Hyunwoo Kim, Kyungsu Kim, Sungjin

Kim, and Junmo Kim. Learning not to learn: Training

deep neural networks with biased data. In The IEEE

Conference on Computer Vision and Pattern Recogni-

tion (CVPR), June 2019. 2,7

[17] Sven Laumer, Christian Maier, and Andreas Eckhardt.

The impact of business process management and ap-

plicant tracking systems on recruiting process perfor-

mance: an empirical study. Journal of Business Eco-

nomics, 85(4):421–453, 2015. 1

[18] Yann LeCun, Corinna Cortes, and CJ Burges. Mnist

handwritten digit database. ATT Labs [Online]. Avail-

able: http://yann.lecun.com/exdb/mnist, 2, 2010. 5

[19] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou

Tang. Deep learning face attributes in the wild. In

Proceedings of International Conference on Computer

Vision (ICCV), December 2015. 6

[20] Gianluca Maguolo and Loris Nanni. A critic evalua-

tion of methods for covid-19 automatic detection from

x-ray images. arXiv preprint arXiv:2004.12823, 2020.

[21] Junhyun Nam, Hyuntak Cha, Sungsoo Ahn, Jaeho

Lee, and Jinwoo Shin. Learning from failure: Training

debiased classiﬁer from biased classiﬁer. In Advances

in Neural Information Processing Systems, 2020. 2,7

[22] Rasmus Rothe, Radu Timofte, and Luc Van Gool.

Deep expectation of real and apparent age from a sin-

gle image without facial landmarks. International

Journal of Computer Vision, 126(2-4):144–157, 2018.

[23] Shiori Sagawa, Pang Wei Koh, Tatsunori B

Hashimoto, and Percy Liang. Distributionally

robust neural networks. In International Conference

on Learning Representations, 2019. 2,7

[24] Ramprasaath R Selvaraju, Michael Cogswell, Ab-

hishek Das, Ramakrishna Vedantam, Devi Parikh, and

Dhruv Batra. Grad-cam: Visual explanations from

deep networks via gradient-based localization. In

Proceedings of the IEEE international conference on

computer vision, pages 618–626, 2017. 5,6

[25] Prabira Kumar Sethy and Santi Kumari Behera. De-

tection of coronavirus disease (covid-19) based on

deep features. Preprints, 2020030300:2020, 2020. 1

[26] Enzo Tartaglione, Carlo Alberto Barbano, Claudio

Berzovini, Marco Calandri, and Marco Grangetto. Un-

veiling covid-19 from chest x-ray with deep learning:

a hurdles race with small data. Int. J. Environ. Res.

Public Health, 17(18):6933, 2020. 1,8

[27] Enzo Tartaglione and Marco Grangetto. Take a ramble

into solution spaces for classiﬁcation problems in neu-

ral networks. In International Conference on Image

Analysis and Processing, pages 345–355. Springer,

2019. 2

[28] Enzo Tartaglione and Marco Grangetto. A non-

discriminatory approach to ethical deep learning. In

2020 IEEE 19th International Conference on Trust,

Security and Privacy in Computing and Communica-

tions (TrustCom), pages 943–950, 2020. 2

[29] Tatiana Tommasi, Novi Patricia, Barbara Caputo, and

Tinne Tuytelaars. A deeper look at dataset bias. In

Domain adaptation in computer vision applications,

pages 37–55. Springer, 2017. 2

[30] Antonio Torralba, Alexei A Efros, et al. Unbiased look

at dataset bias. In CVPR, page 7. Citeseer, 2011. 2,7

[31] Oriol Vinyals, Alexander Toshev, Samy Bengio, and

Dumitru Erhan. Show and tell: A neural image caption

generator. In Proceedings of the IEEE conference on

computer vision and pattern recognition, pages 3156–

3164, 2015. 2

[32] Haohan Wang, Zexue He, Zachary L. Lipton, and

Eric P. Xing. Learning robust representations by pro-

jecting superﬁcial statistics out. In International Con-

ference on Learning Representations, 2019. 2,5

[33] Baobao Zhang and Allan Dafoe. Artiﬁcial intelli-

gence: American attitudes and trends. Available at

SSRN 3312874, 2019. 1

ResearchGate has not been able to resolve any citations for this publication.

Unveiling COVID-19 from chest x-ray with deep learning: A hurdles race with small data

Article

Full-text available

Sep 2020
Int J Environ Res Publ Health

The possibility to use widespread and simple chest X-ray (CXR) imaging for early screening of COVID-19 patients is attracting much interest from both the clinical and the AI community. In this study we provide insights and also raise warnings on what is reasonable to expect by applying deep learning to COVID classification of CXR images. We provide a methodological guide and critical reading of an extensive set of statistical results that can be obtained using currently available datasets. In particular, we take the challenge posed by current small size COVID data and show how significant can be the bias introduced by transfer-learning using larger public non-COVID CXR datasets. We also contribute by providing results on a medium size COVID CXR dataset, just collected by one of the major emergency hospitals in Northern Italy during the peak of the COVID pandemic. These novel data allow us to contribute to validate the generalization capacity of preliminary results circulating in the scientific community. Our conclusions shed some light into the possibility to effectively discriminate COVID using CXR.

Show and tell: A neural image caption generator

Conference Paper

Full-text available

Jun 2015

A Critic Evaluation of Methods for COVID-19 Automatic Detection from X-Ray Images

Article

Apr 2021
INFORM FUSION

In this paper, we compare and evaluate different testing protocols used for automatic COVID-19 diagnosis from X-Ray images in the recent literature. We show that similar results can be obtained using X-Ray images that do not contain most of the lungs. We are able to remove the lungs from the images by turning to black the center of the X-Ray scan and training our classifiers only on the outer part of the images. Hence, we deduce that several testing protocols for the recognition are not fair and that the neural networks are learning patterns in the dataset that are not correlated to the presence of COVID-19. Finally, we show that creating a fair testing protocol is a challenging task, and we provide a method to measure how fair a specific testing protocol is. In the future research we suggest to check the fairness of a testing protocol using our tools and we encourage researchers to look for better techniques than the ones that we propose.

A non-Discriminatory Approach to Ethical Deep Learning

Conference Paper

Dec 2020

Covid-19: Automatic detection from X-Ray images utilizing Transfer Learning with Convolutional Neural Networks

Article

Mar 2020

In this study, a dataset of X-Ray images from patients with common bacterial pneumonia, confirmed Covid-19 disease, and normal incidents was utilized for the automatic detection of the Coronavirus. The aim of the study is to evaluate the performance of state-of-the-art Convolutional Neural Network architectures proposed over recent years for medical image classification. Specifically, the procedure called transfer learning was adopted. With transfer learning, the detection of various abnormalities in small medical image datasets is an achievable target, often yielding remarkable results. The datasets utilized in this experiment are two. Firstly, a collection of 1427 X-Ray images including 224 images with confirmed Covid-19 disease, 700 images with confirmed common bacterial pneumonia, and 504 images of normal conditions. Secondly, a dataset including 224 images with confirmed Covid-19 disease, 714 images with confirmed bacterial and viral pneumonia, and 504 images of normal conditions. The data was collected from the available X-Ray images on public medical repositories. The results suggest that Deep Learning in X-Rays may extract significant biomarkers related to the Cpvid-19 disease, while the best accuracy, sensitivity, and specificity obtained is 96.78%, 98.66%, and 96.46% respectively. Since by now, all diagnostic tests show failure rates such as to raise concerns, the probability of incorporating X-rays into the diagnosis of the disease could be assessed by the medical community, based on the findings, while more research to evaluate the X-Ray approach from different aspects may be conducted.

AutoAugment: Learning Augmentation Strategies From Data

Conference Paper