ArticlePDF Available

A new approach for the detection of abnormal heart sound signals using TQWT, VMD and neural networks

March 2021
Artificial Intelligence Review 54(1)

March 2021
54(1)

DOI:10.1007/s10462-020-09875-w

Authors:

Wei Zeng

longyan university

Chengzhi Yuan

University of Rhode Island

Show all 6 authorsHide

Phonocardiogram (PCG) plays an important role in evaluating many cardiac abnormalities, such as the valvular heart disease, congestive heart failure and anatomical defects of the heart. However, effective cardiac auscultation requires trained physicians whose work is tough, laborious and subjective. The objective of this study is to develop an automatic classification method for anomaly (normal vs. abnormal) detection of PCG recordings without any segmentation of heart sound signals. Hybrid signal processing and artificial intelligence tools, including tunable Q-factor wavelet transform (TQWT), variational mode decomposition (VMD), phase space reconstruction (PSR) and neural networks, are utilized to extract representative features in order to model, identify and detect abnormal patterns in the dynamics of PCG system caused by heart disease. First, heart sound signal is decomposed into a set of frequency subbands with a number of decomposition levels by using the TQWT method. Second, VMD is employed to decompose the subband of the heart sound signal into different intrinsic modes, in which the first four intrinsic modes contain the majority of the heart sound signal’s energy and are considered to be the predominant intrinsic modes. They are selected to construct the reference variable for analysis. Third, phase space of the reference variable is reconstructed, in which the properties associated with the nonlinear PCG system dynamics are preserved. Three-dimensional PSR together with Euclidean distance has been utilized to derive features, which demonstrate significant difference in PCG system dynamics between normal and abnormal heart sound signals. Finally, PhysioNet/CinC Challenge heart sound database is used for evaluation and the synthetic minority over-sampling technique method is applied to balance the datasets. By using the 10-fold cross-validation style, experimental results demonstrate that the proposed features with dynamical neural networks based classifier yield classification performance with sensitivity, specificity, overall score and accuracy values of 97.73\(\%\), 98.05\(\%\), 97.89\(\%\), and 97.89\(\%\), respectively. The results verify the effectiveness of the proposed method which can serve as a potential candidate for the automatic anomaly detection in the clinical application.

Flowchart of the proposed method for the anomaly detection of PCG recordings using TQWT, VMD, PSR, ED and neural networks

…

The waveforms of heart sound signals

…

Examples of subbands of 10 levels TQWT of the normal and abnormal heart sound signals

…

Samples of 3D PSR of [Sub 1 11

…

ariation of classification accuracy with Q-factor on balanced recordings with SMOTE method

…

Figures - uploaded by Chengzhi Yuan

Content may be subject to copyright.

Content uploaded by Chengzhi Yuan

Content may be subject to copyright.

Vol.:(0123456789)

Artiﬁcial Intelligence Review

https://doi.org/10.1007/s10462-020-09875-w

1 3

A new approach forthedetection ofabnormal heart sound

signals using TQWT, VMD andneural networks

WeiZeng1 · JianYuan1· ChengzhiYuan2· QinghuiWang1· FenglinLiu1· YingWang1

Abstract

Phonocardiogram (PCG) plays an important role in evaluating many cardiac abnormali-

ties, such as the valvular heart disease, congestive heart failure and anatomical defects of

the heart. However, eﬀective cardiac auscultation requires trained physicians whose work

is tough, laborious and subjective. The objective of this study is to develop an automatic

classiﬁcation method for anomaly (normal vs. abnormal) detection of PCG recordings

without any segmentation of heart sound signals. Hybrid signal processing and artiﬁcial

intelligence tools, including tunable Q-factor wavelet transform (TQWT), variational mode

decomposition (VMD), phase space reconstruction (PSR) and neural networks, are utilized

to extract representative features in order to model, identify and detect abnormal patterns

in the dynamics of PCG system caused by heart disease. First, heart sound signal is decom-

posed into a set of frequency subbands with a number of decomposition levels by using

the TQWT method. Second, VMD is employed to decompose the subband of the heart

sound signal into diﬀerent intrinsic modes, in which the ﬁrst four intrinsic modes contain

the majority of the heart sound signal’s energy and are considered to be the predominant

intrinsic modes. They are selected to construct the reference variable for analysis. Third,

phase space of the reference variable is reconstructed, in which the properties associated

with the nonlinear PCG system dynamics are preserved. Three-dimensional PSR together

with Euclidean distance has been utilized to derive features, which demonstrate signiﬁcant

diﬀerence in PCG system dynamics between normal and abnormal heart sound signals.

Finally, PhysioNet/CinC Challenge heart sound database is used for evaluation and the

synthetic minority over-sampling technique method is applied to balance the datasets. By

using the 10-fold cross-validation style, experimental results demonstrate that the proposed

features with dynamical neural networks based classiﬁer yield classiﬁcation performance

with sensitivity, speciﬁcity, overall score and accuracy values of 97.73

, 98.05

, 97.89

and 97.89

, respectively. The results verify the eﬀectiveness of the proposed method

which can serve as a potential candidate for the automatic anomaly detection in the clinical

application.

Keywords Heart sound· Phonocardiogram (PCG)· Tunable Q-factor wavelet transform

(TQWT)· Variational mode decomposition (VMD)· Phase space reconstruction (PSR)·

System dynamics· Synthetic minority over-sampling technique (SMOTE)· Neural

networks

Extended author information available on the last page of the article

W.Zeng et al.

1 3

1 Introduction

Cardiac auscultation is one of the most popular non-invasive and cost-eﬀective procedures

for the early diagnosis of various cardiac abnormalities, such as the valvular heart disease,

congestive heart failure and anatomical defects of the heart (Alam etal. 2010). However,

eﬀective cardiac auscultation requires trained physicians which is not accessible in remote

regions and low-income countries of the world. In addition, physicians’ work is tough, tedi-

ous and subjective. Therefore, machine learning based automated heart sound classiﬁcation

systems can be of signiﬁcant impact for early diagnosis of cardiac diseases (Humayun etal.

2020).

Automated classiﬁcation of the heart sound signals (i.e., the Phonocardiogram, PCG),

has attracted increasing attentions and has been extensively studied in the past few dec-

ades. It can be generally divided into two areas: (1) segmentation of the heart sound sig-

nals; and (2) detection of heart sound recordings as pathologic or physiologic (Humayun

et al. 2020). For the former one, in previous studies, several PCG signal segmentation

methods have been proposed based on the digital ﬁlters (Varghees etal. 2014), Fourier

transform (FT), short-time Fourier transform (STFT) and time-frequency representation

(Boutana et al. 2011), Hilbert transform (HT) (Sun et al. 2014), homomorphic ﬁltering

(Hassani etal. 2014), empirical wavelet transform (EWT) (Varghees and Ramachandran

2017), wavelet packet transform (WPT) (Safara etal. 2013), empirical mode decomposi-

tion (EMD) (Cheema and Singh 2019), ensemble EMD (EEMD) (Papadaniil and Had-

jileontiadis 2013), variational mode decomposition (VMD) (Sujadevi et al. 2019), Mel

frequency cepstral coeﬃcient (MFCC) (Nogueira etal. 2019), and higher order statistics

(Xie etal. 2019). Springer etal. (2015) proposed a logistic regression based hidden semi-

Markov model (HSMM) for the segmentation of the ﬁrst (S1) and second (S2) heart sound

within noisy, real-world PCG recordings. Varghees and Ramachandran (2017) proposed

empirical wavelet transform (EWT) based algorithm for the PCG signal decomposition.

Messner etal. (2018) proposed an event detection approach with deep recurrent neural net-

works (DRNNs) for heart sound segmentation, i.e. the detection of the state-sequence of

the S1 and S2 heart sound. On the contrary, Deng and Han (2016) proposed a new frame-

work for heart sound classiﬁcation without any segmentation. They extracted autocorrela-

tion features from the sub-band envelopes by computing the sub-band coeﬃcients of the

heart sound signal with the discrete wavelet decomposition (DWT). Following that, the

autocorrelation features were used for obtaining the uniﬁed feature representation with dif-

fusion maps.

For the detection of heart sound recordings as pathologic or physiologic, researchers

have utilized various machine learning algorithms, such as support vector machine (SVM)

(Li etal. 2019a), neural network (NN) (Beritelli etal. 2018), hidden semi-Markov model

(HSMM) (Noman etal. 2020), k-neareast neighbor (KNN) (Singh and Majumder 2019),

decision tree (Langley and Murray 2017), and convolutional neural network (CNN) (Xiao

etal. 2019), to deal with the problem. Zhang etal. (2017) proposed a scaled spectrogram

and partial least squares regression (PLSR) based method for the extraction of eﬀective

features from PCG signals. Then these features were fed to the support vector machine

(SVM) for the classiﬁcation of PCG signals. Whitaker etal. (2017) combined the sparse

coding features with time-domain features to classify PCG signals by using the SVM clas-

siﬁer. Hamidi etal. (2018) utilized curve ﬁtting and Mel frequency cepstrum coeﬃcients

(MFCC) fused with the fractal dimension to extract features from heart sound signals.

Then the nearest neighbor classiﬁer with Euclidean distance was used for the classiﬁcation

A new approach forthedetection ofabnormal heart sound signals…

1 3

task. Zhang etal. (2019) proposed a method for abnormal heart sound detection using tem-

poral quasi-periodic features and long short-term memory (LSTM) without segmentation.

Bozkurt etal. (2018) fed MFCC and Mel-Spectrogram features into convolutional neural

network (CNN) for the PCG signal classiﬁcation.

Above-mentioned works have achieved excellent performance by using diﬀerent signal

processing and machine learning methods. Nonetheless, since the abnormal heart sound

detection is based upon PCG signals, the use of signal processing techniques, feature

extraction and selection become critical and challenging regarding the design of specialized

computerized systems. Due to the discrete-time, oscillatory and nonlinear characteristics of

heart sound signals (Li et al. 2019b), numerous methods with combination of time and

frequency domains and nonlinear analysis have been developed to handle the classiﬁcation

problem. For the time-frequency-domain analysis, recently, the tunable Q-factor wavelet

transform (TQWT) has become popular in biomedical signal processing as a ﬂexible and

discrete wavelet transform that is applicable particularly for analysing oscillatory signals

(Selesnick 2011; Nishad etal. 2018; Patidar etal. 2017; Hassan et al. 2016). The TQWT

is capable of adjusting its Q-factor and has thus emerged as a powerful tool for oscillatory

signals analysis. By changing the Q-factor and redundancy, the oscillatory behavior of the

wavelet basis can better reﬂect the oscillatory behavior of the signal (Selesnick 2011). Fol-

lowing that a sparse signal representation can be obtained, which will in turn improve the

performance of sparsity-based signal processing for applications in denoising, classiﬁca-

tion and signal separation. Patidar and Pachori (2014) proposed a constrained TQWT based

segmentation of cardiac sound signals into heart beat cycles. The features obtained from

heart beat cycles of separately reconstructed heart sounds and murmur can better represent

the various types of cardiac sound signals than that from containing both. Therefore, heart

sounds and murmur have been separated using constrained TQWT. Jain and Tiwari (2018)

presented a segmentation method for the PCG signal. Parameters of TQWT were tuned to

vary the frequency range of the approximation level such that its kurtosis was maximized.

The intrinsic characteristic of heart sound signal is revealed from the nonlinear perspec-

tive. It provides important information for the feature of heart sound signal. These nonlin-

ear parameters, extracted through diﬀerent types of entropies (Cheema and Singh 2019),

multifractal analysis (Gavrovska etal. 2016), and recurrence quantiﬁcation analysis (RQA)

(Liang etal. 2015), have been employed for automatic detection of abnormal heart sound

signal. Considering the characteristics that the heart sound signal is highly random, non-

linear and nonstationary in nature (Li etal. 2019b), self-adaptive signal processing meth-

ods, such as empirical mode decomposition (EMD) (Huang etal. 1998; Huang and Kunoth

2013) and local mean decomposition (LMD) (Park etal. 2011), have been employed to

extract eﬀective and predominant features from heart sound signals (Cheema and Singh

2019; Salman etal. 2016; Liu etal. 2010). EMD decomposes a multi-component signal

into a number of individual monocomponents, that is, intrinsic mode functions and a resid-

ual signal while LMD decomposes any complicated signal into a series of product func-

tions. However, there exist some drawbacks in these methods, in which the EMD method

contains over envelope, mode mixing, end eﬀects and unexplainable negative frequency

caused by Hilbert transformation (Chen etal. 2011), while the LMD method has distorted

components, mode mixing and time-consuming decomposition (Li etal. 2015). Recently,

variational mode decomposition (VMD) was proposed by Dragomiretskiy and Zosso

(2014) as an alternative to the EMD and LMD for the separation of composite real-valued

time series into respective modes. VMD has been extensively used in the areas of biomedi-

cal signal processing, speech signal processing and seismic signal processing (Mert 2016;

Lal et al. 2018; Xue et al. 2016). It has been reported that VMD is theoretically better

W.Zeng et al.

1 3

founded compared to the sequential iterative sifting of EMD. VMD is based on a clear var-

iational model and the resulting minimization steps perform concurrent mode extraction in

an intuitive way (Wang etal. 2017). It was also pointed out by Dragomiretskiy and Zosso

(2014) that VMD over EMD has some advantages on tones separation and is less sensitive

to noise and sampling. VMD captures the relevant center frequencies, which can ensure

good frequency separation and is eﬃcient for identifying various discontinuities present

in a non-stationary signal (Dragomiretskiy and Zosso 2014; Mert 2016). Sujadevi et al.

(2019) used group sparsity algorithm to denoise the measured PCG signals by exploiting

the group sparse (GS) property of PCG signals. The denoised GS-PCG signals were then

decomposed into subsequent modes with speciﬁc spectral characteristics using VMD algo-

rithm. The appropriate mode for further processing was selected based on mode central

frequencies and mode energy. It was then followed by the extraction of Hilbert envelope

and a thresholding on the selected mode to segment S1 and S2 heart sounds. Mishra etal.

(2018) employed VMD technique for the separation of heart sound (HS) and lung sound

(LS) signals, resulting in minimizing the HS interference from LS signals. Mishra etal.

(2020) used VMD to generate a set of amplitude and frequency modulated narrow band-

limited components (NBCs). The VMD-based decomposition of PCG signals in terms of

NBCs was used for quantifying the nonlinear and non-stationary nature of PCG signals. In

the present work we have developed a novel technique to compute the representative fea-

tures based on TQWT and VMD algorithms which are applied to the heart sound signals.

We hypothesize that these features reﬂect the abnormal alterations in the dynamics of the

PCG system and can achieve high sensitivity and speciﬁcity simultaneously as a discrimi-

nator of abnormal heart sound signal. The ultimate goal of the present study is to propose a

novel method for the detection of abnormal PCG signal. It can provide practitioners with a

more robust, simple and computing-eﬃcient computer-aided tool compared with the clas-

sical cardiac auscultation schemes based on the physicians’ experience.

The main contributions of this work are highlighted as follows:

• TQWT decomposes the heart sound signal into diﬀerent frequency bands, which are

used to extract the main subband with majority of the heart sound signal’s energy.

• VMD method captures most part of the signal information, preserving important wave-

form features as a slightly asymmetry. It resolves mode mixing and aliasing problems

with high computational eﬃciency. With the employment of VMD, it could measure

the variability of the heart sound signal. The ﬁrst four intrinsic modes are then extracted

as predominant modes which contain majority of the heart sound signal’s energy.

• 3D phase space of the predominant intrinsic mode is reconstructed, in which properties

associated with the PCG system dynamics are preserved.

• A reliable model for the anomaly detection of PCG recordings is proposed based on the

diﬀerence of PCG system dynamics between normal and abnormal heart sound signals.

The rest of this paper is organized as follows. Section2 introduces the details of the pro-

posed method, including the PhysioNet/CinC Challenge 2016 heart sound database,

TQWT, VMD, PSR, ED, feature extraction and selection, learning and classiﬁcation mech-

anisms. Section3 presents experimental results. Sections 4 and 5 give some discussions

and conclusions, respectively.

A new approach forthedetection ofabnormal heart sound signals…

1 3

2 Method

In this section, we propose a method to discriminate between normal and abnormal heart

sound signals using the information obtained from nonlinear PCG system dynamics for

anomaly detection of PCG recordings. It is divided into the training stage and the clas-

siﬁcation stage, which include the following steps. In the ﬁrst step, TQWT is employed to

decompose the heart sound signal into diﬀerent frequency bands. In the second step, VMD

is applied to decompose the predominant subband of the heart sound signal into several

intrinsic modes to extract predominant modes. In the third step, PSR is applied to extract

nonlinear dynamics of PCG system and Euclidean distances are computed. Finally, feature

vectors are fed into the neural networks for the modeling and identiﬁcation of PCG system

dynamics. The diﬀerence of PCG system dynamics between normal and abnormal heart

sound signals will be applied for the classiﬁcation task. The procedure of the proposed

algorithm is illustrated in Fig.1.

2.1 Heart sound database

In this study we utilize the popular and public PhysioNet/CinC Challenge 2016 heart sound

database (Liu etal. 2016; Goldberger etal. 2003) which is available at the following web-

site: https://physionet.org/content/challenge-2016. This database is consisting of six heart

sound datasets (a through f) from diﬀerent research groups. In these datasets heart sound

signals were sourced from several contributors around the world from both healthy sub-

jects and pathological patients with certain heart diseases. Speciﬁcally, the Challenge set

consists of 3153 heart sound recordings from 764 subjects/patients, lasting from 5s to just

over 120s which were resampled to 2000 Hz. Figure2 demonstrates samples of the wave-

forms corresponding to a normal and an abnormal heart sound signal.

Fig. 1 Flowchart of the proposed method for the anomaly detection of PCG recordings using TQWT,

VMD, PSR, ED and neural networks

W.Zeng et al.

1 3

The heart sound recordings were collected from diﬀerent locations on the body, in which

the typical four locations are aortic area, pulmonic area, tricuspid area and mitral area. In

the database, heart sound recordings were divided into two types: normal and abnormal.

The normal recordings were from healthy subjects while the abnormal ones were from

patients with a conﬁrmed cardiac diagnosis. The patients suﬀered from a variety of ill-

nesses (which we do not provide on a case-by-case basis), but typically they were heart

valve defects and coronary artery disease patients. All the recordings from the patients

were generally labelled as abnormal. The grouped types are further divided into the train-

ing dataset and testing dataset using the 10-fold cross-validation method. The details of the

datasets are demonstrated in Table1. The number of normal recordings is 2488 while the

number of abnormal recordings is 665. All the six datasets are unbalanced, i.e., the number

of normal recordings does not equal that of abnormal recordings.

A balanced heart sound database is selected (Otherwise, without prior probabilities on

the illness, a prevalence bias would be created.), where the abnormal and normal signals

-0.15

-0.1

-0.05

0.05

0.1

0.15

0.2

(a)Normal heartsound signal

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 1000 2000300040005000600070008000900010000

-0.06

-0.04

-0.02

0.02

0.04

0.06

(b)Abnormalheart soundsignal

Fig. 2 The waveforms of heart sound signals

Table 1 Numbers of raw and balanced recordings for each dataset

Here

represents ‘number of’

Dataset name

Raw recordings

Recordings after selected

balanced

Recordings after

balanced with SMOTE

method

Abnormal Normal Abnormal Normal Abnormal Normal

a 292 117 117 117 292 234

b 104 386 104 104 312 386

c 24 7 7 7 24 21

d 28 27 27 27 28 27

e 183 1871 183 183 1830 1871

f 34 80 34 34 68 80

Total 665 2488 472 472 2554 2619

Ratio of abnormal

to normal

0.27 1 0.98

A new approach forthedetection ofabnormal heart sound signals…

1 3

have the same number of recordings, as shown in Table1. However, the selected balanced

database might reduce the number of raw recordings, especially in Datasets b and e. There-

fore, we then adopt the synthetic minority over-sampling technique (SMOTE) algorithm

(Chawla etal. 2002) to over-sample the minority class so as to balance the database in

avoidance of greatly reducing the raw recordings of normal and abnormal heart sound

signals. SMOTE is a popular over-sampling technique for handling imbalanced class data

which can create synthetic samples in the minority group by applying an iterative search

and selection approach (Feng et al. 2019; Wang etal. 2019; Rivera and Xanthopoulos.

2016). Each observation from the minority class will be iterated through till the needed

number is reached.

The working principle of the SMOTE algorithm is brieﬂy depicted as follows. For

details, please refer to (Chawla etal. 2002).

• Required: Minority Data

D=xi∈X

where

i=1, 2, ..., T

. Number of minority

instances (T), SMOTE percentage (N), Number of nearest neighbors (k).

• for

i=1, 2, ..., T

The working principle of the SMOTE algorithm is brieﬂy depicted as ﬁnd k nearest

minority neighbors of

N=[N∕100]

while

N≠0

select one of the k nearest neighbor,

̄x

select random number

𝛼∈[0, 1]

̂x=xi+𝛼(̄x−xi)

Append

̂x

to S

N−1

• Output: Synthetic data S

With SMOTE algorithm, the balanced datasets are illustrated in Table1.

The heart sound is subjected to the following de-nosing preprocessing step. Heart sound

signals obtained using diagnostic tools are usually contaminated with noise from various

sources. These sounds hinder the early detection of mild heart sounds in the PCG signals.

So ﬁltering of noise to remove such artifacts becomes essential (Shervegar and Bhat 2018).

This should be done at the cost of preserving all diagnostic information required for analy-

sis of the PCG signals, but removing all unwanted entities called noise. The heart sound

taken from the Physionet database is contaminated with various types of noises. The heart

sound selected is heavily ﬁltered to remove the maximum noise from the sound. A 6th-

order Chebyshev low-pass ﬁlter with cut-oﬀ frequency of 140 Hz is used for this purpose.

The noises are in high frequencies while diagnostic information is in low frequencies. Fil-

tering removes the high-frequency noise.

2.2 Tunable Q‑factor wavelet transform (TQWT)

Wavelet transform is an eﬀective time-frequency tool for the analysis of non-stationary sig-

nals. The tunable Q-factor wavelet transform (TQWT) is a ﬂexible fully-discrete wavelet

transform suitable for analysis of oscillatory signals (Selesnick 2011). TQWT depends on

changeable parameters: Q-factor (Q), redundancy (R), and decomposition level (J). Gen-

erally, Q measures the oscillatory behavior and waveform shape of wavelet waveform. R

W.Zeng et al.

1 3

helps localize the wavelet in time-domain without aﬀecting its shape. The decomposition

level J controls the expansion extent and bandpass location of wavelet waveform. There

will be a total of

J+1

subbands. For the TQWT parameters, the wavelet transform should

have a low Q-factor when the signal illustrates small or no oscillatory behavior. On the

other hand, the wavelet transform should have a relatively high Q-factor for the analy-

sis and processing of oscillatory signals. Q is often setting at a high value because heart

sound signals have more oscillations. It is worth noting that unwanted excessive ringing

of wavelets needs to be prevented while performing TQWT by appropriately choosing the

value of R greater than or equal to 3 (Selesnick 2011). Generally, a value of

R=3

is rec-

ommended. The TQWT decomposes heart sound signals into subbands with a number of

decomposition levels by using the input parameters (Q, R, and J). TQWT consists of two

iterative band-pass ﬁlter banks, i.e., the high resonance component ﬁlter

Hﬁlter(𝜔)

and the

low resonant component ﬁlter

Lﬁlter(𝜔)

. The resonance characteristics of oscillatory signal

can be represented by quality factor Q, i.e. the ratio of its center frequency to its band-

width,

Q=fc∕Bw

, where

denotes the center frequency and

represents the bandwidth

of signal.

Let the low-pass and high-pass scaling factors of the two-channel ﬁlter bank be denoted

𝜆

and

𝜎

, respectively. In order to prevent excessive redundancy and achieve perfect

reconstruction, the scaling factors should be:

0<𝜆<1

0<𝜎≤1

𝜆+𝜎>1

. Mathemat-

ically, the low-pass ﬁlter

Lﬁlter(𝜔)

and high-pass ﬁlter

Hﬁlter(𝜔)

are expressed as follows

(Selesnick 2011), respectively :

and

where

𝜃(𝜔)

is the frequency response of Daubechies ﬁlter and is deﬁned with the following

expression:

The Q-factor, R and maximum number of decomposition level

Jmax

can be expressed in

terms of parameters

𝜆

and

𝜎

as follows:

where L is the length of the analysed heart sound signal. Detailed expressions of Q, R,

Jmax

and

are provided in (Selesnick 2011).

In order to extract eﬃcient heart sound signal bands, 10 levels (

J=10

J+1=11

subbands) of TQWT with

Q=3

and

R=3

have been empirically selected in this study.

Figures 3 and 4 represent the decomposed TQWT coeﬃcient plot and energy distri-

bution over sample values for normal and abnormal PCG signals. Here, subband 1

(1)

ﬁlter(𝜔)=

⎧

⎪

⎨

⎪

⎩

1, if ∣𝜔∣≤(1−𝜎)𝜋

𝜃(𝜔+(𝜎−1)𝜋

𝜆+𝜎−1),if (1−𝜎)𝜋<∣𝜔∣< 𝜆𝜋

0, if 𝜆𝜋 ≤∣𝜔∣≤𝜋

(2)

ﬁlter(𝜔)=

⎧

⎪

⎨

⎪

⎩

0, if ∣𝜔∣≤(1−𝜎)𝜋

𝜃(𝜆𝜋−𝜔

𝜆+𝜎−1),if (1−𝜎)𝜋<∣𝜔∣< 𝜆𝜋

1, if 𝜆𝜋 ≤∣𝜔∣≤𝜋

(3)

𝜃

(𝜔)=0.5 ×(1+cos(𝜔)) ×

√

2−cos(𝜔),∣𝜔∣

≤

𝜋

(4)

=2−𝜎

𝜎

;R=𝜎

1−𝜆

;Jmax =

log(𝜎L∕8)

log(1∕𝜆),

A new approach forthedetection ofabnormal heart sound signals…

1 3

Fig. 3 Examples of subbands of 10 levels TQWT of the normal and abnormal heart sound signals

W.Zeng et al.

1 3

corresponds to the high-frequencies and subband 11 corresponds to the low-frequen-

cies. It is deduced that heart sound activity shows signiﬁcant variations in value over

all frequency sub-bands. However, low frequency subbands show large variation in

heart sound activity and carry high amount of energy compared to high frequency sub-

bands. It is observed from these ﬁgures that majority of the heart sound signal’s energy

is concentrated in the 11th subband (marked as

Sub11

), especially for the abnormal heart

sound signal. In comparison, nearly 2

of the normal heart sound signal’s energy is dis-

tributed in subbands 9 and 10, respectively, which means the energy is relatively decen-

tralized. Since the majority of the heart sound signal’s energy is concentrated in the

11th subband,

Sub11

is selected for feature acquisition.

2.3 Variational mode decomposition (VMD)

VMD is aiming to decompose a composite input signal x(t) into n number of intrinsic

modes

𝜇n(t)

which have speciﬁc sparsity properties while reproducing the input signal.

The decomposition process can be written as a constrained variational problem with the

following function:

where K is the number of decomposition modes,

𝜕

𝜕t

[⋅

]

denotes the partial deriva-

tive of a function,

𝛿

is the Dirac function, ‘

∗

’ represents convolution computation,

𝜇n={𝜇1,𝜇2,…,𝜇n}

is the set of all modes,

𝜔n={𝜔1,𝜔2,…,𝜔n}

is the set of center fre-

quency, t is the time script, j is the complex square root of

−1

Considering a quadratic penalty term and Lagrange multipliers

𝜂

, the above-men-

tioned constrained variational problem can be transferred into an unconstrained optimi-

zation problem, which is represented as follows:

(5)

min

𝜇

n,𝜔n

{

∑

n=1

‖

𝜕

𝜕t[(𝛿(t)+ j

𝜋t)∗𝜇n(t)]e−j𝜔kt

‖

}

, subject to

∑

n=1

𝜇n(t)=x(t)

SUBBAND

100

SUBBAND ENERGY (% OF TOTAL)

DISTRIBUTION OF SIGNAL ENERGY

(a)Normal

123456789101112345678910 11

SUBBAND

100

SUBBAND ENERGY (% OF TOTAL)

DISTRIBUTION OF SIGNAL ENERGY

(b)Abnormal

Fig. 4 Examples of the energy distribution of the subbands of TQWT of the normal and abnormal heart

sound signals

A new approach forthedetection ofabnormal heart sound signals…

1 3

where L denotes the augmented Lagrangian,

𝛼

is balancing parameter of the data-ﬁdelity

constraint,‘

⟨

⋅

⟩

’ represents the inner product.

Alternate direction method of multipliers (ADMM) has been used to generate vari-

ous decompose modes and centre frequency at the time of shifting operation of each mode

(Dragomiretskiy and Zosso 2014). The solution of Eq.(6) can be derived by using ADMM, in

which the process of the solution of

𝜇n

and

𝜔n

mainly consists of the following steps:

• Step 1 Intrinsic mode update. The Wiener ﬁltering is embedded for updating the mode

directly in Fourier domain with a ﬁlter tuned to the current center frequency. The solution

for updated mode is obtained as follows:

where

𝜅

is the number of iterations,

̂x(𝜔)

̂𝜇 i(𝜔)

and

̂𝜂 (𝜔)

represent the Fourier trans-

forms of

̂x(t)

̂𝜇 i(t)

and

̂𝜂 (t)

, respectively.

• Step 2 Center frequency update. The center frequency is updated as the center of gravity of

the corresponding mode’s power spectrum, which is represented as follows:

The complete algorithm of VMD can be found in (Dragomiretskiy and Zosso 2014). The

VMD method can eﬀectively capture narrow-band and wide-band modes unlike the ﬁxed

bandwidth of subabands as in the case of the wavelet transform based decomposition approach

(Babu etal. 2018). It is more robust to noisy data. Since each mode is updated by Wiener ﬁl-

tering in Fourier domain during the optimization process, the updated mode is less aﬀected by

noisy disturbances. Therefore, VMD can be more eﬃcient for capturing the signal’s short and

long variations (Mishra etal. 2018; Sujadevi etal. 2019). Hence we apply the VMD method to

make up for the disadvantage of TQWT and serve as complementary tool to more eﬀectively

extract features from PCG signals.

Figure5 demonstrates examples of the VMD of the 11th subband

Sub11

of the normal and

abnormal heart sound signals. It is obvious that each

Sub11

is decomposed into 6 intrinsic

modes which are respectively denoted by

𝜇1,𝜇2,…,𝜇6

. The lower modes are slow varying in

time domain while higher modes exhibit faster variation. Results show that the dominant com-

ponents of the PCG signal are the fundamental heart sounds that may appear in the ﬁrst fewer

modes of the signal decomposition.

(6)

({𝜇n},{𝜔n},𝜂)=𝛼

∑

n=1‖

‖

𝛿t[(𝛿(t)+ j

𝜋t)∗𝜇n(t)]e−j𝜔kt‖

‖

x(t)−

∑

n=1

𝜇n(t)

‖

⟨

𝜂(t),x(t)−

∑

n=1

𝜇n(t)

⟩

(7)

̂𝜇

𝜅+1

̂x(𝜔)−

∑

i≠n̂𝜇 i(𝜔)+

̂𝜂 (𝜔)

1+2𝛼(𝜔−𝜔

(8)

̂𝜔

𝜅+1

∫∞

0𝜔

̂𝜇 n(𝜔)

𝜔

∫∞

̂𝜇

(𝜔)

2d𝜔

W.Zeng et al.

1 3

2.4 Phase space reconstruction (PSR)

It is sometimes necessary to search for patterns in a time series and in a higher dimen-

sional transformation of the time series (Sun et al. 2015). Phase space reconstruction is

a method used to reconstruct the so-called phase space. The concept of phase space is a

useful tool for characterizing any low-dimensional or high-dimensional dynamic system. A

dynamic system can be described using a phase space diagram, which essentially provides

a coordinate system where the coordinates are all the variables comprising mathematical

formulation of the system. A point in the phase space represents the state of the system at

any given time (Sivakumar 2002; Lee etal. 2014). Every intrinsic mode of the subbands

of the normal and abnormal heart sound signals can be written as the time series vector

𝜐={𝜐1,𝜐2,𝜐3,…,𝜐K}

, where K is the total number of data points. The phase space can be

reconstructed according to (Lee etal. 2014):

where

j=1, 2, …,K−(d−1)𝜏

, d is the embedding dimension of the phase space and

𝜏

a time lag. It is worthwhile to mention that the properties associated with the PCG system

dynamics are preserved in the reconstructed phase space.

The behaviour of the signal over time can be visualized using PSR (especially when

2 or 3). In this work, we have conﬁned our discussion to the value of embedding dimension

d=3

, because of their visualization simplicity. In addition, diﬀerent studies have found

this value to best represent the attractor for human biological system (Venkataraman and

Turaga 2016; Som etal. 2016). For

𝜏

, we either use the ﬁrst-zero crossing of the autocorre-

lation function for each time series or the average

𝜏

value obtained from all the time series

in the training dataset using the method proposed in Michael (2005). In this study, we con-

sider the values of time lag

𝜏=5

to test the classiﬁcation performance. PSR for

d=3

has

been referred to as 3D PSR.

Reconstructed phase spaces have been proven to be topologically equivalent to the orig-

inal system and therefore are capable of recovering the nonlinear dynamics of the gen-

erating system (Takens 1981; Xu etal. 2013). This implies that the full dynamics of the

PCG system are accessible in this space, and for this reason, features extracted from it can

potentially contain more and/or diﬀerent information than the common features extraction

method (Chen etal. 2014).

3D PSR is the plot of three delayed vectors

𝜐j,𝜐j+1

and

𝜐j+2

to visualize the dynamics of

the PCG system. Euclidian distance (ED) of a point

(𝜐j,𝜐j+1,𝜐j+2)

, which is the distance of

the point from origin in 3D PSR and can be deﬁned as (Lee etal. 2014)

ED measures can be used in features extraction and have been studied and applied in many

ﬁelds, such as clustering algorithms and induced aggregation operators (Merigó and Casa-

novas 2011).

(9)

Yj=(𝜐j,𝜐j+𝜏,𝜐j+2𝜏,…,𝜐j+(d−1)𝜏)

(10)

√

𝜐2

j+𝜐2

j+1+𝜐2

A new approach forthedetection ofabnormal heart sound signals…

1 3

200400 600800 1000 1200 1400

-0.5

0.5

11th subband

200400 600800 1000 1200 1400

-0.1

0.1

µ1

200400 600800 1000 1200 1400

-0.1

0.1

µ2

200400 600800 1000 1200 1400

-0.1

0.1

µ3

VMD of the 11th subband of the normal heart sound signal

200400 600800 1000 1200 1400

-0.05

0.05

µ4

200400 600800 1000 1200 1400

-0.05

0.05

µ5

200400 600800 1000 1200 1400

Samples

-0.1

0.1

µ6

(a)Original Sub11 of thenormalheart soundsignaland itsVMD.

(b)Original Sub

of the abnormalheart sound signal and itsVMD.

Fig. 5 Examples of VMD of

Sub11

of the normal and abnormal heart sound signals

W.Zeng et al.

1 3

2.5 Feature extraction andselection

In order to obtain more eﬃcient features, this paper proposes the following extraction

scheme.

(1) Ten levels TQWT is employed to decompose the heart sound signal into eleven

subbands, in which the 11th subband

Sub11

contains the majority of heart sound signal’s

energy and is selected for analysis.

(2) VMD of the

Sub11

of the heart sound signal and derivation of predominant intrinsic

modes. The signals obtained by VMD method, which are a series of decomposing sig-

nals, cannot be directly used to classify because of the high feature dimension. To solve

this problem, the Pearson’s correlation coeﬃcient is calculated to measure the correla-

tion between the ﬁrst six intrinsic modes and the original

Sub11

of the heart sound signal.

The intrinsic modes with higher correlation coeﬃcient are more highly correlated to the

original signal, which means the signal energy is mostly concentrated in these intrinsic

modes as well. In the present study most of the energy is concentrated in the ﬁrst four

intrinsic mode (

𝜇1

𝜇2

𝜇3

and

𝜇4

), which contain the most important information from the

heart sound signal and are considered to be the predominant intrinsic modes (seen from

Table2). In addition, an independent t-test analysis of variance (SPSS Inc., IL, USA) is

used to compare the diﬀerence of the ﬁrst six intrinsic modes between normal and abnor-

mal heart sound signals in the PhysioNet/CinC Challenge 2016 database. A p value of

Table 2 The average correlation coeﬃcients and their statistical analysis between each intrinsic mode and

the original 11th subband (

Sub11

) of TQWT of all the raw normal and abnormal heart sound signals from

the PhysioNet/CinC Challenge 2016 heart sound database

A p value of < 0.05 in bold is considered to indicate statistical signiﬁcance

Heart sound type Average correlation coeﬃcients

𝜇1

𝜇2

𝜇3

𝜇4

𝜇5

𝜇6

Normal of Dataset a 0.4082 0.5159 0.4388 0.3119 0.1679 0.1523

Abnormal of Dataset a 0.4342 0.5217 0.4154 0.3045 0.1651 0.1426

Diﬀerence between groups (p value) 0.002 0.042 0.001 0.044 0.549 0.158

Normal of Dataset b 0.4538 0.4502 0.3163 0.2133 0.1343 0.1479

Abnormal of Dataset b 0.4885 0.4515 0.3193 0.2148 0.1311 0.1412

Diﬀerence between groups (p value) 0.013 0.036 0.034 0.042 0.485 0..612

Normal of Dataset c 0.4582 0.5036 0.4206 0.2967 0.1613 0.1521

Abnormal of Dataset c 0.4818 0.5023 0.3918 0.2623 0.1624 0.1376

Diﬀerence between groups (p value) 0.003 0.048 <0.001 <0.001 0..852 0.109

Normal of Dataset d 0.4635 0.4925 0.3690 0.2321 0.1456 0.1180

Abnormal of Dataset d 0.4535 0.5074 0.4221 0.2999 0.1650 0.1709

Diﬀerence between groups (p value) 0.037 0.047 <0.001 <0.001 0.068 <0.001

Normal of Dataset e 0.4161 0.4688 0.4284 0.3306 0.1245 0.1166

Abnormal of Dataset e 0.3320 0.5326 0.4250 0.2798 0.1728 0.1549

Diﬀerence between groups (p value) <0.001 <0.001 0.474 <0.001 <0.001 <0.001

Normal of Dataset f 0.4463 0.4999 0.4678 0.3787 0.1299 0.1152

Abnormal of Dataset f 0.4389 0.4960 0.4528 0.3865 0.1438 0.1447

Diﬀerence between groups (p value) 0.019 0.175 <0.001 0.039 0.52 <0.001

Mean value of correlation coeﬃcients 0.4396 0.4952 0.4056 0.2926 0.1503 0.1412

A new approach forthedetection ofabnormal heart sound signals…

1 3

<0.05

is considered to indicate statistical signiﬁcance. It is seen from Table2 that there

exist signiﬁcant diﬀerences in most cases of the ﬁrst four intrinsic modes between normal

and abnormal heart sound signals in the six datasets. Hence, based on the Pearson’s corre-

lation coeﬃcient and its statistical analysis,

𝜇1

𝜇2

𝜇3

and

𝜇4

of the

Sub11

of the heart sound

signal are selected as reference variable

[

Sub

𝜇

,Sub

𝜇

,Sub

𝜇

,Sub

𝜇

]

and are used for the

following feature derivation.

(3) Reconstruct the phase space of the reference variable with selected values of d and

𝜏;

(4) Compute ED of 3D PSR of the reference variables. Concatenate them to form a fea-

ture vector

[

EDSub

𝜇

,EDSub

𝜇

,EDSub

𝜇

,EDSub

𝜇

]T.

For the PhysioNet/CinC Challenge 2016 heart sound database, heart sound signals are ana-

lyzed and PCG system dynamics are extracted by using TQWT, VMD and 3D PSR. First, ten

levels TQWT of the normal and abnormal heart sound signals is demonstrated in Fig.3. VMD

of the 11th subband of TQWT of the heart sound signals is exhibited in Fig. 5. The ﬁrst four

intrinsic modes are utilized to form the reference variable

[

Sub

𝜇

,Sub

𝜇

,Sub

𝜇

,Sub

𝜇

]

. Sam-

ples of the 3D PSR of the reference variable for normal and abnormal PCG signals are exhib-

ited in Figs.6 and7. It can be observed that phase space tracks of the abnormal heart sound

signals are in a more chaotic state in comparison to the normal heart sound signals. The asym-

metric nature of the portraits ﬁtted on the 3D space portrays the erratic time-varying phase

space dynamics of the abnormal PCG signals. These ﬁgures show that patterns related to the

higher dimensional transformations can be more discriminative than those in the time series

-0.1

0.1

-0.05

0.05 0.1

υj+2

0.05

υj+1

υj

0.1

-0.05 -0.05

-0.1 -0.1

(a)3D PSRofSubµ1

11 forµ1.

-0.1

0.1

-0.05

0.05 0.1

υj+2

0.05

υj+1

υj

0.1

-0.05 -0.05

-0.1 -0.1

(b)3D PSRofSubµ2

11 forµ2.

-0.08

0.1

-0.06

-0.04

-0.02

0.05 0.1

υj+2

0.02

0.05

0.04

υj+1

0.06

υj

0.08

-0.05 -0.05

-0.1 -0.1

forµ3.

-0.05

0.05

υj+2

υj+1

υj

0.05

-0.05 -0.05

(d)3D PSRofSubµ4

forµ4.

Fig. 6 Samples of 3D PSR of

[

Sub

𝜇

,Sub

𝜇

,Sub

𝜇

,Sub

𝜇

]

of the normal heart sound signal

W.Zeng et al.

1 3

itself. The disparity of the PCG system dynamics between the normal and abnormal PCG sig-

nals is treated as the diﬀerentiation criterion in the present study. After 3D PSR, features of

[

EDSub

𝜇1

,EDSub

𝜇2

,EDSub

𝜇3

,EDSub

𝜇4

]T for normal and abnormal heart sound signals are derived

through ED computation. It can be observed from Figs.8 and9 that the Euclidean distances

calculated from the 3D PSR in normal and abnormal heart sound signals are diﬀerent from each

other. This implies that the Euclidean distances can serve as useful features in classifying the

normal and abnormal PCG signal. They are fed into the neural networks for the following mod-

eling, identiﬁcation and classiﬁcation of the PCG system dynamics between the two groups.

2.6 Training andmodeling mechanism based onselected features

In this section, we present a scheme for modeling and derivation of nonlinear PCG system

dynamics derived from heart sound signals of normal and abnormal subjects based on the

extracted features.

Consider a temporal data sequence

𝜑𝜁=[

(

)

…

(

)]T∈

generated from the fol-

lowing discrete-time PCG dynamical system:

where

Y(k)=[y1(k),…,yn(k)]T∈Rn

is the state of the system, which is measurable and

represents the feature

[

EDSub

𝜇1

,EDSub

𝜇2

,EDSub

𝜇3

,EDSub

𝜇4

]T , p

1,…,

n]T

is a constant

(11)

Y(k)=F(Y(k−1),…,Y(k−m);p)+v(Y(k−1),…,Y(k−m);p),

-0.02

0.02

-0.015

-0.01

-0.005

0.01 0.02

υj+2

0.005

0.01

υj+1

0.015

υj

0.02

-0.01 -0.01

-0.02 -0.02

(a)3D PSRofSubµ1

11 forµ1.

-0.03

0.04

-0.02

-0.01

0.02 0.03

υj+2

0.02

0.01

υj+1

00.01

0.02

υj

0.03

-0.02 -0.01

-0.02

-0.04 -0.03

(b)3D PSRofSubµ2

11 forµ2.

-0.04

0.04

-0.03

-0.02

-0.01

0.02 0.04

υj+2

0.01

0.02

υj+1

0.03

υj

0.04

-0.02 -0.02

-0.04 -0.04

forµ3.

-0.03

0.04

-0.02

-0.01

0.02 0.03

υj+2

0.02

0.01

υj+1

00.01

0.02

υj

0.03

-0.02 -0.01

-0.02

-0.04 -0.03

(d)3D PSRofSubµ4

forµ4.

Fig. 7 Samples of 3D PSR of

[

Sub

𝜇

,Sub

𝜇

,Sub

𝜇

,Sub

𝜇

]

of the abnormal heart sound signal

A new approach forthedetection ofabnormal heart sound signals…

1 3

vector of system parameters (diﬀerent p will generate diﬀerent dynamical behaviors),

⋅

;p)=[f1(

⋅

;p1),…,fn(

⋅

;pn)]T

is a smooth but unknown nonlinear PCG system dynamics,

⋅;p

)=[

⋅;p

…

⋅;p

n)]T

is the modeling uncertainty.

Since the modeling uncertainty

⋅

;p)

and the PCG system dynamics

⋅

;p)

cannot be

decoupled from each other, we consider the two terms together as an undivided term, and

deﬁne

𝜙(

⋅

;p) ∶= F(

⋅

;p)+v(

⋅

;p)

as the general PCG system dynamics. The objective of the

training or learning stage is to identify or approximate the general PCG system dynam-

ics

𝜙(

⋅

;p)=[

𝜙

⋅

;p1),…,

𝜙

⋅

;pn)]T

to a desired accuracy via deterministic learning (Wang

and Hill 2006, 2007, 2009).

In the ﬁrst step, standard radial basis function (RBF) neural networks are constructed in

the following form

where Z is the input vector,

W=[w1,…,wN]T∈RN

is the weight vector, N is the node

number of the neural networks, and

)=[

1(∥

−

𝜇

1∥),…,

N(∥

−

𝜇

N∥)]T

, with

i(∥ Z−𝜇i∥) = exp[−(Z−𝜇i)

(Z−𝜇i)

𝜂2

]

being a Gaussian function,

𝜇i(i=1, …,N)

being dis-

tinct points in state space, and

𝜂i

being the width of the receptive ﬁeld.

(12)

fnn(Z)=

∑

i=1

wisi(Z)=WTS(Z)

1000 2000 300040005000600070008000900010000

Number of the data points

0.02

0.04

0.06

0.08

0.1

0.12

0.14

EDj

(a)Euclidiandistanceof3DPSR of Subµ1

forµ1.

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Number of the data points

0.02

0.04

0.06

0.08

0.1

0.12

EDj

(b)Euclidiandistanceof3DPSR of Subµ2

forµ2.

1000 2000 300040005000600070008000900010000

Number of the data points

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

EDj

(c)Euclidiandistanceof3DPSR of Subµ3

11 for

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Number of the data points

0.01

0.02

0.03

0.04

0.05

0.06

0.07

EDj

(d)Euclidiandistanceof3DPSR of Subµ4

forµ

Fig. 8 Samples of the Euclidian distance of 3D PSR of

[

Sub

𝜇

,Sub

𝜇

,Sub

𝜇

,Sub

𝜇

]

of the normal heart

sound signal

W.Zeng et al.

1 3

In the second step, the following dynamical RBF neural networks are employed to model

and derive the general PCG system dynamics

𝜙(

⋅

;p)

where

(k)=[̂y

(k),…,̂y

(k)]

∈R

is the state vector of the dynamical model,

A=diag{a1,…,an}

is a diagonal matrix, with

|ai|<1

being design constants, localized

RBF network

(k)S

,…,

(k)S

]

are used to approximate the unknown

𝜙(

⋅

;p)=[

𝜙

⋅

;p),…,

𝜙

⋅

;p)]T

(k)=[

W1(k),…,

Wn(k)]

is the weight estimate of the

neural networks,

Sk(Z)=S(Y(k−1),…,Y(k−m))

Z=[Y(k−1),…,Y(k−m)]

is the

input of the neural networks.

From Eqs. (11) and (13), the derivative of the state estimation error

ei=̂yi(k)−yi(k)

satisﬁes:

where

−W

∗

W∗

is the ideal constant neural network weight,

𝜙i

(⋅;p)=W∗

+𝜖

𝜖i

is the ideal neural network approximation error. The weight estimate

is updated by the

following Lyapunov-based learning law:

(13)

Y(k)=A(

Y(k−1)−Y(k−1)) +

WT(k)Sk(Z),

(14)

(k+1)=̂y

(k+1)−y

(k+1)

=ai(̂yi(k)−yi(k)) + ̃

i(k+1)Sk(Z)−𝜖

(k)+ ̃

(k+1)S

(Z)−𝜖

1000 2000 300040005000600070008000900010000

Number of the data points

0.005

0.01

0.015

0.02

0.025

EDj

(a)Euclidiandistanceof3DPSR of Subµ1

forµ1.

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Number of the data points

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

EDj

(b)Euclidiandistanceof3DPSR of Subµ2

forµ2.

1000 2000 300040005000600070008000900010000

Number of the data points

0.01

0.02

0.03

0.04

0.05

0.06

EDj

(c)Euclidiandistanceof3DPSR of Subµ3

11 for

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Number of the data points

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

EDj

(d)Euclidiandistanceof3DPSR of Subµ4

forµ

Fig. 9 Samples of the Euclidian distance of 3D PSR of

[

Sub

𝜇

,Sub

𝜇

,Sub

𝜇

,Sub

𝜇

]

of the abnormal heart

sound signal

A new approach forthedetection ofabnormal heart sound signals…

1 3

where

0<|𝛼|<2

, P is any symmetric positive deﬁnite matrix, and the weight estimation

error of neural networks

satisﬁes:

Assumption 1 There exists a constant

SM>0

such that for all

k≥0

, the following

bound is satisﬁed:

The following theorem indicates the learning ability of the above-mentioned identiﬁcation

algorithm for discrete-time PCG system.

Theorem1 Consider adaptive system consisting of the nonlinear PCG system (11), the

dynamical RBF network (13) and the neural network weight updating law (15). For almost

any recurrent trajectory

𝜑𝜁

with initial condition

Wi(0)=0

, we have: (1) the state estima-

tion error

ei(k)

exponentially converges to a small neighbor of zero, and the neural network

weight estimation

W𝜁i

exponentially converges to a small neighborhood of the ideal weight

W∗

𝜁i

; (2) a locally accurate approximation for the unknown

𝜙i(

⋅

;pi)

to the desired error level

𝜖i

is obtained along the trajectory

𝜑𝜁

Proof We construct the following form:

Then, the state estimation error and neural network weight estimation error become:

(15)

i(k+1)= ̂

Wi(k)−

𝛼P(̂y

(k)−y

(k)−a

(̂y

(k−1)−y

(k−1)))S

k−1

(Z)

1+𝜆

max

(P)ST

k−1

(Z)S

k−1

(Z)

(16)

i(k+1)=

Wi(k+1)−W

∗

=̃

Wi(k)−

𝛼P(̃

Wi(k)Sk−1(Z)−𝜖i)Sk−1(Z)

1+𝜆max(P)ST

k−1(Z)Sk−1(Z)

=̃

Wi(k)[I−

𝛼PST

k−1(Z)Sk−1(Z)

1+𝜆max(P)ST

k−1(Z)Sk−1(Z)

]

𝛼PSk−1(Z)𝜖i

1+𝜆

max

(P)ST

k−1

(Z)S

k−1

(Z)

(17)

‖S(Z(k)) ≤SM‖

[

zi(k)

Wi(k)

]

[

1−ST

k−1(Z)

][

ei(k)

Wi(k)

]

W.Zeng et al.

1 3

and

Equations (18) and (19) can be transformed into the form of state equation:

◻

By using the local approximation properties of RBF networks, the state estimation error

and weight estimates learning law can be expressed as a uniﬁed form as follows:

(18)

i(k+1)=ei(k+1)−S

k(Z)

Wi(k+1)

=aiei(k)+ ̃

i(k+1)Sk(z)−𝜖i

−ST

k(Z)̃

Wi(k)I−

𝛼PST

k−1(Z)Sk−1(Z)

1+𝜆max(P)ST

k−1(Z)Sk−1(Z)

𝛼PSk−1(Z)𝜖i

1+𝜆max(P)Sk−1(Z)Sk−1(Z)

=aiei(k)+̃

Wi(k)I−

𝛼PST

k−1(Z)Sk−1(Z)

1+𝜆max(P)ST

k−1(Z)Sk−1(Z)

𝛼PSk−1(Z)𝜖i

1+𝜆max(P)ST

k−1(Z)Sk−1(Z)Sk(Z)

−𝜖i−ST

k(Z)̃

Wi(k)I−

𝛼PST

k−1(Z)Sk−1(Z)

1+𝜆max(P)ST

k−1(Z)Sk−1(Z)



𝛼PSk−1(Z)𝜖i

1+𝜆max(P)ST

k−1(Z)Sk−1(Z)

=aiei(k)−𝜖i−aĩ

WST

k−1(Z)+aĩ

WTST

k−1(Z)

(k)+a

WST

k−1

(Z)−𝜖

(19)

i(k+1)=̃

Wi(k)

[

I−

𝛼PST

k−1(Z)Sk−1(Z)

1+𝜆max(P)ST

k−1

(Z)Sk−1(Z)

]

𝛼PSk−1(Z)𝜖i

1+𝜆max(P)ST

k−1

(Z)Sk−1(Z

)

(20)



zi(k+1)

Wi(k+1)





aiaiS

k−1(Z)

0I−𝛼PST

k−1(Z)Sk−1(Z)

1+𝜆max(P)ST

k−1(Z)Sk−1(Z)



zi(k)

Wi(k)





−𝜖i

𝛼PSk−1(Z)𝜖i

1+𝜆

max

(P)ST

k−1

(Z)S

k−1

(Z)



(21)



zi(k+1)

W𝜁i(k+1)









aiaiST

𝜁(k−1)(Z)

0I−𝛼P𝜁ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)

1+𝜆max(P𝜁)ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)









zi(k)

W𝜁i(k)









−𝜖�

𝜁i

𝛼P𝜁ST

𝜁(k−1)(Z)𝜖�

𝜁i

1+𝜆max(P𝜁)ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)







A new approach forthedetection ofabnormal heart sound signals…

1 3

and

(

⋅

)𝜁i

and

(

⋅

)

𝜁i

stand for terms which are close to the orbit

𝜑𝜁

and far away from the orbit

𝜑𝜁

respectively.

S𝜁k

is a subvector of

W𝜁i

is the corresponding weight subvector.

𝜖

�

𝜁i=𝜖𝜁i−

𝜁i

S̄

𝜁k(Z)=O(𝜖𝜁i

)

is the approximation error along the trajectory

𝜑𝜁

Now, we ﬁrst prove the stability of the nominal part of Eq.(21). Based on the properties

of RBF networks (Wang and Hill 2006, 2007, 2009), almost any periodic or recurrent tra-

jectory

𝜑𝜁

ensures persistence of excitation (PE) of the regressor subvector

S𝜁k

(Gorinevsky

1995). With Assumption1,

S𝜁k

in (21) satisﬁes the PE condition. Then, there exist constants

𝛼1>0, n>n𝜁>0

, such that:

where

n𝜁

is the dimension of

S𝜁k

Consider the following Lyapunov function candidate:

where

𝛽>0

. Then, we have:

Equation (25) can also be written as:

(22)

𝜁i(k+1)= ̃

W̄

𝜁i(k)

[

I−

𝛼P̄

𝜁S

𝜁(k−1)(Z)S̄

𝜁(k−1)(Z)

1+𝜆max(P̄

𝜁)ST

𝜁(k−1)(Z)S̄

𝜁(k−1)(Z)

]

𝛼P̄

𝜁S̄

𝜁(k−1)(Z)𝜖�

𝜁i

1+𝜆max(P̄

𝜁)ST

𝜁(k−1)

(Z)S̄

𝜁(k−1)(Z)

(23)

𝛼

1I≤

j+n−1

∑

k=j

S𝜁(k−1)(Z)ST

𝜁(k−1)(Z),∀j≥

(24)

i(k)=𝛽z

(k)+

𝜁i

(k)P

−1

𝜁̃

W𝜁i(k

)

(25)

𝛥V

(k)=V

(k+1)−V

(k)

=𝛽z2

i(k+1)+ ̃

𝜁i(k+1)P−1

𝜁̃

W𝜁i(k+1)−𝛽z2

i(k)− ̃

𝜁i(k)P−1

𝜁̃

W𝜁i(k)

=−𝛽(1−a2

i)z2

i(k)+2𝛽a2

izi(k)ST

𝜁(k−1)(Z)̃

W𝜁i(k)

+𝛽a2

iST

𝜁(k−1)(Z)̃

W𝜁i(k)ST

𝜁(k−1)̃

W𝜁i(k)

−ST

𝜁(k−1)(Z)̃

W𝜁i(k)

2𝛼I−𝛼2P𝜁ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)

1+𝜆max(P𝜁)ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)

1+𝜆max(P𝜁)ST

𝜁(k−1)

(Z)S𝜁(k−1)(Z)

𝜁i(k)S𝜁(k−1)(Z

)

(26)

𝛥

Vi(k)=−zi(k)ST

𝜁(k−1)(Z)̃

W𝜁i(k)







𝛽(1−a2

i)−𝛽a2

−𝛽a2

2𝛼I−

𝛼2P𝜁ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)

1+𝜆max(P𝜁)ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)

1+𝜆max(P𝜁)ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)−𝛽a2









zi(k)

𝜁i

(k)S𝜁(k−1)(Z)



W.Zeng et al.

1 3

Let

(k)=

⎡

⎢

⎣

𝛽(1−a2

i)−𝛽a2

−𝛽a2

2𝛼I−

𝛼2P𝜁ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)

1+𝜆max(P𝜁)ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)

1+𝜆max(P𝜁)ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)−𝛽a2

⎤

⎥

⎦

, when

𝛼I−

𝛼

P𝜁S

𝜁(k−1)(Z)S𝜁(k−1)(Z)

1+𝜆max(P𝜁)ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)

1+𝜆max(P𝜁)ST

𝜁(

−

)

(Z)S𝜁(k−1)(Z)

−

𝛽[a2

i+a

1−a2

, that is,

2𝛼I−

𝛼

P𝜁S

𝜁(k−1)(Z)S𝜁(k−1)(Z)

1+𝜆max(P𝜁)ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)

+𝜆max(P𝜁)[a2

1−a2

]ST

k−1(Z)Sk−1(Z)

≥

2𝛼I−

𝛼

P𝜁S

𝜁(k−1)(Z)S𝜁(k−1)(Z)

𝜆max(P𝜁)ST

𝜁(k−1)(Z)S𝜁(k−1)(Z)

1+𝜆max(P𝜁)[a2

1−a2

]S2

>𝛽>

Then,

D(k)>0

leas to the results that

Equation (27) means: (1)

zi(k)

and

W𝜁i

(k)S

𝜁(k−1)

)

converge exponentially to zero when

k→∞

. Hence,

ei(k)

converges exponentially to zero when

k→∞

; (2)

W𝜁i

is uniformly

ultimately bounded. Since

W𝜁i

(k)S

𝜁(k−1)

)

converges exponentially to zero, this implies

that

W𝜁i

converges to a constant vector

It can be deduced from Eq.(27) that

𝜁(k−1)

(Z)

Wc=

. Then, we have:

Sum up the above equations, we have ̃

∑j+n−1

k=j

S𝜁(k−1+n)(Z)ST

𝜁(k−1+n)

(Z)=

. For

S𝜁k(Z)

satisfying Eq.(23), the matrix

∑j+n−1

k=j

S𝜁(k−1)(Z)ST

𝜁(k−1)

)

is positive deﬁnite, then

Wc=0

hence

W𝜁i

converges exponentially to zero.

Thus, the nominal part of Eq. (21) is exponentially stable. Since

𝜖𝜁i

is small, both the state

estimation error

zi(k+1)

and the parameter error

W𝜁i

(k+1

)

in Eq.(21) converge exponentially

to small neighborhoods of zero, and the range of the neighborhood is determined by the param-

eter

𝜖𝜁i

The convergence of

W𝜁i

to be in a small neighborhood of

W∗

𝜁i

implies that along the trajec-

tory

𝜑𝜁

where

𝜖

𝜁i

=𝜖𝜁i−

𝜁i

S𝜁k(𝜑𝜁)=O(𝜖𝜁i)=O(𝜖i

)

is the practical approximation error for

using ̂

𝜁i

S𝜁

, which is small due to the exponential convergence of

W𝜁i

By this convergence result, we can obtain a constant vector of neural weights according to

where

{ka,…,kb}

represents a piece of time segment after the transient process. Thus,

using

𝜁i

S𝜁k(𝜑𝜁

)

, where

W𝜁i

is the subvector of

, we have:

(27)

{

𝛥V(k)<0, for {zi(k),

W𝜁i(k)S𝜁(k−1)(Z

)}

𝛥V(k)

≤

0for {z

(k),̃

𝜁i

(k)}

𝜁(k−1)(Z)S

𝜁(k−1)

(Z)

Wc=0, …,S𝜁(k−1+n)(Z)S

𝜁(k−1+n)

(Z)

Wc=

(28)

𝜙

i(𝜑𝜁;pi)=W

∗

𝜁i

S𝜁k(𝜑𝜁)+𝜖𝜁i

=̂

𝜁iS𝜁k(𝜑𝜁)− ̃

𝜁iS𝜁k(𝜑𝜁)+𝜖𝜁

=̂

𝜁i

S𝜁

(𝜑𝜁)+𝜖𝜁

(29)

i=1

kb−ka+1

∑

k=ka

Wi(k

)

A new approach forthedetection ofabnormal heart sound signals…

1 3

where

𝜖

𝜁i

is the practical approximation error for using

𝜁i

S𝜁

. It is clear that after the

transient process,

𝜖𝜁i2=O(𝜖𝜁i1)=O(𝜖i)

It can be seen from Eq.(22) that for the neurons with centers far away from the trajec-

tory

𝜑𝜁

S𝜁k

will become very small due to the localization property of RBF networks. In

this case, the neural weights

W𝜁i

will only be slightly updated. Both

W𝜁i

and

𝜁i

S𝜁

, as well

W𝜁i

and

𝜁i

S𝜁

will remain very small. This means that the entire RBF network

can approximate the unknown

𝜙i(𝜑𝜁;pi)

along the trajectory

𝜑𝜁

as follows:

where

𝜖

i1=𝜖𝜁i1−

𝜁i

S̄

𝜁k(𝜑𝜁)=O(𝜖𝜁i1)=O(𝜖i

)

. Similarly, using Eq.(30), we have

where

𝜖

=𝜖𝜁i

−

𝜁i

S̄

𝜁k(𝜑𝜁)=O(𝜖𝜁i

)=O(𝜖i

)

. Equations (31) and (32) mean that locally

accurate identiﬁcation of the system dynamics

𝜙i(

⋅

;pi)

to the desired level

𝜖i

along the tra-

jectory

𝜑𝜁

can be achieved by using the RBF network. This completes the proof.

It is seen that the employment of localized RBF networks under periodic or periodic-

like (recurrent) inputs, yields a guaranteed PE excitation condition. This condition, with

the localization property of RBF networks, leads to the exponential stability of a localized

adaptive discrete-time PCG system. In this way, parameter convergence and accurate local

approximation of PCG system dynamics can be achieved naturally.

2.7 Classication mechanism

In this section, we present a scheme to classify normal and abnormal heart sound signals.

Consider a set of training temporal data sequences

𝜑s

𝜁,s=1, …,M

, among which the

sth training temporal data sequence

𝜑s

𝜁

generated from the following system:

where

(k)=[y

(k),…,y

(k)]

∈R

is the state of the system, which is measurable,

a constant vector of system parameters,

(⋅;p

)=[f

(⋅;p

),…,f

(⋅;p

)]

denotes the PCG

system dynamics,

(⋅;p

)=[v

(⋅;p

),…,v

(⋅;p

)]

denotes the modeling uncertainty.

As mentioned above, the general PCG system dynamics

𝜙s(

⋅

;p) ∶= Fs(

⋅

;p)+vs(

⋅

;p)

can

be accurately derived and preserved in constant RBF neural networks

))

Consider

𝜑𝜍

generated from Eq.(11) as a test temporal data sequence. For the sth train-

ing temporal data sequence

𝜑s

𝜁

, a dynamical model is constructed by using the time-invari-

ant representation

WsT

as:

(30)

𝜙

i(𝜑𝜁;pi)=

𝜁iS𝜁k(𝜑𝜁)+𝜖𝜁i

=̄

𝜁i

𝜁k

(𝜑

𝜁

)+𝜖

𝜁i2

(31)

𝜙

i(𝜑𝜁;pi)=

𝜁iS𝜁k(𝜑𝜁)+𝜖𝜁i1

=̄

𝜁iS𝜁k(𝜑𝜁)+ ̄

𝜁iS̄

𝜁k(𝜑𝜁)+𝜖𝜁i1−̄

𝜁iS̄

𝜁k(𝜑𝜁

)

=̂

Sk(𝜑𝜁)+𝜖i

(32)

𝜙

i(𝜑𝜁;pi)=

𝜁iS𝜁k(𝜑𝜁)+𝜖𝜁i2

=̄

𝜁iS𝜁k(𝜑𝜁)− ̄

𝜁iS̄

𝜁k(𝜑𝜁)+𝜖𝜁i

=̄

(𝜑

𝜁

)+𝜖

(33)

Ys(k)=Fs(Ys(k−1),…,Ys(k−m);ps)+vs(Ys(k−1),…,Ys(k−m);ps)

W.Zeng et al.

1 3

where

(k)=[̄y

(k),…,̄y

(k)]

∈R

is the state vector of the dynamical model,

B=diag{b1,…,bn}

is a diagonal matrix that is kept the same for all training sequences.

Sk(Z)=S(Y(k−1),…,Y(k−m))

[Y(k−1),…,Y(k−m)]

is the test temporal data

sequence

𝜑𝜍

generated from Eq. (11). Then, corresponding to the test temporal data

sequence

𝜑𝜍

and the dynamical model (34), we obtain the following recognition error

system:

where

i(k)=yi(k)−̄ys

i(k)

|bi|<1

We have that the error

|es

i(k)|

can eﬀectively measure the similarity between the test

sequence

𝜑𝜍

and the training sequences

𝜑s

𝜁

. Compute the average

norm of

|es

i(k)|

, for

example, for

p=1

Hence, we have the following classiﬁcation method for temporal data sequences:

Consider the recognition error system consisting of Eqs.(11), (34) and (35). Among the

M dynamical models, if the error

‖es

i(k)‖L1

between the sth dynamical model and the test

temporal data sequence

𝜑𝜍

is the smallest one, then the test temporal data sequence

𝜑𝜍

said to be most similar to the training temporal data sequence

𝜑s

𝜁

The fundamental idea of the classiﬁcation of abnormal heart sound signals is that if

a test heart sound signal pattern is similar to the trained heart sound signal pattern

s(s∈{1, …,k})

, the constant RBF network

TSk

embedded in the matched estimator s

will quickly recall the learned knowledge by providing accurate approximation to PCG sys-

tem dynamics. Thus, the corresponding error

‖es

i(k)‖L1

will become the smallest among all

the errors

‖

(k)

‖L1

. Based on the smallest error principle, the appearing test heart sound

signal pattern can be classiﬁed.

Classiﬁcation scheme If there exists some ﬁnite time

ts,s∈{1, …,k}

and some

i∈{1, …,n}

such that

‖

(k)

‖L1

‖

(k)

‖L1

for all

t>ts

, then the appearing PCG system

pattern can be classiﬁed and abnormal heart sound signal can be detected.

3 Experimental results

Experiments are implemented using matlab software and tested on an Intel Core i7 6700K

3.5GHz computer with 64GB RAM. We assign feature vector sequences for all the normal

and abnormal heart sound signals in the PhysioNet/CinC Challenge 2016 heart sound data-

base. According to the method described in Sect.2.5, we extract features, which means the

input of the RBF neural networks is

[

EDSub

𝜇1

,EDSub

𝜇2

,EDSub

𝜇3

,EDSub

𝜇4

]T . In order to elim-

inate data diﬀerence between diﬀerent features, all feature data are normalized to

[−1, 1]

Several experiments are carried out to verify the eﬀectiveness of the proposed method.

The classiﬁcation results will be evaluated with the 10-fold cross-validation style in which

the variance of the estimate for the classiﬁers is reduced. The data are divided into the train-

ing and test subsets. For the 10-fold cross-validation, the data set is divided into ten subsets.

(34)

Y(k)=B(

(k−1)−Y(k−1)) +

TSk,

(35)

(k)=b

(k−1)+(𝜙

(⋅;p

)−

),i=1, …,n,k=1, …,M

(36)



i(k)



L1=1



j=1

i(j)



,s=1, …,M

A new approach forthedetection ofabnormal heart sound signals…

1 3

Each time, one of the ten subsets is used as the test set and the other night subsets are put

together to form a training set. As such, every fold has been used nine times as training data

and one time as test data. The ﬁnal result is the average of the 10 implementations. For the

evaluation, the sensitivity (

), the speciﬁcity (

), the overall score (

) of the sensitivity and

the speciﬁcity, and the accuracy (

ACC

) are used and deﬁned as follows (Cliﬀord etal. 2016):

where TP is the number of true positives referring to the abnormal heart sound signals,

FN is the number of false negatives referring to the misidentiﬁed abnormal heart sound

signals, TN is the number of true negatives referring to the correctly detected normal heart

sound signals, and FP is the number of false positives referring to the misidentiﬁed normal

heart sound signals. The overall score is also deﬁned as mean accuracy (

MACC

) in some

literatures.

The classiﬁcation results on normal and abnormal heart sound signals (with two dif-

ferent data balance methods mentioned before) have been illustrated in Tables3 and 4

with 10-fold cross-validation style. We apply three types of features to verify and com-

pare their classiﬁcation performance: (1) derived from TQWT+PSR/ED; (2) derived

from VMD+PSR/ED; and (3) derived from TQWT, VMD, PSR and ED (proposed fea-

tures). Here when only applying TQWT+PSR/ED, we use the 11th subband of 10 levels

TQWT of the heart sound signal together with PSR/ED as the features, which are rep-

resented as

EDSub11

. When only applying VMD+PSR/ED, we use the ﬁrst four intrinsic

modes of the heart sound signal together with PSR/ED as the features, which are repre-

sented as

[

𝜇

,ED

𝜇

,ED

𝜇

,ED

𝜇

]

. It is seen from Tables3 and 4 that the classiﬁcation

(37)

TP +FN

×100(%)

(38)

TN +FP

×100(%)

(39)

(40)

ACC

TP +TN

TP +TN +FN +FP

×100(%)

Table 3 Classiﬁcation performance of the proposed features and its comparison with other two features on

selected balanced recordings evaluated by 10-fold cross-validation. Total numbers of the abnormal and nor-

mal recordings are 472 and 472, respectively

Evaluated features Predicted

groups

Actual groups

(

)

(

)

(

)

ACC

(

)

Normal Abnormal

TQWT+PSR/ED:

EDSub11

Normal 392 80 85.38 83.05 84.22 84.22

Abnormal 69 403

VMD+PSR/ED:

[

𝜇

,ED

𝜇

,ED

𝜇

,ED

𝜇

]

Normal 404 68 87.29 85.59 86.44 86.44

Abnormal 60 412

Proposed features:

[

EDSub

𝜇1

,EDSub

𝜇2

,EDSub

𝜇3

,EDSub

𝜇4

]

Normal 461 11 97.46 97.67 97.57 97.56

Abnormal 12 460

W.Zeng et al.

1 3

performance of the proposed features is superior to that of the other two features. Overall,

our classiﬁcation approach achieves good performance, which indicates that the proposed

pattern classiﬁcation system can eﬀectively detect abnormal heart sound signals by using

nonlinear features and neural network based classiﬁcation tools.

4 Discussion

Experimental results of this study demonstrate that abnormal heart sound signals could be

detected automatically by means of nonlinear features and neural networks based artiﬁcial

intelligence tool. The proposed scheme focuses not only on providing evidence to support

the claim that pathological patients demonstrate altered PCG system dynamics compared

to normal subjects, but also on providing an automatic, objective and computationally con-

venient method to distinguish between normal and abnormal heart sound signals.

Potes etal. (2016) used two classiﬁers, in which the AdaBoost classiﬁer and the CNN

were included. They ﬁrst extracted 124 time-frequency features from the PCG signal and

used them as input to a variant of the AdaBoost classiﬁer. Then they decomposed the PCG

cardiac cycles into four frequency bands, which were used as input of the CNN for training.

Finally, they classiﬁed the normal and abnormal heart sound signals based on an ensemble

of classiﬁers combining the outputs of AdaBoost and the CNN. The reported best perfor-

mance was with the sensitivity of

94.24%

, the speciﬁcity of

77.81%

, and the overall score

86.02%

, respectively.

Dominguez-Morales etal. (2017) divided the heart sound recordings into windows of a

speciﬁc time length. Then they sent these segments of the original sound to a Neuromor-

phic Auditory Sensor, which could decompose the audio into frequency bands and pack-

etize the information. Finally, this information was converted to sonogram images, which

were fed to the CNN for classiﬁcation by using deep learning algorithms. The reported best

performance with 10-fold cross-validation was with the accuracy of

97.05%

, the sensitivity

95.12%

, the speciﬁcity of

93.20%

, and the overall score of

94.16%

, respectively.

Beritelli etal. (2018) extracted features from PCG signals by using Gram polynomials

and the Fourier transform. Afterwards, features were fed to the probabilistic neural net-

works for classiﬁcation. The reported best performance with 10-fold cross-validation was

Table 4 Classiﬁcation performance of the proposed features and its comparison with other two features

on balanced recordings with SMOTE method evaluated by 10-fold cross-validation. Total numbers of the

abnormal and normal recordings are 2554 and 2619, respectively

Evaluation methods Predicted

groups

Actual groups

(

)

(

)

(

)

ACC

(

)

Normal Abnormal

TQWT+PSR/ED:

EDSub11

Normal 2235 384 84.30 85.34 84.82 84.83

Abnormal 401 2153

VMD+PSR/ED:

[

𝜇

,ED

𝜇

,ED

𝜇

,ED

𝜇

]

Normal 2277 342 86.06 86.94 86.50 86.51

Abnormal 356 2198

Proposed features:

[

EDSub

𝜇

,EDSub

𝜇

,EDSub

𝜇

Sub

𝜇4

]

Normal 2568 51 97.73 98.05 97.89 97.89

Abnormal 58 2496

A new approach forthedetection ofabnormal heart sound signals…

1 3

with the accuracy of

94%

, the sensitivity of

93%

, the speciﬁcity of

91%

, and the overall

score of

92%

, respectively.

Bozkurt etal. (2018) extracted features from heart sound signal by using Mel-Spectro-

gram, MFCC and subband envelopes. These features were used as input of the CNN classi-

ﬁer and the reported best performance with 10-fold cross-validation was with the accuracy

81.5%

, the sensitivity of

84.5%

, the speciﬁcity of

78.5%

, and the overall score of

81.5%

respectively.

Zhang et al. (2019) extracted the spectrogram of the heart sound signal by using the

short-time Fourier transform. Following that, they calculated the temporal quasi-periodic

features by the average magnitude diﬀerence function in each frequency band of the heart

sound spectrogram. The extracted features were fed to the two-layer LSTM neural net-

work for classiﬁcation. The reported best performance with 10-fold cross-validation was

with the sensitivity of

96.15%

, the speciﬁcity of

93.18%

, and the overall score of

94.66%

respectively.

Adiban etal. (2019) constructed a ﬁxed length feature vector from the heart sound sig-

nal by using MFCC features. Afterwards, Principal Component Analysis (PCA) transform

and Variational Autoencoder (VA) were used to reduce the feature dimension. Finally, the

reduced size feature vector was fed to Gaussian Mixture Models and SVM for classiﬁca-

tion. The reported best performance was with the sensitivity of

92.28%

, the speciﬁcity of

94.95%

, and the overall score of

93.61%

, respectively.

Xiao etal. (2019) took 3-s 1-D waveform PCG as the inputs of CNN. At ﬁrst, the initial

low-level features were extract by 64 convolutional ﬁlters. Then max pooling layers were

used to further reduce the spatial size of feature maps. After that the feature maps were fed

to the stacked clique blocks. The reported best performance with 10-fold cross-validation

was with the accuracy of

93%

, the sensitivity of

86%

, the speciﬁcity of

95%

, and the overall

score of

91%

, respectively.

Das etal. (2019) extracted three kinds of features from PCG signal, including MFCC,

Short time fourier transform and Cochleagram feature, and then fed them to a supervised

artiﬁcial neural network for classiﬁcation. The reported best performance with 10-fold

cross-validation was with the accuracy of

93.7%

, the sensitivity of

84.5%

, the speciﬁcity of

95.2%

, and the overall score of

89.9%

, respectively.

Diﬀerent from the above discussed methods, this study proposes a hybrid method to

extract nonlinear features using TQWT, VMD, PSR and ED techniques. These features

are fed into dynamical estimators which are consisting of constant RBF neural networks to

classify normal and abnormal heart sound signals. Comparison of the classiﬁcation perfor-

mance to other state-of-the-art methods on the same database is demonstrated in Table5.

The proposed method provides sensitivity, speciﬁcity, overall score and accuracy values

of 97.73

, 98.05

, 97.89

, and 97.89

, respectively, through 10-fold cross-validation

style. Modeling, identiﬁcation and classiﬁcation of PCG system dynamics were employed

instead of putting feature vectors directly into the classiﬁer in comparison to other meth-

ods. This provides another candidate tool for the detection of abnormal heart sound signals.

In TQWT the variation of Q-factor aﬀects the computed features in diﬀerent oscillatory

levels. Selecting the proper value of Q improves the system accuracy until it reaches its

best performance, and then any further increase in the value of Q will reduce the system

performance. Increasing R, while keeping Q unchanged, has the eﬀect of increasing the

overlap between adjacent frequency responses. The parameter R does not aﬀect the general

shape of the wavelet of frequency response spectrum (they are controlled by Q). With a

larger R, the number of level J should be increased in order to cover the same frequency

range because of the increased overlap. The value of J has been restricted to 15 in the

W.Zeng et al.

1 3

Table 5 Summary of classiﬁcation performance on the normal and abnormal heart sound signals with 10-fold cross-validation style obtained from the same PhysioNet/CinC

Challenge 2016 heart sound database in the literature

References Features Classiﬁer Sensitivity (

) Speciﬁcity (

) Overall score (

) Accuracy (

)

Potes etal. (2016) Using time-frequency features AdaBoost and CNN 94.24 77.81 86.02 Not mentioned

Dominguez-Morales etal. (2017) Using sonogram images con-

verted from frequency bands

of PCG

CNN 95.12 93.20 94.16 97.05

Beritelli etal. (2018) Using features extracted from

Gram polynomials and the

Fourier transform

Probabilistic neural networks

classiﬁer

93 91 92 94

Bozkurt etal. (2018) Using features extracted from

Mel-Spectrogram, MFCC, sub-

band envelopes

CNN 84.5 78.5 81.5 81.5

Zhang etal. (2019) Using heart sound spectrogram

features

LSTM 96.15 93.18 94.66

Not mentioned

Adiban etal. (2019) Using MFCC features Gaussian Mixture Models and

SVM

92.28 94.95 93.61 Not mentioned

Xiao etal. (2019) 3-s 1-D waveform with 64 convo-

lutional ﬁlters

CNN 86 95 91 93

Das etal. (2019) MFCC, Short time fourier trans-

form and Cochleagram features

Supervised artiﬁcial neural

network

84.5 95.2 89.9 93.7

Proposed work Extracted through TQWT, VMD,

PSR and ED

Dynamical estimators consisting

of neural networks

97.73

98.05

97.89

A new approach forthedetection ofabnormal heart sound signals…

1 3

present study owing to the fact that higher values of J will lead to higher dimension of fea-

ture matrices which in turn, will increase computational burden. Several experiments are

performed for an optimum selection of Q-factor and J values. The R value is ﬁxed to be 3,

as the R value increases, the overlapping in the adjacent frequency response also increases.

For Q and J the minimum value is selected as 1. Hence, Q is varied from 1 to 10 and J is

varied from 1 to 15, respectively. Then the features are computed from the sub-band with

the majority of the heart sound signal’s energy and fed into RBF neural networks for the

modeling, identiﬁcation and classiﬁcation of PCG system dynamics based on deterministic

learning theory. Figures10 and11 depict the eﬀect of variation of Q-factor and J level on

the classiﬁcation performance. It can be observed from Fig.10 that signiﬁcant variation in

classiﬁcation accuracy is achieved by varying Q-factor value. However, the highest clas-

siﬁcation accuracy is obtained for

Q=3

. Classiﬁcation accuracy further decreases with

increment in Q-factor value. Therefore, optimum value of Q-factor is found to be 3 in the

present study. The optimal value of J is determined in the same manner. It can be observed

from Fig.11 that the maximum accuracy value is achieved for

J=10

. The experimental

results demonstrate that features based on time-frequency properties of TQWT are quite

eﬀective to represent the behavior of cardiac sound signals giving higher classiﬁcation per-

formance. One way to increase the classiﬁcation performance of our method could be with

the ﬁne-tuned parameters of the TQWT on a subject by subject basis, so as to account for

inter-individual diﬀerences. To what extent the performance can be improved by modify-

ing the tuneable parameters of TQWT (globally or for each individual) is not clear and

could be the focus of further investigation in the future.

PSR can reduce the eﬀects of the noise or outliers of the PCG signals. Hence, features

extracted in phase space might help improve the classiﬁcation results. The most visual way

to observe the dynamic behavior of a chaotic system is through the phase space, which is

the track record of the chaotic system and can reﬂect the changes of the system state. For

12345678910

Q-factor

Accuracy (%)

Fig. 10 Variation of classiﬁcation accuracy with Q-factor on balanced recordings with SMOTE method

W.Zeng et al.

1 3

the convenience of observation, a phase space is often studied to directly judge the non-

linear dynamic behavior of chaotic systems. For example, for periodic motion, the phase

diagram trajectory is a simple closed curve. Because heart sound is a quasi-periodic signal,

we further use the phase space to analyze the chaotic characteristics of the heart sound. In

this work, we have conﬁned our discussion to the value of embedding dimension

d=3

because of their visualization simplicity. In addition, diﬀerent studies have found this value

to best represent the attractor for human biological system (Venkataraman and Turaga

2016; Som etal. 2016). From a theoretical viewpoint, the time lag

𝜏

has little impact on the

classiﬁcation performance, and in fact there are no limitations or assumptions placed upon

it with respect to the underlying time-lag reconstruction theorems for discrete-time signals

(Sauer etal. 1991). However, since topological invariance of systems does not equate to

identical phase spaces or attractors, from a practical viewpoint the lag must be selected

with respect to some relevant criteria (Johnson etal. 2005), such as the ﬁrst-zero crossing

of the autocorrelation function for each time series or the average

𝜏

value obtained from

all the time series in the training dataset using the method proposed in Michael (2005).

The dimension d is held constant and the classiﬁcation task is implemented with time lag

varying across a range of 1–20. It can be observed from Fig.12 that the accuracy is highest

for a lag of 5, with a decline followed by a second lower peak value at lag 12. However, to

what extent the classiﬁcation performance can be improved by modifying the dimension

and time lag is not clear and construction of regulation principle of the PSR parameters

will be considered in future research.

The TQWT can be used to extract the dynamical changes in the abnormal PCG sig-

nals with respect to that of normal. It is a nonlinear method and hence able to capture the

subtle variations in the PCG signals which results in high accuracy. Decomposing signals

with VMD is considered insightful because it provides more descriptive details about the

12345678910 11 12 13 14 15

Decomposition level J

93.5

94.5

95.5

96.5

97.5

Accuracy (%)

Fig. 11 Variation of classiﬁcation accuracy with decomposition level J on balanced recordings with

SMOTE method

A new approach forthedetection ofabnormal heart sound signals…

1 3

original signal. For example, a signal that is decomposed into 4 intrinsic modes is more

descriptive than one decomposed into 2 intrinsic modes. VMD is essentially a set of adap-

tive Wiener ﬁlter banks, which transforms signal decomposition into variational solution

problem and can decompose a signal into an ensemble of band-limited mode concurrently

in a non-recursive way. 3D phase spaces of the predominant intrinsic modes are recon-

structed, in which properties associated with the PCG system dynamics are preserved.

PSR plots PCG system dynamics along the advisable

𝜇1

𝜇2

𝜇3

and

𝜇4

intrinsic modes of

the 11th subbands trajectory in a 3D phase space diagram and visualizes the PCG system

dynamics. Features derived from TQWT, VMD, 3D PSR and ED may better reﬂect the

abnormal alterations in the dynamics of the PCG system and can achieve high sensitivity

and speciﬁcity simultaneously as a discriminator of abnormal heart sound signal. When

feeding these features into the RBF neural networks for the modeling and identiﬁcantion of

PCG system dynamics, it could greatly improve the modeling accuracy which is eﬀective

for the anomaly (normal vs. abnormal) detection of PCG recordings.

5 Conclusions

In this study, we propose a new approach including TQWT, VMD, PSR and ED for the

detection of abnormal heart sound signals, which is computationally simple and easy to

implement. The results of this study indicate that the pattern classiﬁcation of heart sound

signal can oﬀer an objective method to assess the disparity of PCG system dynamics

between normal subjects and pathological patients with heart diseases. However, some

limitations still need to be improved and overcome, such as the limited size of the database,

12345678910 11 12 13 14 15 16 17 18 19 20

Time lag τ

96.2

96.4

96.6

96.8

97.2

97.4

97.6

97.8

Accuracy (%)

Fig. 12 Variation of classiﬁcation accuracy with time lag

𝜏

at dimension 3 on balanced recordings with

SMOTE method

W.Zeng et al.

1 3

the regulation principle of the TQWT amd PSR parameters. Future work will include a

clinical validation of the proposed technique with a larger number of pathological patients

with diﬀerent heart diseases. Assessments of the mathematical relationship between the

embedding dimension, time lag, Q-factor, redundancy, decomposition level and the clas-

siﬁcation accuracy can also be considered in future investigations. In the present study we

did not regroup the PhysioNet/CinC Challenge 2016 database in a patient-wise manner

since we did not provide on a case-by-case basis for the patients with a variety of illness.

In future research we will regroup the database in a patient-wise manner and consider the

impact of illness (such as heart valve defects and coronary artery disease) of the patients on

the eﬀectiveness of the stratiﬁed classiﬁcation model. Features introduced in other methods

such as various entropies, Hurst exponent, fractal dimension and other nonlinear features,

can also be explored in the proposed framework to evaluate its classiﬁcation performance.

The proposed automated detection system can assist physicians in cross-checking their

diagnosis of heart diseases.

Acknowledgements This work was supported by the National Natural Science Foundation of China (Grant

No. 61773194), by the Natural Science Foundation of Fujian Province (Grant No. 2018J01542), by the Pro-

gram for New Century Excellent Talents in Fujian Province University and by the Training Program of

Innovation and Entrepreneurship for Undergraduates (Grant No. 201911312009).

Compliance with ethical standards

Conict of interest There is no conﬂict of interest.

References

Adiban M, BabaAli B, Shehnepoor S (2019) I-vector based features embedding for heart sound classiﬁca-

tion. arXiv preprint arXiv :1904.11914

Alam U, Asghar O, Khan SQ, Hayat S, Malik RA (2010) Cardiac auscultation: an essential clinical skill in

decline. Br J Cardiol 17(1):8

Babu KA, Ramkumar B, Manikandan MS (2018) Automatic identiﬁcation of S1 and S2 heart sounds using

simultaneous PCG and PPG recordings. IEEE Sens J 18(22):9430–9440

Beritelli F, Capizzi G, Sciuto GL, Napoli C, Scaglione F (2018) Automatic heart activity diagnosis based on

Gram polynomials and probabilistic neural networks. Biomed Eng Lett 8(1):77–85

Boutana D, Benidir M, Barkat B (2011) Segmentation and identiﬁcation of some pathological phonocardio-

gram signals using time-frequency analysis. IET Signal Process 5(6):527–537

Bozkurt B, Germanakis I, Stylianou Y (2018) A study of time-frequency features for CNN-based automatic

heart sound classiﬁcation for pathology detection. Comput Biol Med 100:132–143

Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling

technique. J Artif Intell Res 16:321–357

Cheema A, Singh M (2019) An application of phonocardiography signals for psychological stress detection

using non-linear entropy based features in empirical mode decomposition domain. Appl Soft Comput

77:24–33

Chen B, He Z, Chen X, Cao H, Cai G, Zi Y (2011) A demodulating approach based on local mean decom-

position and its applications in mechanical fault diagnosis. Meas Sci Technol 22(5):055704

Chen M, Fang Y, Zheng X (2014) Phase space reconstruction for improving the classiﬁcation of single trial

EEG. Biomed Signal Process Control 11:10–16

Cliﬀord GD, Liu C, Moody B, Springer D, Silva I, Li Q, Mark RG (2016) Classiﬁcation of normal/abnor-

mal heart sound recordings: the PhysioNet/computing in cardiology challenge 2016. In: 2016 Comput-

ing in cardiology conference (CinC), pp 609–612

Das S, Pal S, Mitra M (2019) Supervised model for Cochleagram feature based fundamental heart sound

identiﬁcation. Biomed Signal Process Control 52:32–40

Deng SW, Han JQ (2016) Towards heart sound classiﬁcation without segmentation via autocorrelation fea-

ture and diﬀusion maps. Future Gener Comput Syst 60:13–21

A new approach forthedetection ofabnormal heart sound signals…

1 3

Dominguez-Morales JP, Jimenez-Fernandez AF, Dominguez-Morales MJ, Jimenez-Moreno G (2017) Deep

neural networks for the recognition and classiﬁcation of heart murmurs using neuromorphic auditory

sensors. IEEE Trans Biomed Circuits Syst 12(1):24–34

Dragomiretskiy K, Zosso D (2014) Variational mode decomposition. IEEE Trans Signal Process

62(3):531–544

Feng W, Dauphin G, Huang W, Quan Y, Bao W, Wu M, Li Q (2019) Dynamic synthetic minority over-

sampling technique-based rotation forest for the classiﬁcation of imbalanced hyperspectral data. IEEE

J Sel Top Appl Earth Obs Remote Sens 12(7):2159–2169

Gavrovska A, Zajic G, Bogdanovic V, Reljin I, Reljin B (2016) Paediatric heart sound signal analysis

towards classiﬁcation using multifractal spectra. Physiol Meas 37(9):1556

Goldberger AL, Amaral LAN, Glass L, Hausdorﬀ JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng

CK, Stanley HE (2003) PhysioBank, physioToolkit, and physioNet: components of a new research

resource for complex physiologic signals. Circulation 101(23):e215–e220

Gorinevsky D (1995) On the persistency of excitation in radial basis function network identiﬁcation of non-

linear systems. IEEE Trans Neural Netw 6(5):1237–1244

Hamidi M, Ghassemian H, Imani M (2018) Classiﬁcation of heart sound signal using curve ﬁtting and frac-

tal dimension. Biomed Signal Process Control 39:351–359

Hassan AR, Siuly S, Zhang Y (2016) Epileptic seizure detection in EEG signals using tunable-Q factor

wavelet transform and bootstrap aggregating. Comput Methods Programs Biomed 137:247–259

Hassani K, Bajelani K, Navidbakhsh M, Doyle DJ, Taherian F (2014) Heart sound segmentation based on

homomorphic ﬁltering. Perfusion 29(4):351–359

Huang B, Kunoth A (2013) An optimization based empirical mode decomposition scheme. J Comput Appl

Math 240:174–183

Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Liu HH (1998) The empirical mode decomposi-

tion and Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc R Soc Lond A

Math Phys Eng Sci 454(1971):903–995

Humayun AI, Ghaﬀarzadegan S, Ansari MI, Feng Z, Hasan T (2020) Towards domain invariant heart

sound abnormality detection using learnable ﬁlterbanks. IEEE J Biomed Health Inform. https ://doi.

org/10.1109/JBHI.2020.29702 52

Jain PK, Tiwari AK (2018) A robust algorithm for segmentation of phonocardiography signal using tunable

quality wavelet transform. J Med Biol Eng 38(3):396–410

Johnson MT, Povinelli RJ, Lindgren AC, Ye J, Liu X, Indrebo KM (2005) Time-domain isolated phoneme

classiﬁcation using reconstructed phase spaces. IEEE Trans Speech Audio Process 13(4):458–466

Lal GJ, Gopalakrishnan EA, Govind D (2018) Epoch estimation from emotional speech signals using vari-

ational mode decomposition. Circuits Syst Signal Process 37(8):3245–3274

Langley P, Murray A (2017) Heart sound classiﬁcation from unsegmented phonocardiograms. Physiol Meas

38(8):1658

Lee SH, Lim JS, Kim JK, Yang J, Lee Y (2014) Classiﬁcation of normal and epileptic seizure EEG signals

using wavelet transform, phase-space reconstruction, and Euclidean distance. Comput Methods Pro-

grams Biomed 116(1):10–25

Li Y, Xu M, Wei Y, Huang W (2015) Rotating machine fault diagnosis based on intrinsic characteristic-

scale decomposition. Mech Mach Theory 94:9–27

Li J, Ke L, Du Q, Ding X, Chen X, Wang D (2019a) Heart sound signal classiﬁcation algorithm: a combina-

tion of wavelet scattering transform and twin support vector machine. IEEE Access 7:179339–179348

Li J, Ke L, Du Q (2019b) Classiﬁcation of heart sounds based on the wavelet fractal and twin support vector

machine. Entropy 21(5):472

Liang QZ, Guo XM, Zhang WY, Dai WD, Zhu XH (2015) Identiﬁcation of heart sounds with arrhythmia

based on recurrence quantiﬁcation analysis and Kolmogorov entropy. J Med Biol Eng 35(2):209–217

Liu L, Wang H, Wang Y, Tao T, Wu X (2010) Feature analysis of heart sound based on the improved

Hilbert-Huang transform. In: 3rd IEEE international conference on computer science and information

technology, pp 378–381

Liu C, Springer D, Li Q, Moody B, Juan RA, Chorro FJ, Syed Z (2016) An open access database for the

evaluation of heart sound algorithms. Physiol Meas 37(12):2181

Merigó JM, Casanovas M (2011) Induced aggregation operators in the Euclidean distance and its applica-

tion in ﬁnancial decision making. Expert Syst Appl 38:7603–7608

Mert A (2016) ECG feature extraction based on the bandwidth properties of variational mode decomposi-

tion. Physiol Meas 37(4):530

Messner E, Zohrer M, Pernkopf F (2018) Heart sound segmentation-an event detection approach using deep

recurrent neural networks. IEEE Trans Biomed Eng 65(9):1964–1974

W.Zeng et al.

1 3

Michael S (2005) Applied nonlinear time series analysis: applications in physics, physiology and ﬁnance

(Vol 52). World Scientiﬁc, Singapore

Mishra M, Banerjee S, Thomas DC, Dutta S, Mukherjee A (2018) Detection of third heart sound using

variational mode decomposition. IEEE Trans Instrum Meas 67(7):1713–1721

Mishra M, Pratiher S, Menon H, Mukherjee A (2020) Identiﬁcation of S1 and S2 heart sounds using

spectral and convex hull features. IEEE Sens J 20(8):4311–4320

Nishad A, Pachori RB, Acharya UR (2018) Application of TQWT based ﬁlter-bank for sleep apnea

screening using ECG signals. J Ambient Intell Humaniz Comput. https ://doi.org/10.1007/s1265

2-018-0867-3

Nogueira DM, Ferreira CA, Gomes EF, Jorge AM (2019) Classifying heart sounds using images of

Motifs, MFCC and temporal features. J Med Syst 43(6):168

Noman FM, Salleh SH, Ting CM, Samdin SB, Ombao H, Hussain H (2020) A Markov-switching model

approach to heart sound segmentation and classiﬁcation. IEEE J Biomed Health Inform 24(3):705–716

Papadaniil CD, Hadjileontiadis LJ (2013) Eﬃcient heart sound segmentation and extraction using ensemble

empirical mode decomposition and kurtosis features. IEEE J Biomed Health Inform 18(4):1138–1152

Park C, Looney D, Van Hulle MM, Mandic DP (2011) The complex local mean decomposition. Neuro-

computing 74(6):867–875

Patidar S, Pachori RB (2014) Classiﬁcation of cardiac sound signals using constrained tunable-Q wave-

let transform. Expert Syst Appl 41(16):7161–7170

Patidar S, Pachori RB, Upadhyay A, Acharya UR (2017) An integrated alcoholic index using tunable-Q

wavelet transform based features extracted from EEG signals for diagnosis of alcoholism. Appl Soft

Comput 50:71–78

Potes C, Parvaneh S, Rahman A, Conroy B (2016) Ensemble of feature-based and deep learning-based

classiﬁers for detection of abnormal heart sounds. In: 2016 computing in cardiology conference

(CinC), pp 621–624

Rivera WA, Xanthopoulos P (2016) A priori synthetic over-sampling methods for increasing classiﬁca-

tion sensitivity in imbalanced data sets. Expert Syst Appl 66:124–135

Safara F, Doraisamy S, Azman A, Jantan A, Ramaiah ARA (2013) Multi-level basis selection of wavelet

packet decomposition tree for heart sound classiﬁcation. Comput Biol Med 43(10):1407–1414

Salman AH, Ahmadi N, Mengko R, Langi AZ, Mengko TL (2016) Empirical mode decomposition

(EMD) based denoising method for heart sound signal and its performance analysis. Int J Electr

Comput Eng 6(5):1–8

Sauer T, Yorke JA, Casdagli M (1991) Embedology. J Stat Phys 65(3–4):579–616

Selesnick I (2011) Wavelet transform with tunable Q-factor. IEEE Trans Signal Process 59(8):3560–3575

Shervegar MV, Bhat GV (2018) Heart sound classiﬁcation using Gaussian mixture model. Porto Biomed

J 3(1):e4

Singh SA, Majumder S (2019) Classiﬁcation of unsegmented heart sound recording using KNN classi-

ﬁer. J Mech Med Biol 19(04):1950025

Sivakumar B (2002) A phase-space reconstruction approach to prediction of suspended sediment con-

centration in rivers. J Hydrol 258(1–4):149–162

Som A, Krishnamurthi N, Venkataraman V, Turaga P (2016) Attractor-shape descriptors for balance

impairment assessment in Parkinson’s disease. In: IEEE conference on engineering in medicine and

biology society, pp 3096–3100

Springer DB, Tarassenko L, Cliﬀord GD (2015) Logistic regression-HSMM-based heart sound segmen-

tation. IEEE Trans Biomed Eng 63(4):822–832

Sujadevi VG, Mohan N, Kumar SS, Akshay S, Soman KP (2019) A hybrid method for fundamental heart

sound segmentation using group-sparsity denoising and variational mode decomposition. Biomed

Eng Lett 9(4):413–424

Sun S, Jiang Z, Wang H, Fang Y (2014) Automatic moment segmentation and peak detection analysis of

heart sound pattern via short-time modiﬁed Hilbert transform. Comput Methods Programs Biomed

114(3):219–230

Sun Y, Li J, Liu J, Chow C, Sun B, Wang R (2015) Using causal discovery for feature selection in multi-

variate numerical time series. Mach Learn 101(1–3):377–395

Takens F (1981) Detecting strange attractors in turbulence. In: Rand DA, Young L-S (eds) Dynamical

systems and turbulence, Warwick 1980. Springer, Berlin, pp 366–381

Varghees VN, Ramachandran KI (2014) A novel heart sound activity detection framework for automated

heart sound analysis. Biomed Signal Process Control 13:174–188

Varghees VN, Ramachandran KI (2017) Eﬀective heart sound segmentation and murmur classiﬁcation

using empirical wavelet transform and instantaneous phase for electronic stethoscope. IEEE Sens J

17(12):3861–3872

A new approach forthedetection ofabnormal heart sound signals…

1 3

Aliations

WeiZeng1 · JianYuan1· ChengzhiYuan2· QinghuiWang1· FenglinLiu1· YingWang1

* Wei Zeng

zw0597@126.com

1 School ofPhysics andMechanical andElectrical Engineering, Longyan University,

Longyan364012, People’sRepublicofChina

2 Department ofMechanical, Industrial andSystems Engineering, University ofRhode Island,

Kingston, RI02881, USA

Venkataraman V, Turaga P (2016) Shape distributions of nonlinear dynamical systems for video-based

inference. IEEE Trans Pattern Anal Mach Intell 38(12):2531–2543

Wang C, Hill DJ (2006) Learning from neural control. IEEE Trans Neural Networks 17(1):130–146

Wang C, Hill DJ (2007) Deterministic learning and rapid dynamical pattern recognition. IEEE Trans Neural

Netw 18(3):617–630

Wang C, Hill DJ (2009) Deterministic learning theory for identiﬁcation, recognition and control. CRC

Press, Boca Raton

Wang Y, Liu F, Jiang Z, He S, Mo Q (2017) Complex variational mode decomposition for signal processing

applications. Mech Syst Signal Process 86:75–85

Wang Q, Zhou X, Wang C, Liu Z, Huang J, Zhou Y, Cheng JZ (2019) WGAN-based synthetic minor-

ity over-sampling technique: improving semantic ﬁne-grained classiﬁcation for lung nodules in CT

images. IEEE Access 7:18450–18463

Whitaker BM, Suresha PB, Liu C, Cliﬀord GD, Anderson DV (2017) Combining sparse coding and time-

domain features for heart sound classiﬁcation. Physiol Meas 38(8):1701

Xiao B, Xu Y, Bi X, Zhang J, Ma X (2019) Heart sounds classiﬁcation using a novel 1-D convolutional neu-

ral network with extremely low parameter consumption. Neurocomputing. https ://doi.org/10.1016/j.

neuco m.2018.09.101

Xie Y, Xie K, Xie S (2019) Underdetermined blind source separation for heart sound using higher-order

statistics and sparse representation. IEEE Access 7:87606–87616

Xu B, Jacquir S, Laurent G, Bilbault JM, Binczak S (2013) Phase space reconstruction of an experimental

model of cardiac ﬁeld potential in normal and arrhythmic conditions. In: 35th annual international

conference of the IEEE engineering in medicine and biology society, pp 3274–3277

Xue YJ, Cao JX, Wang DX, Du HK, Yao Y (2016) Application of the variational-mode decomposition for

seismic time-frequency analysis. IEEE J Sel Top Appl Earth Obs Remote Sens 9(8):3821–3831

Zhang WJ, Han JQ, Deng SW (2017) Heart sound classiﬁcation based on scaled spectrogram and partial

least squares regression. Biomed Signal Process Control 32:20–28

Zhang WJ, Han JQ, Deng SW (2019) Abnormal heart sound detection using temporal quasi-periodic fea-

tures and long short-term memory without segmentation. Biomed Signal Process Control 53:101560

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and

institutional aﬃliations.

A preview of this full-text is provided by Springer Nature.

Learn more

Content available from Artificial Intelligence Review

This content is subject to copyright. Terms and conditions apply.

Effect of Heart murmurs on Heart Rate Study

Preprint

Full-text available

May 2024

Sidi mohamed el amine DEBBAL

The purpose of this paper is to present a straightforward framework for Heart Rate (HR) estimation from a Phonocardiogram (PCG) records and study the impact of murmur severity on HR. The system focuses primarily on data processing procedure, which is based on signal preprocessing using Maximal Overlap Discrete Wavelet Transform (MODWT) to delineate murmurs from heart sounds. We exploit the characteristics of Logistic function to derive an enhanced PCG envelop that serves as prerequisite for HR algorithm detection. In fact, the PCG envelop present a cyclostationarity that can be easily detected throughout a cross-covariance autocorrelation function to calculate the Heart Rate (HR). In addition, the effect of minor and pronounced murmurs is gauged by the Energetic Ratio (ER) that provide a comprehensive idea about the superimposed murmur energy on first and second Heart sounds. The study was conducted on PASCAL datasets with 335 real clinical records. Results shows that subjects with Heart murmurs present an averaged Heart Rate (HR ≈ 77 bpm) is within the normal range in mild and medium murmurs. These findings suggests that the change in heart rate is not associated with the severity of murmur that occurs in structural heart valve disorder. A result that could be valuable to medical professionals operating in the emergency departments.

Artificial intelligence for heart sound classification: A review

Article

Full-text available

Jan 2024
EXPERT SYST

Heart sound signal analysis is very important for the early identification and treatment of cardiovascular illness. With rapid advancements in science and technology, artificial intelligence technologies are providing tremendous opportunities to enhance diagnosis and clinical decision‐making. Instruments can now perform clinical diagnoses that previously could only be handled by human experts more conveniently and efficiently. Despite multiple works on automatic heart sound analysis, there are few summarization and review works. This article attempts to give a thorough overview of various heart sound analysis subtasks and examine the improvements made in each subtask by both machine learning techniques and deep learning algorithms. It goals to highlight the potential of AI to revolutionize cardiovascular healthcare by enabling accurate and automated analysis of heart sounds. The findings of this review are beneficial for researchers, clinicians, and engineers in the development and application of AI‐based solutions for improved heart sound classification and diagnosis.

ChronicNet: Randome Forest Classifier-based Chronic Heart Failure Detection with CNN Feature Analysis

Conference Paper

Apr 2024

HBNET: A blended ensemble model for the detection of cardiovascular anomalies using phonocardiogram

Article

Feb 2024
Tech Health Care

Abstract Background: Cardiac diseases are highly detrimental illnesses, responsible for approximately 32% of global mortality [1]. Early diagnosis and prompt treatment can reduce deaths caused by cardiac diseases. In paediatric patients, it is challenging for paediatricians to identify functional murmurs and pathological murmurs from heart sounds. Objective: The study intends to develop a novel blended ensemble model using hybrid deep learning models and softmax regression to classify adult, and paediatric heart sounds into five distinct classes, distinguishing itself as a groundbreaking work in this domain. Furthermore, the research aims to create a comprehensive 5-class paediatric phonocardiogram (PCG) dataset. The dataset includes two critical pathological classes, namely atrial septal defects and ventricular septal defects, along with functional murmurs, pathological and normal heart sounds. Methods: The work proposes a blended ensemble model (HbNet-Heartbeat Network) comprising two hybrid models, CNN-BiLSTM and CNN-LSTM, as base models and Softmax regression as meta-learner. HbNet leverages the strengths of base models and improves the overall PCG classification accuracy. Mel Frequency Cepstral Coefficients (MFCC) capture the crucial audio signal characteristics relevant to the classification. The amalgamation of these two deep learning structures enhances the precision and reliability of PCG classification, leading to improved diagnostic results. Results: The HbNet model exhibited excellent results with an average accuracy of 99.72% and sensitivity of 99.3% on an adult dataset, surpassing all the existing state-of-the-art works. The researchers have validated the reliability of the HbNet model by testing it on a real-time paediatric dataset. The paediatric model's accuracy is 86.5%. HbNet detected functional murmur with 100% precision. Conclusion: The results indicate that the HbNet model exhibits a high level of efficacy in the early detection of cardiac disorders. Results also imply that HbNet has the potential to serve as a valuable tool for the development of decision-support systems that aid medical practitioners in confirming their diagnoses. This method makes it easier for medical professionals to diagnose and initiate prompt treatment while performing preliminary auscultation and reduces unnecessary echocardiograms. Keywords: Blended ensemble; mel frequency cepstral coefficient; meta-learner; phonocardiogram; softmax regression.

Heart Sound Classification based on Discrete Wavelet Transform and Group-based Sparse Features of PCG Signal

Conference Paper

Dec 2023

Exploiting Data-Efficient Image Transformer-Based Transfer Learning for Valvular Heart Diseases Detection

Article

Full-text available

Jan 2024

Recent studies have shown the potential of the Data-Efficient Image Transformer (DeiT)-based transfer learning method in speech/image recognition and classification utilizing models pre-trained on image datasets. However, the use of DeiT models, especially those pre-trained on image datasets, has not yet been explored for Valvular Heart Disease (VHD) detection. This paper proposes a transfer learning methodology using the DeiT model pre-trained on image datasets for VHD classification. Additionally, we introduce a hybrid Convolution-DeiT (Conv-DeiT) architecture to further improve classification performance. The Conv-DeiT framework integrates a convolutional block with a Squeeze-and-Excitation (SE) attention mechanism to enhance the channel and spatial information within the input features before processing by the DeiT model. The proposed models were assessed using the Heart Sound Murmur (HSM) database, accessible on GitHub. Experimental results show that the DeiT-based transfer learning approach achieved an overall accuracy of 97.44%. Moreover, our Conv-DeiT method outperformed the DeiT-based transfer learning with an impressive overall accuracy of 99.44%. This study indicates the effectiveness of transfer learning using DeiT models pre-trained on image datasets for heart sound classification. Specifically, our hybrid Conv-DeiT method, which combines the convolutional block and the SE-attention mechanism, demonstrates significant advantages in this context.

Hybrid Sneaky algorithm-based deep neural networks for Heart sound classification using phonocardiogram

Article

Nov 2023
NETWORK-COMP NEURAL

In the diagnosis of cardiac disorders Heart sound has a major role, and early detection is crucial to safeguard the patients. Computerized strategies of heart sound classification advocate intensive and more exact results in a quick and better manner. Using a hybrid optimization-controlled deep learning strategy this paper proposed an automatic heart sound classification module. The parameter tuning of the Deep Neural Network (DNN) classifier in a satisfactory manner is the importance of this research which depends on the Hybrid Sneaky optimization algorithm. The developed sneaky optimization algorithm inherits the traits of questing and societal search agents. Moreover, input data from the Phonocardiogram (PCG) database undergoes the process of feature extraction which extract the important features, like statistical, Heart Rate Variability (HRV), and to enhance the performance of this model, the features of Mel frequency Cepstral coefficients (MFCC) are assisted. The developed Sneaky optimization-based DNN classifier's performance is determined in respect of the metrics, namely precision, accuracy, specificity, and sensitivity, which are around 97%, 96.98%, 97%, and 96.9%, respectively.

Enhancing Imbalanced Heart Sound Classification through Transfer Learning and Gammatonegram Image Analysis

Preprint

Full-text available

Nov 2023

Cardiovascular diseases remain the foremost global cause of mortality, necessitating timely and accurate diagnosis. Auscultation, relying on a physician's expertise and a stethoscope, stands as the primary diagnostic tool for cardiovascular disorders. However, its inherent subjectivity necessitates the development of an efficient clinical support system capable of transforming this subjective process into a computerized and proficient method. In real-world clinical settings, auscultation sounds frequently become entangled with ambient noise, demanding the implementation of an effective denoising technique followed by a robust classification model to ensure accurate categorization. In this research paper, we present an innovative preprocessing technique that harnesses the Variational Mode Decomposition (VMD) method to effectively denoise heart sounds. Subsequently, the denoised sound signals undergo processing through a Gammatone filter bank and Short-Time Fourier Transform (STFT) to generate time-frequency distributions in the form of Gammatonegram images and Spectrogram images. To tackle the challenges associated with imbalanced datasets, we incorporate a data augmentation method during the image processing phase. These images are then subjected to classification using various deep convolutional neural network architectures grounded in transfer learning principles, specifically CNN models, including AlexNet, SqueezeNet, GoogLeNet, and VGG19, to mitigate model overfitting.Our experimental results undergo rigorous validation using the publicly accessible PhysioNet 2016 dataset. Notably, our proposed methodology, particularly when leveraging Gammatonegram images, demonstrates highly promising results. These outcomes underscore the considerable clinical potential of our approach, particularly in the context of detecting imbalanced and noisy heart sound signals, ultimately contributing to the enhancement of cardiovascular disease diagnosis.

Heart Sounds Classification Based on High‐Order Spectrogram and Multi‐Convolutional Neural Network after a New Screening Strategy

Article

Full-text available

Oct 2023

This paper proposes a pre‐processing method for heart sound screening and extracts the high‐order spectral feature of phonocardiogram. Moreover, a multi‐convolutional neural network (mCNN) is constructed to achieve the classification of normal, aortic stenosis, mitral regurgitation, mitral stenosis, and mitral valve prolapse. First, the heart sound recordings are down‐sampled, denoised by wavelet transform, and normalized. Second, a new heart sound screening algorithm is proposed. The waveform of the heart sound recording is segmented and saved as an image which is performed by the gray‐scale processing to calculate the amplitude of the heart sound. The extremely noisy heart sound segments are screened out based on the amplitude information, and the remaining heart sound segments are spliced as pure heart sound recordings. After 50% superposition segmentation of the heart sound recordings, high‐order spectral features are extracted and image data are stored. Finally, a 34‐layer mCNN is specifically designed to boost the performance of heart sound classification through multi‐layer dimensionality reduction. Experimental results show that the proposed method has superior performance compared with the existing one. For the two‐category dataset, the accuracy with and without PCG screening is 97.99% and 99.42%, respectively. For the five‐category dataset, the average accuracy is 99%.

Improving the Performance of ECG-based Epileptic Seizure Prediction using Wavelet and Variational Mode Decomposition

Conference Paper

Aug 2023

Heart Sound Signal Classification Algorithm: A Combination of Wavelet Scattering Transform and Twin Support Vector Machine

Article

Full-text available

Jan 2019

By classifying the heart sound signals, it can provide very favorable clinical information to the diagnosis of cardiovascular diseases. According to the characteristics of heart sound signals which are complex and difficult to classify and recognize, a new method of feature extraction and classification about heart sound signal is proposed by a combination of wavelet scattering transform and twin support vector machine in this paper. The method is as follows: The heart sound signal data set is firstly divided into two parts, one as a training set and the other as a testing set. Then the wavelet scattering transform is applied to the heart sound signals in the training set and the testing set. The scattering transform is a new time-frequency analysis method. It overcomes the shortcomings of the traditional wavelet transform which has the time-shift changes. It has the advantages of translation invariance and elastic deformation stability. Thus obtain the scattering feature matrix of the heart sound signal. Due to the large dimension of scattering feature matrix, this paper uses multidimensional scaling (MDS) method to reduce the dimension. This method is compared with the classical dimension reduction method-principal component analysis (PCA). Finally, the dimensionality-reduced feature matrix is input into the twin support vector machine (TWSVM) for training. After training the classifier to get the optimal parameters, the dimensionality-reduced scattering feature matrix of the testing signal is input into the classifier for testing. Experimental results show that the classification accuracy of the proposed method can reach 98% or more, and the running time is greatly reduced compared with support vector machine (SVM).

Towards Domain Invariant Heart Sound Abnormality Detection Using Learnable Filterbanks

Article

Full-text available

Jan 2020

Objective: Cardiac auscultation is the most practiced non-invasive and cost-effective procedure for the early diagnosis of heart diseases. While machine learning based systems can aid in automatically screening patients, the robustness of these systems is affected by numerous factors including the stethoscope/sensor, environment and data collection protocol. This paper studies the adverse effect of domain variability on heart sound abnormality detection and develops strategies to address this problem. Methods: We propose a novel Convolutional Neural Network (CNN) layer, consisting of time-convolutional (tConv) units, that emulate Finite Impulse Response (FIR) filters. The filter coefficients can be updated via backpropagation and be stacked in the front-end of the network as a learnable filterbank. Results: On publicly available multi-domain datasets, the proposed method surpasses the top-scoring systems found in the literature for heart sound abnormality detection (a binary classification task). We utilized sensitivity, specificity, F-1 score and Macc (average of sensitivity and specificity) as performance metrics. Our systems achieved relative improvements of up to 11.84% in terms of MAcc, compared to state-of-the-art methods. Conclusion: The results demonstrate the effectiveness of the proposed learnable filterbank CNN architecture in achieving robustness towards sensor/domain variability in PCG signals. Significance: The proposed methods pave the way for deploying automated cardiac screening systems in diversified and underserved communities.

Empirical Mode Decomposition (EMD) Based Denoising Method for Heart Sound Signal and Its Performance Analysis

Article

Full-text available

Oct 2016
IJECE

p>In this paper, a denoising method for heart sound signal based on empirical mode decomposition (EMD) is proposed. To evaluate the performance of the proposed method, extensive simulations are performed using synthetic normal and abnormal heart sound data corrupted with white, colored, exponential and alpha-stable noise under different SNR input values. The performance is evaluated in terms of signal-to-noise ratio (SNR), root mean square error (RMSE), and percent root mean square difference (PRD), and compared with wavelet transform (WT) and total variation (TV) denoising methods. The simulation results show that the proposed method outperforms two other methods in removing three types of noises.</p

Statistical Feature Embedding for Heart Sound Classification

Article

Full-text available

Aug 2019

Cardiovascular Disease (CVD) is considered as one of the principal causes of death in the world. Over recent years, this field of study has attracted researchers’ attention to investigate heart sounds’ patterns for disease diagnostics. In this study, an approach is proposed for normal/abnormal heart sound classification on the Physionet challenge 2016 dataset. For the first time, a fixed length feature vector; called i-vector; is extracted from each heart sound using Mel Frequency Cepstral Coefficient (MFCC) features. Afterwards, Principal Component Analysis (PCA) transform and Variational Autoencoder (VAE) are applied on the i-vector to achieve dimension reduction. Eventually, the reduced size vector is fed to Gaussian Mixture Models (GMMs) and Support Vector Machine (SVM) for classification purpose. Experimental results demonstrate the proposed method could achieve a performance improvement of 16% based on Modified Accuracy (MAcc) compared with the baseline system on the Physionet2016 dataset.

Underdetermined Blind Source Separation for Heart Sound Using Higher-Order Statistics and Sparse Representation

Article

Full-text available

Jul 2019

Underdetermined blind source separation (UBSS) is a hot and challenging problem in signal processing. In the traditional UBSS algorithm, the number of source signals is often assumed to be known, which is very inconvenient in practice. Additionally, it is more difficult to obtain the accurate estimation of mixing matrix in the underdetermined case. However, these information has a great influence on the source separation results, which can easily lead to poor separation performance. In this paper, a novel UBSS algorithm is presented to carry out a combined source signal number estimation and source signal separation task. In the proposed algorithm, we first design a gap-based detection method to detect the number of source signals by eigenvalue decomposition. Then, the estimation of mixing matrix is processed using a higher-order cumulant-based method so that the uniqueness of the estimated mixing matrix is guaranteed. Furthermore, an improved l1-norm minimization algorithm is proposed to estimate the source signals. Meanwhile, the per-conditioned conjugate gradient technology is employed to accelerate the convergence rate such that the computational load is reduced. Finally, a series of simulation experiments with synthetic heart sound data and image reconstruction results demonstrate that the proposed algorithm achieve better separating property than the state-of-the-art algorithms.

Identification of S1 and S2 Heart Sounds using Spectral and Convex Hull Features

Article

Jan 2020

A new set of morphological characteristics of Phonocardiogram (PCG) signal is presented for recognition of first (S1) and second (S2) heart sounds (HSs). Initially, variational mode decomposition on PCG signal generates a set of amplitude and frequency modulated narrow band-limited components (NBCs) and Hilbert transformation of these NBCs comprehends its complex plane analytic signal representation (ASR). Instantaneous spectral attributes encompassing amplitude modulation bandwidths and convex hull area measure from the ASRs are concatenated to form the feature set. Experimental results on both publicly available and experimentally recorded HSs signals outperform the existing state-of-the-art. Also, the proposed technique does not require any timing information between S1 and S2 and electrocardiogram (ECG) signal reference and is highly robust to noisy real-world PCGs as shown by noise analysis.

Abnormal heart sound detection using temporal quasi-periodic features and long short-term memory without segmentation

Article

Aug 2019
BIOMED SIGNAL PROCES

Abnormal heart sound detection is an effective and convenient method for the preliminary diagnosis of heart diseases. In this study, we propose a novel method for abnormal heart sound detection using temporal quasi-periodic features and long short-term memory without segmentation. In the proposed method, the spectrogram of the heart sound signal is extracted using the short-time Fourier transform in the first step. Subsequently, the temporal quasi-periodic features of the heart sound signal are calculated by the average magnitude difference function from the spectrogram in different frequency bands. Moreover, to extract the dependency relation within the temporal quasi-periodic features, the method of long short-term memory is applied. Thus, more discriminative features are obtained. Finally, the performance of the proposed method is evaluated on the public dataset offered by the 2016 PhysioNet/Computing in Cardiology Challenge, and the results indicate that our proposed method is competitive compared with the state-of-the-art abnormal heart sound detection methods.

A hybrid method for fundamental heart sound segmentation using group-sparsity denoising and variational mode decomposition

Article

Jul 2019

Segmentation of fundamental heart sounds–S1 and S2 is important for automated monitoring of cardiac activity including diagnosis of the heart diseases. This pa-per proposes a novel hybrid method for S1 and S2 heart sound segmentation using group sparsity denoising and variation mode decomposition (VMD) technique. In the proposed method, the measured phonocardiogram (PCG) signals are denoised using group sparsity algorithm by exploiting the group sparse (GS) property of PCG signals. The denoised GS-PCG signals are then decomposed into subsequent modes with specific spectral characteristics using VMD algorithm. The appropriate mode for further processing is selected based on mode central frequencies and mode energy. It is then followed by the extraction of Hilbert envelope (HEnv) and a thresholding on the selected mode to segment S1 and S2 heart sounds. The performance advantage of the proposed method is verified using PCG signals from benchmark databases namely eGeneralMedical, Littmann, Washington, and Michigan. The proposed hybrid algorithm has achieved a sensitivity of 100%, positive predictivity of 98%, accuracy of 98% and detection error rate of 1.5%. The promising results obtained suggest that proposed approach can be considered for automated heart sound segmentation.

Supervised model for Cochleagram feature based fundamental heart sound identification

Article

Jul 2019
BIOMED SIGNAL PROCES

The efficiency of automated heart sound analysis mostly depends on accurate detection of acoustic events. In this study, an acoustic feature based heart sound segmentation algorithm has been proposed for automatic identification of the fundamental heart sounds (FHS). Gammatone filter bank energy has been introduced to represent the heart sound distinctive features. A supervised artificial neural network (ANN) model is used to detect S1-S2 and non S1-S2 segments of the cardiac cycle. Finally time based information is utilized to identify S1 and S2 positions. Performance of the system is evaluated using 764 real and noisy heart sound cycles (both normal and abnormal domains) from the 2016 PhysioNet/CinC challenge database with annotations provided for heart sound states. The accuracy achieved using Cochleagram feature is more than 95% for both first and second heart sound identification. Proposed technique shows that multilayer perceptron (MLP) neural network using Cochleagram feature improvises the overall S1-S2 identification accuracy compared to the other acoustic features reported earlier.

Dynamic Synthetic Minority Over-Sampling Technique-Based Rotation Forest for the Classification of Imbalanced Hyperspectral Data

Article

Jun 2019

Rotation forest (RoF) is a powerful ensemble classifier and has attracted substantial attention due to its performance in hyperspectral data classification. Multi-class imbalance learning is one of the biggest challenges in machine learning and remote sensing. The standard technique for constructing RoF ensemble tends to increase the overall accuracy; RoF has difficulty to sufficiently recognize the minority class. This paper proposes a novel dynamic SMOTE (synthetic minority oversampling technique)-based RoF algorithm for the multi-class imbalance problem. The main idea of the proposed method is to dynamically balance the class distribution before building each rotation decision tree. A resampling rate is set in each iteration (ranging from 10% in the first iteration to 100% in the last) and this ratio defines the number of minority class instances randomly resampled (with replacement) from the original dataset in each iteration. The rest of the minority class instances are generated by the SMOTE method. The reported results on three real hyperspectral datasets show that the proposed method can get better performance than random forest, RoF, and some popular data sampling methods.

A new approach for the detection of abnormal heart sound signals using TQWT, VMD and neural networks

Abstract and Figures

Recommended publications

Automated detection of abnormal heart sound signals using Fano-factor constrained tunable quality wa...

A novel technique for the detection of myocardial dysfunction using ECG signals based on hybrid sign...

A new learning and classification framework for the detection of abnormal heart sound signals using...

Classification of myocardial infarction based on hybrid feature extraction and artificial intelligen...