Content uploaded by Chengzhi Yuan
Author content
All content in this area was uploaded by Chengzhi Yuan on Jul 30, 2020
Content may be subject to copyright.
Vol.:(0123456789)
Artificial Intelligence Review
https://doi.org/10.1007/s10462-020-09875-w
1 3
A new approach forthedetection ofabnormal heart sound
signals using TQWT, VMD andneural networks
WeiZeng1 · JianYuan1· ChengzhiYuan2· QinghuiWang1· FenglinLiu1· YingWang1
© Springer Nature B.V. 2020
Abstract
Phonocardiogram (PCG) plays an important role in evaluating many cardiac abnormali-
ties, such as the valvular heart disease, congestive heart failure and anatomical defects of
the heart. However, effective cardiac auscultation requires trained physicians whose work
is tough, laborious and subjective. The objective of this study is to develop an automatic
classification method for anomaly (normal vs. abnormal) detection of PCG recordings
without any segmentation of heart sound signals. Hybrid signal processing and artificial
intelligence tools, including tunable Q-factor wavelet transform (TQWT), variational mode
decomposition (VMD), phase space reconstruction (PSR) and neural networks, are utilized
to extract representative features in order to model, identify and detect abnormal patterns
in the dynamics of PCG system caused by heart disease. First, heart sound signal is decom-
posed into a set of frequency subbands with a number of decomposition levels by using
the TQWT method. Second, VMD is employed to decompose the subband of the heart
sound signal into different intrinsic modes, in which the first four intrinsic modes contain
the majority of the heart sound signal’s energy and are considered to be the predominant
intrinsic modes. They are selected to construct the reference variable for analysis. Third,
phase space of the reference variable is reconstructed, in which the properties associated
with the nonlinear PCG system dynamics are preserved. Three-dimensional PSR together
with Euclidean distance has been utilized to derive features, which demonstrate significant
difference in PCG system dynamics between normal and abnormal heart sound signals.
Finally, PhysioNet/CinC Challenge heart sound database is used for evaluation and the
synthetic minority over-sampling technique method is applied to balance the datasets. By
using the 10-fold cross-validation style, experimental results demonstrate that the proposed
features with dynamical neural networks based classifier yield classification performance
with sensitivity, specificity, overall score and accuracy values of 97.73
%
, 98.05
%
, 97.89
%
,
and 97.89
%
, respectively. The results verify the effectiveness of the proposed method
which can serve as a potential candidate for the automatic anomaly detection in the clinical
application.
Keywords Heart sound· Phonocardiogram (PCG)· Tunable Q-factor wavelet transform
(TQWT)· Variational mode decomposition (VMD)· Phase space reconstruction (PSR)·
System dynamics· Synthetic minority over-sampling technique (SMOTE)· Neural
networks
Extended author information available on the last page of the article
W.Zeng et al.
1 3
1 Introduction
Cardiac auscultation is one of the most popular non-invasive and cost-effective procedures
for the early diagnosis of various cardiac abnormalities, such as the valvular heart disease,
congestive heart failure and anatomical defects of the heart (Alam etal. 2010). However,
effective cardiac auscultation requires trained physicians which is not accessible in remote
regions and low-income countries of the world. In addition, physicians’ work is tough, tedi-
ous and subjective. Therefore, machine learning based automated heart sound classification
systems can be of significant impact for early diagnosis of cardiac diseases (Humayun etal.
2020).
Automated classification of the heart sound signals (i.e., the Phonocardiogram, PCG),
has attracted increasing attentions and has been extensively studied in the past few dec-
ades. It can be generally divided into two areas: (1) segmentation of the heart sound sig-
nals; and (2) detection of heart sound recordings as pathologic or physiologic (Humayun
et al. 2020). For the former one, in previous studies, several PCG signal segmentation
methods have been proposed based on the digital filters (Varghees etal. 2014), Fourier
transform (FT), short-time Fourier transform (STFT) and time-frequency representation
(Boutana et al. 2011), Hilbert transform (HT) (Sun et al. 2014), homomorphic filtering
(Hassani etal. 2014), empirical wavelet transform (EWT) (Varghees and Ramachandran
2017), wavelet packet transform (WPT) (Safara etal. 2013), empirical mode decomposi-
tion (EMD) (Cheema and Singh 2019), ensemble EMD (EEMD) (Papadaniil and Had-
jileontiadis 2013), variational mode decomposition (VMD) (Sujadevi et al. 2019), Mel
frequency cepstral coefficient (MFCC) (Nogueira etal. 2019), and higher order statistics
(Xie etal. 2019). Springer etal. (2015) proposed a logistic regression based hidden semi-
Markov model (HSMM) for the segmentation of the first (S1) and second (S2) heart sound
within noisy, real-world PCG recordings. Varghees and Ramachandran (2017) proposed
empirical wavelet transform (EWT) based algorithm for the PCG signal decomposition.
Messner etal. (2018) proposed an event detection approach with deep recurrent neural net-
works (DRNNs) for heart sound segmentation, i.e. the detection of the state-sequence of
the S1 and S2 heart sound. On the contrary, Deng and Han (2016) proposed a new frame-
work for heart sound classification without any segmentation. They extracted autocorrela-
tion features from the sub-band envelopes by computing the sub-band coefficients of the
heart sound signal with the discrete wavelet decomposition (DWT). Following that, the
autocorrelation features were used for obtaining the unified feature representation with dif-
fusion maps.
For the detection of heart sound recordings as pathologic or physiologic, researchers
have utilized various machine learning algorithms, such as support vector machine (SVM)
(Li etal. 2019a), neural network (NN) (Beritelli etal. 2018), hidden semi-Markov model
(HSMM) (Noman etal. 2020), k-neareast neighbor (KNN) (Singh and Majumder 2019),
decision tree (Langley and Murray 2017), and convolutional neural network (CNN) (Xiao
etal. 2019), to deal with the problem. Zhang etal. (2017) proposed a scaled spectrogram
and partial least squares regression (PLSR) based method for the extraction of effective
features from PCG signals. Then these features were fed to the support vector machine
(SVM) for the classification of PCG signals. Whitaker etal. (2017) combined the sparse
coding features with time-domain features to classify PCG signals by using the SVM clas-
sifier. Hamidi etal. (2018) utilized curve fitting and Mel frequency cepstrum coefficients
(MFCC) fused with the fractal dimension to extract features from heart sound signals.
Then the nearest neighbor classifier with Euclidean distance was used for the classification
A new approach forthedetection ofabnormal heart sound signals…
1 3
task. Zhang etal. (2019) proposed a method for abnormal heart sound detection using tem-
poral quasi-periodic features and long short-term memory (LSTM) without segmentation.
Bozkurt etal. (2018) fed MFCC and Mel-Spectrogram features into convolutional neural
network (CNN) for the PCG signal classification.
Above-mentioned works have achieved excellent performance by using different signal
processing and machine learning methods. Nonetheless, since the abnormal heart sound
detection is based upon PCG signals, the use of signal processing techniques, feature
extraction and selection become critical and challenging regarding the design of specialized
computerized systems. Due to the discrete-time, oscillatory and nonlinear characteristics of
heart sound signals (Li et al. 2019b), numerous methods with combination of time and
frequency domains and nonlinear analysis have been developed to handle the classification
problem. For the time-frequency-domain analysis, recently, the tunable Q-factor wavelet
transform (TQWT) has become popular in biomedical signal processing as a flexible and
discrete wavelet transform that is applicable particularly for analysing oscillatory signals
(Selesnick 2011; Nishad etal. 2018; Patidar etal. 2017; Hassan et al. 2016). The TQWT
is capable of adjusting its Q-factor and has thus emerged as a powerful tool for oscillatory
signals analysis. By changing the Q-factor and redundancy, the oscillatory behavior of the
wavelet basis can better reflect the oscillatory behavior of the signal (Selesnick 2011). Fol-
lowing that a sparse signal representation can be obtained, which will in turn improve the
performance of sparsity-based signal processing for applications in denoising, classifica-
tion and signal separation. Patidar and Pachori (2014) proposed a constrained TQWT based
segmentation of cardiac sound signals into heart beat cycles. The features obtained from
heart beat cycles of separately reconstructed heart sounds and murmur can better represent
the various types of cardiac sound signals than that from containing both. Therefore, heart
sounds and murmur have been separated using constrained TQWT. Jain and Tiwari (2018)
presented a segmentation method for the PCG signal. Parameters of TQWT were tuned to
vary the frequency range of the approximation level such that its kurtosis was maximized.
The intrinsic characteristic of heart sound signal is revealed from the nonlinear perspec-
tive. It provides important information for the feature of heart sound signal. These nonlin-
ear parameters, extracted through different types of entropies (Cheema and Singh 2019),
multifractal analysis (Gavrovska etal. 2016), and recurrence quantification analysis (RQA)
(Liang etal. 2015), have been employed for automatic detection of abnormal heart sound
signal. Considering the characteristics that the heart sound signal is highly random, non-
linear and nonstationary in nature (Li etal. 2019b), self-adaptive signal processing meth-
ods, such as empirical mode decomposition (EMD) (Huang etal. 1998; Huang and Kunoth
2013) and local mean decomposition (LMD) (Park etal. 2011), have been employed to
extract effective and predominant features from heart sound signals (Cheema and Singh
2019; Salman etal. 2016; Liu etal. 2010). EMD decomposes a multi-component signal
into a number of individual monocomponents, that is, intrinsic mode functions and a resid-
ual signal while LMD decomposes any complicated signal into a series of product func-
tions. However, there exist some drawbacks in these methods, in which the EMD method
contains over envelope, mode mixing, end effects and unexplainable negative frequency
caused by Hilbert transformation (Chen etal. 2011), while the LMD method has distorted
components, mode mixing and time-consuming decomposition (Li etal. 2015). Recently,
variational mode decomposition (VMD) was proposed by Dragomiretskiy and Zosso
(2014) as an alternative to the EMD and LMD for the separation of composite real-valued
time series into respective modes. VMD has been extensively used in the areas of biomedi-
cal signal processing, speech signal processing and seismic signal processing (Mert 2016;
Lal et al. 2018; Xue et al. 2016). It has been reported that VMD is theoretically better
W.Zeng et al.
1 3
founded compared to the sequential iterative sifting of EMD. VMD is based on a clear var-
iational model and the resulting minimization steps perform concurrent mode extraction in
an intuitive way (Wang etal. 2017). It was also pointed out by Dragomiretskiy and Zosso
(2014) that VMD over EMD has some advantages on tones separation and is less sensitive
to noise and sampling. VMD captures the relevant center frequencies, which can ensure
good frequency separation and is efficient for identifying various discontinuities present
in a non-stationary signal (Dragomiretskiy and Zosso 2014; Mert 2016). Sujadevi et al.
(2019) used group sparsity algorithm to denoise the measured PCG signals by exploiting
the group sparse (GS) property of PCG signals. The denoised GS-PCG signals were then
decomposed into subsequent modes with specific spectral characteristics using VMD algo-
rithm. The appropriate mode for further processing was selected based on mode central
frequencies and mode energy. It was then followed by the extraction of Hilbert envelope
and a thresholding on the selected mode to segment S1 and S2 heart sounds. Mishra etal.
(2018) employed VMD technique for the separation of heart sound (HS) and lung sound
(LS) signals, resulting in minimizing the HS interference from LS signals. Mishra etal.
(2020) used VMD to generate a set of amplitude and frequency modulated narrow band-
limited components (NBCs). The VMD-based decomposition of PCG signals in terms of
NBCs was used for quantifying the nonlinear and non-stationary nature of PCG signals. In
the present work we have developed a novel technique to compute the representative fea-
tures based on TQWT and VMD algorithms which are applied to the heart sound signals.
We hypothesize that these features reflect the abnormal alterations in the dynamics of the
PCG system and can achieve high sensitivity and specificity simultaneously as a discrimi-
nator of abnormal heart sound signal. The ultimate goal of the present study is to propose a
novel method for the detection of abnormal PCG signal. It can provide practitioners with a
more robust, simple and computing-efficient computer-aided tool compared with the clas-
sical cardiac auscultation schemes based on the physicians’ experience.
The main contributions of this work are highlighted as follows:
• TQWT decomposes the heart sound signal into different frequency bands, which are
used to extract the main subband with majority of the heart sound signal’s energy.
• VMD method captures most part of the signal information, preserving important wave-
form features as a slightly asymmetry. It resolves mode mixing and aliasing problems
with high computational efficiency. With the employment of VMD, it could measure
the variability of the heart sound signal. The first four intrinsic modes are then extracted
as predominant modes which contain majority of the heart sound signal’s energy.
• 3D phase space of the predominant intrinsic mode is reconstructed, in which properties
associated with the PCG system dynamics are preserved.
• A reliable model for the anomaly detection of PCG recordings is proposed based on the
difference of PCG system dynamics between normal and abnormal heart sound signals.
The rest of this paper is organized as follows. Section2 introduces the details of the pro-
posed method, including the PhysioNet/CinC Challenge 2016 heart sound database,
TQWT, VMD, PSR, ED, feature extraction and selection, learning and classification mech-
anisms. Section3 presents experimental results. Sections 4 and 5 give some discussions
and conclusions, respectively.
A new approach forthedetection ofabnormal heart sound signals…
1 3
2 Method
In this section, we propose a method to discriminate between normal and abnormal heart
sound signals using the information obtained from nonlinear PCG system dynamics for
anomaly detection of PCG recordings. It is divided into the training stage and the clas-
sification stage, which include the following steps. In the first step, TQWT is employed to
decompose the heart sound signal into different frequency bands. In the second step, VMD
is applied to decompose the predominant subband of the heart sound signal into several
intrinsic modes to extract predominant modes. In the third step, PSR is applied to extract
nonlinear dynamics of PCG system and Euclidean distances are computed. Finally, feature
vectors are fed into the neural networks for the modeling and identification of PCG system
dynamics. The difference of PCG system dynamics between normal and abnormal heart
sound signals will be applied for the classification task. The procedure of the proposed
algorithm is illustrated in Fig.1.
2.1 Heart sound database
In this study we utilize the popular and public PhysioNet/CinC Challenge 2016 heart sound
database (Liu etal. 2016; Goldberger etal. 2003) which is available at the following web-
site: https://physionet.org/content/challenge-2016. This database is consisting of six heart
sound datasets (a through f) from different research groups. In these datasets heart sound
signals were sourced from several contributors around the world from both healthy sub-
jects and pathological patients with certain heart diseases. Specifically, the Challenge set
consists of 3153 heart sound recordings from 764 subjects/patients, lasting from 5s to just
over 120s which were resampled to 2000 Hz. Figure2 demonstrates samples of the wave-
forms corresponding to a normal and an abnormal heart sound signal.
Fig. 1 Flowchart of the proposed method for the anomaly detection of PCG recordings using TQWT,
VMD, PSR, ED and neural networks
W.Zeng et al.
1 3
The heart sound recordings were collected from different locations on the body, in which
the typical four locations are aortic area, pulmonic area, tricuspid area and mitral area. In
the database, heart sound recordings were divided into two types: normal and abnormal.
The normal recordings were from healthy subjects while the abnormal ones were from
patients with a confirmed cardiac diagnosis. The patients suffered from a variety of ill-
nesses (which we do not provide on a case-by-case basis), but typically they were heart
valve defects and coronary artery disease patients. All the recordings from the patients
were generally labelled as abnormal. The grouped types are further divided into the train-
ing dataset and testing dataset using the 10-fold cross-validation method. The details of the
datasets are demonstrated in Table1. The number of normal recordings is 2488 while the
number of abnormal recordings is 665. All the six datasets are unbalanced, i.e., the number
of normal recordings does not equal that of abnormal recordings.
A balanced heart sound database is selected (Otherwise, without prior probabilities on
the illness, a prevalence bias would be created.), where the abnormal and normal signals
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
(a)Normal heartsound signal
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 1000 2000300040005000600070008000900010000
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
(b)Abnormalheart soundsignal
Fig. 2 The waveforms of heart sound signals
Table 1 Numbers of raw and balanced recordings for each dataset
Here
#
represents ‘number of’
Dataset name
#
Raw recordings
#
Recordings after selected
balanced
#
Recordings after
balanced with SMOTE
method
Abnormal Normal Abnormal Normal Abnormal Normal
a 292 117 117 117 292 234
b 104 386 104 104 312 386
c 24 7 7 7 24 21
d 28 27 27 27 28 27
e 183 1871 183 183 1830 1871
f 34 80 34 34 68 80
Total 665 2488 472 472 2554 2619
Ratio of abnormal
to normal
0.27 1 0.98
A new approach forthedetection ofabnormal heart sound signals…
1 3
have the same number of recordings, as shown in Table1. However, the selected balanced
database might reduce the number of raw recordings, especially in Datasets b and e. There-
fore, we then adopt the synthetic minority over-sampling technique (SMOTE) algorithm
(Chawla etal. 2002) to over-sample the minority class so as to balance the database in
avoidance of greatly reducing the raw recordings of normal and abnormal heart sound
signals. SMOTE is a popular over-sampling technique for handling imbalanced class data
which can create synthetic samples in the minority group by applying an iterative search
and selection approach (Feng et al. 2019; Wang etal. 2019; Rivera and Xanthopoulos.
2016). Each observation from the minority class will be iterated through till the needed
number is reached.
The working principle of the SMOTE algorithm is briefly depicted as follows. For
details, please refer to (Chawla etal. 2002).
• Required: Minority Data
D=xi∈X
where
i=1, 2, ..., T
. Number of minority
instances (T), SMOTE percentage (N), Number of nearest neighbors (k).
• for
i=1, 2, ..., T
do
The working principle of the SMOTE algorithm is briefly depicted as find k nearest
minority neighbors of
xi
̂
N=[N∕100]
while
̂
N≠0
do
select one of the k nearest neighbor,
̄x
select random number
𝛼∈[0, 1]
̂x=xi+𝛼(̄x−xi)
Append
̂x
to S
̂
N=
̂
N−1
• Output: Synthetic data S
With SMOTE algorithm, the balanced datasets are illustrated in Table1.
The heart sound is subjected to the following de-nosing preprocessing step. Heart sound
signals obtained using diagnostic tools are usually contaminated with noise from various
sources. These sounds hinder the early detection of mild heart sounds in the PCG signals.
So filtering of noise to remove such artifacts becomes essential (Shervegar and Bhat 2018).
This should be done at the cost of preserving all diagnostic information required for analy-
sis of the PCG signals, but removing all unwanted entities called noise. The heart sound
taken from the Physionet database is contaminated with various types of noises. The heart
sound selected is heavily filtered to remove the maximum noise from the sound. A 6th-
order Chebyshev low-pass filter with cut-off frequency of 140 Hz is used for this purpose.
The noises are in high frequencies while diagnostic information is in low frequencies. Fil-
tering removes the high-frequency noise.
2.2 Tunable Q‑factor wavelet transform (TQWT)
Wavelet transform is an effective time-frequency tool for the analysis of non-stationary sig-
nals. The tunable Q-factor wavelet transform (TQWT) is a flexible fully-discrete wavelet
transform suitable for analysis of oscillatory signals (Selesnick 2011). TQWT depends on
changeable parameters: Q-factor (Q), redundancy (R), and decomposition level (J). Gen-
erally, Q measures the oscillatory behavior and waveform shape of wavelet waveform. R
W.Zeng et al.
1 3
helps localize the wavelet in time-domain without affecting its shape. The decomposition
level J controls the expansion extent and bandpass location of wavelet waveform. There
will be a total of
J+1
subbands. For the TQWT parameters, the wavelet transform should
have a low Q-factor when the signal illustrates small or no oscillatory behavior. On the
other hand, the wavelet transform should have a relatively high Q-factor for the analy-
sis and processing of oscillatory signals. Q is often setting at a high value because heart
sound signals have more oscillations. It is worth noting that unwanted excessive ringing
of wavelets needs to be prevented while performing TQWT by appropriately choosing the
value of R greater than or equal to 3 (Selesnick 2011). Generally, a value of
R=3
is rec-
ommended. The TQWT decomposes heart sound signals into subbands with a number of
decomposition levels by using the input parameters (Q, R, and J). TQWT consists of two
iterative band-pass filter banks, i.e., the high resonance component filter
Hfilter(𝜔)
and the
low resonant component filter
Lfilter(𝜔)
. The resonance characteristics of oscillatory signal
can be represented by quality factor Q, i.e. the ratio of its center frequency to its band-
width,
Q=fc∕Bw
, where
fc
denotes the center frequency and
Bw
represents the bandwidth
of signal.
Let the low-pass and high-pass scaling factors of the two-channel filter bank be denoted
by
𝜆
and
𝜎
, respectively. In order to prevent excessive redundancy and achieve perfect
reconstruction, the scaling factors should be:
0<𝜆<1
,
0<𝜎≤1
,
𝜆+𝜎>1
. Mathemat-
ically, the low-pass filter
Lfilter(𝜔)
and high-pass filter
Hfilter(𝜔)
are expressed as follows
(Selesnick 2011), respectively :
and
where
𝜃(𝜔)
is the frequency response of Daubechies filter and is defined with the following
expression:
The Q-factor, R and maximum number of decomposition level
Jmax
can be expressed in
terms of parameters
𝜆
and
𝜎
as follows:
where L is the length of the analysed heart sound signal. Detailed expressions of Q, R,
Jmax
,
fc
and
Bw
are provided in (Selesnick 2011).
In order to extract efficient heart sound signal bands, 10 levels (
J=10
,
J+1=11
subbands) of TQWT with
Q=3
and
R=3
have been empirically selected in this study.
Figures 3 and 4 represent the decomposed TQWT coefficient plot and energy distri-
bution over sample values for normal and abnormal PCG signals. Here, subband 1
(1)
L
filter(𝜔)=
⎧
⎪
⎨
⎪
⎩
1, if ∣𝜔∣≤(1−𝜎)𝜋
𝜃(𝜔+(𝜎−1)𝜋
𝜆+𝜎−1),if (1−𝜎)𝜋<∣𝜔∣< 𝜆𝜋
0, if 𝜆𝜋 ≤∣𝜔∣≤𝜋
(2)
H
filter(𝜔)=
⎧
⎪
⎨
⎪
⎩
0, if ∣𝜔∣≤(1−𝜎)𝜋
𝜃(𝜆𝜋−𝜔
𝜆+𝜎−1),if (1−𝜎)𝜋<∣𝜔∣< 𝜆𝜋
1, if 𝜆𝜋 ≤∣𝜔∣≤𝜋
(3)
𝜃
(𝜔)=0.5 ×(1+cos(𝜔)) ×
√
2−cos(𝜔),∣𝜔∣
≤
𝜋
.
(4)
Q
=
f
c
Bw
=2−𝜎
𝜎
;R=𝜎
1−𝜆
;Jmax =
log(𝜎L∕8)
log(1∕𝜆),
A new approach forthedetection ofabnormal heart sound signals…
1 3
Fig. 3 Examples of subbands of 10 levels TQWT of the normal and abnormal heart sound signals
W.Zeng et al.
1 3
corresponds to the high-frequencies and subband 11 corresponds to the low-frequen-
cies. It is deduced that heart sound activity shows significant variations in value over
all frequency sub-bands. However, low frequency subbands show large variation in
heart sound activity and carry high amount of energy compared to high frequency sub-
bands. It is observed from these figures that majority of the heart sound signal’s energy
is concentrated in the 11th subband (marked as
Sub11
), especially for the abnormal heart
sound signal. In comparison, nearly 2
%
of the normal heart sound signal’s energy is dis-
tributed in subbands 9 and 10, respectively, which means the energy is relatively decen-
tralized. Since the majority of the heart sound signal’s energy is concentrated in the
11th subband,
Sub11
is selected for feature acquisition.
2.3 Variational mode decomposition (VMD)
VMD is aiming to decompose a composite input signal x(t) into n number of intrinsic
modes
𝜇n(t)
which have specific sparsity properties while reproducing the input signal.
The decomposition process can be written as a constrained variational problem with the
following function:
where K is the number of decomposition modes,
𝜕
𝜕t
[⋅
]
denotes the partial deriva-
tive of a function,
𝛿
is the Dirac function, ‘
∗
’ represents convolution computation,
𝜇n={𝜇1,𝜇2,…,𝜇n}
is the set of all modes,
𝜔n={𝜔1,𝜔2,…,𝜔n}
is the set of center fre-
quency, t is the time script, j is the complex square root of
−1
.
Considering a quadratic penalty term and Lagrange multipliers
𝜂
, the above-men-
tioned constrained variational problem can be transferred into an unconstrained optimi-
zation problem, which is represented as follows:
(5)
min
𝜇
n,𝜔n
{
K
∑
n=1
‖
‖
‖
‖
𝜕
𝜕t[(𝛿(t)+ j
𝜋t)∗𝜇n(t)]e−j𝜔kt
‖
‖
‖
‖
2
2
}
, subject to
K
∑
n=1
𝜇n(t)=x(t)
,
SUBBAND
0
10
20
30
40
50
60
70
80
90
100
SUBBAND ENERGY (% OF TOTAL)
DISTRIBUTION OF SIGNAL ENERGY
(a)Normal
123456789101112345678910 11
SUBBAND
0
10
20
30
40
50
60
70
80
90
100
SUBBAND ENERGY (% OF TOTAL)
DISTRIBUTION OF SIGNAL ENERGY
(b)Abnormal
Fig. 4 Examples of the energy distribution of the subbands of TQWT of the normal and abnormal heart
sound signals
A new approach forthedetection ofabnormal heart sound signals…
1 3
where L denotes the augmented Lagrangian,
𝛼
is balancing parameter of the data-fidelity
constraint,‘
⟨
⋅
⟩
’ represents the inner product.
Alternate direction method of multipliers (ADMM) has been used to generate vari-
ous decompose modes and centre frequency at the time of shifting operation of each mode
(Dragomiretskiy and Zosso 2014). The solution of Eq.(6) can be derived by using ADMM, in
which the process of the solution of
𝜇n
and
𝜔n
mainly consists of the following steps:
• Step 1 Intrinsic mode update. The Wiener filtering is embedded for updating the mode
directly in Fourier domain with a filter tuned to the current center frequency. The solution
for updated mode is obtained as follows:
where
𝜅
is the number of iterations,
̂x(𝜔)
,
̂𝜇 i(𝜔)
and
̂𝜂 (𝜔)
represent the Fourier trans-
forms of
̂x(t)
,
̂𝜇 i(t)
and
̂𝜂 (t)
, respectively.
• Step 2 Center frequency update. The center frequency is updated as the center of gravity of
the corresponding mode’s power spectrum, which is represented as follows:
The complete algorithm of VMD can be found in (Dragomiretskiy and Zosso 2014). The
VMD method can effectively capture narrow-band and wide-band modes unlike the fixed
bandwidth of subabands as in the case of the wavelet transform based decomposition approach
(Babu etal. 2018). It is more robust to noisy data. Since each mode is updated by Wiener fil-
tering in Fourier domain during the optimization process, the updated mode is less affected by
noisy disturbances. Therefore, VMD can be more efficient for capturing the signal’s short and
long variations (Mishra etal. 2018; Sujadevi etal. 2019). Hence we apply the VMD method to
make up for the disadvantage of TQWT and serve as complementary tool to more effectively
extract features from PCG signals.
Figure5 demonstrates examples of the VMD of the 11th subband
Sub11
of the normal and
abnormal heart sound signals. It is obvious that each
Sub11
is decomposed into 6 intrinsic
modes which are respectively denoted by
𝜇1,𝜇2,…,𝜇6
. The lower modes are slow varying in
time domain while higher modes exhibit faster variation. Results show that the dominant com-
ponents of the PCG signal are the fundamental heart sounds that may appear in the first fewer
modes of the signal decomposition.
(6)
L
({𝜇n},{𝜔n},𝜂)=𝛼
K
∑
n=1‖
‖
‖
‖
𝛿t[(𝛿(t)+ j
𝜋t)∗𝜇n(t)]e−j𝜔kt‖
‖
‖
‖
2
2
+
‖
‖
‖
‖
‖
‖
x(t)−
K
∑
n=1
𝜇n(t)
‖
‖
‖
‖
‖
‖
2
2
+
⟨
𝜂(t),x(t)−
K
∑
n=1
𝜇n(t)
⟩
,
(7)
̂𝜇
𝜅+1
n=
̂x(𝜔)−
∑
i≠n̂𝜇 i(𝜔)+
̂𝜂 (𝜔)
2
1+2𝛼(𝜔−𝜔
n
)2
,
(8)
̂𝜔
𝜅+1
n=
∫∞
0𝜔
|
̂𝜇 n(𝜔)
|2
d
𝜔
∫∞
0|
̂𝜇
n
(𝜔)
|
2d𝜔
W.Zeng et al.
1 3
2.4 Phase space reconstruction (PSR)
It is sometimes necessary to search for patterns in a time series and in a higher dimen-
sional transformation of the time series (Sun et al. 2015). Phase space reconstruction is
a method used to reconstruct the so-called phase space. The concept of phase space is a
useful tool for characterizing any low-dimensional or high-dimensional dynamic system. A
dynamic system can be described using a phase space diagram, which essentially provides
a coordinate system where the coordinates are all the variables comprising mathematical
formulation of the system. A point in the phase space represents the state of the system at
any given time (Sivakumar 2002; Lee etal. 2014). Every intrinsic mode of the subbands
of the normal and abnormal heart sound signals can be written as the time series vector
𝜐={𝜐1,𝜐2,𝜐3,…,𝜐K}
, where K is the total number of data points. The phase space can be
reconstructed according to (Lee etal. 2014):
where
j=1, 2, …,K−(d−1)𝜏
, d is the embedding dimension of the phase space and
𝜏
is
a time lag. It is worthwhile to mention that the properties associated with the PCG system
dynamics are preserved in the reconstructed phase space.
The behaviour of the signal over time can be visualized using PSR (especially when
d=
2 or 3). In this work, we have confined our discussion to the value of embedding dimension
d=3
, because of their visualization simplicity. In addition, different studies have found
this value to best represent the attractor for human biological system (Venkataraman and
Turaga 2016; Som etal. 2016). For
𝜏
, we either use the first-zero crossing of the autocorre-
lation function for each time series or the average
𝜏
value obtained from all the time series
in the training dataset using the method proposed in Michael (2005). In this study, we con-
sider the values of time lag
𝜏=5
to test the classification performance. PSR for
d=3
has
been referred to as 3D PSR.
Reconstructed phase spaces have been proven to be topologically equivalent to the orig-
inal system and therefore are capable of recovering the nonlinear dynamics of the gen-
erating system (Takens 1981; Xu etal. 2013). This implies that the full dynamics of the
PCG system are accessible in this space, and for this reason, features extracted from it can
potentially contain more and/or different information than the common features extraction
method (Chen etal. 2014).
3D PSR is the plot of three delayed vectors
𝜐j,𝜐j+1
and
𝜐j+2
to visualize the dynamics of
the PCG system. Euclidian distance (ED) of a point
(𝜐j,𝜐j+1,𝜐j+2)
, which is the distance of
the point from origin in 3D PSR and can be defined as (Lee etal. 2014)
ED measures can be used in features extraction and have been studied and applied in many
fields, such as clustering algorithms and induced aggregation operators (Merigó and Casa-
novas 2011).
(9)
Yj=(𝜐j,𝜐j+𝜏,𝜐j+2𝜏,…,𝜐j+(d−1)𝜏)
(10)
ED
j=
√
𝜐2
j+𝜐2
j+1+𝜐2
j+
2
A new approach forthedetection ofabnormal heart sound signals…
1 3
200400 600800 1000 1200 1400
-0.5
0
0.5
11th subband
200400 600800 1000 1200 1400
-0.1
0
0.1
µ1
200400 600800 1000 1200 1400
-0.1
0
0.1
µ2
200400 600800 1000 1200 1400
-0.1
0
0.1
µ3
VMD of the 11th subband of the normal heart sound signal
200400 600800 1000 1200 1400
-0.05
0
0.05
µ4
200400 600800 1000 1200 1400
-0.05
0
0.05
µ5
200400 600800 1000 1200 1400
Samples
-0.1
0
0.1
µ6
(a)Original Sub11 of thenormalheart soundsignaland itsVMD.
(b)Original Sub
11
of the abnormalheart sound signal and itsVMD.
Fig. 5 Examples of VMD of
Sub11
of the normal and abnormal heart sound signals
W.Zeng et al.
1 3
2.5 Feature extraction andselection
In order to obtain more efficient features, this paper proposes the following extraction
scheme.
(1) Ten levels TQWT is employed to decompose the heart sound signal into eleven
subbands, in which the 11th subband
Sub11
contains the majority of heart sound signal’s
energy and is selected for analysis.
(2) VMD of the
Sub11
of the heart sound signal and derivation of predominant intrinsic
modes. The signals obtained by VMD method, which are a series of decomposing sig-
nals, cannot be directly used to classify because of the high feature dimension. To solve
this problem, the Pearson’s correlation coefficient is calculated to measure the correla-
tion between the first six intrinsic modes and the original
Sub11
of the heart sound signal.
The intrinsic modes with higher correlation coefficient are more highly correlated to the
original signal, which means the signal energy is mostly concentrated in these intrinsic
modes as well. In the present study most of the energy is concentrated in the first four
intrinsic mode (
𝜇1
,
𝜇2
,
𝜇3
and
𝜇4
), which contain the most important information from the
heart sound signal and are considered to be the predominant intrinsic modes (seen from
Table2). In addition, an independent t-test analysis of variance (SPSS Inc., IL, USA) is
used to compare the difference of the first six intrinsic modes between normal and abnor-
mal heart sound signals in the PhysioNet/CinC Challenge 2016 database. A p value of
Table 2 The average correlation coefficients and their statistical analysis between each intrinsic mode and
the original 11th subband (
Sub11
) of TQWT of all the raw normal and abnormal heart sound signals from
the PhysioNet/CinC Challenge 2016 heart sound database
A p value of < 0.05 in bold is considered to indicate statistical significance
Heart sound type Average correlation coefficients
𝜇1
𝜇2
𝜇3
𝜇4
𝜇5
𝜇6
Normal of Dataset a 0.4082 0.5159 0.4388 0.3119 0.1679 0.1523
Abnormal of Dataset a 0.4342 0.5217 0.4154 0.3045 0.1651 0.1426
Difference between groups (p value) 0.002 0.042 0.001 0.044 0.549 0.158
Normal of Dataset b 0.4538 0.4502 0.3163 0.2133 0.1343 0.1479
Abnormal of Dataset b 0.4885 0.4515 0.3193 0.2148 0.1311 0.1412
Difference between groups (p value) 0.013 0.036 0.034 0.042 0.485 0..612
Normal of Dataset c 0.4582 0.5036 0.4206 0.2967 0.1613 0.1521
Abnormal of Dataset c 0.4818 0.5023 0.3918 0.2623 0.1624 0.1376
Difference between groups (p value) 0.003 0.048 <0.001 <0.001 0..852 0.109
Normal of Dataset d 0.4635 0.4925 0.3690 0.2321 0.1456 0.1180
Abnormal of Dataset d 0.4535 0.5074 0.4221 0.2999 0.1650 0.1709
Difference between groups (p value) 0.037 0.047 <0.001 <0.001 0.068 <0.001
Normal of Dataset e 0.4161 0.4688 0.4284 0.3306 0.1245 0.1166
Abnormal of Dataset e 0.3320 0.5326 0.4250 0.2798 0.1728 0.1549
Difference between groups (p value) <0.001 <0.001 0.474 <0.001 <0.001 <0.001
Normal of Dataset f 0.4463 0.4999 0.4678 0.3787 0.1299 0.1152
Abnormal of Dataset f 0.4389 0.4960 0.4528 0.3865 0.1438 0.1447
Difference between groups (p value) 0.019 0.175 <0.001 0.039 0.52 <0.001
Mean value of correlation coefficients 0.4396 0.4952 0.4056 0.2926 0.1503 0.1412
A new approach forthedetection ofabnormal heart sound signals…
1 3
<0.05
is considered to indicate statistical significance. It is seen from Table2 that there
exist significant differences in most cases of the first four intrinsic modes between normal
and abnormal heart sound signals in the six datasets. Hence, based on the Pearson’s corre-
lation coefficient and its statistical analysis,
𝜇1
,
𝜇2
,
𝜇3
and
𝜇4
of the
Sub11
of the heart sound
signal are selected as reference variable
[
Sub
𝜇
1
11
,Sub
𝜇
2
11
,Sub
𝜇
3
11
,Sub
𝜇
4
11
]
T
and are used for the
following feature derivation.
(3) Reconstruct the phase space of the reference variable with selected values of d and
𝜏;
(4) Compute ED of 3D PSR of the reference variables. Concatenate them to form a fea-
ture vector
[
EDSub
𝜇
1
11
j
,EDSub
𝜇
2
11
j
,EDSub
𝜇
3
11
j
,EDSub
𝜇
4
11
j
]T.
For the PhysioNet/CinC Challenge 2016 heart sound database, heart sound signals are ana-
lyzed and PCG system dynamics are extracted by using TQWT, VMD and 3D PSR. First, ten
levels TQWT of the normal and abnormal heart sound signals is demonstrated in Fig.3. VMD
of the 11th subband of TQWT of the heart sound signals is exhibited in Fig. 5. The first four
intrinsic modes are utilized to form the reference variable
[
Sub
𝜇
1
11
,Sub
𝜇
2
11
,Sub
𝜇
3
11
,Sub
𝜇
4
11
]
T
. Sam-
ples of the 3D PSR of the reference variable for normal and abnormal PCG signals are exhib-
ited in Figs.6 and7. It can be observed that phase space tracks of the abnormal heart sound
signals are in a more chaotic state in comparison to the normal heart sound signals. The asym-
metric nature of the portraits fitted on the 3D space portrays the erratic time-varying phase
space dynamics of the abnormal PCG signals. These figures show that patterns related to the
higher dimensional transformations can be more discriminative than those in the time series
-0.1
0.1
-0.05
0.05 0.1
0
υj+2
0.05
0.05
υj+1
0
υj
0
0.1
-0.05 -0.05
-0.1 -0.1
(a)3D PSRofSubµ1
11 forµ1.
-0.1
0.1
-0.05
0.05 0.1
0
υj+2
0.05
0.05
υj+1
0
υj
0
0.1
-0.05 -0.05
-0.1 -0.1
(b)3D PSRofSubµ2
11 forµ2.
-0.08
0.1
-0.06
-0.04
-0.02
0.05 0.1
0
υj+2
0.02
0.05
0.04
υj+1
0
0.06
υj
0
0.08
-0.05 -0.05
-0.1 -0.1
(c)3D PSRofSubµ3
11
forµ3.
-0.05
0.05
0.05
0
υj+2
υj+1
0
υj
0
0.05
-0.05 -0.05
(d)3D PSRofSubµ4
11
forµ4.
Fig. 6 Samples of 3D PSR of
[
Sub
𝜇
1
11
,Sub
𝜇
2
11
,Sub
𝜇
3
11
,Sub
𝜇
4
11
]
T
of the normal heart sound signal
W.Zeng et al.
1 3
itself. The disparity of the PCG system dynamics between the normal and abnormal PCG sig-
nals is treated as the differentiation criterion in the present study. After 3D PSR, features of
[
EDSub
𝜇1
11
j
,EDSub
𝜇2
11
j
,EDSub
𝜇3
11
j
,EDSub
𝜇4
11
j
]T for normal and abnormal heart sound signals are derived
through ED computation. It can be observed from Figs.8 and9 that the Euclidean distances
calculated from the 3D PSR in normal and abnormal heart sound signals are different from each
other. This implies that the Euclidean distances can serve as useful features in classifying the
normal and abnormal PCG signal. They are fed into the neural networks for the following mod-
eling, identification and classification of the PCG system dynamics between the two groups.
2.6 Training andmodeling mechanism based onselected features
In this section, we present a scheme for modeling and derivation of nonlinear PCG system
dynamics derived from heart sound signals of normal and abnormal subjects based on the
extracted features.
Consider a temporal data sequence
𝜑𝜁=[
Y
(
1
)
,
…
,Y
(
k
)]T∈
R
n
generated from the fol-
lowing discrete-time PCG dynamical system:
where
Y(k)=[y1(k),…,yn(k)]T∈Rn
is the state of the system, which is measurable and
represents the feature
[
EDSub
𝜇1
11
j
,EDSub
𝜇2
11
j
,EDSub
𝜇3
11
j
,EDSub
𝜇4
11
j
]T , p
=[
p
1,…,
p
n]T
is a constant
(11)
Y(k)=F(Y(k−1),…,Y(k−m);p)+v(Y(k−1),…,Y(k−m);p),
-0.02
0.02
-0.015
-0.01
-0.005
0.01 0.02
0
υj+2
0.005
0.01
0.01
υj+1
0
0.015
υj
0
0.02
-0.01 -0.01
-0.02 -0.02
(a)3D PSRofSubµ1
11 forµ1.
-0.03
0.04
-0.02
-0.01
0.02 0.03
0
υj+2
0.02
0.01
υj+1
00.01
0.02
υj
0
0.03
-0.02 -0.01
-0.02
-0.04 -0.03
(b)3D PSRofSubµ2
11 forµ2.
-0.04
0.04
-0.03
-0.02
-0.01
0.02 0.04
0
υj+2
0.01
0.02
0.02
υj+1
0
0.03
υj
0
0.04
-0.02 -0.02
-0.04 -0.04
(c)3D PSRofSubµ3
11
forµ3.
-0.03
0.04
-0.02
-0.01
0.02 0.03
0
υj+2
0.02
0.01
υj+1
00.01
0.02
υj
0
0.03
-0.02 -0.01
-0.02
-0.04 -0.03
(d)3D PSRofSubµ4
11
forµ4.
Fig. 7 Samples of 3D PSR of
[
Sub
𝜇
1
11
,Sub
𝜇
2
11
,Sub
𝜇
3
11
,Sub
𝜇
4
11
]
T
of the abnormal heart sound signal
A new approach forthedetection ofabnormal heart sound signals…
1 3
vector of system parameters (different p will generate different dynamical behaviors),
F(
⋅
;p)=[f1(
⋅
;p1),…,fn(
⋅
;pn)]T
is a smooth but unknown nonlinear PCG system dynamics,
v(
⋅;p
)=[
v
1(
⋅;p
1)
,
…
,v
n(
⋅;p
n)]T
is the modeling uncertainty.
Since the modeling uncertainty
v(
⋅
;p)
and the PCG system dynamics
F(
⋅
;p)
cannot be
decoupled from each other, we consider the two terms together as an undivided term, and
define
𝜙(
⋅
;p) ∶= F(
⋅
;p)+v(
⋅
;p)
as the general PCG system dynamics. The objective of the
training or learning stage is to identify or approximate the general PCG system dynam-
ics
𝜙(
⋅
;p)=[
𝜙
1(
⋅
;p1),…,
𝜙
n(
⋅
;pn)]T
to a desired accuracy via deterministic learning (Wang
and Hill 2006, 2007, 2009).
In the first step, standard radial basis function (RBF) neural networks are constructed in
the following form
where Z is the input vector,
W=[w1,…,wN]T∈RN
is the weight vector, N is the node
number of the neural networks, and
S(
Z
)=[
s
1(∥
Z
−
𝜇
1∥),…,
s
N(∥
Z
−
𝜇
N∥)]T
, with
s
i(∥ Z−𝜇i∥) = exp[−(Z−𝜇i)
T
(Z−𝜇i)
𝜂2
i
]
being a Gaussian function,
𝜇i(i=1, …,N)
being dis-
tinct points in state space, and
𝜂i
being the width of the receptive field.
(12)
fnn(Z)=
N
∑
i=1
wisi(Z)=WTS(Z)
,
1000 2000 300040005000600070008000900010000
Number of the data points
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
EDj
(a)Euclidiandistanceof3DPSR of Subµ1
11
forµ1.
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Number of the data points
0
0.02
0.04
0.06
0.08
0.1
0.12
EDj
(b)Euclidiandistanceof3DPSR of Subµ2
11
forµ2.
1000 2000 300040005000600070008000900010000
Number of the data points
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
EDj
(c)Euclidiandistanceof3DPSR of Subµ3
11 for
µ
3
.
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Number of the data points
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
EDj
(d)Euclidiandistanceof3DPSR of Subµ4
11
forµ
4
.
Fig. 8 Samples of the Euclidian distance of 3D PSR of
[
Sub
𝜇
1
11
,Sub
𝜇
2
11
,Sub
𝜇
3
11
,Sub
𝜇
4
11
]
T
of the normal heart
sound signal
W.Zeng et al.
1 3
In the second step, the following dynamical RBF neural networks are employed to model
and derive the general PCG system dynamics
𝜙(
⋅
;p)
:
where
̂
Y
(k)=[̂y
1
(k),…,̂y
n
(k)]
T
∈R
n
is the state vector of the dynamical model,
A=diag{a1,…,an}
is a diagonal matrix, with
|ai|<1
being design constants, localized
RBF network
̂
WT
(k)S
k
=[
̂
W
T
1
(k)S
1
,…,
̂
W
T
n
(k)S
n
]
T
are used to approximate the unknown
𝜙(
⋅
;p)=[
𝜙
1(
⋅
;p),…,
𝜙
n(
⋅
;p)]T
,
̂
W
T
(k)=[
̂
W1(k),…,
̂
Wn(k)]
is the weight estimate of the
neural networks,
Sk(Z)=S(Y(k−1),…,Y(k−m))
,
Z=[Y(k−1),…,Y(k−m)]
is the
input of the neural networks.
From Eqs. (11) and (13), the derivative of the state estimation error
ei=̂yi(k)−yi(k)
satisfies:
where
̃
Wi
=
̂
W
i
−W
∗
i
,
W∗
i
is the ideal constant neural network weight,
𝜙i
(⋅;p)=W∗
i
T
S
k
+𝜖
i
,
𝜖i
is the ideal neural network approximation error. The weight estimate
̂
Wi
is updated by the
following Lyapunov-based learning law:
(13)
̂
Y(k)=A(
̂
Y(k−1)−Y(k−1)) +
̂
WT(k)Sk(Z),
(14)
e
i
(k+1)=̂y
i
(k+1)−y
i
(k+1)
=ai(̂yi(k)−yi(k)) + ̃
WT
i(k+1)Sk(Z)−𝜖
i
=a
i
e
i
(k)+ ̃
WT
i
(k+1)S
k
(Z)−𝜖
i
,
1000 2000 300040005000600070008000900010000
Number of the data points
0
0.005
0.01
0.015
0.02
0.025
EDj
(a)Euclidiandistanceof3DPSR of Subµ1
11
forµ1.
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Number of the data points
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
EDj
(b)Euclidiandistanceof3DPSR of Subµ2
11
forµ2.
1000 2000 300040005000600070008000900010000
Number of the data points
0
0.01
0.02
0.03
0.04
0.05
0.06
EDj
(c)Euclidiandistanceof3DPSR of Subµ3
11 for
µ
3
.
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Number of the data points
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
EDj
(d)Euclidiandistanceof3DPSR of Subµ4
11
forµ
4
.
Fig. 9 Samples of the Euclidian distance of 3D PSR of
[
Sub
𝜇
1
11
,Sub
𝜇
2
11
,Sub
𝜇
3
11
,Sub
𝜇
4
11
]
T
of the abnormal heart
sound signal
A new approach forthedetection ofabnormal heart sound signals…
1 3
where
0<|𝛼|<2
, P is any symmetric positive definite matrix, and the weight estimation
error of neural networks
̃
W
satisfies:
Assumption 1 There exists a constant
SM>0
such that for all
k≥0
, the following
bound is satisfied:
The following theorem indicates the learning ability of the above-mentioned identification
algorithm for discrete-time PCG system.
Theorem1 Consider adaptive system consisting of the nonlinear PCG system (11), the
dynamical RBF network (13) and the neural network weight updating law (15). For almost
any recurrent trajectory
𝜑𝜁
with initial condition
̂
Wi(0)=0
, we have: (1) the state estima-
tion error
ei(k)
exponentially converges to a small neighbor of zero, and the neural network
weight estimation
̂
W𝜁i
exponentially converges to a small neighborhood of the ideal weight
W∗
𝜁i
; (2) a locally accurate approximation for the unknown
𝜙i(
⋅
;pi)
to the desired error level
𝜖i
is obtained along the trajectory
𝜑𝜁
by
̄
WT
i
S
k
.
Proof We construct the following form:
Then, the state estimation error and neural network weight estimation error become:
(15)
̂
W
i(k+1)= ̂
Wi(k)−
𝛼P(̂y
i
(k)−y
i
(k)−a
i
(̂y
i
(k−1)−y
i
(k−1)))S
k−1
(Z)
1+𝜆
max
(P)ST
k−1
(Z)S
k−1
(Z)
,
(16)
̃
W
i(k+1)=
̂
Wi(k+1)−W
∗
i
=̃
Wi(k)−
𝛼P(̃
Wi(k)Sk−1(Z)−𝜖i)Sk−1(Z)
1+𝜆max(P)ST
k−1(Z)Sk−1(Z)
=̃
Wi(k)[I−
𝛼PST
k−1(Z)Sk−1(Z)
1+𝜆max(P)ST
k−1(Z)Sk−1(Z)
]
+
𝛼PSk−1(Z)𝜖i
1+𝜆
max
(P)ST
k−1
(Z)S
k−1
(Z)
(17)
‖S(Z(k)) ≤SM‖
[
zi(k)
̃
Wi(k)
]
=
[
1−ST
k−1(Z)
01
][
ei(k)
̃
Wi(k)
]
W.Zeng et al.
1 3
and
Equations (18) and (19) can be transformed into the form of state equation:
◻
By using the local approximation properties of RBF networks, the state estimation error
and weight estimates learning law can be expressed as a unified form as follows:
(18)
z
i(k+1)=ei(k+1)−S
T
k(Z)
̃
Wi(k+1)
=aiei(k)+ ̃
WT
i(k+1)Sk(z)−𝜖i
−ST
k(Z)̃
Wi(k)I−
𝛼PST
k−1(Z)Sk−1(Z)
1+𝜆max(P)ST
k−1(Z)Sk−1(Z)
+
𝛼PSk−1(Z)𝜖i
1+𝜆max(P)Sk−1(Z)Sk−1(Z)
=aiei(k)+̃
Wi(k)I−
𝛼PST
k−1(Z)Sk−1(Z)
1+𝜆max(P)ST
k−1(Z)Sk−1(Z)
+
𝛼PSk−1(Z)𝜖i
1+𝜆max(P)ST
k−1(Z)Sk−1(Z)Sk(Z)
−𝜖i−ST
k(Z)̃
Wi(k)I−
𝛼PST
k−1(Z)Sk−1(Z)
1+𝜆max(P)ST
k−1(Z)Sk−1(Z)
+
𝛼PSk−1(Z)𝜖i
1+𝜆max(P)ST
k−1(Z)Sk−1(Z)
=aiei(k)−𝜖i−aĩ
WST
k−1(Z)+aĩ
WTST
k−1(Z)
=a
i
z
i
(k)+a
i
̃
WST
k−1
(Z)−𝜖
i
(19)
̃
W
i(k+1)=̃
Wi(k)
[
I−
𝛼PST
k−1(Z)Sk−1(Z)
1+𝜆max(P)ST
k−1
(Z)Sk−1(Z)
]
+
𝛼PSk−1(Z)𝜖i
1+𝜆max(P)ST
k−1
(Z)Sk−1(Z
)
(20)
zi(k+1)
̃
Wi(k+1)
=
aiaiS
T
k−1(Z)
0I−𝛼PST
k−1(Z)Sk−1(Z)
1+𝜆max(P)ST
k−1(Z)Sk−1(Z)
zi(k)
̃
Wi(k)
+
−𝜖i
𝛼PSk−1(Z)𝜖i
1+𝜆
max
(P)ST
k−1
(Z)S
k−1
(Z)
(21)
zi(k+1)
̃
W𝜁i(k+1)
=
aiaiST
𝜁(k−1)(Z)
0I−𝛼P𝜁ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)
1+𝜆max(P𝜁)ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)
zi(k)
̃
W𝜁i(k)
+
−𝜖�
𝜁i
𝛼P𝜁ST
𝜁(k−1)(Z)𝜖�
𝜁i
1+𝜆max(P𝜁)ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)
A new approach forthedetection ofabnormal heart sound signals…
1 3
and
(
⋅
)𝜁i
and
(
⋅
)
̄
𝜁i
stand for terms which are close to the orbit
𝜑𝜁
and far away from the orbit
𝜑𝜁
respectively.
S𝜁k
is a subvector of
Sk
.
̂
W𝜁i
is the corresponding weight subvector.
𝜖
�
𝜁i=𝜖𝜁i−
̂
WT
̄
𝜁i
S̄
𝜁k(Z)=O(𝜖𝜁i
)
is the approximation error along the trajectory
𝜑𝜁
.
Now, we first prove the stability of the nominal part of Eq.(21). Based on the properties
of RBF networks (Wang and Hill 2006, 2007, 2009), almost any periodic or recurrent tra-
jectory
𝜑𝜁
ensures persistence of excitation (PE) of the regressor subvector
S𝜁k
(Gorinevsky
1995). With Assumption1,
S𝜁k
in (21) satisfies the PE condition. Then, there exist constants
𝛼1>0, n>n𝜁>0
, such that:
where
n𝜁
is the dimension of
S𝜁k
.
Consider the following Lyapunov function candidate:
where
𝛽>0
. Then, we have:
Equation (25) can also be written as:
(22)
̃
W
̄
𝜁i(k+1)= ̃
W̄
𝜁i(k)
[
I−
𝛼P̄
𝜁S
T
̄
𝜁(k−1)(Z)S̄
𝜁(k−1)(Z)
1+𝜆max(P̄
𝜁)ST
̄
𝜁(k−1)(Z)S̄
𝜁(k−1)(Z)
]
+
𝛼P̄
𝜁S̄
𝜁(k−1)(Z)𝜖�
𝜁i
1+𝜆max(P̄
𝜁)ST
̄
𝜁(k−1)
(Z)S̄
𝜁(k−1)(Z)
(23)
𝛼
1I≤
j+n−1
∑
k=j
S𝜁(k−1)(Z)ST
𝜁(k−1)(Z),∀j≥
1
(24)
V
i(k)=𝛽z
2
i
(k)+
̃
W
T
𝜁i
(k)P
−1
𝜁̃
W𝜁i(k
)
(25)
𝛥V
i
(k)=V
i
(k+1)−V
i
(k)
=𝛽z2
i(k+1)+ ̃
WT
𝜁i(k+1)P−1
𝜁̃
W𝜁i(k+1)−𝛽z2
i(k)− ̃
WT
𝜁i(k)P−1
𝜁̃
W𝜁i(k)
=−𝛽(1−a2
i)z2
i(k)+2𝛽a2
izi(k)ST
𝜁(k−1)(Z)̃
W𝜁i(k)
+𝛽a2
iST
𝜁(k−1)(Z)̃
W𝜁i(k)ST
𝜁(k−1)̃
W𝜁i(k)
−ST
𝜁(k−1)(Z)̃
W𝜁i(k)
2𝛼I−𝛼2P𝜁ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)
1+𝜆max(P𝜁)ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)
1+𝜆max(P𝜁)ST
𝜁(k−1)
(Z)S𝜁(k−1)(Z)
̃
WT
𝜁i(k)S𝜁(k−1)(Z
)
(26)
𝛥
Vi(k)=−zi(k)ST
𝜁(k−1)(Z)̃
W𝜁i(k)
𝛽(1−a2
i)−𝛽a2
i
−𝛽a2
i
2𝛼I−
𝛼2P𝜁ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)
1+𝜆max(P𝜁)ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)
1+𝜆max(P𝜁)ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)−𝛽a2
i
zi(k)
̃
WT
𝜁i
(k)S𝜁(k−1)(Z)
W.Zeng et al.
1 3
Let
D
(k)=
⎡
⎢
⎢
⎢
⎣
𝛽(1−a2
i)−𝛽a2
i
−𝛽a2
i
2𝛼I−
𝛼2P𝜁ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)
1+𝜆max(P𝜁)ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)
1+𝜆max(P𝜁)ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)−𝛽a2
i
⎤
⎥
⎥
⎥
⎦
, when
2
𝛼I−
𝛼
2
P𝜁S
T
𝜁(k−1)(Z)S𝜁(k−1)(Z)
1+𝜆max(P𝜁)ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)
1+𝜆max(P𝜁)ST
𝜁(
k
−
1
)
(Z)S𝜁(k−1)(Z)
−
𝛽[a2
i+a
4
i
1−a2
i
]>
0
, that is,
2𝛼I−
𝛼
2
P𝜁S
T
𝜁(k−1)(Z)S𝜁(k−1)(Z)
1+𝜆max(P𝜁)ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)
1
+𝜆max(P𝜁)[a2
i+
a4
i
1−a2
i
]ST
k−1(Z)Sk−1(Z)
≥
2𝛼I−
𝛼
2
P𝜁S
T
𝜁(k−1)(Z)S𝜁(k−1)(Z)
𝜆max(P𝜁)ST
𝜁(k−1)(Z)S𝜁(k−1)(Z)
1+𝜆max(P𝜁)[a2
i+
a4
i
1−a2
i
]S2
M
>𝛽>
0
.
Then,
D(k)>0
leas to the results that
Equation (27) means: (1)
zi(k)
and
̃
W𝜁i
(k)S
𝜁(k−1)
(Z
)
converge exponentially to zero when
k→∞
. Hence,
ei(k)
converges exponentially to zero when
k→∞
; (2)
̃
W𝜁i
is uniformly
ultimately bounded. Since
̃
W𝜁i
(k)S
𝜁(k−1)
(Z
)
converges exponentially to zero, this implies
that
̃
W𝜁i
converges to a constant vector
̃
Wc
.
It can be deduced from Eq.(27) that
S
T
𝜁(k−1)
(Z)
̃
Wc=
0
. Then, we have:
Sum up the above equations, we have ̃
W
c
∑j+n−1
k=j
S𝜁(k−1+n)(Z)ST
𝜁(k−1+n)
(Z)=
0
. For
S𝜁k(Z)
satisfying Eq.(23), the matrix
∑j+n−1
k=j
S𝜁(k−1)(Z)ST
𝜁(k−1)
(Z
)
is positive definite, then
̃
Wc=0
,
hence
̃
W𝜁i
converges exponentially to zero.
Thus, the nominal part of Eq. (21) is exponentially stable. Since
𝜖𝜁i
is small, both the state
estimation error
zi(k+1)
and the parameter error
̃
W𝜁i
(k+1
)
in Eq.(21) converge exponentially
to small neighborhoods of zero, and the range of the neighborhood is determined by the param-
eter
𝜖𝜁i
.
The convergence of
̂
W𝜁i
to be in a small neighborhood of
W∗
𝜁i
implies that along the trajec-
tory
𝜑𝜁
,
where
𝜖
𝜁i
1
=𝜖𝜁i−
̃
WT
𝜁i
S𝜁k(𝜑𝜁)=O(𝜖𝜁i)=O(𝜖i
)
is the practical approximation error for
using ̂
W
T
𝜁i
S𝜁
k
, which is small due to the exponential convergence of
̃
W𝜁i
.
By this convergence result, we can obtain a constant vector of neural weights according to
where
{ka,…,kb}
represents a piece of time segment after the transient process. Thus,
using
̄
WT
𝜁i
S𝜁k(𝜑𝜁
)
, where
̄
W𝜁i
is the subvector of
̄
Wi
, we have:
(27)
{
𝛥V(k)<0, for {zi(k),
̃
W𝜁i(k)S𝜁(k−1)(Z
)}
𝛥V(k)
≤
0for {z
i
(k),̃
W
𝜁i
(k)}
S
𝜁(k−1)(Z)S
T
𝜁(k−1)
(Z)
̃
Wc=0, …,S𝜁(k−1+n)(Z)S
T
𝜁(k−1+n)
(Z)
̃
Wc=
0
(28)
𝜙
i(𝜑𝜁;pi)=W
∗
𝜁i
T
S𝜁k(𝜑𝜁)+𝜖𝜁i
=̂
WT
𝜁iS𝜁k(𝜑𝜁)− ̃
WT
𝜁iS𝜁k(𝜑𝜁)+𝜖𝜁
i
=̂
WT
𝜁i
S𝜁
k
(𝜑𝜁)+𝜖𝜁
i1
(29)
̄
W
i=1
kb−ka+1
k
b
∑
k=ka
̂
Wi(k
)
A new approach forthedetection ofabnormal heart sound signals…
1 3
where
𝜖
𝜁i
2
is the practical approximation error for using
̂
W
T
𝜁i
S𝜁
k
. It is clear that after the
transient process,
𝜖𝜁i2=O(𝜖𝜁i1)=O(𝜖i)
.
It can be seen from Eq.(22) that for the neurons with centers far away from the trajec-
tory
𝜑𝜁
,
S𝜁k
will become very small due to the localization property of RBF networks. In
this case, the neural weights
̂
W𝜁i
will only be slightly updated. Both
̂
W𝜁i
and
̂
W
T
𝜁i
S𝜁
k
, as well
as
̄
W𝜁i
and
̄
WT
𝜁i
S𝜁
k
will remain very small. This means that the entire RBF network
̂
WT
i
S
k
can approximate the unknown
𝜙i(𝜑𝜁;pi)
along the trajectory
𝜑𝜁
as follows:
where
𝜖
i1=𝜖𝜁i1−
̄
W
T
̄
𝜁i
S̄
𝜁k(𝜑𝜁)=O(𝜖𝜁i1)=O(𝜖i
)
. Similarly, using Eq.(30), we have
where
𝜖
i
2
=𝜖𝜁i
2
−
̄
W
T
̄
𝜁i
S̄
𝜁k(𝜑𝜁)=O(𝜖𝜁i
2
)=O(𝜖i
)
. Equations (31) and (32) mean that locally
accurate identification of the system dynamics
𝜙i(
⋅
;pi)
to the desired level
𝜖i
along the tra-
jectory
𝜑𝜁
can be achieved by using the RBF network. This completes the proof.
It is seen that the employment of localized RBF networks under periodic or periodic-
like (recurrent) inputs, yields a guaranteed PE excitation condition. This condition, with
the localization property of RBF networks, leads to the exponential stability of a localized
adaptive discrete-time PCG system. In this way, parameter convergence and accurate local
approximation of PCG system dynamics can be achieved naturally.
2.7 Classication mechanism
In this section, we present a scheme to classify normal and abnormal heart sound signals.
Consider a set of training temporal data sequences
𝜑s
𝜁,s=1, …,M
, among which the
sth training temporal data sequence
𝜑s
𝜁
generated from the following system:
where
Ys
(k)=[y
s
1
(k),…,y
s
n
(k)]
T
∈R
n
is the state of the system, which is measurable,
ps
is
a constant vector of system parameters,
Fs
(⋅;p
s
)=[f
s
1
(⋅;p
s
1
),…,f
s
n
(⋅;p
s
n
)]
T
denotes the PCG
system dynamics,
vs
(⋅;p
s
)=[v
s
1
(⋅;p
s
1
),…,v
s
n
(⋅;p
s
n
)]
T
denotes the modeling uncertainty.
As mentioned above, the general PCG system dynamics
𝜙s(
⋅
;p) ∶= Fs(
⋅
;p)+vs(
⋅
;p)
can
be accurately derived and preserved in constant RBF neural networks
̄
WT
i
S
k
(Y
s
(k
))
.
Consider
𝜑𝜍
generated from Eq.(11) as a test temporal data sequence. For the sth train-
ing temporal data sequence
𝜑s
𝜁
, a dynamical model is constructed by using the time-invari-
ant representation
̄
WsT
S
k
as:
(30)
𝜙
i(𝜑𝜁;pi)=
̂
W
T
𝜁iS𝜁k(𝜑𝜁)+𝜖𝜁i
1
=̄
WT
𝜁i
S
𝜁k
(𝜑
𝜁
)+𝜖
𝜁i2
(31)
𝜙
i(𝜑𝜁;pi)=
̂
W
T
𝜁iS𝜁k(𝜑𝜁)+𝜖𝜁i1
=̄
WT
𝜁iS𝜁k(𝜑𝜁)+ ̄
WT
̄
𝜁iS̄
𝜁k(𝜑𝜁)+𝜖𝜁i1−̄
WT
̄
𝜁iS̄
𝜁k(𝜑𝜁
)
=̂
WT
i
Sk(𝜑𝜁)+𝜖i
1
(32)
𝜙
i(𝜑𝜁;pi)=
̄
W
T
𝜁iS𝜁k(𝜑𝜁)+𝜖𝜁i2
=̄
WT
𝜁iS𝜁k(𝜑𝜁)− ̄
WT
̄
𝜁iS̄
𝜁k(𝜑𝜁)+𝜖𝜁i
2
=̄
WT
i
S
k
(𝜑
𝜁
)+𝜖
i2
(33)
Ys(k)=Fs(Ys(k−1),…,Ys(k−m);ps)+vs(Ys(k−1),…,Ys(k−m);ps)
W.Zeng et al.
1 3
where
̄
Y
(k)=[̄y
s
1
(k),…,̄y
s
n
(k)]
T
∈R
n
is the state vector of the dynamical model,
B=diag{b1,…,bn}
is a diagonal matrix that is kept the same for all training sequences.
Sk(Z)=S(Y(k−1),…,Y(k−m))
,
[Y(k−1),…,Y(k−m)]
is the test temporal data
sequence
𝜑𝜍
generated from Eq. (11). Then, corresponding to the test temporal data
sequence
𝜑𝜍
and the dynamical model (34), we obtain the following recognition error
system:
where
es
i(k)=yi(k)−̄ys
i(k)
,
|bi|<1
.
We have that the error
|es
i(k)|
can effectively measure the similarity between the test
sequence
𝜑𝜍
and the training sequences
𝜑s
𝜁
. Compute the average
Lp
norm of
|es
i(k)|
, for
example, for
p=1
,
Hence, we have the following classification method for temporal data sequences:
Consider the recognition error system consisting of Eqs.(11), (34) and (35). Among the
M dynamical models, if the error
‖es
i(k)‖L1
between the sth dynamical model and the test
temporal data sequence
𝜑𝜍
is the smallest one, then the test temporal data sequence
𝜑𝜍
is
said to be most similar to the training temporal data sequence
𝜑s
𝜁
.
The fundamental idea of the classification of abnormal heart sound signals is that if
a test heart sound signal pattern is similar to the trained heart sound signal pattern
s(s∈{1, …,k})
, the constant RBF network
̄
W
s
TSk
embedded in the matched estimator s
will quickly recall the learned knowledge by providing accurate approximation to PCG sys-
tem dynamics. Thus, the corresponding error
‖es
i(k)‖L1
will become the smallest among all
the errors
‖
e
k
i
(k)
‖L1
. Based on the smallest error principle, the appearing test heart sound
signal pattern can be classified.
Classification scheme If there exists some finite time
ts,s∈{1, …,k}
and some
i∈{1, …,n}
such that
‖
e
s
i
(k)
‖L1
<
‖
e
k
i
(k)
‖L1
for all
t>ts
, then the appearing PCG system
pattern can be classified and abnormal heart sound signal can be detected.
3 Experimental results
Experiments are implemented using matlab software and tested on an Intel Core i7 6700K
3.5GHz computer with 64GB RAM. We assign feature vector sequences for all the normal
and abnormal heart sound signals in the PhysioNet/CinC Challenge 2016 heart sound data-
base. According to the method described in Sect.2.5, we extract features, which means the
input of the RBF neural networks is
[
EDSub
𝜇1
11
j
,EDSub
𝜇2
11
j
,EDSub
𝜇3
11
j
,EDSub
𝜇4
11
j
]T . In order to elim-
inate data difference between different features, all feature data are normalized to
[−1, 1]
.
Several experiments are carried out to verify the effectiveness of the proposed method.
The classification results will be evaluated with the 10-fold cross-validation style in which
the variance of the estimate for the classifiers is reduced. The data are divided into the train-
ing and test subsets. For the 10-fold cross-validation, the data set is divided into ten subsets.
(34)
̄
Y(k)=B(
̄
Y
s
(k−1)−Y(k−1)) +
̄
W
s
TSk,
(35)
es
i
(k)=b
i
e
s
i
(k−1)+(𝜙
i
(⋅;p
i
)−
̄
W
sT
i
S
k
),i=1, …,n,k=1, …,M
,
(36)
es
i(k)
L1=1
k
k
j=1
es
i(j)
,s=1, …,M
A new approach forthedetection ofabnormal heart sound signals…
1 3
Each time, one of the ten subsets is used as the test set and the other night subsets are put
together to form a training set. As such, every fold has been used nine times as training data
and one time as test data. The final result is the average of the 10 implementations. For the
evaluation, the sensitivity (
Se
), the specificity (
Sp
), the overall score (
Sc
) of the sensitivity and
the specificity, and the accuracy (
ACC
) are used and defined as follows (Clifford etal. 2016):
where TP is the number of true positives referring to the abnormal heart sound signals,
FN is the number of false negatives referring to the misidentified abnormal heart sound
signals, TN is the number of true negatives referring to the correctly detected normal heart
sound signals, and FP is the number of false positives referring to the misidentified normal
heart sound signals. The overall score is also defined as mean accuracy (
MACC
) in some
literatures.
The classification results on normal and abnormal heart sound signals (with two dif-
ferent data balance methods mentioned before) have been illustrated in Tables3 and 4
with 10-fold cross-validation style. We apply three types of features to verify and com-
pare their classification performance: (1) derived from TQWT+PSR/ED; (2) derived
from VMD+PSR/ED; and (3) derived from TQWT, VMD, PSR and ED (proposed fea-
tures). Here when only applying TQWT+PSR/ED, we use the 11th subband of 10 levels
TQWT of the heart sound signal together with PSR/ED as the features, which are rep-
resented as
EDSub11
j
. When only applying VMD+PSR/ED, we use the first four intrinsic
modes of the heart sound signal together with PSR/ED as the features, which are repre-
sented as
[
ED
𝜇
1
j
,ED
𝜇
2
j
,ED
𝜇
3
j
,ED
𝜇
4
j
]
T
. It is seen from Tables3 and 4 that the classification
(37)
S
e=
TP
TP +FN
×100(%)
,
(38)
S
p=
TN
TN +FP
×100(%)
,
(39)
S
c=
S
e
+S
p
2,
(40)
ACC
=
TP +TN
TP +TN +FN +FP
×100(%)
,
Table 3 Classification performance of the proposed features and its comparison with other two features on
selected balanced recordings evaluated by 10-fold cross-validation. Total numbers of the abnormal and nor-
mal recordings are 472 and 472, respectively
Evaluated features Predicted
groups
Actual groups
Se
(
%
)
Sp
(
%
)
Sc
(
%
)
ACC
(
%
)
Normal Abnormal
TQWT+PSR/ED:
EDSub11
j
Normal 392 80 85.38 83.05 84.22 84.22
Abnormal 69 403
VMD+PSR/ED:
[
ED
𝜇
1
j
,ED
𝜇
2
j
,ED
𝜇
3
j
,ED
𝜇
4
j
]
T
Normal 404 68 87.29 85.59 86.44 86.44
Abnormal 60 412
Proposed features:
[
EDSub
𝜇1
11
j
,EDSub
𝜇2
11
j
,EDSub
𝜇3
11
j
,EDSub
𝜇4
11
j
]
T
Normal 461 11 97.46 97.67 97.57 97.56
Abnormal 12 460
W.Zeng et al.
1 3
performance of the proposed features is superior to that of the other two features. Overall,
our classification approach achieves good performance, which indicates that the proposed
pattern classification system can effectively detect abnormal heart sound signals by using
nonlinear features and neural network based classification tools.
4 Discussion
Experimental results of this study demonstrate that abnormal heart sound signals could be
detected automatically by means of nonlinear features and neural networks based artificial
intelligence tool. The proposed scheme focuses not only on providing evidence to support
the claim that pathological patients demonstrate altered PCG system dynamics compared
to normal subjects, but also on providing an automatic, objective and computationally con-
venient method to distinguish between normal and abnormal heart sound signals.
Potes etal. (2016) used two classifiers, in which the AdaBoost classifier and the CNN
were included. They first extracted 124 time-frequency features from the PCG signal and
used them as input to a variant of the AdaBoost classifier. Then they decomposed the PCG
cardiac cycles into four frequency bands, which were used as input of the CNN for training.
Finally, they classified the normal and abnormal heart sound signals based on an ensemble
of classifiers combining the outputs of AdaBoost and the CNN. The reported best perfor-
mance was with the sensitivity of
94.24%
, the specificity of
77.81%
, and the overall score
of
86.02%
, respectively.
Dominguez-Morales etal. (2017) divided the heart sound recordings into windows of a
specific time length. Then they sent these segments of the original sound to a Neuromor-
phic Auditory Sensor, which could decompose the audio into frequency bands and pack-
etize the information. Finally, this information was converted to sonogram images, which
were fed to the CNN for classification by using deep learning algorithms. The reported best
performance with 10-fold cross-validation was with the accuracy of
97.05%
, the sensitivity
of
95.12%
, the specificity of
93.20%
, and the overall score of
94.16%
, respectively.
Beritelli etal. (2018) extracted features from PCG signals by using Gram polynomials
and the Fourier transform. Afterwards, features were fed to the probabilistic neural net-
works for classification. The reported best performance with 10-fold cross-validation was
Table 4 Classification performance of the proposed features and its comparison with other two features
on balanced recordings with SMOTE method evaluated by 10-fold cross-validation. Total numbers of the
abnormal and normal recordings are 2554 and 2619, respectively
Evaluation methods Predicted
groups
Actual groups
Se
(
%
)
Sp
(
%
)
Sc
(
%
)
ACC
(
%
)
Normal Abnormal
TQWT+PSR/ED:
EDSub11
j
Normal 2235 384 84.30 85.34 84.82 84.83
Abnormal 401 2153
VMD+PSR/ED:
[
ED
𝜇
1
j
,ED
𝜇
2
j
,ED
𝜇
3
j
,ED
𝜇
4
j
]
T
Normal 2277 342 86.06 86.94 86.50 86.51
Abnormal 356 2198
Proposed features:
[
EDSub
𝜇
1
11
j
,EDSub
𝜇
2
11
j
,EDSub
𝜇
3
11
j,
ED
Sub
𝜇4
11
j
]
T
Normal 2568 51 97.73 98.05 97.89 97.89
Abnormal 58 2496
A new approach forthedetection ofabnormal heart sound signals…
1 3
with the accuracy of
94%
, the sensitivity of
93%
, the specificity of
91%
, and the overall
score of
92%
, respectively.
Bozkurt etal. (2018) extracted features from heart sound signal by using Mel-Spectro-
gram, MFCC and subband envelopes. These features were used as input of the CNN classi-
fier and the reported best performance with 10-fold cross-validation was with the accuracy
of
81.5%
, the sensitivity of
84.5%
, the specificity of
78.5%
, and the overall score of
81.5%
,
respectively.
Zhang et al. (2019) extracted the spectrogram of the heart sound signal by using the
short-time Fourier transform. Following that, they calculated the temporal quasi-periodic
features by the average magnitude difference function in each frequency band of the heart
sound spectrogram. The extracted features were fed to the two-layer LSTM neural net-
work for classification. The reported best performance with 10-fold cross-validation was
with the sensitivity of
96.15%
, the specificity of
93.18%
, and the overall score of
94.66%
,
respectively.
Adiban etal. (2019) constructed a fixed length feature vector from the heart sound sig-
nal by using MFCC features. Afterwards, Principal Component Analysis (PCA) transform
and Variational Autoencoder (VA) were used to reduce the feature dimension. Finally, the
reduced size feature vector was fed to Gaussian Mixture Models and SVM for classifica-
tion. The reported best performance was with the sensitivity of
92.28%
, the specificity of
94.95%
, and the overall score of
93.61%
, respectively.
Xiao etal. (2019) took 3-s 1-D waveform PCG as the inputs of CNN. At first, the initial
low-level features were extract by 64 convolutional filters. Then max pooling layers were
used to further reduce the spatial size of feature maps. After that the feature maps were fed
to the stacked clique blocks. The reported best performance with 10-fold cross-validation
was with the accuracy of
93%
, the sensitivity of
86%
, the specificity of
95%
, and the overall
score of
91%
, respectively.
Das etal. (2019) extracted three kinds of features from PCG signal, including MFCC,
Short time fourier transform and Cochleagram feature, and then fed them to a supervised
artificial neural network for classification. The reported best performance with 10-fold
cross-validation was with the accuracy of
93.7%
, the sensitivity of
84.5%
, the specificity of
95.2%
, and the overall score of
89.9%
, respectively.
Different from the above discussed methods, this study proposes a hybrid method to
extract nonlinear features using TQWT, VMD, PSR and ED techniques. These features
are fed into dynamical estimators which are consisting of constant RBF neural networks to
classify normal and abnormal heart sound signals. Comparison of the classification perfor-
mance to other state-of-the-art methods on the same database is demonstrated in Table5.
The proposed method provides sensitivity, specificity, overall score and accuracy values
of 97.73
%
, 98.05
%
, 97.89
%
, and 97.89
%
, respectively, through 10-fold cross-validation
style. Modeling, identification and classification of PCG system dynamics were employed
instead of putting feature vectors directly into the classifier in comparison to other meth-
ods. This provides another candidate tool for the detection of abnormal heart sound signals.
In TQWT the variation of Q-factor affects the computed features in different oscillatory
levels. Selecting the proper value of Q improves the system accuracy until it reaches its
best performance, and then any further increase in the value of Q will reduce the system
performance. Increasing R, while keeping Q unchanged, has the effect of increasing the
overlap between adjacent frequency responses. The parameter R does not affect the general
shape of the wavelet of frequency response spectrum (they are controlled by Q). With a
larger R, the number of level J should be increased in order to cover the same frequency
range because of the increased overlap. The value of J has been restricted to 15 in the
W.Zeng et al.
1 3
Table 5 Summary of classification performance on the normal and abnormal heart sound signals with 10-fold cross-validation style obtained from the same PhysioNet/CinC
Challenge 2016 heart sound database in the literature
References Features Classifier Sensitivity (
%
) Specificity (
%
) Overall score (
%
) Accuracy (
%
)
Potes etal. (2016) Using time-frequency features AdaBoost and CNN 94.24 77.81 86.02 Not mentioned
Dominguez-Morales etal. (2017) Using sonogram images con-
verted from frequency bands
of PCG
CNN 95.12 93.20 94.16 97.05
Beritelli etal. (2018) Using features extracted from
Gram polynomials and the
Fourier transform
Probabilistic neural networks
classifier
93 91 92 94
Bozkurt etal. (2018) Using features extracted from
Mel-Spectrogram, MFCC, sub-
band envelopes
CNN 84.5 78.5 81.5 81.5
Zhang etal. (2019) Using heart sound spectrogram
features
LSTM 96.15 93.18 94.66
%
Not mentioned
Adiban etal. (2019) Using MFCC features Gaussian Mixture Models and
SVM
92.28 94.95 93.61 Not mentioned
Xiao etal. (2019) 3-s 1-D waveform with 64 convo-
lutional filters
CNN 86 95 91 93
Das etal. (2019) MFCC, Short time fourier trans-
form and Cochleagram features
Supervised artificial neural
network
84.5 95.2 89.9 93.7
Proposed work Extracted through TQWT, VMD,
PSR and ED
Dynamical estimators consisting
of neural networks
97.73
%
98.05
%
97.89
%
97.89
%
A new approach forthedetection ofabnormal heart sound signals…
1 3
present study owing to the fact that higher values of J will lead to higher dimension of fea-
ture matrices which in turn, will increase computational burden. Several experiments are
performed for an optimum selection of Q-factor and J values. The R value is fixed to be 3,
as the R value increases, the overlapping in the adjacent frequency response also increases.
For Q and J the minimum value is selected as 1. Hence, Q is varied from 1 to 10 and J is
varied from 1 to 15, respectively. Then the features are computed from the sub-band with
the majority of the heart sound signal’s energy and fed into RBF neural networks for the
modeling, identification and classification of PCG system dynamics based on deterministic
learning theory. Figures10 and11 depict the effect of variation of Q-factor and J level on
the classification performance. It can be observed from Fig.10 that significant variation in
classification accuracy is achieved by varying Q-factor value. However, the highest clas-
sification accuracy is obtained for
Q=3
. Classification accuracy further decreases with
increment in Q-factor value. Therefore, optimum value of Q-factor is found to be 3 in the
present study. The optimal value of J is determined in the same manner. It can be observed
from Fig.11 that the maximum accuracy value is achieved for
J=10
. The experimental
results demonstrate that features based on time-frequency properties of TQWT are quite
effective to represent the behavior of cardiac sound signals giving higher classification per-
formance. One way to increase the classification performance of our method could be with
the fine-tuned parameters of the TQWT on a subject by subject basis, so as to account for
inter-individual differences. To what extent the performance can be improved by modify-
ing the tuneable parameters of TQWT (globally or for each individual) is not clear and
could be the focus of further investigation in the future.
PSR can reduce the effects of the noise or outliers of the PCG signals. Hence, features
extracted in phase space might help improve the classification results. The most visual way
to observe the dynamic behavior of a chaotic system is through the phase space, which is
the track record of the chaotic system and can reflect the changes of the system state. For
12345678910
Q-factor
91
92
93
94
95
96
97
98
Accuracy (%)
Fig. 10 Variation of classification accuracy with Q-factor on balanced recordings with SMOTE method
W.Zeng et al.
1 3
the convenience of observation, a phase space is often studied to directly judge the non-
linear dynamic behavior of chaotic systems. For example, for periodic motion, the phase
diagram trajectory is a simple closed curve. Because heart sound is a quasi-periodic signal,
we further use the phase space to analyze the chaotic characteristics of the heart sound. In
this work, we have confined our discussion to the value of embedding dimension
d=3
,
because of their visualization simplicity. In addition, different studies have found this value
to best represent the attractor for human biological system (Venkataraman and Turaga
2016; Som etal. 2016). From a theoretical viewpoint, the time lag
𝜏
has little impact on the
classification performance, and in fact there are no limitations or assumptions placed upon
it with respect to the underlying time-lag reconstruction theorems for discrete-time signals
(Sauer etal. 1991). However, since topological invariance of systems does not equate to
identical phase spaces or attractors, from a practical viewpoint the lag must be selected
with respect to some relevant criteria (Johnson etal. 2005), such as the first-zero crossing
of the autocorrelation function for each time series or the average
𝜏
value obtained from
all the time series in the training dataset using the method proposed in Michael (2005).
The dimension d is held constant and the classification task is implemented with time lag
varying across a range of 1–20. It can be observed from Fig.12 that the accuracy is highest
for a lag of 5, with a decline followed by a second lower peak value at lag 12. However, to
what extent the classification performance can be improved by modifying the dimension
and time lag is not clear and construction of regulation principle of the PSR parameters
will be considered in future research.
The TQWT can be used to extract the dynamical changes in the abnormal PCG sig-
nals with respect to that of normal. It is a nonlinear method and hence able to capture the
subtle variations in the PCG signals which results in high accuracy. Decomposing signals
with VMD is considered insightful because it provides more descriptive details about the
12345678910 11 12 13 14 15
Decomposition level J
93
93.5
94
94.5
95
95.5
96
96.5
97
97.5
98
Accuracy (%)
Fig. 11 Variation of classification accuracy with decomposition level J on balanced recordings with
SMOTE method
A new approach forthedetection ofabnormal heart sound signals…
1 3
original signal. For example, a signal that is decomposed into 4 intrinsic modes is more
descriptive than one decomposed into 2 intrinsic modes. VMD is essentially a set of adap-
tive Wiener filter banks, which transforms signal decomposition into variational solution
problem and can decompose a signal into an ensemble of band-limited mode concurrently
in a non-recursive way. 3D phase spaces of the predominant intrinsic modes are recon-
structed, in which properties associated with the PCG system dynamics are preserved.
PSR plots PCG system dynamics along the advisable
𝜇1
,
𝜇2
,
𝜇3
and
𝜇4
intrinsic modes of
the 11th subbands trajectory in a 3D phase space diagram and visualizes the PCG system
dynamics. Features derived from TQWT, VMD, 3D PSR and ED may better reflect the
abnormal alterations in the dynamics of the PCG system and can achieve high sensitivity
and specificity simultaneously as a discriminator of abnormal heart sound signal. When
feeding these features into the RBF neural networks for the modeling and identificantion of
PCG system dynamics, it could greatly improve the modeling accuracy which is effective
for the anomaly (normal vs. abnormal) detection of PCG recordings.
5 Conclusions
In this study, we propose a new approach including TQWT, VMD, PSR and ED for the
detection of abnormal heart sound signals, which is computationally simple and easy to
implement. The results of this study indicate that the pattern classification of heart sound
signal can offer an objective method to assess the disparity of PCG system dynamics
between normal subjects and pathological patients with heart diseases. However, some
limitations still need to be improved and overcome, such as the limited size of the database,
12345678910 11 12 13 14 15 16 17 18 19 20
Time lag τ
96
96.2
96.4
96.6
96.8
97
97.2
97.4
97.6
97.8
98
Accuracy (%)
Fig. 12 Variation of classification accuracy with time lag
𝜏
at dimension 3 on balanced recordings with
SMOTE method
W.Zeng et al.
1 3
the regulation principle of the TQWT amd PSR parameters. Future work will include a
clinical validation of the proposed technique with a larger number of pathological patients
with different heart diseases. Assessments of the mathematical relationship between the
embedding dimension, time lag, Q-factor, redundancy, decomposition level and the clas-
sification accuracy can also be considered in future investigations. In the present study we
did not regroup the PhysioNet/CinC Challenge 2016 database in a patient-wise manner
since we did not provide on a case-by-case basis for the patients with a variety of illness.
In future research we will regroup the database in a patient-wise manner and consider the
impact of illness (such as heart valve defects and coronary artery disease) of the patients on
the effectiveness of the stratified classification model. Features introduced in other methods
such as various entropies, Hurst exponent, fractal dimension and other nonlinear features,
can also be explored in the proposed framework to evaluate its classification performance.
The proposed automated detection system can assist physicians in cross-checking their
diagnosis of heart diseases.
Acknowledgements This work was supported by the National Natural Science Foundation of China (Grant
No. 61773194), by the Natural Science Foundation of Fujian Province (Grant No. 2018J01542), by the Pro-
gram for New Century Excellent Talents in Fujian Province University and by the Training Program of
Innovation and Entrepreneurship for Undergraduates (Grant No. 201911312009).
Compliance with ethical standards
Conict of interest There is no conflict of interest.
References
Adiban M, BabaAli B, Shehnepoor S (2019) I-vector based features embedding for heart sound classifica-
tion. arXiv preprint arXiv :1904.11914
Alam U, Asghar O, Khan SQ, Hayat S, Malik RA (2010) Cardiac auscultation: an essential clinical skill in
decline. Br J Cardiol 17(1):8
Babu KA, Ramkumar B, Manikandan MS (2018) Automatic identification of S1 and S2 heart sounds using
simultaneous PCG and PPG recordings. IEEE Sens J 18(22):9430–9440
Beritelli F, Capizzi G, Sciuto GL, Napoli C, Scaglione F (2018) Automatic heart activity diagnosis based on
Gram polynomials and probabilistic neural networks. Biomed Eng Lett 8(1):77–85
Boutana D, Benidir M, Barkat B (2011) Segmentation and identification of some pathological phonocardio-
gram signals using time-frequency analysis. IET Signal Process 5(6):527–537
Bozkurt B, Germanakis I, Stylianou Y (2018) A study of time-frequency features for CNN-based automatic
heart sound classification for pathology detection. Comput Biol Med 100:132–143
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling
technique. J Artif Intell Res 16:321–357
Cheema A, Singh M (2019) An application of phonocardiography signals for psychological stress detection
using non-linear entropy based features in empirical mode decomposition domain. Appl Soft Comput
77:24–33
Chen B, He Z, Chen X, Cao H, Cai G, Zi Y (2011) A demodulating approach based on local mean decom-
position and its applications in mechanical fault diagnosis. Meas Sci Technol 22(5):055704
Chen M, Fang Y, Zheng X (2014) Phase space reconstruction for improving the classification of single trial
EEG. Biomed Signal Process Control 11:10–16
Clifford GD, Liu C, Moody B, Springer D, Silva I, Li Q, Mark RG (2016) Classification of normal/abnor-
mal heart sound recordings: the PhysioNet/computing in cardiology challenge 2016. In: 2016 Comput-
ing in cardiology conference (CinC), pp 609–612
Das S, Pal S, Mitra M (2019) Supervised model for Cochleagram feature based fundamental heart sound
identification. Biomed Signal Process Control 52:32–40
Deng SW, Han JQ (2016) Towards heart sound classification without segmentation via autocorrelation fea-
ture and diffusion maps. Future Gener Comput Syst 60:13–21
A new approach forthedetection ofabnormal heart sound signals…
1 3
Dominguez-Morales JP, Jimenez-Fernandez AF, Dominguez-Morales MJ, Jimenez-Moreno G (2017) Deep
neural networks for the recognition and classification of heart murmurs using neuromorphic auditory
sensors. IEEE Trans Biomed Circuits Syst 12(1):24–34
Dragomiretskiy K, Zosso D (2014) Variational mode decomposition. IEEE Trans Signal Process
62(3):531–544
Feng W, Dauphin G, Huang W, Quan Y, Bao W, Wu M, Li Q (2019) Dynamic synthetic minority over-
sampling technique-based rotation forest for the classification of imbalanced hyperspectral data. IEEE
J Sel Top Appl Earth Obs Remote Sens 12(7):2159–2169
Gavrovska A, Zajic G, Bogdanovic V, Reljin I, Reljin B (2016) Paediatric heart sound signal analysis
towards classification using multifractal spectra. Physiol Meas 37(9):1556
Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng
CK, Stanley HE (2003) PhysioBank, physioToolkit, and physioNet: components of a new research
resource for complex physiologic signals. Circulation 101(23):e215–e220
Gorinevsky D (1995) On the persistency of excitation in radial basis function network identification of non-
linear systems. IEEE Trans Neural Netw 6(5):1237–1244
Hamidi M, Ghassemian H, Imani M (2018) Classification of heart sound signal using curve fitting and frac-
tal dimension. Biomed Signal Process Control 39:351–359
Hassan AR, Siuly S, Zhang Y (2016) Epileptic seizure detection in EEG signals using tunable-Q factor
wavelet transform and bootstrap aggregating. Comput Methods Programs Biomed 137:247–259
Hassani K, Bajelani K, Navidbakhsh M, Doyle DJ, Taherian F (2014) Heart sound segmentation based on
homomorphic filtering. Perfusion 29(4):351–359
Huang B, Kunoth A (2013) An optimization based empirical mode decomposition scheme. J Comput Appl
Math 240:174–183
Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Liu HH (1998) The empirical mode decomposi-
tion and Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc R Soc Lond A
Math Phys Eng Sci 454(1971):903–995
Humayun AI, Ghaffarzadegan S, Ansari MI, Feng Z, Hasan T (2020) Towards domain invariant heart
sound abnormality detection using learnable filterbanks. IEEE J Biomed Health Inform. https ://doi.
org/10.1109/JBHI.2020.29702 52
Jain PK, Tiwari AK (2018) A robust algorithm for segmentation of phonocardiography signal using tunable
quality wavelet transform. J Med Biol Eng 38(3):396–410
Johnson MT, Povinelli RJ, Lindgren AC, Ye J, Liu X, Indrebo KM (2005) Time-domain isolated phoneme
classification using reconstructed phase spaces. IEEE Trans Speech Audio Process 13(4):458–466
Lal GJ, Gopalakrishnan EA, Govind D (2018) Epoch estimation from emotional speech signals using vari-
ational mode decomposition. Circuits Syst Signal Process 37(8):3245–3274
Langley P, Murray A (2017) Heart sound classification from unsegmented phonocardiograms. Physiol Meas
38(8):1658
Lee SH, Lim JS, Kim JK, Yang J, Lee Y (2014) Classification of normal and epileptic seizure EEG signals
using wavelet transform, phase-space reconstruction, and Euclidean distance. Comput Methods Pro-
grams Biomed 116(1):10–25
Li Y, Xu M, Wei Y, Huang W (2015) Rotating machine fault diagnosis based on intrinsic characteristic-
scale decomposition. Mech Mach Theory 94:9–27
Li J, Ke L, Du Q, Ding X, Chen X, Wang D (2019a) Heart sound signal classification algorithm: a combina-
tion of wavelet scattering transform and twin support vector machine. IEEE Access 7:179339–179348
Li J, Ke L, Du Q (2019b) Classification of heart sounds based on the wavelet fractal and twin support vector
machine. Entropy 21(5):472
Liang QZ, Guo XM, Zhang WY, Dai WD, Zhu XH (2015) Identification of heart sounds with arrhythmia
based on recurrence quantification analysis and Kolmogorov entropy. J Med Biol Eng 35(2):209–217
Liu L, Wang H, Wang Y, Tao T, Wu X (2010) Feature analysis of heart sound based on the improved
Hilbert-Huang transform. In: 3rd IEEE international conference on computer science and information
technology, pp 378–381
Liu C, Springer D, Li Q, Moody B, Juan RA, Chorro FJ, Syed Z (2016) An open access database for the
evaluation of heart sound algorithms. Physiol Meas 37(12):2181
Merigó JM, Casanovas M (2011) Induced aggregation operators in the Euclidean distance and its applica-
tion in financial decision making. Expert Syst Appl 38:7603–7608
Mert A (2016) ECG feature extraction based on the bandwidth properties of variational mode decomposi-
tion. Physiol Meas 37(4):530
Messner E, Zohrer M, Pernkopf F (2018) Heart sound segmentation-an event detection approach using deep
recurrent neural networks. IEEE Trans Biomed Eng 65(9):1964–1974
W.Zeng et al.
1 3
Michael S (2005) Applied nonlinear time series analysis: applications in physics, physiology and finance
(Vol 52). World Scientific, Singapore
Mishra M, Banerjee S, Thomas DC, Dutta S, Mukherjee A (2018) Detection of third heart sound using
variational mode decomposition. IEEE Trans Instrum Meas 67(7):1713–1721
Mishra M, Pratiher S, Menon H, Mukherjee A (2020) Identification of S1 and S2 heart sounds using
spectral and convex hull features. IEEE Sens J 20(8):4311–4320
Nishad A, Pachori RB, Acharya UR (2018) Application of TQWT based filter-bank for sleep apnea
screening using ECG signals. J Ambient Intell Humaniz Comput. https ://doi.org/10.1007/s1265
2-018-0867-3
Nogueira DM, Ferreira CA, Gomes EF, Jorge AM (2019) Classifying heart sounds using images of
Motifs, MFCC and temporal features. J Med Syst 43(6):168
Noman FM, Salleh SH, Ting CM, Samdin SB, Ombao H, Hussain H (2020) A Markov-switching model
approach to heart sound segmentation and classification. IEEE J Biomed Health Inform 24(3):705–716
Papadaniil CD, Hadjileontiadis LJ (2013) Efficient heart sound segmentation and extraction using ensemble
empirical mode decomposition and kurtosis features. IEEE J Biomed Health Inform 18(4):1138–1152
Park C, Looney D, Van Hulle MM, Mandic DP (2011) The complex local mean decomposition. Neuro-
computing 74(6):867–875
Patidar S, Pachori RB (2014) Classification of cardiac sound signals using constrained tunable-Q wave-
let transform. Expert Syst Appl 41(16):7161–7170
Patidar S, Pachori RB, Upadhyay A, Acharya UR (2017) An integrated alcoholic index using tunable-Q
wavelet transform based features extracted from EEG signals for diagnosis of alcoholism. Appl Soft
Comput 50:71–78
Potes C, Parvaneh S, Rahman A, Conroy B (2016) Ensemble of feature-based and deep learning-based
classifiers for detection of abnormal heart sounds. In: 2016 computing in cardiology conference
(CinC), pp 621–624
Rivera WA, Xanthopoulos P (2016) A priori synthetic over-sampling methods for increasing classifica-
tion sensitivity in imbalanced data sets. Expert Syst Appl 66:124–135
Safara F, Doraisamy S, Azman A, Jantan A, Ramaiah ARA (2013) Multi-level basis selection of wavelet
packet decomposition tree for heart sound classification. Comput Biol Med 43(10):1407–1414
Salman AH, Ahmadi N, Mengko R, Langi AZ, Mengko TL (2016) Empirical mode decomposition
(EMD) based denoising method for heart sound signal and its performance analysis. Int J Electr
Comput Eng 6(5):1–8
Sauer T, Yorke JA, Casdagli M (1991) Embedology. J Stat Phys 65(3–4):579–616
Selesnick I (2011) Wavelet transform with tunable Q-factor. IEEE Trans Signal Process 59(8):3560–3575
Shervegar MV, Bhat GV (2018) Heart sound classification using Gaussian mixture model. Porto Biomed
J 3(1):e4
Singh SA, Majumder S (2019) Classification of unsegmented heart sound recording using KNN classi-
fier. J Mech Med Biol 19(04):1950025
Sivakumar B (2002) A phase-space reconstruction approach to prediction of suspended sediment con-
centration in rivers. J Hydrol 258(1–4):149–162
Som A, Krishnamurthi N, Venkataraman V, Turaga P (2016) Attractor-shape descriptors for balance
impairment assessment in Parkinson’s disease. In: IEEE conference on engineering in medicine and
biology society, pp 3096–3100
Springer DB, Tarassenko L, Clifford GD (2015) Logistic regression-HSMM-based heart sound segmen-
tation. IEEE Trans Biomed Eng 63(4):822–832
Sujadevi VG, Mohan N, Kumar SS, Akshay S, Soman KP (2019) A hybrid method for fundamental heart
sound segmentation using group-sparsity denoising and variational mode decomposition. Biomed
Eng Lett 9(4):413–424
Sun S, Jiang Z, Wang H, Fang Y (2014) Automatic moment segmentation and peak detection analysis of
heart sound pattern via short-time modified Hilbert transform. Comput Methods Programs Biomed
114(3):219–230
Sun Y, Li J, Liu J, Chow C, Sun B, Wang R (2015) Using causal discovery for feature selection in multi-
variate numerical time series. Mach Learn 101(1–3):377–395
Takens F (1981) Detecting strange attractors in turbulence. In: Rand DA, Young L-S (eds) Dynamical
systems and turbulence, Warwick 1980. Springer, Berlin, pp 366–381
Varghees VN, Ramachandran KI (2014) A novel heart sound activity detection framework for automated
heart sound analysis. Biomed Signal Process Control 13:174–188
Varghees VN, Ramachandran KI (2017) Effective heart sound segmentation and murmur classification
using empirical wavelet transform and instantaneous phase for electronic stethoscope. IEEE Sens J
17(12):3861–3872
A new approach forthedetection ofabnormal heart sound signals…
1 3
Aliations
WeiZeng1 · JianYuan1· ChengzhiYuan2· QinghuiWang1· FenglinLiu1· YingWang1
* Wei Zeng
zw0597@126.com
1 School ofPhysics andMechanical andElectrical Engineering, Longyan University,
Longyan364012, People’sRepublicofChina
2 Department ofMechanical, Industrial andSystems Engineering, University ofRhode Island,
Kingston, RI02881, USA
Venkataraman V, Turaga P (2016) Shape distributions of nonlinear dynamical systems for video-based
inference. IEEE Trans Pattern Anal Mach Intell 38(12):2531–2543
Wang C, Hill DJ (2006) Learning from neural control. IEEE Trans Neural Networks 17(1):130–146
Wang C, Hill DJ (2007) Deterministic learning and rapid dynamical pattern recognition. IEEE Trans Neural
Netw 18(3):617–630
Wang C, Hill DJ (2009) Deterministic learning theory for identification, recognition and control. CRC
Press, Boca Raton
Wang Y, Liu F, Jiang Z, He S, Mo Q (2017) Complex variational mode decomposition for signal processing
applications. Mech Syst Signal Process 86:75–85
Wang Q, Zhou X, Wang C, Liu Z, Huang J, Zhou Y, Cheng JZ (2019) WGAN-based synthetic minor-
ity over-sampling technique: improving semantic fine-grained classification for lung nodules in CT
images. IEEE Access 7:18450–18463
Whitaker BM, Suresha PB, Liu C, Clifford GD, Anderson DV (2017) Combining sparse coding and time-
domain features for heart sound classification. Physiol Meas 38(8):1701
Xiao B, Xu Y, Bi X, Zhang J, Ma X (2019) Heart sounds classification using a novel 1-D convolutional neu-
ral network with extremely low parameter consumption. Neurocomputing. https ://doi.org/10.1016/j.
neuco m.2018.09.101
Xie Y, Xie K, Xie S (2019) Underdetermined blind source separation for heart sound using higher-order
statistics and sparse representation. IEEE Access 7:87606–87616
Xu B, Jacquir S, Laurent G, Bilbault JM, Binczak S (2013) Phase space reconstruction of an experimental
model of cardiac field potential in normal and arrhythmic conditions. In: 35th annual international
conference of the IEEE engineering in medicine and biology society, pp 3274–3277
Xue YJ, Cao JX, Wang DX, Du HK, Yao Y (2016) Application of the variational-mode decomposition for
seismic time-frequency analysis. IEEE J Sel Top Appl Earth Obs Remote Sens 9(8):3821–3831
Zhang WJ, Han JQ, Deng SW (2017) Heart sound classification based on scaled spectrogram and partial
least squares regression. Biomed Signal Process Control 32:20–28
Zhang WJ, Han JQ, Deng SW (2019) Abnormal heart sound detection using temporal quasi-periodic fea-
tures and long short-term memory without segmentation. Biomed Signal Process Control 53:101560
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
A preview of this full-text is provided by Springer Nature.
Content available from Artificial Intelligence Review
This content is subject to copyright. Terms and conditions apply.