Content uploaded by Dr. C. Sathish Kumar
Author content
All content in this area was uploaded by Dr. C. Sathish Kumar on Nov 20, 2018
Content may be subject to copyright.
UNCORRECTED PROOF
CBM 467
pp: 1--15 (col.fig.: nil)
PROD. TYPE: COM ED: GV
PAGN: Mamatha -- SCAN: Mallik
ARTICLE IN PRESS
Computers in Biology and Medicine ( ) –
www.elsevier.com/locate/compbiomed
1
Neural classication of lung sounds using wavelet coecients
A. Kandaswamya, C. Sathish Kumarb;∗, Rm. Pl. Ramanathanc, S. Jayaramana,3
N. Malmurugana
aDepartment of Electronics and Communication Engineering, PSG College of Technology, Coimbatore-641 004, India5
bDepartment of Electrical and Electronics Engineering, PSG College of Technology, Coimbatore-641 004, India
cDepartment of Pulmonology, PSG Institute of Medical Sciences and Research, Coimbatore-641 004, India7
Received 17 March 2003; accepted 16 July 2003
Abstract9
Electronic auscultation is an ecient technique to evaluate the condition of respiratory system using lung
sounds. As lung sound signals are non-stationary, the conventional method of frequency analysis is not highly
11
successful in diagnostic classication. This paper deals with a novel method of analysis of lung sound signals
using wavelet transform, and classication using articial neural network (ANN). Lung sound signals were
13
decomposed into the frequency subbands using wavelet transform and a set of statistical features was extracted
from the subbands to represent the distribution of wavelet coecients. An ANN based system, trained using the
15
resilient backpropagation algorithm, was implemented to classify the lung sounds to one of the six categories:
normal, wheeze, crackle, squawk, stridor, or rhonchus.
17
?2003 Published by Elsevier Ltd.
Keywords: Respiratory system diagnosis; Auscultation; Lung sound analysis; Discrete wavelet transform; Articial neural19
network
1. Introduction21
Chest auscultation is an inexpensive and ecient way to evaluate pulmonary dysfunction. As the
pathological changes of the lung produce characteristic sounds, auscultation gives direct information23
about the function of the lung. The conventional method of auscultation with a stethoscope has many
25
∗Corresponding author. Department of Electronics and Communication Engineering, PSG College of Technology,
Coimbatore, Tamil Nadu 641 004, India. Tel.: +91-9447-15-4467; fax: +91-422-257-3833.
E-mail address: csathish k@redimail.com (C.S. Kumar).
0010-4825/$ - see front matter ?2003 Published by Elsevier Ltd.
doi:10.1016/S0010-4825(03)00092-1
UNCORRECTED PROOF
2A. Kandaswamy et al. / Computers in Biology and Medicine ( ) –
CBM 467
ARTICLE IN PRESS
limitations. It is a subjective process that depends on the physician’s own hearing, experience,1
and ability to dierentiate between dierent sound patterns. Moreover the stethoscope has a fre-
quency response that attenuates frequency components of the lung sound signal above about 120 Hz3
and the human ear is not very sensitive to the lower frequency band that remains [1].
So auscultation is often utilized in cursory manner and many physicians rely so heavily on the5
chest X-ray, spirogram, and arterial blood gas analysis to evaluate the patient’s pulmonary
status.7
The only reliable and quantitative method for an objective assessment of lung sounds for diagnosis
of pulmonary diseases is by digital recording with subsequent analysis. This can be easily accom-9
plished using a computer. It is precisely the application of computer technology that has provided
new insights into the analysis of lung sounds for diagnosis [2]. Over the last 30 years, computerized11
methods for the recording and analysis of lung sounds have overcome many limitations of simple
auscultation. The study of lung sounds using computers oers immense advantages with respect to13
the storage, analysis and communication of sounds, but it has not yet found a major place in diag-
nosis of respiratory diseases [3]. Among the commonly reported applications of computerized lung15
sound analysis are the graphical presentation of features of importance, correlation of lung sound
with other physiological signals, comparison of lung sound obtained at dierent times during the17
progression of respiratory diseases or their treatment, monitoring of lung sounds of adults in critical
care settings, and detection of features and patterns that are not easily recognized by the human19
ear.
Signicant diagnostic information can be obtained from the frequency distribution of lung sounds.21
However, selecting the signal processing technique is very important. Many eorts have been re-
ported in literature on the classication of lung sound signals using frequency analysis [4–8]. Lung23
sound signals are non-stationary even when observed in a perfectly healthy normal subject. This
non-stationarity is severe in case of abnormal subjects. Thus the mutually exclusive time and fre-25
quency domain representations are not highly successful in the diagnostic classication of lung
sounds. Hence, the need for representation of lung sound signals in two dimensions with time27
and frequency as co-ordinates. Wavelet Transform (WT) is a suitable technique for obtaining the
time-frequency distribution of signals.29
Recently, a work on time-frequency analysis of lung sound signals for detecting typical pneumo-
nia using WT has been reported [9]. We could not nd many other serious attempts reported in31
international literature on diagnostic analysis of lung sound signals using WT. As compared to the
conventional method of frequency analysis using Fourier transform or short time Fourier transform,33
wavelets enable analysis with a coarse to ne multiresolution perspective of the signal [10]. In this
work, WT has been applied for the time-frequency analysis of lung sound signals and articial neural35
network (ANN) for the classication using wavelet coecients.
Lung sounds recorded from various subjects will be of dierent loudness levels. Hence be-37
fore processing, the signals were normalized so that they would have approximately the same
loudness irrespective of the subject. After normalization, the signals were decomposed into fre-39
quency subbands using discrete wavelet transform (DWT). A set of statistical features was ex-
tracted from the subbands to represent the distribution of wavelet coecients. An ANN based sys-41
tem was implemented to classify the lung sound signal to one of the categories: normal, wheeze,
crackle, squawk, stridor, or rhonchus. Block diagram schematic of the proposed method is shown in43
Fig. 1.
UNCORRECTED PROOF
CBM 467
ARTICLE IN PRESS
A. Kandaswamy et al. / Computers in Biology and Medicine ( ) – 3
Fig. 1. Schematic of the classication method.
2. Material and methods1
2.1. Categories of lung sound
Sound signals produced by the lungs due to airow during inspiration and expiration form a3
powerful source of information about the condition of the respiratory system. Lung sounds are
empirically known to be closely correlated with pulmonary pathology and they can be divided into5
two major categories: normal and adventitious sounds. The breathing-associated sound of a healthy
person is called the normal lung sound. Normal lung sound spans in the frequency range 100–7
1000 Hz and is devoid of any discrete peaks. Adventitious sounds are abnormal sounds and usually
indicate some type of respiratory disorder.9
2.1.1. Adventitious sounds
Wheezes: Wheezes are adventitious, continuous sound having a musical character. Acoustically,11
it is characterized by periodic waveforms with a dominant frequency usually over 100 Hz and with
a duration of ¿100 ms. Wheezing is a common sign of obstructive lung disease.13
Crackles: Crackles are discontinuous adventitious lung sounds, explosive and transient in nature,
and occur frequently in cardiorespiratory diseases [11]. Their duration is less than 20 ms, and their15
frequency content typically is wide.
Stridors: Stridors are very loud wheezes, which are the consequence of a morphologic or dynamic17
obstruction in larynx or trachea. Stridor is usually characterized by a prominent peak at about 1000 Hz
in its frequency spectrum.19
Squawks: Squawks are short inspiratory wheezes that occur primarily in restrictive lung diseases.
They always occur along with crackles, and often begin with a crackle. Their duration rarely exceeds21
400 ms. Squawks are assumed to originate from oscillation of small airways after sudden opening,
and their timing seems to depend on the transpulmonary pressure in a similar manner as in crackles.23
Rhonchi: Rhonchi often have a low-pitched, rattling, rumbling or bubbling quality. They may even
sound similar to wheezes on occasion, and therefore may be dicult to distinguish from them. They25
may have an even more liquid sound than either wheezes or crackles, but they could also sound
dry. The dominant frequency of rhonchi is less than 200 Hz.27
There are some more categories of lung sounds such as pleural friction rub, death rattle, etc. Pleural
friction rub is the characteristic sound produced when the pleural space has uid in it. Under this29
condition, there is less room for expansion of lung tissue on inspiration and the pleura rub together.
The death rattle is an ominous sound that generally describes a patient with lungs that are lling31
up with uid. As the pleural friction rub and death rattle can easily be identied by a physician,
mostly without the use of a stethoscope, these categories were not included in the classication.
UNCORRECTED PROOF
4A. Kandaswamy et al. / Computers in Biology and Medicine ( ) –
CBM 467
ARTICLE IN PRESS
Table 1
Types of adventitious sounds and possible lung diseases
Types of adventitious sounds Possible lung diseases
Crackles Alveolitis, pulmonary brosis, atelectasis, pneumonia, asbetosis, chronic bronchitis,
bronchiectasis, congestive heart failure
Wheezes Obstructive lung diseases (e.g. asthma), cyctic brosis, adults exposed to occupa-
tional hazards
Stridors Laryngitis, laryngomalacia, anatomic hypothesis, vocal cord paralysis, airway in-
ammation following extubation, tumors, tracheal stenosis
Squawks Allergic alveolitis, pulmonary brosis, intersitial brosis
Rhonchi Chronic bronchitis, tumors, pneumonia, obstructive pulmonary disease
Various types of adventitious sounds and the possible lung diseases are shown in Table 1[11,12].1
The usefulness of lung sound classication can be well understood from this Table. Identica-
tion of the type of adventitious sound will surely aid the physician in diagnosis of the respiratory3
disease.
2.2. Acquisition of lung sounds5
Lung sound signals collected earlier for frequency analysis [6] were used in this study. Recording
of the signals were done at the Department of Pulmonology, PSG Institute of Medical Sciences and7
Research, Coimbatore, India. All recordings were made, with the subjects in relaxed condition and in
supine position, under the supervision of a senior physician (third author) specialized in pulmonology9
and respiratory care. The classication of the lung sounds were done by the pulmonology group
headed by the third author. The signals belong to typical chronic cases of inspiratory wheezes, ne11
crackles, stridor, squawk, and rhonchus, apart from normal vesicular sounds. Other types of lung
sounds were not considered. Some of the lung sound signals available on the internet sites [13,14]13
were also used. The sampling frequency was 11;025 Hz in all cases.
2.3. Analysis using discrete wavelet transform15
2.3.1. Wavelet transform
The process of converting a signal from the time domain to the frequency domain is achieved con-17
ventionally with the Fourier transform (FT). Fourier transform does not provide enough information
when used on non-stationary signals. FT determines only the frequency components of a signal, but19
not their location in time. In order to overcome this drawback, short time Fourier Transform (STFT),
using a technique called windowing, was proposed. STFT maps the signal into a two-dimensional21
space of time and frequency using a single xed window. Wavelet transform enables analysis with
multiple window durations that allow for a coarse to ne multiresolution perspective of the signal.23
Being able to dilate or compress the variable sized window region (wavelet), dierent features of
the signal will be extracted in WT. A comparison between the constant window regions used in25
STFT analysis and the variable window region used in WT analysis is exhibited in Fig. 2. The
UNCORRECTED PROOF
CBM 467
ARTICLE IN PRESS
A. Kandaswamy et al. / Computers in Biology and Medicine ( ) – 5
Fig. 2. Window regions of STFT and WT analyzes.
mathematical equation describing the continuous wavelet transform (CWT) of the signal x(t)is[10]1
CWT
x(; s)= 1
|s|x(t) ∗t−
sdt: (1)
The quantity sis referred to as the scaling parameter of the wavelet, which can be considered to
represent the inverse of frequency, and the dilation parameter. The wavelet is compressed if scale3
is low and stretched if scale is high, which is also evident in Fig. 2. When the signal is sampled
at discrete intervals as in the case of acquisition by computers, discrete wavelet transform (DWT)5
is used.
2.3.2. Denoising the lung sound signals7
One of the major hurdles in computerized analysis of lung sounds is the presence of noise in the
signals. Noise from instruments such as ventilator, air-conditioner, etc., and other ambient noise may9
contaminate the lung sound signals. The noisy nature of the lung sounds is a serious impeding factor
that prohibits further processing in order to identify useful diagnostic features. Hence denoising of11
the lung sound signals is a must for eective utilization for diagnosis. Since the frequency bands of
these noises may overlap with the lung sounds, conventional method of using lters is not suitable13
for removal of noise. In this work, DWT based denoising technique, namely wavelet shrinkage
denoising, was used.15
Wavelet shrinkage denoising consists of three steps:
•obtain wavelet transform of the signal,17
•nonlinear shrinking of wavelet coecients and
•obtain inverse wavelet transform of the modied coecients.19
There are various types of wavelet shrinkage denoising techniques, classied according to the thresh-
olding method used in nonlinear shrinking. SureShrink [15], which uses a hybrid of the universal21
threshold and the Stein’s unbiased estimate of risk (SURE), was used for denoising in this work.
The denoising results are shown in Fig. 3.23
UNCORRECTED PROOF
6A. Kandaswamy et al. / Computers in Biology and Medicine ( ) –
CBM 467
ARTICLE IN PRESS
0 1 2 3 4 5 6
-1
-0.5
0
0.5
1
Time (s)
Amplitude
(a)
0 1 2 3 4 5 6
-1
-0.5
0
0.5
1
1.5
Time (s)
Amplitude
(b)
0 0.2 0.4 0.6
-1
-0.5
0
0.5
1
Time (s)
Amplitude
(c)
0 0.1 0.2 0.3 0.4 0.5 0.6
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
Time (s)
Amplitude
(d)
Fig. 3. Lung sound signal (a) with noise and (b) after denoising. (c) and (d) are the time expanded representations of
(a) and (b), respectively.
2.3.3. Multiresolution decomposition of lung sound signals1
The procedure of multiresolution decomposition of a signal x[n] is schematically shown in Fig. 4.
Each stage of this scheme consists of two digital lters and two downsamplers by 2. The rst lter,3
h[:] is the discrete mother wavelet, high-pass in nature, and the second, g[:] is its mirror version, low
pass in nature. The downsampled outputs of rst high-pass and low-pass lters provide the detail,5
D1and the approximation, A1, respectively. The rst approximation, A1is further decomposed and
this process is continued as shown in Fig. 4.7
Selection of wavelet and number of levels: Selection of suitable wavelet and the number of
levels of decomposition is very important in analysis of signals using WT. The typical way is to9
visually inspect the data rst, and if the data are kind of discontinuous, Haar or other sharp wavelet
functions are applied, otherwise a smoother wavelet can be employed. Usually, tests are performed11
UNCORRECTED PROOF
CBM 467
ARTICLE IN PRESS
A. Kandaswamy et al. / Computers in Biology and Medicine ( ) – 7
Fig. 4. Subband decomposition of DWT implementation; h[n] is the high pass lter, g[n] the low pass lter.
Table 2
Ranges of frequency bands in wavelet decomposition
Decomposed signal Frequency range (Hz)
D12756.25–5512.50
D21378.13–2756.25
D3689.06–1378.13
D4344.53–689.06
D5172.26–344.53
D686.13–172.26
D743.07–86.13
A70–43.07
with dierent types of wavelets and the one which gives maximum eciency is selected for the1
particular application.
The number of levels of decomposition is chosen based on the dominant frequency components3
of the signal. The levels are chosen such that those parts of the signal that correlate well with the
frequencies required for classication of the signal are retained in the wavelet coecients. Since the5
lung sounds do not have any useful frequency components below 50 Hz, the number of levels was
chosen to be 7. Thus the signal is decomposed into the details D1–D7and one nal approximation,7
A7. The ranges of various frequency bands are shown in Table 2.
2.4. Feature extraction9
The extracted wavelet coecients provide a compact representation that shows the energy distri-
bution of the signal in time and frequency. It was observed that the values of the coecients are11
very close to zero in D1,D2and A7. This is anticipated as the lung sound frequency spectrum ranges
from 50 to 1000 Hz. So the coecients corresponding to the frequency bands, D1,D2and A7were13
discarded, thus reducing the number of feature vectors representing the signal.
In order to further reduce the dimensionality of the extracted feature vectors, statistics over the set15
of the wavelet coecients was used [16]. The following statistical features were used to represent
UNCORRECTED PROOF
8A. Kandaswamy et al. / Computers in Biology and Medicine ( ) –
CBM 467
ARTICLE IN PRESS
the time-frequency distribution of the lung sound signals:
1
(1) Mean of the absolute values of the coecients in each subband.
(2) Average power of the wavelet coecients in each subband.3
(3) Standard deviation of the coecients in each subband.
(4) Ratio of the absolute mean values of adjacent subbands.5
Features 1 and 2 represent the frequency distribution of the signal and the features 3 and 4 the
amount of changes in frequency distribution. These feature vectors, calculated for the frequency7
bands D3–D7, were used for classication of the lung sound signals. The classier is based on a
multilayer articial neural network.9
2.5. Classication using articial neural networks
Articial neural networks (ANNs) are formed of cells simulating the low level functions of bi-11
ological neurons. In ANN, knowledge about the problem is distributed in neurons and connections
weights of links between neurons. The neural network has to be trained to adjust the connection13
weights and biases in order to produce the desired mapping. At the training stage, the feature vectors
are applied as input to the network and the network adjusts its variable parameters, the weights and15
biases, to capture the relationship between the input patterns and outputs. ANNs are particularly
useful for complex pattern recognition and classication tasks. The capability of learning from ex-17
amples, the ability to reproduce arbitrary nonlinear functions of input, and the highly parallel and
regular structure of ANN make them especially suitable for pattern classication tasks [17].19
ANNs are widely used in biomedical eld for modelling, data analysis, and diagnostic classication
[18–20]. The most frequently used training algorithm in classication problems is the backpropagation21
(BP) algorithm which is used in this work also. As the conventional BP algorithm with gradient
descent, and gradient descent with momentum are slow, a few of the modied BP algorithms were23
tried. Adaptive learning rate BP, resilient BP, Levenburg–Marquardt, and scaled conjugate gradient
BP algorithms were examined for training the ANN.25
2.5.1. Encoding of data for ANN
The classication scheme of 1-of-C coding has been used for classifying the signal into one of27
the output categories. For each type of lung sound, a corresponding output class is associated. The
feature vector set, xrepresents the ANN inputs, and the corresponding class, once coded, constitutes29
the ANN outputs. In order to make the neural network training more ecient, the input feature
vectors were normalized so that they fall in the range [0, 1.0]. Since the number of output classes31
is 6, the ANN has 6 outputs which produce a code for each class, as shown below.
y=
[0:90:10:10:10:10:1]Tfor the rst class;
[0:10:90:10:10:10:1]Tfor the second class;
.
.
..
.
.
[0:10:10:10:10:10:9]Tfor the sixth class:
UNCORRECTED PROOF
CBM 467
ARTICLE IN PRESS
A. Kandaswamy et al. / Computers in Biology and Medicine ( ) – 9
Each dummy variable is given the value 0.1 except for the one corresponding to the correct category,1
which is given the value 0.9. Using target values of 0.1 and 0.9 instead of the common practice of 0
and 1 prevents the outputs of the network from being directly interpretable as posterior probabilities3
[21]. The output vector associated to the modied input vector xk;k=1;2;:::;K is noted yk, with
Kthe number of lung sound signals.5
Once the coding processes are completed, a set of Kinput/output pairs
D={xk;y
k|k=1;2;:::;K}is available. This data set is divided into two subsets, training set and7
test set.
(1) Dtrain ={xk;y
k|k=1;2;:::;K
train}is used to perform the ANN training which consists of the9
determination of the ANN running parameters, i.e. the ANN connection weights and biases.
(2) Dtest ={xk;y
k|k=1;2;:::;K
test}is used to validate o-line classication ability and quality of11
the ANN once the training has been completed.
Normally about 80% of the dataset is used as the training set and the remaining as test set. During13
validation phase, the signal class (i.e. the type of lung sound) is indicated by the index of highest
output of ANN.15
2.5.2. Cross validation
Cross validation (CV) [22,23] is often used for comparing two or more learning ANN models to17
estimate which model will perform the best on the problem at hand. With n-fold CV, the available
data is partitioned into ndisjoint subsets, the union of which is equal to the original set. Each19
learning model is trained on n−1 of the available subsets, and then tested on the one subset which
was not used during training. This process is repeated ntimes, each time using a dierent test set21
chosen from the navailable partitions of the training data, until all possible choices for the test
set have been exhausted. The ntest set scores for each learning model are then averaged, and the23
model with the highest average test set score is chosen as the one most likely to perform well on
unseen data.25
3. Results and discussion
One cycle of breathing, an inspiration followed by expiration, was selected from each of the lung27
sound signals. Although it is better to select a cycle which is less noisy, it is not mandatory. The
denoising performed prior to the application of wavelet transform would remove any noise present.29
After normalization, the lung sound signals were decomposed using wavelet transform and the
statistical features were extracted from the subbands. A classication system based on feedforward31
ANN was implemented using the statistical features as inputs. The DWT and ANN training were
performed using the toolboxes available with the technical computing software, MATLAB [24,25].33
In order to improve the condence intervals on the performance estimates, 6-fold cross validation
was performed. The total number of 126 samples were partitioned into 6 disjoint subsets and each35
time 105 samples were used for training and the remaining 21 for validation. This procedure is
repeated 5 times, each time using a dierent test set chosen from the 6 divisions of the data, until37
all possible choices for the test set have been consumed. For each ANN model, this type of training
UNCORRECTED PROOF
10 A. Kandaswamy et al. / Computers in Biology and Medicine ( ) –
CBM 467
ARTICLE IN PRESS
40
50
60
70
80
90
100
Sum squared error
Classification efficiency (%)
coif4 sym10 db12 db8
0.1 0.01 0.001 0.0001
Fig. 5. Comparison of classication eciencies using dierent wavelets.
and testing process was done 25 times with each set of Dtrain and Dtest of the CV and the average1
value is taken.
The classication eciency which is dened as the percentage ratio of the number of lung sounds3
correctly classied to the total number of lung sounds considered for classication, also depends on
the type of wavelet chosen for the application. In the previous work on application of WT in lung5
sound analysis [9], Daubechies wavelet of order 8 (db8) was used and found to yield good results.
In order to investigate the eect of other wavelets on classications eciency, tests were carried out7
using other wavelets also. Apart from db8, Symmlet of order 10 (sym10), Coiet of order 4 (coif4),
and Daubechies of order 12 (db12) were also tried. Average eciency obtained for each wavelet9
when lung sound signals were classied using various ANN structures, is shown in Fig. 5.Itcan
be seen that the Daubechies wavelet oers better eciency than the others, and db8 is marginally11
better than db12. Hence db8 wavelet is chosen for this application.
Hornik et al. [26] showed that multilayer feedforward ANNs with at least one hidden layer of13
computational unit are capable of approximating any nite function to any degree of accuracy, and
hence they can be regarded as universal approximators. Subsequently it was proved that a feedforward15
ANN one hidden layer having p−1 hidden layer neurons can exactly implement an arbitrary training
set with ptraining samples [27]. This is a sucient condition for exactly implementing the training17
set. An important corollary to this result in the context of a classication problem, is that ANNs with
sigmoidal activation functions and two layers can approximate any decision boundary to arbitrary19
accuracy. Therefore, we started the simulation with a two layer ANN architecture with the hidden
layer having 104 neurons, 1 less than the number of training samples, and both layers having21
sigmoidal transfer functions. The number of input nodes was xed at 19, equal to the number of
input feature vectors, and the number of output vectors as 6, equal to the number of output classes.23
UNCORRECTED PROOF
CBM 467
ARTICLE IN PRESS
A. Kandaswamy et al. / Computers in Biology and Medicine ( ) – 11
Table 3
Performance of the various ANN architectures
Model ANN No. of No. of Training time ratio 1(%) 2(%) 3(%)
no. architecture weights epochs with model 1
1 19-104-6 2600 172 1.00 100 94.56 91.33
2 19-70-6 1750 190 0.8377 100 93.09 92.00
3 19-55-6 1375 211 0.7954 100 94.45 90.33
4 19-40-6 1000 265 0.7317 100 94.02 91.67
5 19-25-6 625 422 1.8044 100 83.37 73.67
6 19-10-6 250 5786 10.2727 100 59.15 46.33
Optimum number of neurons in the hidden layer, training algorithm, parameters of the training1
algorithm, and the activation functions of the two layers were determined by repeated simulation.
The conventional backpropagation (BP) algorithm was found to be too slow in converging to the3
specied sum squared error (sse). Resilient backpropagation algorithm which normally perform very
well on pattern recognition problems [25,28] has been selected initially for training the ANN.5
It was observed that the classication eciency and the training time were less when tan-sigmoid
function was used for the rst layer and log-sigmoid for the second layer. Hence the activation7
functions were selected accordingly. Many ANN models, having hidden layer neurons less than 100,
were investigated for ascertaining how changes in the number of neurons in hidden layer contribute9
to the overall performance of the classication system. We noted that learning of the training set
does not necessarily guarantee successful diagnostic classication of the test set. The results of11
neural network models trained with resilient backpropagation algorithm, which was found to be the
best training algorithm, are summarized in Table 3. Average value of the classication eciencies13
obtained on simulation is shown in the table. 1is the average eciency when training set is
presented to the trained ANN and 2is that when the validation test set is presented. In order to15
assess the performance of the trained ANN, a separate test set of 12 lung sound signals is also used.
3is the average eciency obtained when this test set was submitted.17
It was noticed that the best performance was obtained for the training set, validation test set,
and separate test set with those models whose hidden layer had 40 neurons or more. Thus the19
optimum number of neurons required in the hidden layer is 40, and hence we have chosen the ANN
conguration 19-40-6.21
One of the problems that occurs during neural network training is called overtting. The error on
the training set is driven to a very small value, but when new data is presented to the network the23
error is large. The network has memorized the training examples, but it has not learned to generalize
to new situations. One method for improving network generalization is to use a network that is just25
large enough to provide an adequate t [25,29]. The larger the network used, the more complex the
functions the network can create. If a small enough network is used, it will not have enough power27
to overt the data. Thus in this application, by using a network that is just large enough to provide
an adequate t, it could be possible to avoid any possibility of overtting of the training data.29
It is also very dicult to know which training algorithm will be the fastest for a given problem.
It will depend on many factors, including the complexity of the problem, the number of data points31
UNCORRECTED PROOF
12 A. Kandaswamy et al. / Computers in Biology and Medicine ( ) –
CBM 467
ARTICLE IN PRESS
10-1 100101102
10
10
10
10-1
-2
-3
-4
100
Time (s)
Sum squared error
sse goal = 0.001
RP
LM
GDA
SCG
Fig. 6. Comparison of the performances of training algorithms.
in the training set, the number of weights and biases in the network, the error goal, and whether1
the network is being used for pattern recognition or function approximation. In order to obtain
the most ecient training algorithm for this work, we have investigated four high performance3
backpropagation algorithms, namely, adaptive learning rate BP (GDA), resilient BP (RP), scaled
conjugate gradient (SCG), and Levenburg–Marquardt (LM) algorithms.5
A comparison of the performances of the training algorithms is illustrated in Fig. 6, which plots
the time required to converge to the error goal versus the sum squared error convergence goal for7
the ANN architecture 19-40-6. It can be seen that the resilient BP is the fastest algorithm for this
classication problem, hence its selection is justied. The SCG algorithm seems to perform well, it9
is almost as fast as the RP. The GDA algorithm is much slower than the SCG and RP algorithms
and LM algorithm seems to be the slowest.11
Thus the lung sound classication system proposed in this paper uses db8 wavelet for time-
frequency analysis of lung sounds. 19-40-6 ANN architecture was found to be the optimum model13
for classication using the statistical features extracted from wavelet coecients. RP algorithm was
used for training the ANN. This system was tested using a new set of lung sound signals which15
were originally noisy.
4. Summary17
Acoustic signals generated by the lungs during inspiration and expiration give important informa-
tion about the condition of the respiratory system. Electronic auscultation can be eectively used to19
assess respiratory dysfunction. Conventional method of classication of lung sounds using mutually
exclusive time and frequency domain representations does not give ecient results. In this work, a21
novel method of diagnostic classication of lung sound signals is proposed.
The lung sound signals were decomposed into time-frequency representations using wavelet trans-23
form and statistical features were calculated to depict their distribution. An ANN-based system was
UNCORRECTED PROOF
CBM 467
ARTICLE IN PRESS
A. Kandaswamy et al. / Computers in Biology and Medicine ( ) – 13
implemented for the classication of lung sounds using the statistical features as inputs. Simulation1
results showed that Daubechies wavelet of order 8 gives better classication eciency than some of
the other common wavelets. ANN architectures having two layers were chosen for the application.3
Some of the high performance backpropagation algorithms, namely, adaptive learning rate BP, re-
silient BP, scaled conjugate gradient, and Levenberg–Marquardt algorithms were tested for training5
the ANN. It was concluded after many simulations with various combinations of ANN architectures,
activation functions, and training algorithms, that an ANN architecture 19-40-6, with tan-sigmoid7
transfer function for the rst layer and log-sigmoid function for the second layer, is the optimum
structure for this application. Of the various training algorithms tested, resilient BP was found to be9
the best algorithm, taking least time for meeting the error goal.
Apart from serving as an aid for physician in diagnosis of pulmonary diseases, the proposed11
method could prove particularly useful for patients in critical care units and children who often nd
it dicult to blow hard several times in the course of a lung function test. Recording and analysis13
of lung sound over longer terms, weeks or months, will give a detailed and objective picture of a
patient’s condition and a good view on the course of his disorder in daily life. This information,15
could then be used by the physician, to set up an adequate treatment. However, it may be noted
that we have not considered all the classes of lung sounds in our study. More exhaustive work17
may be necessary for identifying all the types of lung sounds, and for dierentiating between the
subcategories such as ne crackles, coarse crackles, ne wheezing, coarse wheezing, etc.19
Acknowledgements
The authors wish to thank The Principal, PSG College of Technology, Coimbatore and the man-
agement of PSG institutions for providing the research facilities and encouragement. This work was
supported by Department of Science and Technology, Govt. of India vide Grant No. III.5(47)/96-ET.
References21
[1] A.R.A. Sovijarvi, J. Vanderschoot, J.E. Earis, Standardization of computerised respiratory sound analysis, Eur. Respir.
Rev. 10 (77) (2000) 585.
23
[2] H. Pasterkamp, S.S. Kraman, G.R. Wodicka, Respiratory sounds: advances beyond the stethescope, Am. J. Respir.
Crit. Care Med. 156 (1997) 974–987.
25
[3] J.E. Earis, B.M.G. Cheetham, Future perspectives for respiratory sound research, Eur. Respir. Rev. 10 (77) (2000)
641–646.
27
[4] F.T. Wooten, W.W. Waring, M.J. Wegmann, W.F. Anderson, J.D. Conley, Method for respiratory sound analysis,
Med. Instrum. 12 (1978) 254–257.
29
[5] S.K. Chowdhury, A.K. Majumder, Frequency analysis of adventitious lung sounds, J. Biomed. Eng. 4 (1982)
305–312.
31
[6] A. Kandaswamy, S. Rajkumar, A.S. Kumar, S. Jayaraman, Respiratory system diagnosis through lung sound
processing, J. Systems Sci. Eng. 4 (1) (1999) 32–36.
33
[7] S. Rietveld, M. Oud, E.H. Dooijes, Classication of asthmatic breath sounds: preliminary results of the classifying
capacity of human examiners versus articial neural networks, Comput. Biomed. Res. 32 (1999) 440–448.
35
[8] M. Oud, E. Dooijes, J. van der Zee, Asthmatic airways obstruction assessment based on detailed analysis of
respiratory sound spectra, IEEE Trans. Biomed. Eng. 47 (11) (2000) 1450–1455.
UNCORRECTED PROOF
14 A. Kandaswamy et al. / Computers in Biology and Medicine ( ) –
CBM 467
ARTICLE IN PRESS
[9] V. Gross, T. Penzel, L. Hadjileontiadis, U. Kochler, Wavelet based lung sound analysis, in: J. Jan, J. Kozumplik, I.1
Provaznik (Eds.), Proceedings of the 16th International Eurasip Conference (Biosignal 2002), Vutium Press, BRNO
University of Technology, Czech Republic, 2002, pp. 150–152.
3
[10] S.G. Mallat, A theory for multiresolution signal decomposition: the wavelet representation, IEEE Trans. Pattern Anal.
Mach. Intel. 11 (1989) 674–693.
5
[11] A.R.A. Sovijarvi, L.P. Malmberg, G. Charbonneau, J. Vanderschoot, F. Dalmasso, C. Sacco, M. Rossi, J.E. Earis,
Characteristics of breath sounds and adventitious respiratory sounds, Eur. Respir. Rev. 10 (77) (2000) 591–596.
7
[12] R.L. Wilkins, J.E. Hodgkin, B. Lopez, Lung Sounds: A Practical Guide, The C.V. Mosby Company, Missouri,
1997.
9
[13] www.ief.u-psud.fr.
[14] www.library.uthscsa.edu.
11
[15] D.L. Donoho, I.M. Johnstone, Adapting to unknown smoothness via wavelet shrinkage, J. Am. Stat. Assoc. 90 (432)
(1995) 1200–1224.
13
[16] G. Tzanetakis, G. Essl, P. Cook, Audio analysis using the discrete wavelet transform, in: C.E.D’ Attellis, V.V. Kluev,
N. Mastorakis (Eds.), Mathematics and Simlulation with Biological Economical and Musicoacoustical Applications,
15
WSES Press, New York, 2001, pp. 318–323.
[17] R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classication, Wiley, New York, 2001.
17
[18] T. Villmann, Neural network approaches in medicine—a review of actual developments, in: Proceedings of the
European Symposium on Articial Neural Networks (ESANN 2000), Bruges, Belgium, 2000, pp. 165–176.
19
[19] F. Gurgen, Neural network based decision making in diagnostic applications, IEEE Eng. Med. Biol. Mag. 18 (4)
(1999) 89–93.
21
[20] P.J.G. Lisboa, A review of evidence of health benet from articial neural networks in medical intervention, Neural
Networks 15 (1) (2002) 11–39.
23
[21] W.S. Sarle, Neural network FAQ—Part 2 (2002), ftp://ftp.sas.com/pub/neural/FAQ2.html.
[22] C. Goutte, Note on free lunches and cross-validation, Neural Comput. 9 (1997) 1211–1215.
25
[23] A. Tim, T. Martinez, Cross validation and MLP architecture selection, in: Proceedings of the IEEE International
Joint Conference on Neural Networks (IJCNN’99), Washington, DC, USA, 1999, p. CD Paper #192.
27
[24] M. Misiti, Y. Misiti, G. Oppenheim, J.M. Poggi, Wavelet Toolbox Users Guide for Use with MATLAB, The
MathWorks, Inc., Natick, MA, 2001.
29
[25] H. Demuth, M. Beale, Neural Network Toolbox Users Guide for Use with MATLAB, The MathWorks, Inc., Natick,
MA, 2001.
31
[26] K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networks are universal approximators, Neural
Networks 2 (1989) 359–366.
33
[27] M.A. Sartori, P.J. Antsaklis, A simple method to derive bounds on the size and to train multilayer neural networks,
IEEE Trans. Neural Networks 2 (1991) 467–471.
35
[28] M. Riedmiller, H. Braun, A direct adaptive method for faster backpropagation learning: The RPROP algorithm, in:
H. Ruspini (Ed.), Proceedings of the IEEE International Conference on Neural Networks (ICNN), San Francisco,
37
CA, USA, 1993, pp. 586–591.
[29] M.T. Hagan, H.B. Demuth, M.H. Beale, Neural Network Design, PWS Publishing, Boston, MA, 1996.
39
A. Kandaswamy received his B.E. (Honors) degree in Electrical Engineering and M.Sc. (Eng.) degree in Applied Electron-
ics both from Madras University, India in 1969 and 1974, respectively, and the Ph.D. degree from Bharathiar University,
41
India in 1992. Since 1969 he has been with PSG College of Technology, Coimbatore, India where at present he is the
Dean of Electrical Sciences. His research interests include biomedical signal processing, medical image processing, and
43
medical expert systems.
C. Sathish Kumar received his B.Tech. degree in Electronics and Communication Engineering from University of Kerala,
45
India and M.Tech. degree in Electrical Engineering from I.I.T. Bombay, India in 1988 and 1996, respectively. At present
he is working on biomedical signal processing for his Ph.D. degree at PSG College of Technology, Coimbatore, India. His
47
UNCORRECTED PROOF
CBM 467
ARTICLE IN PRESS
A. Kandaswamy et al. / Computers in Biology and Medicine ( ) – 15
areas of interest include biomedical signal processing, modelling of physiological systems, applications of soft computing1
in biomedical engineering.
Rm. Pl. Ramanathan received his M.B.B.S. degree from PSG Institute of Medical Sciences and Research, Coimbatore,
3
India in 1991 and subsequently he completed the M.D. degree in Paediatrics in 1994. He received D.M. in Pulmonology
and Critical Care Medicine from Postgraduate Institute of Medical Education and Research, Chandigarh, India in 1999.
5
At present he is with PSG Institute of Medical Sciences and Research where he is an Associate Professor in charge
of pulmonology and respiratory intensive care unit. His other areas of research include pulmonary physiology and sleep
7
disordered breathing.
S. Jayaraman received his B.E. degree in Electronics and Communication Engineering and M.E. degree in Communication
9
Systems both from Madras University, India in 1974 and 1976, respectively. He received his Ph.D. degree on stability
of multidimensional systems from Bharathiar University, India in 1993. His research interests include stability analysis
11
of multidimensional systems, ecient digital lter design, signal analysis based on independent component analysis and
principal component analysis, and non-linear signal processing. Currently he is working as Professor of Electronics and
13
Communication Engineering at PSG College of Technology, Coimbatore, India.
N. Malmurugan received his B.E. and M.Tech. degrees in Electronics and Communication Engineering from Madurai
15
Kamaraj University and Pondicherry University, India in 1987 and 1990, respectively. He is currently working for his
Ph.D. degree on data compression and wavelet analysis. His research interests include wavelet-based signal analysis,
17
multimedia signal compression and VLSI signal processing. At present, he is working as Assistant Professor of Electronics
and Communication Engineering at PSG College of Technology, Coimbatore, India.
19