Conference PaperPDF Available

Training and Analysis of Support Vector Machine using Sequential Minimal Optimization

Authors:

Abstract and Figures

Maximizing the classification performance of the training data is a typical procedure in training a classifier. It is well known that training a Support Vector Machine (SVM) requires the solution of an enormous quadratic programming (QP) optimization problem. Serious challenges appeared in the training dilemma due to immense training and this could be solved using Sequential Minimal Optimization (SMO). This paper investigates the performance of SMO solver in term of CPU time, number of support vector and decision boundaries when applied in a 2-dimensional datasets. Next, the chunking algorithm is employed for comparison purpose. Initial results demonstrated that the SMO algorithm could enhance the performance of the training dataset. Both algorithms illustrated similar patterns from the decision boundaries attained. Classification rate achieved by both solvers are superb.
Content may be subject to copyright.
Training and Analysis of Support Vector Machine
using Sequential Minimal Optimization
S.Shahbudin1, A. Hussain2, S. A. Samad3
Electrical, Electronics & System Engineering Department
Faculty of Engineering and Built Environment,
National University of Malaysia
Bangi, Selangor Darul Ehsan, Malaysia
shaqay@yahoo.com1 , aini@vlsi.eng.ukm.my2
N. Md Tahir4
Faculty of Electrical Engineering
Technology University of Mara,
Shah Alam, Selangor Darul Ehsan, Malaysia
Abstract— Maximizing the classification performance of the
training data is a ty pical procedure in training a classifier. It is
well known that training a Support Vector M achine (SVM)
requires the solution of an enormous quadratic programming
(QP) optimization problem. Serious challenges appeared in the
training dilemma due to immense training and this could be
solved using Sequential Minimal Optimization (SMO). This
paper investigates the performance of SMO solver in term of
CPU time, number of support vector and decision boundaries
when applied in a 2-dimensional datasets. Next, the chunking
algorithm is employed for comparison purpose. Initial results
demonstrated that the SMO algorithm could enhance the
performance of the training dataset. Both algorithms illustrated
similar patterns from the decision boundaries attained.
Classification rate achieved by both solvers are superb.
Keywords— Sequential Minimal Optimization, Chunking
algorithm, decision boundaries, support vector machine.
I. INTRODUCTION
Support Vector Machine (SVM) is an eminent technique
for solving classification problems. The goal of SVM is to
determine a classifier or regression machine that minimizes the
empirical risk namely the training set error and the confidence
interval which corresponds to the generalization or test set error
[1]-[2].
To obtain a SVM classifier with the best generalization
performance, appropriate training is required. Training SVM
entails the solution of a very large quadratic programming (QP)
optimization problem. However, Sequential Minimal
Optimization (SMO) with a fixed working set size is amongst
the popular decomposition method for training even for very
large data sets. Most of the researches [3]-[7], proved that SMO
gave superb performances in training infinite N-dimensional
data size. For example, in [3], SMO is applied to train SVM for
classifying large dataset such as the (UCI) “adult” data set, text
categorizations and sparse dataset. Results indicated that SMO
algorithm provides a better scaling for both linear and non
linear SVM with RBF Gaussian as the kernel function. It is also
performed extremely well with sparse data sets even for non-
linear SVM. In [7], various sample datasets such as ionosphere,
breast cancer and adult dataset have been tested and result
proved that SMO-based algorithm is significantly more
efficient than other methods available in the optimization
toolboxes.
However, in training of 2-dimensional (2D) data, the
performance of SMO algorithm is rarely visualized, analyzed
and studied. Thus, the purpose of this paper is to explore the
SMO capability in training 2D data as compared to chunking
algorithm along with visualization of the decision boundary
figures of both algorithms. In this study, to analyze the
performance of SMO algorithm, the parameters like CPU time,
number of support vector and the shape of decision boundaries
will be examined and observed.
A description of SVM is detailed in Section 2. In Section 3,
several previous training SVM algorithms are explained.
Experimental results of both algorithms are presented in
Section 4. Finally, in Section 5 we conclude our findings.
II. OVERVIEW OF SVM
In general, a Support Vector Machine (SVM) is a learning
machine for two class classification problems, and given a
labeled training dataset, (x1.,y1),…,(xl ,yl ) where xiRN is a
feature vector and yi {-1,1} is a class label.
The algorithm seeks to define a decision surface which
gives the largest margin or separating between the data classes
whilst at the same time minimizing the number of errors.
However, this decision surface is not created in the input space,
but rather in a very high-dimensional feature space. The
resulting model is nonlinear, and is accomplished by the use of
kernel functions. The kernel function, K indicates a measure of
similarity between a pattern xi, and a pattern xj from the stored
training set. Using the kernel, the dual QP problem in term of
Lagrange Multipliers, Įi in the feature space is given in
equation (1), that is maximize
),(
2
1
)(
1,1
ji
l
ji
jiji
l
i
ixxKyyW ¦¦
==
=
αααα
(1)
subject to the constraints
¦
=
=
l
i
ii y
1
0
α
C
i
α
0 (2)
373
1-4244-2384-2/08/$20.00 c
2008 IEEE
Authorized licensed use limited to: GOOGLE. Downloaded on October 6, 2009 at 13:05 from IEEE Xplore. Restrictions apply.
where i=1,…,l.
After finding the optimal values of Įi, the decision
boundary that needs to be constructed is of the form
.),()(
0
bxxKyxf ii
l
i
i
+= ¦
α
α
(3)
where the class of x is determined from the sign of f(x). Those
xicorresponding Įi0 is called support vector. The value b is a
threshold of the decision boundary from origin. The
regularization parameter, C, is the margin parameter that
determines the trade-off between maximizing the margin and
minimizing the classification error and is chosen by means of a
validation set [7].
In SVM classification one of the attracting features is the
sparsity representation of its decision boundary. According to
[8], the position of the separating hyperplane in the feature
space is determined via real-valued weights on the training set
examples. Those training examples that are situated far away
from the hyper plane do not participate in its specification and
thus receive weights of zero. Only the training examples that lie
close to the decision boundary between the two classes receive
nonzero weights. These training examples are called the
support vectors, since removing them would change the
location of the separating hyper plane. As an example, Fig. 1
illustrates the support vectors in a two-dimensional feature
space. Typically, SVM learning algorithm is defined such that
the number of support vectors is less compared to the total
number of training examples, thus allowing the SVM to
classify new examples efficiently, since the majority of the
training examples can be safely ignored.
III. PREVIOUS SVM TRAINING METHODS
The first training of a SVM with small data sets was
introduced by Vapnik [9], using constrained conjugate gradient
algorithm. Conjugate gradient ascent started with an initial
estimate for solution, denoted by Įo, and then updates the
vector iteratively following the steepest ascent path, that is
moving in the direction of the gradient of W(Į) evaluated at the
position Įt for update t+1.At each iteration, the direction of
update is determined by the steepest ascent strategy, but at the
same time the step length is kept fixed. In this method, every
time Įi reaches zero, the corresponding data point is eliminated
and the process will be re-started.
As such, the decomposition or working set method that is
the SMO which is known to be an excellent method to train
large data set problems will be investigated for its capability in
dealing with 2D data. In this study, SMO training capability is
explored and compared to another type of training method
called the chunking algorithm.
A. Chunking
The Chunking algorithm was proposed by Vapnik [10] in
which started with arbitrary subset or ‘chunk’ the data and train
an SVM, Then, the support vectors remain in the chunk while
other points are discarded and replaced by a new working set
with gross violations of KKT (Karush-Kuhn-Tucker)
conditions. In the final iteration the entire set of non-zero
Lagrange multipliers is extracted and hence the algorithm
solves the QP problem.
B. Sequential Minimal Optimization (SMO)
Recently, SMO has been employed to rapidly train
SVM. The idea behind SMO is that the QP problems can be
broken up into a series of the smallest possible QP problems
and solved analytically by optimizing two Įi at each iteration
and keeping the remaining Įi as fixed. These two values can be
acquired easily and rapidly and thus helps avoid large matrix
computation .Details of SMO can be found in [3]-[4].
The main difference between the SMO method and the
chunking algorithm is that SMO solves the QP problem
analytically without any extra matrix storage whereas for
chunking algorithm, the QP problem needs to be solved
iteratively; which involves exhaustive numerical QP steps and
thus required exponential memory [3].
IV. EXPERIMENTAL RESULTS
SMO solver is implemented to train the binary SVM
classifier with L1-soft margin. Based on [12], for each solver,
the convergence of tolerance, İ = 0.001, the value of kernel
argument is equal to 1 and the Gaussian radial basis function
(RBF) kernel was opted that is K(xi,xj)=exp(-||xi-xj||2/(2ı2)).
The CPU time of both algorithms were measured on a
Pentium R, 3.0 GHz computer with 768MB RAM. Both SMO
and Chunking solvers were implemented using the Statistical
Pattern Recognition Toolbox [12] and Matlab 7.0 respectively.
Over 190 2-D data acquired from [11] are divided equally
for training and testing. These values represent the feature
vectors of the second and forth eigenpostures that are generated
based on PCA technique. Both the SMO and chunking solvers
are used. The system is trained on the training data and its
performance measured on the test data. The trained SVM
classifier is evaluated on the 2D training data using various
values of regularization parameter, C. The example of decision
boundary with C=10 attained is as shown in Fig. 2.
Figure 1. A 2D feature space with a separating hyperplane for non-
linear boundary. Both classification boundary and the accompanying soft
margins are represented by solid line and dotted lines, respectivel y where
as positive and negative examples fall on opposite sides of the decision
boundary. The circled p oints are the support vec tors that lie closest to the
decision boundary.
374 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)
Authorized licensed use limited to: GOOGLE. Downloaded on October 6, 2009 at 13:05 from IEEE Xplore. Restrictions apply.
A similar decision boundary is obtained using the Chunking
solver as illustrated in Fig. 3.
From both figures, it is observed that the patterns of
decision boundary are alike. However, there exist differences in
the CPU time as tabulated in Table I. The CPU time of the
Chunking algorithms is 1.6853 seconds whilst the SMO
algorithm recorded the CPU time of only 0.1288 seconds. The
number of support vector for both solvers are almost equal that
is 54 for chunking solver and 52 for SMO solver. The same
goes for the classification rates with both solvers gained an
accuracy of 90%.
Further, results for decision boundaries of the SMO and
chunking solvers with regularization parameter C of value 100
are depicted in Fig. 4 and Fig. 5 respectively. It is again
observed that both decision boundaries are almost similar with
equal number of support vectors of 30. Again, the classification
rates of both solvers are similar specifically 93.75% for SMO
and 93.47% for the chunking solver. As before, the CPU time
for SMO is faster than the chunking solver. More results with
various value of C are tabulated in Table I.
TABLE I. PERFORMANCE COMPARISON OF SMO AND
CHUNKING SOLVER WITH FROM C=10 TO C=500 (EIGENPOSTURES
DATASET)
Training
Algorithms
Parameter Measured
Regularization
Parameter, C
Number of
support
vectors
CPU Times
(s)
Classification
Accuracy
(%)
SMO solver
10 52 0.1288 91.67
50 38 0.2591 92.78
100 30 0.4097 93.75
500 24 1.6363 93.75
Chunking
solver
10 54 1.6853 91.51
50 37 1.3853 92.65
100 30 1.7758 93.47
500 25 2.4234 96.76
Figure 2. Sample of 2D eigenpostures dataset
using SMO solver (C=10)
Figure 3. Sample of 2-D eigenpostures dataset using
Chunking solver (C=10)
Figure 5. Sample of 2-D eigenpostures dataset
using Chunking solver (C=100 )
Figure 4. Sample of 2D eigenpostures dataset
using SMO solver (C=100 )
2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008) 375
Authorized licensed use limited to: GOOGLE. Downloaded on October 6, 2009 at 13:05 from IEEE Xplore. Restrictions apply.
From the results, the decision boundaries for both solvers
depict similar patterns but recorded different CPU time. The
CPU time of SMO solver is shorter than the chunking solver
irrespective of the C values. It is also observed that the
regularization parameter, C is inversely related to the number
of support vectors, that is, as the value increases, the number of
support vectors will decrease. Additionally, even though the
CPU time of both solvers is different the solvers performances
in terms of the classification error rates are unaffected.
Next, the Iris and Ripley datasets acquired from the software
[12] and UCI Machine Learning Repository Databases
respectively are utilized to verify these findings. Both datasets
are in the 2D form. The Ripley dataset [12]-[13] comprised of
250 training data and 1000 patterns as the testing data. The Iris
dataset consists of 120 patterns with 60 patterns are used as the
training data and the remainder as testing data. The trained
SVM classifier is evaluated on both datasets using various
values of regularization parameter, from C =10 to C=500.
Results are as summarized in Table II and Table III
respectively.
TABLE II. PERFORMANCE COMPARISON OFSMO AND
CHUNKING SOLVER FROM C=10 TO C=500 (IRIS DATASET)
Training
Algorithms
Parameter Measured
Regularization
Parameter C
Number of
support
vectors
CPU
Times
(s)
Classification
Accuracy
(%)
SMO solver
10 12 0.0556 95.00
50 10 0.0563 95.00
100 9 0.0420 98.33
500 8 0.0161 98.33
Chunking
solver
10 12 0.3042 95.00
50 9 0.7916 96.67
100 9 0.4533 98.33
500 8 0.2266 98.33
TABLE III. PERFORM ANCE COMPAR ISON OF SMO AND
CHUNKING SOLVER WITH FROM C=10 TO C=500 (RIPLEY
DATASET)
Training
Algorithms
Parameter Measured
Regularization
Parameter C
Number of
support
vectors
CPU Times
(s)
Classification
Accuracy
(%)
SMO solver
10 94 0.3469 85.60
50 85 0.4931 87.60
100 83 0.5743 87.60
500 77 2.9142 89.20
Chunking
solver
10 94 0.7706 82.60
50 85 0.9766 84.35
100 83 1.4559 84.40
500 76 3.0644 89.45
From Table II and III, it is observed for both datasets that
the SMO performed faster than the chunking solver from the
CPU time attained. In addition, the number of support vectors
obtained is equivalent for both solvers towards Ripley and Iris
datasets. Also, the number of support vectors will reduce when
C increased. For both solvers, the Iris and the Ripley dataset
obtained excellent classification rate for various value of C. It
is also observed that both solvers generated similar decision
boundaries as depicted in Fig. 6(a)-(d) and Fig. 7(a)-(d) based
on the different values of C applied. Fig. 6 depicts the results
using Iris dataset whereas Fig. 7 depicts the results using
Ripley dataset.
V. CONCLUSIONS
As a conclusion, the SMO algorithm realized better training
method for SVM 2D data size based on the CPU time attained.
From the visualized decision boundaries, both solvers
illustrated similar pattern for both value of C applied along
with similar and excellent classification rate. Furthermore, the
number of support vectors is equal for both solvers with
different values of C. Initial results demonstrated that the
SMO is an efficient approach for training the SVM even for
2D data samples.
REFERENCES
[1] V.N. Vapnik, “The nature of statistical learning theory,” Springer, New
York, 1995.
[2] N.Cristianini , J.Shawe-Taylor,“An Introd uction to Support Vector
Machines:and other kernel-based learning methods,”New York:
Cambridge University Press, 2000.
[3] J. Platt, "Fast training of support vector ma chines using sequenti al
minimal optimizati on:' in Advances in Kemrl Ma rhods -Support Vector
Laming, B. Schdlkopi, C.1.C Burges. and A.J. Smala. editors, pages
185-208. MIT Press, Camb ridge. MA, 1999.
[4] Ginny Mak , “Th e Implementation Of Supp ort vector machines usin g the
sequential minima l optimization algori thm,” Master thesis , 2000.
[5] E. Osuna, R. Freund and F. Girosi, “Support vector machines: Training
and Applications,” A.I. Memo AIM-1602, MIT A.I. Lab, 1996.
[6] Francis R. Bach & Gert R. G. Lanckriet, Michael I. Jordan , “Fast kernel
learning using sequentia l minimal optimization,” Report No. UCB/CSD-
04-1307 February, 2004.
[7] Shiego Abe, “Supp ort vector machines for pattern cla ssification”
Advances in Pattern Rec ognition , Springer 2005 .
[8] Michael P. S. Brown,William Noble Grundy,David Lin,Nello
Cristianini,Charles Sugnet,Manuel Ares,Jr. David Haussler , “Support
Vector Machine Classification of Microarray Gene Exp ression Data,”
Technical Report No: UCSC-CRL-99-09, June 12, 1999.
[9] C.Burges and V.Vapnik, “A new method for constructing artifial neural
networks,”Technical report, AT&T Bell Laboratories,May 1995
[10] V.Vapnik, “Estimati on of Dependences Based on Empiric al Data,”
Springer-Verlag, 1982.
[11] Nooritawati Md Tahir, Aini Hussain, Salina Abdul Samad, Hafi zah
Husain & Mohd Marzuki Mustafa, “Eigenposture For Classification”
Journal of Applied Sciences, Asian Network for Scientific Information,
ANSINET, 6(2), 2006.
[12] Statistical Pattern Recognition Toolbox
http://cmp.felk.cvut.cz/cmp/software/stprtool/index.html
[13] B.D. Ripley, “Neural networks and related methods for classification,”
J. Royal Statistica l Soc. Series B, 56:pp.409–45 6, 1994.
376 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)
Authorized licensed use limited to: GOOGLE. Downloaded on October 6, 2009 at 13:05 from IEEE Xplore. Restrictions apply.
Figure 6. Results obtained using iris dataset
(c) When C=100 for SMO solver
1 2 3 4 5 6 7
0
0.5
1
1.5
2
2.5
x1
x2
(b) When C=10 for chunking solver
1 2 3 4 5 6 7
0
0.5
1
1.5
2
2.5
-1
-1
-1
-1
-1
0
0
0
0
0
0
1
1
1
1
1
x1
x2
(d) When C=100 for chunking solver
1 2 3 4 5 6 7
0
0.5
1
1.5
2
2.5
-1
-1
-1
-1
-1
0
0
0
0
0
1
1
1
1
1
1
x1
x2
1 2 3 4 5 6 7
0
0.5
1
1.5
2
2.5
x1
x2
(a) When C=10 for SMO solver
2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008) 377
Authorized licensed use limited to: GOOGLE. Downloaded on October 6, 2009 at 13:05 from IEEE Xplore. Restrictions apply.
(c) When C=100 for SMO solver
(a) When C=10 for SMO solver (b) When C=10 for chunking solver
(d) When C=100 for chunking s olver
Figure 7. Results obtained using Ripley dataset
378 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)
Authorized licensed use limited to: GOOGLE. Downloaded on October 6, 2009 at 13:05 from IEEE Xplore. Restrictions apply.
... For classifying huge datasets, text categorizations, and sparse datasets, it is used to SVM. SMO algorithm gives the best scaling for both linear and non-linear SVM [13]. ...
Article
Full-text available
The internet has enabled people to instantly access enormous amounts of information from anywhere in the world. In the process of decision-making, the other's opinion is important. It is familiar for e-commerce websites to have a review segmentfor each of their products where users may share their thoughts and help potential customers assess the products.Sentiment analysis is the most common field to obtain insights from text data from various e-commerce sites like Amazon, Ajio, etc. In sentiment analysis, machine learning techniques give the best results whenanalyzing and categorizing positive and negative reviews. This article provides a detailed analysis of sentiment analysis of e-commerce reviews from the Amazon website which contains reviews of beauty products. The Amazon dataset was collected from Kaggle and pre-processed for better results. The classifiers from the two machine learning categories are considered as follows: Naive Bayes (NB) from Bayes theorem, Sequential Minimal Optimization (SMO) from Support Vector Machine (SVM). The result of this paper is determined by analyzing and comparing these two outcomesefficiently.
... SMO subsequently collect a pair of training samples for join improvisation, which reduces the usage of memory [53].SMO uses an analytical method for avoiding complex iteration processes. SMO can be used as a decay method for training large data sets [93].SMO algorithm depended on heuristics for choosing the variables for optimizing an objective function. This concept helps the algorithm for performing on a large data set. ...
Article
Full-text available
Email is a useful communication medium for better reach. There are two types of emails, those are ham or legitimate email and spam email. Spam is a kind of bulk or unsolicited email that contains an advertisement, phishing website link, malware, Trojan, etc. This research aims to classify spam emails using machine learning classifiers and evaluate the performance of classifiers. In the pre-processing step, the dataset has been analyzed in terms of attributes and instances. In the next step, thirteen machine learning classifiers are implemented for performing classification. Those classifiers are Adaptive Booster, Artificial Neural Network, Bootstrap Aggregating, Decision Table, Decision Tree, J48, K-Nearest Neighbor, Linear Regression, Logistic Regression, Naïve Bayes, Random Forest, Sequential Minimal Optimization and, Support Vector Machine. In terms of accuracy, the Random Forest classifier performs best and the performance of the Naïve Bayes classifier is substandard compared to the rest of the classifiers. Random Forest classifier had the accuracy of 99.91% and 99.93% for the Spam Corpus and Spambase datasets respectively. The naïve Bayes classifier had the accuracy of 87.63% and 79.53% for the Spam Corpus and Spambase datasets respectively.
... The value b is a threshold of the decision boundary from origin. The regularization parameter, C, is the margin parameter that determines the trade-off between maximizing the margin and minimizing the classification error and is chosen by means of a validation set [8]. An example of the SVM decision boundary for a 2D classification generated by Chen et al. [10] is as depicted in Fig.3. ...
Article
Many recycling activities adopt manual sorting for plastic recycling that relies on plant personnel who visually identify and pick plastic bottles as they travel along the conveyor belt. These bottles are then sorted into the respective containers. Manual sorting may not be a suitable option for recycling facilities of high throughput. It has also been noted that the high turnover among sorting line workers had caused difficulties in achieving consistency in the plastic separation process. As a result, an intelligent system for automated sorting is greatly needed to replace manual sorting system. The core components of machine vision for this intelligent sorting system is the image recognition and classification.[3]Therefore, in this work, an automated classification of plastic bottles based on the extraction of best feature vectors to represent the type of plastic bottles is performed using the morphological based approach. Morphological operations are used to describe the structure or form of an image. By using the two-dimensional description of plastic bottle silhouettes, edge detection of the object silhouette is performed followed by the erosion process. This procedure can be considered as two stages; a) a feature vector is extracted from the analysis of morphological operation and structure element used and b) a classification technique is applied to that input vector in order to provide a meaningful categorization of the data content. In this study, Support Vector Machines (SVM) was employed merely to classify the image of two groups of plastic bottles namely polyethyleneterephthalate (PET) and non-PET. Additionally, for detailed classification task, the pattern of decision boundary for classification of extracted feature vectors based on morphological approach is also illustrated. Furthermore, the optimal features for input to SVM classifier is identified.The initial results indicate that the performance of the SVM in terms of classification accuracy is more than 90%.
Article
Speech classification acceleration using field-programmable gate arrays (FPGAs) is a well-studied field and enables the potential to gain both speed and better energy efficiency over other processor-intensive classifiers. System-on-chip (SoC) architecture allows for an integrated system between programmable logic and processor and for increased bandwidth communications to on-chip peripherals and memory. This article serves as an investigation of the utility of an edge-based support-vector machine (SVM) implemented onto a Zynq-XC7Z020 multiprocessor system on a chip (MPSoC) for the acceleration of three speech class pairs. The system allows for a parallelized structure, which yielded a faster classifier model. The results were found to be an acceleration factor of 2.08 $\times$ . This appears to have come at the cost of a decrease in prediction accuracy, lowering from 92.5% to 83.5% positive prediction percentage likely due to decreased data resolution. The resolution used in this model was a 16-bit fixed-point format for the hardware interpretation and a floating-point format for the software benchmark. The resource usage of the FPGA was also analyzed for both overlays and can yield a 21% reduction in CPU usage.
Article
In recent years, autoencoder has been widely used for the fault diagnosis of mechanical equipment because of its excellent performance in feature extraction and dimension reduction; however, the original autoencoder only has limited feature extraction ability due to the lack of label information. To solve this issue, this study proposes a feature distance stack autoencoder (FD-SAE) for rolling bearing fault diagnosis. Compared with the existing methods, FD-SAE has stronger feature extraction ability and faster network convergence speed. By analyzing the characteristics of original rolling bearing data, it is found that there are evident differences between normal data and faulty data. Therefore, a simple linear support vector machine (SVM) is used to classify normal data and faulty data, and then the proposed FD-SAE is used for fault classification. The novel combination of SVM and FD-SAE has simple structure and little computational complexity. Finally, the proposed method is verified on the rolling bearing data set of Case Western Reserve University (CWRU).
Article
Weed classification is a serious issue in the agricultural research. Weed classification is a necessity in identifying weed species for control. Many classification techniques have been used to identify weed based on images, however, most of the techniques only measure the percentages of accuracy but the detailed of classifier parameter are not analyzed and discussed. Therefore, in this work, feature vectors of weed images extracted using Gabor Wavelet and Fast Fourier Transform (FFT) were employed in analyzing weed pattern based on images using Support Vector Machines (SVM). The decision boundaries of the categorized extracted feature vectors are illustrated and optimal feature vectors are identified. Results are discussed and displayed with illustrations to prove the SVM classifier performance.
Conference Paper
Many classifiers have been employed to classify human posture classification; however, most of them only presents the average accuracy of the classification. Furthermore, the details of the measured parameters especially for SVM classifier are not measured. Therefore, the objective of this work is to analyse and classify human body posture using Support Vector Machine (SVM) techniques based on various two combinations of eigenpostures by considering two different solvers in the training process. The two solvers namely Sequential Minimal Optimization (SMO) and Matlab Quadratics Programming (QP) solvers have been studied and analyzed to perform the SVM training. The principal component analysis (PCA) method is applied to extract the features from human shape silhouettes. These extracted feature vectors are then used to perform human posture classification. Human posture evaluates which eigenpostures (feature vectors of the several eigenvalues) can be used to classify either human standing posture or human non-standing posture. Next, the solvers that produced the best performance in classifying human postures as well as the best combination of eigenpostures were selected. The results verified that the combination of second and fourth eigenpostures gives the superb performance with 100% correct classification and it is shown that the best solver in training process to classify human body posture classification is the SMO based on the shortest CPU time attained.
Article
Full-text available
An abstract is not available.
Article
Full-text available
While classical kernel-based classifiers are based on a single kernel, in practice it is often desirable to base classifiers on combinations of multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for the support vector machine (SVM), and showed that the optimization of the coefficients of such a combi-nation reduces to a convex optimization problem known as a quadratically-constrained quadratic program (QCQP). Unfortunately, current convex optimization toolboxes can solve this problem only for a small number of kernels and a small number of data points; moreover, the sequential minimal optimization (SMO) techniques that are es-sential in large-scale implementations of the SVM cannot be applied because the cost function is non-differentiable. We propose a novel dual formulation of the QCQP as a second-order cone programming problem, and show how to exploit the technique of Moreau-Yosida regularization to yield a formulation to which SMO techniques can be applied. We present experimental results that show that our SMO-based algorithm is significantly more efficient than the general-purpose interior point methods available in current optimization toolboxes.
Article
Feed‐forward neural networks are now widely used in classification problems, whereas non‐linear methods of discrimination developed in the statistical field are much less widely known. A general framework for classification is set up within which methods from statistics, neural networks, pattern recognition and machine learning can be compared. Neural networks emerge as one of a class of flexible non‐linear regression methods which can be used to classify via regression. Many interesting issues remain, including parameter estimation, the assessment of the classifiers and in algorithm development.
Chapter
In the history of research of the learning problem one can extract four periods that can be characterized by four bright events: (i) Constructing the first learning machines, (ii) constructing the fundamentals of the theory, (iii) constructing neural networks, (iv) constructing the alternatives to neural networks.
Book
A guide on the use of SVMs in pattern classification, including a rigorous performance comparison of classifiers and regressors. The book presents architectures for multiclass classification and function approximation problems, as well as evaluation criteria for classifiers and regressors. Features: Clarifies the characteristics of two-class SVMs; Discusses kernel methods for improving the generalization ability of neural networks and fuzzy systems; Contains ample illustrations and examples; Includes performance evaluation using publicly available data sets; Examines Mahalanobis kernels, empirical feature space, and the effect of model selection by cross-validation; Covers sparse SVMs, learning using privileged information, semi-supervised learning, multiple classifier systems, and multiple kernel learning; Explores incremental training based batch training and active-set training methods, and decomposition techniques for linear programming SVMs; Discusses variable selection for support vector regressors.
Book
Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.
Article
Feed-forward neural networks are now widely used in classification problems, whereas nonlinear methods of discrimination developed in the statistical field are much less widely known. A general framework for classification is set up within which methods from statistics, neural networks, pattern recognition and machine learning can be compared. Neural networks emerge as one of a class of flexible non-linear regression methods which can be used to classify via regression. Many interesting issues remain, including parameter estimation, the assessment of the classifiers and in algorithm development.
Chapter
Traducción de: Vosstanovlenie zavisimosteipo émpiricheskim dannym Incluye bibliografía e índice