Figure 1 - uploaded by Puspa Inayat Khalid
Content may be subject to copyright.
Procedure of 10-fold cross validation.

Procedure of 10-fold cross validation.

Source publication
Article
Full-text available
Support vector machine (SVM) has been successfully applied for classification in this paper. This paper discussed the basic principle of the SVM at first, and then SVM classifier with polynomial kernel and the Gaussian radial basis function kernel are choosen to determine pupils who have difficulties in writing. The 10-fold cross-validation method...

Context in source publication

Context 1
... this project, we used 10-fold cross validation (k = 10) as it is the most common used for data mining and machine learning. As shown in Figure 1, the darker section of the data are used for training while the remaining data; lighter sections are used for validate the model. This process is repeated 10 times until all sections have been validated. ...

Similar publications

Article
Full-text available
Early detection of pancreatic cancer is difficult, and thus many cases of pancreatic cancer are diagnosed late. When pancreatic cancer is detected, the cancer is usually well developed. Machine learning is an approach that is part of artificial intelligence and can detect pancreatic cancer early. This paper proposes a machine learning approach with...
Article
Full-text available
Support Vector Machines (SVM) classifier is a popular classification method. However , most users may not well take tuning parameters selection because this step is time consuming. In practice, the tuning parameters are chosen by evaluating parameter candidates via cross validation. It is shown that the performance of SVM is sensitive to the values...
Article
Full-text available
In this paper, to combine the advantage of both polynomial kernel and the Mahalanobis distance metric learning (DML) methods, we propose a Mahalanobis DML based polynomial kernel for the classification of hyperspectral images. To ensure the method is computing-saving, we adapt a fast iterative method to learn the Mahalanobis matrix. Simulation expe...
Article
Full-text available
In this study, we propose a new approach that can be used as a kernel-like function for support vector machines (SVMs) in order to get nonlinear classification surfaces. We combined polyhedral conic functions (PCFs) with the SVM method. To get nonlinear classification surfaces, kernel functions are used with SVMs. However, the parameter selection o...
Article
Full-text available
Sports talent recognition is one of the intensively discussed topics in this day and age. Cricket is a sport of keen interest and has fascinated researchers all over the world to ponder and work in this domain. In this era of technological competence, incorporating technology in cricket talent identification is an incumbent task. Also, early-age ta...

Citations

... SVM additionally proves effective on data with large dimensions [13]. SVM has also been used satisfactorily in a variety of applications, including covid-19 forecasting [14], identifying leaf diseases [15] [16], forecasting stock prices [17], classifying handwriting [18], identifying fraudulent banking [19], identifying fraudulent credit cards [20], recognition of faces [21], and numerous others. Based on the results of the previous research, this research will develop the SVM method to predict health insurance claims. ...
Article
Full-text available
Health insurance industry is very much needed by the community in handling the financial risks in the health sector. The number of claims greatly affects the achievement of profits and the sustainability of the health insurance industry. Therefore, filing claims by insurance users from year to year is important to be predicted in insurance firm. The Machine Learning (ML) method promises to be a good solution for predicting health insurance claims compared to conventional data analytics methods. Support Vector Machine (SVM) is one of the superior ML approaches. Nonetheless, SVM performance is controlled by the suitable selection of SVM parameters. The SVM parameters is typically selected by trial and error, sometimes resulting in not optimal performance and taking a long time to complete. Swarm intelligence-based algorithms can be used to select the best parameters from SVM. This method is capable of locating the global best solution, is simple to implemented, and doesn't involve derivatives. One of the best swarm intelligence algorithms is the Bat Algorithm (BA). BA has a faster convergence rate than other algorithms, for example Particle Swarm Optimization (PSO). Based on this situation, this paper offers the new classification model for predicting health insurance claim based on SVM and BA. The metrics utilized for evaluation are accuracy, recall, precision, f1-score, and computing time. The experimental outcomes show that the proposed approach is superior to the conventional SVM and the hybrid of SVM and PSO in forecasting health insurance claims. In addition, the proposed method has a substantially shorter computing time than the hybrid of SVM and PSO. The outcomes of the experiments also indicate that the new classification model for predicting health insurance claim based on the SVM and BA can avoid over-fitting condition.
... SVM also works well in high dimensional data [7]. SVM has been implemented successfully in many applications, such as covid-19 prediction [8], leaf diseases detection [9], [10], stock price prediction [11], handwriting classification [12], bank fraud detection [13], credit card fraud detection [14], face recognition [15], and many other applications. However, the SVM performance is determined by the parameters and the features [16]. ...
Article
Full-text available
The number of claims plays an important role the profit achievement of health insurance companies. Prediction of the number of claims could give the significant implications in the profit margins generated by the health insurance company. Therefore, the prediction of claim submission by insurance users in that year needs to be done by insurance companies. Machine learning methods promise the great solution for claim prediction of the health insurance users. There are several machine learning methods that can be used for claim prediction, such as the Naïve Bayes method, Decision Tree (DT), Artificial Neural Networks (ANN) and Support Vector Machine (SVM). The previous studies show that the SVM has some advantages over the other methods. However, the performance of the SVM is determined by some parameters. Parameter selection of SVM is normally done by trial and error so that the performance is less than optimal. Some optimization algorithms based heuristic optimization can be used to determine the best parameter values of SVM, for example Particle Swarm Optimization (PSO) and Genetic Algorithm (GA). They are able to search the global optimum, easy to be implemented. The derivatives aren’t needed in its computation. Several researches show that PSO give the better solutions if it is compared with GA. All particles in the PSO are able to find the solution near global optimal. For these reasons, this article proposes the health claim insurance prediction using SVM with PSO. The experimental results show that the SVM with PSO gives the great performance in the health claim insurance prediction and it has been proven that the SVM with PSO give better performance than the SVM standard.
... Bouadjenek et al. [17] introduced two N AL-Q C Y. S gradient features to classify the writer's age, gender and handedness: the histogram of oriented gradients and gradient local binary patterns. They used the Support Vector Machine (SVM) [18], method to classify the documents. IAM and Khatt datasets were used to evaluate the system. ...
Article
Handwriting analysis is the science of determining an individual’s personality from his or her handwriting by assessing features such as slant, pen pressure, word spacing, and other factors. Handwriting analysis has a wide range of uses and applications, including dating and socialising, roommates and landlords, business and professional, employee hiring, and human resources. This study used the ResNet and GoogleNet CNN architectures as fixed feature extractors from handwriting samples. SVM was used to classify the writer’s gender and age based on the extracted features. We built an Arabic dataset named FSHS to analyse and test the proposed system. In the gender detection system, applying the automatic feature extraction method to the FSHS dataset produced accuracy rates of 84.9% and 82.2% using ResNet and GoogleNet, respectively. While the age detection system using the automatic feature extraction method achieved accuracy rates of 69.7% and 61.1% using ResNet and GoogleNet, respectively
... Support Vector Machine (SVM) is a discriminative type of classifier used for both regression as well as classification purpose. SVM has been used for the recognition of handwritten digits [13], speaker identification [14], to detect faces in the images [15] and so on. ...
Conference Paper
Full-text available
This paper aims to recognize emotions from speech on a more realistic database using various classifiers. For this purpose, experiments are conducted using the standard 6373 dimensional Computational Paralinguistic Challenge (ComParE) feature set. The features extracted are modeled using Support Vector Machine (SVM) and Deep Neural Network (DNN) classifiers. The effectiveness of the proposed system has been validated on the Emotional Sensitivity Assistance System for People with Disabilities (EmotAsS) database, provided as part of the INTERSPEECH 2018 Computational Paralinguistics Challenge. Besides, experiments have also been performed on a reduced subset of the standard ComParE acoustic feature set consisting of 873 prosodic features. Experimental results suggest that the reduced prosodic feature set provides comparable performance with the original feature set. It is also observed that DNN classifier provides better performance than SVM.
... The data set is assigned to a class which generates the maximum conditional posterior probability with available attributes as input using Bayes rule [15]. K-fold cross validation method is implemented to split the features extracted in the proposed model into training and testing set [16]. This method has an advantage that it makes full use of the limited sample dataset for classification so as to evaluate performance of proposed feature set for glioma grade identification. ...
Article
Gliomas are most common brain tumor in children and adults worldwide and accounts for 80% of all malignant tumors. In this work, we proposed a novel method for glioma grade classification using texture feature set extracted from T2-weighted magnetic resonance images (MRI). Gray-level co-occurrence matrix (GLCM) parameters are computed from local Optimal Oriented Pattern (LOOP) transformed images to differentiate low grade and high grade glioma. Classification is carried out using support vector machine (SVM), Naive Bayes and k-nearest neighbor (k-NN) classifier and their performance for glioma grade classification is accessed. SVM classifier outperforms other classifiers and achieved an accuracy of 95%, sensitivity of 93% and specificity of 100% for classifying gliomas using proposed LOOP transformed based GLCM texture features.
... Cross-validation is a popular and widely used model validation technique [19][20][21][22] for assessing how the results of machine learning analysis will generalize to an independent dataset. It is mainly used in settings where the goal was prediction, and one wants to estimate how accurately a predictive model will perform in practice. ...
Article
Full-text available
Upper body power (UBP) is an important determinant of cross-country ski race performance. Although numerous studies exist to measure UBP of cross-country skiers, to date, no study has ever attempted to predict UBP of cross-country skiers. The purpose of this paper was to develop prediction models for estimating 10-s UBP (UBP10) and 60-s UBP (UBP60) of cross-country skiers using support vector machines (SVM). Four types of SVMs have been considered, they are as follows: SVM using the radial basis function kernel (SVM-RBF), SVM using the sigmoid kernel, SVM using the polynomial kernel, and SVM using the linear kernel. For comparison purposes, UBP prediction models based on multilayer perceptron and multiple linear regression have also been developed. The dataset used in this study includes data of 77 subjects. Age, gender, height, weight, body mass index, maximal heart rate, maximal oxygen uptake, and exercise time are the predictor variables, and UBP10 and UBP60 are the target variables. Several UBP prediction models have been developed by using the combination of the predictor variables to predict UBP10 and UBP60. By using 10-fold cross-validation on the datasets, the performance of the models has been evaluated by calculating their standard error of estimates (SEEs) and multiple correlation coefficients (Rs). The results show that SVM-RBF based UBP prediction models perform much better (i.e., yield lower SEEs and higher Rs) than the prediction models developed by other regression methods and can be safely used for the prediction of UBP of cross-country skiers.
Chapter
Age detection from handwritten documents is a crucial research area in many disciplines such as forensic analysis and medical diagnosis. Furthermore, this task is challenging due to the high similarity and overlap between individuals’ handwriting. The performance of the document recognition and analysis systems, depends on the extracted features from handwritten documents, which can be a challenging task as this depends on extracting the most relevant information from row text. In this paper, a set of age-related features suggested by a graphologist, to detect the age of the writers, have been proposed. These features include irregularity in slant, irregularity in pen pressure, irregularity in textlines, and the percentage of black and white pixels. Support Vector Machines (SVM) classifier has been used to train, validate and test the proposed approach on two different datasets: the FSHS and the Khatt dataset. The proposed method has achieved a classification rate of 71% when applied to FSHS dataset. Meanwhile, our method outperformed state-of-arts methods when applied to the Khatt dataset with a classification rate of 65.2%. Currently, these are the best rates in this field.KeywordsAge detectionMachine learningImage processingHandwriting analysis
Article
Gender detection from handwritten documents is a crucial research area in many disciplines such as psychology, pyelography, graphology, and forensic analysis. Furthermore, this task is challenging due to the high similarity and overlap between individuals’ handwriting. The performance of the document recognition and analysis systems, depends on the extracted features from handwritten documents, which can be a challenging task as this depends on extracting the most relevant information from row text. In this paper, a set of gender-related features suggested by a graphologist, to detect the gender of the writers, have been proposed. These features include margins, space between words, pen-pressure and handwriting irregularity. Both SVM and ANN classifiers have been used to train, validate and test the proposed approach on two different data sets: our data set FSHS and ICDAR2013 dataset. The proposed method has achieved high classification rates of 94.7% and 97.1% using SVM and ANN respectively. Meanwhile, our method outperformed state-of-arts methods when applied to the ICDAR2013 dataset with classification rates of 91.4% and 92.5% using SVM and ANN respectively.