Fig 13 - uploaded by Asa Ben-Hur
Content may be subject to copyright.
4. The effect of the degree of a polynomial kernel. Higher degree polynomial kernels allow a more flexible decision boundary. The style follows that of 3.

4. The effect of the degree of a polynomial kernel. Higher degree polynomial kernels allow a more flexible decision boundary. The style follows that of 3.

Source publication
Article
Full-text available
The Support Vector Machine (SVM) is a widely used classifier in bioinformatics. Obtaining the best results with SVMs requires an understanding of their workings and the various ways a user can influence their accuracy. We provide the user with a basic understanding of the theory behind SVMs and focus on their use in practice. We describe the effect...

Similar publications

Conference Paper
Full-text available
In recent work, it was shown that combining multi-kernel based support vector machines (SVMs) can lead to near state-of-the-art performance on an action recognition dataset (HMDB-51 dataset). This was 0.4\% lower than frameworks that used hand-crafted features in addition to the deep convolutional feature extractors. In the present work, we show th...
Thesis
The negative impact of reduced visibility on driver performance has been recognized as one of the major causes of motor vehicle crashes. Proper assessment of real-time visibility condition is therefore crucial for safe driving, especially during adverse weather including fog. Although many studies have investigated various visibility detection meth...
Conference Paper
Full-text available
With the emergence of data streaming applications that produce large data in motion, anomaly detection in non-stationary environments has become a major research focus. Unknown and unstable behaviour of data over time, limits the application of traditional anomaly detection methods that have been designed for stationary data. Moreover, basic assump...
Article
Full-text available
Detecting and monitoring of abnormal movement behaviors in patients with Parkinson’s Disease (PD) and individuals with Autism Spectrum Disorders (ASD) are beneficial for adjusting care and medical treatment in order to improve the patient’s quality of life. Supervised methods commonly used in the literature need annotation of data, which is a time-...
Article
Full-text available
Time series classification (TSC) arises in many fields and has a wide range of applications. Here, we adopt the bag-of-words (BoW) framework to classify time series. Our algorithm first samples local subsequences from time series at feature-point locations when available. It then builds local descriptors, and models their distribution by Gaussian m...

Citations

... Some popular kernel functions include polynomial kernel, Gaussian kernel (known as the radial basis function), and Sigmoid kernel. The Gaussian kernel usually outperforms the polynomial kernel in both accuracy and convergence time [8]. Gaussian kernel has interpolation ability and is effective at identifying local properties, and Sigmoid kernel is better suited for identifying global characteristics but has a relatively weak interpolation ability [11]. ...
... how to optimize hyperparameters in SVMs and kernels? Uninformed choices may result in severely reduced accuracy [8], [15], however, there is little insight about the uninformed choices. Most approaches for SVM research and applications focused on tuning three main ratios -mixed kernel ratio, Sigmoid ratio, and Gaussian ratio -to find the best combination for the highest accuracy of the mixed-kernel SVMs, however, the selection of hyperparameters in the SVMs and the kernels may greatly vary for different applications and datasets, and choosing the optimal kernel is critical for a high classification accuracy of the mixed-kernel SVMs [11]. ...
... Nevertheless, little research has been reported in the SVM literature on both of these parameters. In [8], a smaller value of C (=10) allows to ignore points close to the boundary and increases the margin, but a large value of C (=100) decreases the margin. In [15], the performance impact of C with the values of 1 and 16 was discussed. ...
Preprint
Full-text available
Support Vector Machine (SVM) is a state-of-the-art classification method widely used in science and engineering due to its high accuracy, its ability to deal with high dimensional data, and its flexibility in modeling diverse sources of data. In this paper, we propose an autotuning-based optimization framework to quantify the ranges of hyperparameters in SVMs to identify their optimal choices, and apply the framework to two SVMs with the mixed-kernel between Sigmoid and Gaussian kernels for smart pixel datasets in high energy physics (HEP) and mixed-kernel heterojunction transistors (MKH). Our experimental results show that the optimal selection of hyperparameters in the SVMs and the kernels greatly varies for different applications and datasets, and choosing their optimal choices is critical for a high classification accuracy of the mixed kernel SVMs. Uninformed choices of hyperparameters C and coef0 in the mixed-kernel SVMs result in severely low accuracy, and the proposed framework effectively quantifies the proper ranges for the hyperparameters in the SVMs to identify their optimal choices to achieve the highest accuracy 94.6\% for the HEP application and the highest average accuracy 97.2\% with far less tuning time for the MKH application.
... SVMs examine and group labelled data into classes, split by the widest plane (support vector). They are often employed when there is a nonlinear correlation among data, and, as such, a separation line is not easily recognizable [98]. In their study, Raji et al. identified three white matter regions distinguishing mild TBI from controls using edge density imaging maps. ...
Article
Full-text available
The dawn of Artificial intelligence (AI) in healthcare stands as a milestone in medical innovation. Different medical fields are heavily involved, and pediatric emergency medicine is no exception. We conducted a narrative review structured in two parts. The first part explores the theoretical principles of AI, providing all the necessary background to feel confident with these new state-of-the-art tools. The second part presents an informative analysis of AI models in pediatric emergencies. We examined PubMed and Cochrane Library from inception up to April 2024. Key applications include triage optimization, predictive models for traumatic brain injury assessment, and computerized sepsis prediction systems. In each of these domains, AI models outperformed standard methods. The main barriers to a widespread adoption include technological challenges, but also ethical issues, age-related differences in data interpretation, and the paucity of comprehensive datasets in the pediatric context. Future feasible research directions should address the validation of models through prospective datasets with more numerous sample sizes of patients. Furthermore, our analysis shows that it is essential to tailor AI algorithms to specific medical needs. This requires a close partnership between clinicians and developers. Building a shared knowledge platform is therefore a key step.
... SVMs examine and group labelled data into classes, split by the widest plane (support vector). They are often employed when there is a non-linear correlation among data, and, as such, a separation line is not easily recognizable [96]. In their study, 12 Raji et al. identified three white matter regions distinguishing mild TBI from controls using edge density imaging maps. ...
Preprint
Full-text available
The dawn of Artificial Intelligence (AI) in healthcare stands as a milestone in medical innovation. A plethora of different medical fields are heavily involved and pediatric emergency medicine is no exception. These new tools do not merely provide more advanced and efficient systems for patients’ diagnosis, management and treatment. They rather concern a strict shift from traditional methods based upon broad categories towards a more personalized healthcare. AI offers many promises in pediatric emergency medicine with a wide range of applications involving clinical decision making, patients’ flows management and prioritization. Main barriers to a widespread diffusion involve technological challenges but also ethical issues and the paucity of extensive datasets in pediatric contexts. We conducted a narrative review structured in two parts. The first part explores the theoretical principles of AI, providing all the necessary background to feel confident with these new state-of-the-art tools. The second part presents an informative analysis of AI models in pediatric emergencies, pointing out the actual applications and challenges until future feasible research perspectives.
... Where denotes the weight vector, denotes the deviation. The and hyperplane define the separation position [60]. ...
Article
Sound classification has obtained considerable attention in recent years due to its wide range of applications in various fields, such as speech recognition, sound surveillance, music analysis, and environmental monitoring. Because of its success, audio classification can also be employed in medical applications. Coughing is the most common disease symptom, and cough sounds might be used to diagnose them. This research focuses on identifying observable features of cough and classifying them into positive, negative, or symptomatic categories. A novel ensemble learning model based on the super learner (SL) is proposed to diagnose the disease using cough sounds utilizing various audio features such as Frequency Distribution, Time Domain Features, Spectral Features, and Time-Frequency Features. The SL method is a cross-validated approach to stacked generalization, and it can select an optimal learner from a set of learners and improve performance by selecting and merging models using cross-validation. The proposed SL model comprises DT, RF, LR, SVM, ET, and k-NN algorithms. We use the public Coughvid dataset, and the proposed model achieves a correct classification rate for symptomatic cases, which was 90.90%, and the positive predictive value for COVID-19 cases was 84.50%. The SL3 model attains 72%, 78%, 73%, 74.4%, and 78.85% precision, recall, f1-score, accuracy, and average AUC values, respectively. The numerical results show that the proposed model might be implemented to diagnose various other diseases that can be determined from respiratory sounds.
... x is the feature vector, and b is the bias [35]. The distance between feature vector x and hyperplane for each class is measured by the equation: ...
Article
Full-text available
The continuous developments in information technologies have resulted in a significant rise in security concerns, including cybercrimes, unauthorized access, and cyberattacks. Recently, researchers have increasingly turned to social media platforms like X to investigate cyberattacks. Analyzing and collecting news about cyberattacks from tweets can efficiently provide crucial insights into the attacks themselves, including their impacts, occurrence regions, and potential mitigation strategies. However, there is a shortage of labeled datasets related to cyberattacks. This paper describes CybAttT, a dataset of 36,071 English cyberattack-related tweets. These tweets are manually labeled into three classes: high-risk news, normal news, and not news. Our final overall Inner Annotation agreement was 0.99 (Fleiss kappa), which represents high agreement. To ensure dataset reliability and accuracy, we conducted rigorous experiments using different supervised machine learning algorithms and various fine-tuned language models to assess its quality and suitability for its intended purpose. A high F1-score of 87.6% achieved using the CybAttT dataset not only demonstrates the potential of our approach but also validates the high quality and thoroughness of its annotations. We have made our CybAttT dataset accessible to the public for research purposes.
... The Acquisition function of the Bayesian optimizer was set as "expected improvement per second", training time was not limited and maximum number of allowed iterations was 30. More detail regarding the SVM model and the effects of various optimization parameters are presented elsewhere [23]. ...
Article
Full-text available
Introduction The rising incidence of incidental detection of pancreatic cystic neoplasms has compelled radiologists to determine new diagnostic methods for the differentiation of various kinds of lesions. We aim to demonstrate the utility of texture features extracted from ADC maps in differentiating intraductal papillary mucinous neoplasms (IPMN) from serous cystadenomas (SCA). Methods This retrospective study was performed on 136 patients (IPMN = 87, SCA = 49) split into testing and training datasets. A total of 851 radiomics features were extracted from volumetric contours drawn by an expert radiologist on ADC maps of the lesions. LASSO regression analysis was used to determine the most predictive set of features and a radiomics score was developed based on their respective coefficients. A hyper-optimized support vector machine was then utilized to classify the lesions based on their radiomics score. Results A total of four Wavelet features (LHL/GLCM/LCM2, HLL/GLCM/LCM2, /LLL/First Order/90percent, /LLL/GLCM/MCC) were selected from all of the features to be included in our classifier. The classifier was optimized by altering hyperparameters and the trained model was applied to the validation dataset. The model achieved a sensitivity of 92.8, specificity of 90%, and an AUC of 0.97 in the training data set, and a sensitivity of 83.3%, specificity of 66.7%, and AUC of 0.90 in the testing dataset. Conclusion A support vector machine model trained and validated on volumetric texture features extracted from ADC maps showed the possible beneficence of these features in differentiating IPMNs from SCAs. These results are in line with previous regarding the role of ADC maps in classifying cystic lesions and offers new evidence regarding the role of texture features in differentiation of potentially neoplastic and benign lesions.
... Decision trees (DT) and k-nearest neighbor (KNN) algorithms are commonly used to identify gait phases and disorders [23,39,40]. For analyzing and sorting different activities such as running, jumping, or walking, support vector machines (SVM) are usually employed [23,41]. ...
Article
Full-text available
Background: Gait is the manner or style of walking, involving motor control and coordination to adapt to the surrounding environment. Knowing the kinesthetic markers of normal gait is essential for the diagnosis of certain pathologies or the generation of intelligent ortho-prostheses for the treatment or prevention of gait disorders. The aim of the present study was to identify the key features of normal human gait using inertial unit (IMU) recordings in a walking test. Methods: Gait analysis was conducted on 32 healthy participants (age range 19–29 years) at speeds of 2 km/h and 4 km/h using a treadmill. Dynamic data were obtained using a microcontroller (Arduino Nano 33 BLE Sense Rev2) with IMU sensors (BMI270). The collected data were processed and analyzed using a custom script (MATLAB 2022b), including the labeling of the four relevant gait phases and events (Stance, Toe-Off, Swing, and Heel Strike), computation of statistical features (64 features), and application of machine learning techniques for classification (8 classifiers). Results: Spider plot analysis revealed significant differences in the four events created by the most relevant statistical features. Among the different classifiers tested, the Support Vector Machine (SVM) model using a Cubic kernel achieved an accuracy rate of 92.4% when differentiating between gait events using the computed statistical features. Conclusions: This study identifies the optimal features of acceleration and gyroscope data during normal gait. The findings suggest potential applications for injury prevention and performance optimization in individuals engaged in activities involving normal gait. The creation of spider plots is proposed to obtain a personalised fingerprint of each patient’s gait fingerprint that could be used as a diagnostic tool. A deviation from a normal gait pattern can be used to identify human gait disorders. Moving forward, this information has potential for use in clinical applications in the diagnosis of gait-related disorders and developing novel orthoses and prosthetics to prevent falls and ankle sprains.
... The kernel function is another important parameter that needs to be accurately chosen. Studies have shown that the Polynomial kernel function performs better than other kinds of kernel functions [38,39]. Encouragingly, the coefficient of determination the model significantly increased from 0.7024 to 0.8928 when the polynomial kernel was used instead of the linear kernel. ...
Article
Full-text available
Backbreak in the mining industry presents a considerable challenge, impacting both safety and operational efficiency. Accurate prediction of backbreak is therefore a critical endeavour. This study rigorously evaluates four advanced machine learning (ML) techniques-Lagrangian Support Vector Machine (LSVM), Radial Basis Function Neural Network (RBFNN), Gaussian Process Regression (GPR), and Extreme Gradient Boosting (XGBoost)-to ascertain the most effective method for backbreak prediction. Utilising a comprehensive dataset of 60 blasting rounds from the Damang Goldfields Open Pit Mine and prior to the analysis, this dataset underwent a thorough preprocessing phase. The efficacy of each model is assessed using a suite of metrics, including correlation coefficient (r), coefficient of determination (R 2), mean squared error (MSE), root mean squared error (RMSE), and mean absolute error (MAE). The performance of the models is quantitatively compared, revealing XGBoost as the superior predictor in this context, characterised by an r of 0.9788, an R 2 of 0.9565, an MSE of 0.1714, an RMSE of 0.4139, and an MAE of 0.2819. The findings of this study underscore the potential of XGBoost as a robust tool for backbreak prediction, offering mining companies a viable solution to enhance safety protocols and mitigate financial losses related to backbreak incidents. This research contributes significantly to the field of predictive analytics in mining, providing a comprehensive comparative analysis of various ML techniques for backbreak prediction.
... Machine learning techniques are being widely used for supervised learning where a regression model is developed to predict the output as a function of the input parameters. The supervised machine learning techniques used in this study are back propagation neural networks [9][10][11] and support vector machines [12,13]. Artificial Neural Networks have been used to simulate turbomachinery problems in many studies [14][15][16][17][18][19]. ...
... Support vector machine (SVM) is a supervised learning method suitable for handling scenarios with small sample sizes, non-linear data, or high-dimensional data. Its underlying principle revolves around finding an effective decision boundary that separates samples into distinct classes [26][27][28]. Consequently, SVM performs Based on the features introduced above, a total of 72 features are obtained, while there were 3 duplicate features: Hence, we obtained 69 filtered and normalized features. These extracted features will serve as inputs for training and prediction in our machinelearning model. ...
... Support vector machine (SVM) is a supervised learning method suitable for handling scenarios with small sample sizes, non-linear data, or high-dimensional data. Its underlying principle revolves around finding an effective decision boundary that separates samples into distinct classes [26][27][28]. Consequently, SVM performs exceptionally well in both classification and regression tasks. We utilize kernel functions that transform data points into higher-dimensional spaces for classification to address non-linearity and high-dimensional data. ...
... Bioengineering 2024, 11,26 ...
Article
Full-text available
Extracorporeal membrane oxygenation (ECMO) is a vital emergency procedure providing respiratory and circulatory support to critically ill patients, especially those with compromised cardiopulmonary function. Its use has grown due to technological advances and clinical demand. Prolonged ECMO usage can lead to complications, necessitating the timely assessment of peripheral microcirculation for an accurate physiological evaluation. This study utilizes non-invasive near-infrared spectroscopy (NIRS) to monitor knee-level microcirculation in ECMO patients. After processing oxygenation data, machine learning distinguishes high and low disease severity in the veno-venous (VV-ECMO) and veno-arterial (VA-ECMO) groups, with two clinical parameters enhancing the model performance. Both ECMO modes show promise in the clinical severity diagnosis. The research further explores statistical correlations between the oxygenation data and disease severity in diverse physiological conditions, revealing moderate correlations with the acute physiologic and chronic health evaluation (APACHE II) scores in the VV-ECMO and VA-ECMO groups. NIRS holds the potential for assessing patient condition improvements.