Extracting Support Data for a Given Task.

IdSarcasm: Benchmarking and Evaluating Language Models for Indonesian Sarcasm Detection

Article

Full-text available

Jan 2024

Sarcasm detection in the Indonesian language poses a unique set of challenges due to the linguistic nuances and cultural specificities of the Indonesian social media landscape. Understanding the dynamics of sarcasm in this context requires a deep dive into not only language patterns but also the socio-cultural background that shapes the use of sarcasm as a form of criticism and expression. In this study, we developed the first publicly available Indonesian sarcasm detection benchmark datasets from social media texts. We extensively investigated the results of classical machine learning algorithms, pre-trained language models, and recent large language models (LLMs). Our findings show that fine-tuning pre-trained language models is still superior to other techniques, achieving F1 scores of 62.74% and 76.92% on the Reddit and Twitter subsets respectively. Further, we show that recent LLMs fail to perform zero-shot classification for sarcasm detection and that tackling data imbalance requires a more sophisticated data augmentation approach than our basic methods.

Stable and Critical Gesture Recognition in Children and Pregnant Women by SVM Classification with FFT Features of Signals from Wearable Attires

Article

Jun 2014

Intelligent solutions for supporting decision-making processes in road management: A general framework accounting for environment, road serviceability, and user’s safety

Thesis

Jun 2022

Nicholas Fiorentini

This Ph.D. dissertation focuses on optimizing automated decision-making processes involving critical aspects of road management tasks. Specifically, the research aims to define and implement specific strategies for supplying support to decision-makers considering two leading elements: road maintenance and road safety. We propose some novel applications based on the integrated use of high-performance Non-Destructive Techniques (NDTs) and Geographical Information Systems (GISs) in order to obtain a “fully sensed” infrastructure, creating a multi-scale database concerning structural, geometrical, functional, social, and environmental characteristics. The environmental aspect is essential since climate change phenomena and extreme natural events are increasingly linked with infrastructure damage and serviceability; nonetheless, current Pavement Management Systems (PMSs) commonly rely solely on road pavement structural characteristics and surface functional performance. The high amount of collected data serves as input for calibrating different data-driven approaches, such as Machine Learning Algorithms (MLAs) and statistical regressions. Considering the aspect of road monitoring and maintenance, such models allow identifying the environmental factors that have the most significant impact on road damage and serviceability, as well as recognizing road sites with critical health conditions that need to be restored. Moreover, the calibrated MLAs enable decision-makers to determine the road maintenance interventions with higher priority. Considering road safety, the calibrated MLAs allow identifying the sites where serious road crashes can be triggered and estimating the crash count in a specified time frame. Moreover, it is possible to recognize infrastructure-related factors that significantly impact crash likelihood. Road authorities may consider the outcomes of the dissertation as a novel approach for drafting appropriate guidelines and defining more objective management programs.

Multi-Class Support Vector Machine with Maximizing Minimum Margin

Article

Mar 2024

Support Vector Machine (SVM) stands out as a prominent machine learning technique widely applied in practical pattern recognition tasks. It achieves binary classification by maximizing the "margin", which represents the minimum distance between instances and the decision boundary. Although many efforts have been dedicated to expanding SVM for multi-class case through strategies such as one versus one and one versus the rest, satisfactory solutions remain to be developed. In this paper, we propose a novel method for multi-class SVM that incorporates pairwise class loss considerations and maximizes the minimum margin. Adhering to this concept, we embrace a new formulation that imparts heightened flexibility to multi-class SVM. Furthermore, the correlations between the proposed method and multiple forms of multi-class SVM are analyzed. The proposed regularizer, akin to the concept of "margin", can serve as a seamless enhancement over the softmax in deep learning, providing guidance for network parameter learning. Empirical evaluations demonstrate the effectiveness and superiority of our proposed method over existing multi-classification methods. Complete version is available at https://arxiv.org/pdf/2312.06578.pdf. Code is available at https://github.com/zz-haooo/M3SVM.

Predicting the Motion of a USV Using Support Vector Regression with Mixed Kernel Function

Article

Full-text available

Dec 2022

Predicting the maneuvering motion of an unmanned surface vehicle (USV) plays an important role in intelligent applications. To more precisely predict this empirically, this study proposes a method based on the support vector regression with a mixed kernel function (MK-SVR) combined with the polynomial kernel (PK) function and radial basis function (RBF). A mathematical model of the maneuvering of the USV was established and subjected to a zig-zag test on the DW-uBoat USV platform to obtain the test data. Cross-validation was used to optimize the parameters of SVR and determine suitable weight coefficients in the MK function to ensure the adaptive adjustment of the proposed method. The PK-SVR, RBF-SVR, and MK-SVR methods were used to identify the dynamics of the USV and build the corresponding predictive models. A comparison of the results of the predictions with experimental data confirmed the limitations of the SVR with a single kernel function in terms of forecasting different parameters of motion of the USV while verifying the validity of the MK-SVR based on data collected from a full-scale test. The results show that the MK-SVR method combines the advantages of the local and global kernel functions to offer a better predictive performance and generalization ability than SVR based on the nuclear kernel function. The purpose of this manuscript is to propose a novel method of dynamics identification for USV, which can help us establish a more precise USV dynamic model to design and verify an excellent motion controller.

Distinguishing between epileptic and physiologic high-frequency brain activity using automatically detected EEG patterns

Thesis

Full-text available

May 2022

Daniel Lachner-Piza

High-Frequency-Oscillations (HFO) are an electro-encephalographic biomarker composed of two subgroups: ripples with oscillation frequencies between 80 and 250 Hz, and fast-ripples (FR) with oscillation frequencies between 250 and 500 Hz. HFO may be involved in either pathologic or physiologic brain functions; however, the criteria for the differentiation between pathologic and physiologic HFO remain unclear. The discernment of purely pathologic HFO could improve the identification of epileptogenic brain regions during the pre-surgical evaluation of epilepsy patients; analogously, recognizing purely physiologic HFO would help clarify their relevance for cognitive functions. This work addresses four challenges faced by HFO: (i) The automatic detection of HFO and the validation of these detections. (ii) The differentiation of HFO as putative pathologic when coinciding with interictal epileptic spikes (IES), and as putative physiologic when not coinciding with IES or when co-occurring with sleep spindles. (iii) Determining the value of pathologic HFO as biomarkers of the epileptogenic zone. (iv) Determining the involvement of physiologic HFO during spatial memory processing and consolidation. The HFO detectors developed in this thesis are based on support-vector-machines and obtained at least 21 F1-score points more than previously published algorithms at the lowest signal-to-noise-ratio. The success achieved when discerning between IES coinciding with HFO and IES occurring in isolation was comparable to that of other published algorithms. The detectors were run on 42 hours of electroencephalogram from 8 patients, the pathologic HFO amounted to 21% of all HFO and localized the epileptogenic-zone with an accuracy 8 points higher than undifferentiated HFO and 20 points higher than physiologic HFO. The detectors were also run on the electroencephalogram of 19 patients conducting a spatial-navigation task. The occurrence of physiologic ripples was found to decrease significantly during the spatial-navigation task, while pathologic ripples did not show any modulation. Ripples and sleep spindles were also detected during 8 hours of a night separating two spatial-navigation tasks; the rate of sleep spindle coincident ripples showed a correlation with the performance improvement on the spatial navigation task, whereas the rate of undifferentiated ripples did not show this correlation. In summary, the obtained results suggest that the proposed differentiation of HFO into putatively pathologic and physiologic subgroups based on their association with IES and sleep spindles helps to: (i) More accurately estimate the margins of the epileptogenic zone; (ii) Identify the role of HFO activity in spatial memory and memory consolidation processes. The detectors developed in this thesis are publicly available and provide a toolset to further study the interactions between HFO, IES and sleep spindles.

Prediction of Emission Characteristics of Generator Engine with Selective Catalytic Reduction Using Artificial Intelligence

Article

Full-text available

Aug 2022

Eco-friendliness is an important global issue, and the maritime field is no exception. Predicting the composition of exhaust gases emitted by ship engines will be of consequence in this respect. Therefore, in this study, exhaust gas data were collected from the generator engine of a real ship along with engine-related data to predict emission characteristics. This is because installing an emission gas analyzer on a ship has substantial economic burden, and, even if it is installed, the accuracy can be increased by a virtual sensor. Furthermore, data were obtained with and without operating the SCR (often mounted on ships to reduce NOx), which is a crucial facility to satisfy environment regulation. In this study, four types of datasets were created by adding cooling and electrical-related variables to the basic engine dataset to check whether it improves model performance or not; each of these datasets consisted of 15 to 26 variables as inputs. CO2 (%), NOx (ppm), and tEx (°C) were predicted from each dataset using an artificial neural network (ANN) model and a support vector machine (SVM) model with optimal hyperparameters selected by trial and error. The results confirmed that the SVM model performed better on smaller datasets, such as the one used in this study compared to the ANN model. Moreover, the dataset type, DaCE, which had both cooling and electrical-related variables added to the basic engine dataset, yielded the best overall prediction performance. When the performance of the SVM model was measured using the test data of a DaCE on both no-SCR mode and SCR mode, the RMSE (R2) of CO2 was between 0.1137% (0.8119) and 0.0912% (0.8975), the RMSE (R2) of NOx was between 17.1088 ppm (0.9643) and 13.6775 ppm (0.9776), and the RMSE (R2) of tEx was between 4.5839 °C (0.8754) and 1.5688 °C (0.9392).

Empirical Study on Virtual Order of Class Labels in Nominal Classification

Article

Full-text available

Jun 2022

Binary Decomposition can be adopted in ordered and unordered ways. Inspired by the case that label order information can be exploited to improve the ordinal classification performance through adopting an ordered decomposition strategy, this paper explores whether the effectiveness of binary decomposition in nominal classification tasks can be improved by setting a virtual label order. The essential purpose of setting a virtual order is to obtain small intra-class distances and large inter-class distances after binary decomposition , such that simpler binary classification tasks are obtained. The experimental results show that setting a virtual order results in an improvement of the classification performance as expected. However, the performance obtained by setting a virtual order does not show significant superiority in comparison with the one produced by some unordered decomposition strategies, and the reasons have been analysed in the context of the relationship between virtual orders and inter-class distances.

Deep pattern-based tumor segmentation in brain MRIs

Article

Full-text available

Sep 2022
NEURAL COMPUT APPL

It is still hard to deal with artifacts in magnetic resonance images (MRIs), particularly when the latter are to be segmented. This paper introduces a novel deep-based scheme for tumor segmentation in brain MRIs. According to the proposed scheme, a large set of partial sub-images are paved form sliced from an MRI volume then inputted to an ensemble of convolutional neural networks (CNNs) in order to label the voxels in the centers of the sub-images, according to the classes to which they should belong. Partial sub-images, that capture local patterns around central voxels, have allowed to speed-up both the training and the prediction steps, allowing efficient use of such a scheme for real MRI-based tumor diagnosis. Experiments were performed using the BraTS (brain tumor segmentation) database, where the obtained results show that the proposed scheme allows both fast and accurate brain tumor detection and segmentation in pathological MRIs.

Ultrasound Penetration-Based Digital Soil Texture Analyzer

Article

Full-text available

Apr 2022

Finding the particle size distribution is one of the main objectives in soil science as well as any other area to be worked on soil. There are many methods to accomplish this task, both traditional and technology-oriented. While traditional methods have many disadvantages, such as being dependent on the laboratory medium and expert knowledge, being carried out manually etc., technology-oriented devices on the other hand, are quite expensive due to hardware and production costs. In order to propose an optimized solution in between, an automated soil texture analyzer, supported by microcomputer and machine learning method, is proposed, which can eliminate the incomplete and error-prone disadvantages of the traditional hydrometer method, focusing only to find ratios of sand, silt and clay materials in a soil sample. In the light of a previous study, the structure and working principle of an automated soil texture estimation device, which works depending on the change of ultrasound intensity passed through the soil-water mixture in the 3D-printed measurement cup, has been explained in detail. The signals obtained from 80 soil samples, the contents of which were analyzed by the traditional hydrometer method, have been collected on the computer using the proposed device, pre-processed, and then feature extraction steps have been applied to be given as input to the machine learning methods. Rates of sand, silt and clay fractions of the soils have been predicted using Support Vector Regression (SVR), Multi-Layer Perceptron Neural Network (MLPNN) and Long Short-Term Memory (LSTM) machine learning methods. Best results are achieved using MLPNN structure in terms of Mean Absolute Error (MAE). Using MLPNN 54 soils for sand, 48 for silt, and 48 soils for clay, are found having estimation error of less than 10%. Worst results are obtained using LSTM method. There are 38 soils for sand, 38 for silt, and 41 soils for clay with MAE value below 10% with LSTM method. Statistical validations are presented with RMSE and Correlation Coefficient values and the achieved results are thought to be moderate especially considering the many of the competing studies are carried out using thousands of dollars worth laser diffraction based machines. The temperature effect has also been investigated by performing tests at varying temperatures from 9 to 38 ∘C.

Prediction of Blast−Induced Ground Vibrations by ANFIS and Support Vector Machines

Conference Paper

Nov 2021

Yaşar Ağan

ÖZET Bu çalışmada patlatma kaynaklı yer sarsıntısı uyarlamalı ağ tabanlı bulanık çıkarım sistemi (ANFIS) ve destek vektör makineleri (SVM) yöntemleriyle tahmin edilmiştir. ANFIS modelinde delikler arası mesafe−dilim kalınlığı oranı (S/B), basamak yüksekliği−dilim kalınlığı oranı (H/B) ve ölçekli mesafe (SD) parametreleri girdi olarak kullanılmıştır. Tahmin edilen çıktı en yüksek parçacık hızıdır (ppv). Destek vektör makineleri yönteminde ise en uygun model olarak 'SVM quadratic' seçilmiştir. Modellerin oluşturulmasında 69 eğitme verisi ve 26 test verisi kullanılmıştır. Tüm patlatma verileri İstanbul Cendere havzasında bulunan taş ocaklarından toplanmıştır. Geliştirilen modeller klasik ölçekli mesafe denklemi ve çok değişkenli regresyon denklemi ile karşılaştırılmıştır. ANFIS ve SVM modeli ile yapılan tahminlerin regresyon modellerine göre daha başarılı olduğu ortaya konmuştur. ABSTRACT In this research, blast-induced ground vibrations were estimated by adaptive network-based fuzzy inference system (ANFIS) and support vector machines (SVM) methods. In the ANFIS model, spacing to burden ratio (S/B), bench height to burden ratio (H/B) and scaled distance (SD) were considered as input parameters. The predicted output is peak particle velocity (ppv). In the support vector machines method, the 'SVM quadratic' model was selected as the most suitable model. 69 training data and 26 test data were used to create the models. All the blast data were collected from aggregate quarries located in Cendere Valley, İstanbul. The developed models were compared to classical scaled distance equation and multiple regression equation. It is concluded that the predictions of ANFIS and SVM models are more successful than the regression equations. 1 GİRİŞ Açık işletmelerde gerçekleştirilen patlatma faaliyetlerinde, patlatma kaynaklı çeşitli çevresel etkileri mevcuttur. Olumsuz çevresel etkileri 4 başlık altında incelemek mümkündür. Bunlar kaya (taş) savrulması, toz emisyonu (yayılımı), hava şoku ve patlatmalardan kaynaklanan yer sarsıntılarıdır. Bu çalışma kapsamında patlatma kaynaklı yer sarsıntıları ele alınmıştır. Yer sarsıntıları üzerinde en sık durulan çevresel etkilerden biridir. Yer sarsıntısı patlatmanın diğer etkilerine kıyasla daha uzak mesafelere ulaşabilmektedir. Kısacası yer sarsıntıları daha geniş bir alana yayılmaktadır. Yer sarsıntısına karşı önlem almak için her şeyden önce yer sarsıntısının tahmin edilmesi gerekmektedir. Bu amaçla regresyon analizine dayanan klasik ölçekli mesafe denklemleri sıklıkla kullanılmaktadır. Günümüzde çeşitli esnek hesaplama yöntemleri ve makine öğrenmesi teknikleri de yer sarsıntısı hesaplanmasında ön plana çıkmaktadır. Farklı tekniklerin kullanılması alternatif modellerin geliştirilmesine olanak sağlamaktadır. Yer sarsıntısı tahmininde göz önüne alınan temel parametreler gecikme başına anlık şarj ve ölçüm

A Comparative Study of Energy Big Data Analysis for Product Management in a Smart Factory

Article

Full-text available

Jan 2022

Energy is one of the key inputs for a country's economic growth and social development. Analysis and modeling of industrial energy are currently time-intensive processes because more and more energy is consumed for economic growth in a smart factory. This study aims to present and analyse the predictive models of the data-driven system to be used by appliances and find the most significant product item. With repeated cross-validation, three statistical models were trained and tested in a test set: 1) general linear regression model (GLM), 2) support vector machine (SVM), and 3) boosting tree (BT). The performance of prediction models were measured by R2 error, root mean squared error (RMSE), mean absolute error (MAE), and coefficient of variation (CV). The best model from the study is the support vector machine (SVM) that has been able to provide R2 of 0.86 for the training data set and 0.85 for the testing data set with a low coefficient of variation, and the most significant product of this smart factory is Skelp.

Efficient Configuration Exploration in Inverse Dynamics Acquisition of Robotic Manipulators

Conference Paper

May 2021

Electrical Impedance Guides Electrode Array in Cochlear Implantation using Machine Learning and Robotic Feeder

Article

Oct 2021
HEARING RES

Cochlear Implant provides an electronic substitute for hearing to severely or profoundly deaf patients. However, postoperative hearing outcomes significantly depend on the proper placement of electrode array (EA) into scala tympani (ST) during cochlear implant surgery. Due to limited intra-operative methods to access array placement, the objective of the current study was to evaluate the relationship between EA complex impedance and different insertion trajectories in a plastic ST model. A prototype system was designed to measure bipolar complex impedance (magnitude and phase) and its resistive and reactive components of electrodes. A 3-DoF actuation system was used as an insertion feeder. 137 insertions were performed from 3 different directions at a speed of 0.08 mm/s. Complex impedance data of 8 electrode pairs were sequentially recorded in each experiment. Machine learning algorithms were employed to classify both the full and partial insertion lengths. Support Vector Machine (SVM) gave the highest 97.1% accuracy for full insertion. When a real-time prediction was tested, Shallow Neural Network (SNN) model performed better than other algorithms using partial insertion data. The highest accuracy was found at 86.1% when 4 time samples and 2 apical electrode pairs were used. Direction prediction using partial data has the potential of online control of the insertion feeder for better EA placement. Accessing the position of the electrode array during the insertion has the potential to optimize its intraoperative placement that will result in improved hearing outcomes.

Classification partiellement supervisée par SVM : application à la détection d’événements en surveillance audio

Thesis

Dec 2013

Sébastien Lecomte

Cette thèse s’intéresse aux méthodes de classification par Machines à Vecteurs de Support (SVM) partiellement supervisées permettant la détection de nouveauté (One-Class SVM). Celles-ci ont été étudiées dans le but de réaliser la détection d’événements audio anormaux pour la surveillance d’infrastructures publiques, en particulier dans les transports. Dans ce contexte, l’hypothèse « ambiance normale » est relativement bien connue (même si les signaux correspondants peuvent être très non stationnaires). En revanche, tout signal « anormal » doit pouvoir être détecté et, si possible, regroupé avec les signaux de même nature. Ainsi, un système de référence s’appuyant sur une modélisation unique de l’ambiance normale est présenté, puis nous proposons d’utiliser plusieurs SVM de type One Class mis en concurrence. La masse de données à traiter a impliqué l’étude de solveurs adaptés à ces problèmes. Les algorithmes devant fonctionner en temps réel, nous avons également investi le terrain de l’algorithmie pour proposer des solveurs capables de démarrer à chaud. Par l’étude de ces solveurs, nous proposons une formulation unifiée des problèmes à une et deux classes, avec et sans biais. Les approches proposées ont été validées sur un ensemble de signaux réels. Par ailleurs, un démonstrateur intégrant la détection d’événements anormaux pour la surveillance de station de métro en temps réel a également été présenté dans le cadre du projet Européen VANAHEIM

A novel supervised machine learning algorithm to detect Parkinson’s disease on its early stages

Article

Apr 2021

Lavanya Madhuri Bollipo

Early and accurate Parkinson’s disease (PD) diagnosis are usually complex as clinical symptoms often onset only when there is extensive loss of dopaminergic neurons in substantia-nigra and symptoms are atypical at an early stages of the disease. Recent brain imaging modality such as single photon emission computed tomography (SPECT) with 123I- Ioflupane (DaTSCAN) have shown to be a better diagnostic tool for PD even in its initial stages. Presently machine learning algorithms have become trendier and play important role to automate PD diagnosis and predict its progression. In machine learning community, support vector regression (SVR) has recently received much attention due to its ability to negotiate between fitting accuracy and model complexity in training prediction models. This work presents an optimized SVR with weights associated to each of the sample data to automate PD diagnosis and predict its progression at primary stages. The proposed algorithm (W-SVR) is trained with motor and cognitive symptom scores in addition to striatal binding ratio (SBR) values calculated from the 123I-Ioflupane SPECT scans (taken from the Parkinson’s progression markers initiative (PPMI) database) for early PD prognosis accurately. In model building, different kernels are used to check the accuracy and goodness of fit. We observed promising results obtained by W-SVR in comparison with classic Support vector regression.

New method for solving Ivanov regularization-based support vector machine learning

Article

Aug 2021
COMPUT OPER RES

The support vector machine (SVM) model is one of the most well-known machine learning models, which is based on the structural risk minimization (SRM) principle. The SRM principle, formulated by Vapnik in a statistical learning theory framework, can be naturally expressed as an Ivanov regularization-based SVM (I-SVM). Recent advances in learning theory clearly show that I-SVM allows a more effective control of the learning hypothesis space with a better generalization ability. In this paper, we propose a new method for optimizing the I-SVM to find the optimal separation hyperplane. The proposed approach provides a parallel block minimization framework for solving the dual I-SVM problem that exploits the advantages of the randomized primal–dual coordinate (RPDC) method, and every iteration-based sub-optimization RPDC routine has a simple closed-form. We also provide an upper limit τ∗ for the space control parameter τ by solving a Morozov regularization SVM (M-SVM) problem. Experimental results confirmed the improved performance of our method for general I-SVM learning problems.

National-scale electricity peak load forecasting: Traditional, machine learning, or hybrid model?

Preprint

Full-text available

Jun 2021

As the volatility of electricity demand increases owing to climate change and electrification, the importance of accurate peak load forecasting is increasing. Traditional peak load forecasting has been conducted through time series-based models; however, recently, new models based on machine or deep learning are being introduced. This study performs a comparative analysis to determine the most accurate peak load-forecasting model for Korea, by comparing the performance of time series, machine learning, and hybrid models. Seasonal autoregressive integrated moving average with exogenous variables (SARIMAX) is used for the time series model. Artificial neural network (ANN), support vector regression (SVR), and long short-term memory (LSTM) are used for the machine learning models. SARIMAX-ANN, SARIMAX-SVR, and SARIMAX-LSTM are used for the hybrid models. The results indicate that the hybrid models exhibit significant improvement over the SARIMAX model. The LSTM-based models outperformed the others; the single and hybrid LSTM models did not exhibit a significant performance difference. In the case of Korea's highest peak load in 2019, the predictive power of the LSTM model proved to be greater than that of the SARIMAX-LSTM model. The LSTM, SARIMAX-SVR, and SARIMAX-LSTM models outperformed the current time series-based forecasting model used in Korea. Thus, Korea's peak load-forecasting performance can be improved by including machine learning or hybrid models.

Deep Repulsive Prototypes for Adversarial Robustness

Preprint

Full-text available

May 2021

While many defences against adversarial examples have been proposed, finding robust machine learning models is still an open problem. The most compelling defence to date is adversarial training and consists of complementing the training data set with adversarial examples. Yet adversarial training severely impacts training time and depends on finding representative adversarial samples. In this paper we propose to train models on output spaces with large class separation in order to gain robustness without adversarial training. We introduce a method to partition the output space into class prototypes with large separation and train models to preserve it. Experimental results shows that models trained with these prototypes -- which we call deep repulsive prototypes -- gain robustness competitive with adversarial training, while also preserving more accuracy on natural samples. Moreover, the models are more resilient to large perturbation sizes. For example, we obtained over 50% robustness for CIFAR-10, with 92% accuracy on natural samples and over 20% robustness for CIFAR-100, with 71% accuracy on natural samples without adversarial training. For both data sets, the models preserved robustness against large perturbations better than adversarially trained models.

Construction de Représentation de Données Adaptées dans le Cadre de Peu d'Exemples Étiquetés

Thesis

Dec 2020

Léo Gautheron

L'apprentissage automatique consiste en l'étude et la conception d'algorithmes qui construisent des modèles capables de traiter des tâches non triviales aussi bien ou mieux que les humains et, si possible, à un moindre coût.Ces modèles sont généralement entraînés à partir d'un ensemble de données où chaque exemple décrit une instance de la même tâche et est représenté par un ensemble de caractéristiques et un résultat ou étiquette que nous voulons généralement prédire.Un élément nécessaire au succès de tout algorithme d'apprentissage automatique est lié à la qualité de l'ensemble de caractéristiques décrivant les données, également appelé représentation des données.Dans l'apprentissage supervisé, plus les caractéristiques décrivant les exemples sont corrélées avec l'étiquette, plus le modèle sera efficace.Il existe trois grandes familles de caractéristiques : les caractéristiques ``observables'', les caractéristiques ``fabriquées à la main'' et les caractéristiques ``latentes'' qui sont généralement apprises automatiquement à partir des données d'entraînement.Les contributions de cette thèse s'inscrivent dans le cadre de cette dernière catégorie. Plus précisément, nous nous intéressons au cadre spécifique de l'apprentissage d'une représentation discriminatoire lorsque le nombre de données d'intérêt est limité.Un manque de données d'intérêt peut être constaté dans différents scénarios.Tout d'abord, nous abordons le problème de l'apprentissage déséquilibré avec une classe d'intérêt composée de peu d'exemples en apprenant une métrique qui induit un nouvel espace de représentation où les modèles appris ne favorisent pas les exemples majoritaires.Deuxièmement, nous proposons de traiter un scénario avec peu d'exemples disponibles en apprenant en même temps une représentation de données pertinente et un modèle qui généralise bien en boostant des modèles basés sur des noyaux et des caractéristiques de Fourier aléatoires.Enfin, pour traiter le scénario d'adaptation de domaine où l'ensemble cible ne contient pas d'étiquette alors que les exemples sources sont acquis dans des conditions différentes, nous proposons de réduire l'écart entre les deux domaines en ne conservant que les caractéristiques les plus similaires qui optimisent la solution d'un problème de transport optimal entre les deux domaines.

Virtual DVL Reconstruction Method for an Integrated Navigation System Based on DS-LSSVM Algorithm

Article

Full-text available

Mar 2021

Autonomous Underwater Vehicle (AUV) mostly relies on an integrated navigation system, which consists of a strap-down inertial navigation system (SINS) and Dopper Velocity Log (DVL). The integrated system provides continuous and accurate navigation information when compared to standalone SINS or DVL. However, the dependency of DVL signals on the acoustic environment may cause any DVL malfunction due to marine organisms or strong wave-absorbing material. This paper introduces a novel method utilizing Dempster Shafer (DS) theory augmented by Least Squares Support Vector Machines (LSSVM), known as DS-LSSVM. The SINS and DVL data fusion is designed by DS theory whereas LSSVM models the SINS error. The virtual DVL is built by the proposed DS-LSSVM method, which makes autonomous underwater vehicle navigation purposes possible during long-term DVL outage. The test results demonstrate the effectiveness of the proposed virtual DVL signal estimation method. In addition, the positioning accuracy of the proposed method outperforms that of the LSSVM method.

A PREDICTION MODEL FOR STEEL FACTORY MANUFACTURING PRODUCT BASED ON ENERGY CONSUMPTION USING DATA MINING TECHNIQUE

Article

Jan 2020

Predicting Secondary Structure of Protein Using Hybrid of Convolutional Neural Network and Support Vector Machine

Article

Feb 2021

Protein secondary structure prediction is one of the problems in the Bioinformatics field, which conducted to find the function of proteins. Protein secondary structure prediction is done by classifying each sequence of protein primary structure into the sequence of protein secondary structure, which fall in sequence labelling problems and can be solved with the machine learning. Convolutional Neural Network (CNN) and Support Vector Machine (SVM) are 2 methods that often used to solve classification problems. In this research, we proposed a hybrid of 1-Dimensional CNN and SVM to predict the secondary structure of the protein. In this research, we used a novel hybrid 1-Dimensional CNN and SVM for sequence labelling, specifically to predict the secondary structure of the protein. Our hybrid model managed to outperform previous studies in term of Q3 and Q8 accuracy on CB513 dataset.

A PREDICTION MODEL FOR STEEL FACTORY MANUFACTURING PRODUCT BASED ON ENERGY CONSUMPTION USING DATA MINING TECHNIQUE

Article

Full-text available

Dec 2020

Energy has been obtained as one of the key inputs for a country's economic growth and also for social development as well. Analysis and modeling of industrial energy are currently a time-insertion process because more and more energy is consumed for economic growth in an industrial factory. Industrial energy consumption analysis and predictions play a very important role in improving energy utilization rates to make profitable things for industrial companies or factories. This study is aimed to present and analyze the predictive models of the data-driven system for the uses of appliances. This paper is intended to address the filtering of data to use non-predictive parameters and ratings of features. With repeated cross-validation, three statistical models were trained and tested in a test set: 1) general linear regression model (GLM), 2) support vector machine with the radial kernel (SVM RBF) 3) boosting tree (BT). The performance of prediction models was measured by R 2 error, root mean squared error (RMSE), mean absolute error (MAE), and coefficient of variation (CV). The best model from the study is the support vector machine (SVM) that has been able to provide R 2 of 0.86 for the training data set and 0.85 for the testing data set with a low coefficient of variation

Surface Motion Prediction and Mapping for Road Infrastructures Management by PS-InSAR Measurements and Machine Learning Algorithms

Article

Full-text available

Dec 2020

This paper introduces a methodology for predicting and mapping surface motion beneath road pavement structures caused by environmental factors. Persistent Scatterer Interferometric Synthetic Aperture Radar (PS-InSAR) measurements, geospatial analyses, and Machine Learning Algorithms (MLAs) are employed for achieving the purpose. Two single learners, i.e., Regression Tree (RT) and Support Vector Machine (SVM), and two ensemble learners, i.e., Boosted Regression Trees (BRT) and Random Forest (RF) are utilized for estimating the surface motion ratio in terms of mm/year over the Province of Pistoia (Tuscany Region, central Italy, 964 km2), in which strong subsidence phenomena have occurred. The interferometric process of 210 Sentinel-1 images from 2014 to 2019 allows exploiting the average displacements of 52,257 Persistent Scatterers as output targets to predict. A set of 29 environmental-related factors are preprocessed by SAGA-GIS, version 2.3.2, and ESRI ArcGIS, version 10.5, and employed as input features. Once the dataset has been prepared, three wrapper feature selection approaches (backward, forward, and bi-directional) are used for recognizing the set of most relevant features to be used in the modeling. A random splitting of the dataset in 70% and 30% is implemented to identify the training and test set. Through a Bayesian Optimization Algorithm (BOA) and a 10-Fold Cross-Validation (CV), the algorithms are trained and validated. Therefore, the Predictive Performance of MLAs is evaluated and compared by plotting the Taylor Diagram. Outcomes show that SVM and BRT are the most suitable algorithms; in the test phase, BRT has the highest Correlation Coefficient (0.96) and the lowest Root Mean Square Error (0.44 mm/year), while the SVM has the lowest difference between the standard deviation of its predictions (2.05 mm/year) and that of the reference samples (2.09 mm/year). Finally, algorithms are used for mapping surface motion over the study area. We propose three case studies on critical stretches of two-lane rural roads for evaluating the reliability of the procedure. Road authorities could consider the proposed methodology for their monitoring, management, and planning activities.

Shallow, Deep, Ensemble models for Network Device Workload Forecasting

Conference Paper

Full-text available

Sep 2020

Cenru Liu

HOW TO FINE-TUNE SUPPORT VECTOR MACHINES FOR CLASSIFICATION

Book

Full-text available

Aug 2020

This book covers in the first part the theoretical aspects of support vector machines and their functionality, and then based on the discussed concepts it explains how to find-tune a support vector machine to yield highly accurate prediction results which are adaptable to any classification tasks. The introductory part is extremely beneficial to someone new to learning support vector machines, while the more advanced notions are useful for everyone who wants to understand the mathematics behind support vector machines and how to find-tune them in order to generate the best predictive performance of a certain classification model.

SUPPORT VECTOR MACHINES FOR CHURN PREDICTION IN THE MOBILE TELECOMMUNICATIONS INDUSTRY

Book

Full-text available

May 2020

This book presents a CRISP-DM data mining project for implementing a classification model that achieves a predictive performance very close to the ideal model, namely of 99.70%. This model yields such a high accuracy, mainly, due to the proprietary architecture of the machine learning algorithm used. We implement a support vector machine which is improved using multiple techniques existent in the literature. A detailed theoretical explanation is offered regarding support vector machines, learning algorithms and several optimization algorithms, and each decision taken in building the final architecture is motivated. To demonstrate the predictive performance of our classification model, we use a telecommunications synthetic dataset that contains call details records (CDR) for 3,333 customers, with 21 independent variables and one dependent variable which indicates the past behavior of these customers with respect to churn. This is a generic dataset frequently used in research as a benchmark for testing different architectures of machine learning algorithms proposed for classification. The methodology presented in this book is scalable to datasets that have hundreds of thousands of instances and hundreds or thousands of variables coming from various industries such as telecommunications, finance, astronomy, biotech, marketing, healthcare, and many others, and can be applied to any real world classification problem.

Add-On Anomaly Threshold Technique for Improving Unsupervised Intrusion Detection on SCADA Data

Article

Full-text available

Jun 2020

Supervisory control and data acquisition (SCADA) systems monitor and supervise our daily infrastructure systems and industrial processes. Hence, the security of the information systems of critical infrastructures cannot be overstated. The effectiveness of unsupervised anomaly detection approaches is sensitive to parameter choices, especially when the boundaries between normal and abnormal behaviours are not clearly distinguishable. Therefore, the current approach in detecting anomaly for SCADA is based on the assumptions by which anomalies are defined; these assumptions are controlled by a parameter choice. This paper proposes an add-on anomaly threshold technique to identify the observations whose anomaly scores are extreme and significantly deviate from others, and then such observations are assumed to be ”abnormal”. The observations whose anomaly scores are significantly distant from ”abnormal” ones will be assumed as ”normal”. Then, the ensemble-based supervised learning is proposed to find a global and efficient anomaly threshold using the information of both ”normal”/”abnormal” behaviours. The proposed technique can be used for any unsupervised anomaly detection approach to mitigate the sensitivity of such parameters and improve the performance of the SCADA unsupervised anomaly detection approaches. Experimental results confirm that the proposed technique achieved a significant improvement compared to the state-of-the-art of two unsupervised anomaly detection algorithms.

Support Vector Data Description

Article

Jan 2004

David M.J. Tax

Data domain description concerns the characterization of a data set. A good description covers all target data but includes no superfluous space. The boundary of a dataset can be used to detect novel data or outliers. We will present the Support Vector Data Description (SVDD) which is inspired by the Support Vector Classifier. It obtains a spherically shaped boundary around a dataset and analogous to the Support Vector Classifier it can be made flexible by using other kernel functions. The method is made robust against outliers in the training set and is capable of tightening the description by using negative examples. We show characteristics of the Support Vector Data Descriptions using artificial and real data.

Season wise bike sharing demand analysis using random forest algorithm

Article

Full-text available

Feb 2020

Rental bike sharing is an urban mobility model that is affordable and ecofriendly. The public bike sharing model is widely used in several cities across the world over the past decade. Because bike use is rising constantly, understanding the system demand in prediction is important to boost the operating system readiness. This article presents a prediction model to meet user demands and efficient operations for rental bikes using Random Forest (RF), which is a homogeneous ensemble method. The approach is carried out in Seoul, South Korea to predict the hourly use of rental bikes. RF is compared with Support Vector Machine with Radial Basis Function Kernel, k‐nearest neighbor and Classification and Regression Trees to verify RF supremacy in rental bike demand prediction. Performance Index measures the efficiency of RF compared to the other predictive models. Also, the variable importance analysis is performed to assess the most important characteristics during different seasons by creating a predictive model using RF for each season. The results show that the influence of variables changes depending on the seasons that suggest different operating conditions. RF models trained with yearly and seasonwise models show that bike sharing demand can be further improved by considering seasonal change.

Using data mining techniques for bike sharing demand prediction in metropolitan city

Article

Feb 2020
COMPUT COMMUN

Currently Rental bikes are introduced in many urban cities for the enhancement of mobility comfort. It is important to make the rental bike available and accessible to the public at the right time as it lessens the waiting time. Eventually, providing the city with a stable supply of rental bikes becomes a major concern. The crucial part is the prediction of bike count required at each hour for the stable supply of rental bikes. A Data mining technique is employed for overcoming the hurdles for the prediction of hourly rental bike demand. This paper discusses the models for hourly rental bike demand prediction. Data used include weather information (Temperature, Humidity, Windspeed, Visibility, Dewpoint, Solar radiation, Snowfall, Rainfall), the number of bikes rented per hour and date information. The paper also explores an filtering of features approach to eliminate the parameters which are not predictive and ranks the features based on its prediction performance. Five Statistical regression models were trained with their best hyperparameters using repeated cross-validation and the performance is evaluated using a testing set: (a) Linear Regression (b) Gradient Boosting Machine (c) Support Vector Machine (Radial Basis Function Kernel) (d) Boosted Trees, and (e) Extreme Gradient Boosting Trees. When all the predictors are employed, the best model Gradient Boosting Machine can give the best and highest R² value of 0.96 in the training set and 0.92 in the test set. Furthermore, several analyzes are carried out in Gradient Boosting Machine with different combinations of predictors to identify the most significant predictors and the relationships between them.

Latent semantic network induction in the context of linked example senses

Conference Paper

Full-text available

Jan 2019

Training Subset Selection for Support Vector Regression

Conference Paper

Full-text available

Sep 2019

Binary Decomposition for Multi-Class Classification Problems: Development and Applications

Conference Paper

Jul 2023

Non-compliance of the European Court of Human Rights decisions: A machine learning analysis

Article

Dec 2023
INT REV LAW ECON

The paper investigates all (971) non-executed pending leading cases of the European Court of Human Rights (ECtHR) between 2012 and 2020 through Machine Learning (ML) techniques. Drawing on the extant scholarship, our interest on compliance has centred on state level and case level variables. For the identification of important variables, four databases have been used. Each country party to the European Convention on Human Rights (ECHR) received 232 distinct factors for eight years. Since we aim to make a parameter estimation for a high-dimensional data set, Simulated Annealing (SA) is employed as feature selection method. In the state level analysis, Support Vector Regression (SVR) model has been applied yielding the coefficients of the variables, which have been found to be important in spelling out non-compliance with the ECtHR decisions. For the case level analysis, different cluster techniques have been utilized and the countries have been grouped into four different clusters. We have found that the states that have relatively high levels of equality before the law, protection of individual liberties, social class equality with regard to enjoying civil liberties, access to justice and free and autonomous election management arrangements, are less susceptible to non-compliance of the decisions of the ECtHR. For the case level analysis, type of violated rights, the existence of dissent in the decision and dissenting votes of national judges for their appointing states affect the compliance behaviour of the states. In addition, a notable result of the research is that if a national judge casts a dissenting vote against the violation judgment of the ECtHR involving the state that appointed him/her, the judgment is likely not to be executed by the respondent state.

Machine learning algorithms for classifying corneas by Zernike descriptors

Article

Nov 2022

Keratoconus is the most common primary ectasia, as the treatment is not easy, its early diagnosis is essential. The main goal of this study is to develop a method for classification of specific types of corneal shapes where 55 Zernike coefficients (angular index m = 9) are used as inputs. We describe and apply six Machine Learning (ML) classification methods and an ensemble of them to objectively discriminate between keratoconic and non-keratoconic corneal shapes. Earlier attempts by other authors have successfully implemented several Machine Learning models using different parameters (usually, indirect measurements) and have obtained positive results. Given the importance and ubiquity of Zernike polynomials in the eye care community, our proposal should be a suitable choice to incorporate to current methods which might serve as a prescreening test. In this project we work with 475 corneas, classified by experts in two groups, 50 keratoconics and 425 non-keratoconics. All six models yield high rated results with accuracies above 98%, precisions above 97%, or sensitivities above 93%. Also, by building an assembly with the models, we further improve the accuracy of our classification, for example we found an accuracy of 99.7%, a precision of 99.8% and sensitivity of 98.3%. The model can be easily implemented in any system, being very simple to use, thus providing ophthalmologists with a effortless and powerful tool to make a first diagnosis.

Evaluation Of Machine Learning Classification Methods For Rice Detection Using Earth Observation Data: Case Of Senegal

Article

Full-text available

May 2022

Agriculture is considered one of the most vulnerable sectors to climate change. In addition to rainfed agriculture, irrigated crops such as rice have been developed in recent decades along the Senegal River. This new crop requires reliable information and monitoring systems. Remote sensing data have proven to be very useful for mapping and monitoring rice fields. In this study, a rice classification system based on machine learning to recognize and categorize rice is proposed. Physical interpretations of rice with other land cover classes in relation to the spectral signature should identify the optimal periods for mapping rice plots using three machine learning methods including Support Vector Machine (SVM), Random Forest (RF), and Classification and Regression Trees (CART). The database is composed of field data collected by GPS and high spatial resolution (10 to 30 m) satellite data acquired between January and May 2018. The analysis of the spectral signature of different land cover show that the ability to differentiate rice from other classes depends on the level of rice development. The results show the efficiency of the three classification algorithms with overall accuracies and Kappa coefficients for SVM (96.2%, 94.5%), for CART (97.6%, 96.5%) and for RF (98% 97.1%) respectively. Unmixing analysis was made to verify the classification and compare the accuracy of these three algorithms according to their performance.

Construction Land Information Extraction and Expansion Analysis of Xiaogan City Using One-Class Support Vector Machine

Article

Full-text available

Jan 2022

Urban expansion is generally accompanied by a series of ecological problems; therefore, it is of great significance to strengthen the research on urban expansion to effectively guide and control urban expansion. In this study, we used a one-class support vector machine (OCSVM) based on Landsat image data to extract the construction land area in Xiaogan City of Hubei Province (China) in 2000, 2005, 2010, 2015, and 2020. We analyzed the characteristics of construction land expansion and explore the driving mechanisms of construction land expansion in Xiaogan City. The results show that 1) the accuracy of the construction land information extracted by OCSVM was 91.46%, 90.02%, 89.31%, 92.23%, and 89.67%, respectively, which met the expected results and could be used for the study of extension and driving mechanism, proving that OCSVM is suitable for the study of remote sensing image classification when only one class of features is extracted; 2) Xiaogan City's overall expansion is of high-speed type in 20 years, which is restricted by the terrain, medium-speed expansion between 2000 and 2005, and high-speed expansion between 2005 and 2020, and the expansion intensity of Xiaogan City in all four time periods from 2000 to 2020 is of slow expansion type; and 3) among the main factors influencing urban expansion in Xiaogan, the increase in population, economic development, development of high-tech zones, and construction of transportation lines increase the demand for construction land; government planning decisions provide the direction and scope for the expansion of construction land, which is the most leading factor among the drivers of construction land expansion.

Short-term hybrid forecasting model of ice storage air-conditioning based on improved SVR

Article

Jun 2022

The energy consumption of air conditioning system is more than half of building energy consumption. Accurate and efficient forecasting of building cooling load is one of the important ways to realize building energy saving and energy system management. We proposes a short-term hybrid forecasting model based on the Mean Impact Value-Improved Gray Wolf Optimizer-Support Vector Regression (MIV-IGWO-SVR) and applies this model to the ice storage air conditioning cooling load forecasting. First, the impact of historical meteorological data on cooling load was analyzed, and the Mean Impact Value (MIV) is used to filter out the main variables affecting the cooling load. Second, two new types of improvement strategies are proposed to optimize the Gray Wolf Optimizer (GWO), which is used in SVR parameter optimization. Finally, the model is verified using the measured data of a large public building in northern China. And the results showed that the MAPE and the RMSE predicted by the MIV-IGWO-SVR model are 0.4319% and 4.8511, respectively. Through comparative experiments, it can be concluded that the MIV-IGOWO-SVR model has higher prediction accuracy, shorter running time, and stronger robustness than neural networks under small sample data. Therefore, the forecasting model proposed in this paper provides effective method support for the building energy consumption prediction and can meet actual engineering needs.

Hybrid Intelligent System for Medical Diagnosis in Health Care

Chapter

Jul 2021

The autonomous city implies a global vision that incorporates artificial intelligence, deep learning, big data, decision-making, ICT and the Internet of Things (IoT) to promote sustainable development. The ageing issue is that researchers, companies and the government should devote efforts to developing smart health care, innovative technology and applications. For an extended period, conventional intelligence systems have played a critical role in health care. However, with the increased popularity and widespread use of these hybrid intelligent computer systems, there is a significant shift in the healthcare sector. Diagnosis and detection of various diseases using several techniques can be resolved by using these techniques. Different novel methods are applied to biomedical engineering to diagnose diseases, and new models are being studied and compared with the existing technologies. Hybrid intelligence systems can be implied in decision-making, remote monitoring, healthcare logistics, medical diagnosis, and modern information system. This success's fundamental cause seems to be derived by different intelligent computational mechanisms, such as genetic algorithms, evolutionary computation, convolutional neural network (CNN), long short-term memory (LSTM), autoencoders, deep generative models and deep belief networks. To solve complex problems, we need domain knowledge that comprises the methodologies that provide hybrid systems with complementary reasoning and empirical data. This chapter will focus on the need for a hybrid intelligent system in the healthcare industry and their medical diagnosis applicability.

Ensemble Classification Using Entropy-Based Features for MRI Tissue Segmentation

Chapter

Mar 2021

It is still hard to deal with artifacts in magnetic resonance images (MRIs), particularly when the latter are to be segmented. This paper introduces a novel feature, namely the spatial entropy of intensity that allows a pattern-based representation which enhances the MRI segmentation despite presence of high levels of noise and intensity non uniformity (INU) within MRI data. Moreover, we bring out that ensembles of classifiers used with the proposed feature have significantly enhanced structured MRI segmentation. Thus, to conduct experiments, MRIs with different artifact levels were extracted and exploited from the Brain Web MRI database. The obtained results reveal that the proposed feature, especially when used with ensembles of classifiers has significantly enhanced the overall MRI segmentation.

Hyperbolic Support Vector Machine and Kernel design

Thesis

Sep 2019

Aya El Dakdouki

Statistical learning theory is a field of inferential statistics whose foundations were laid by Vapnik at the end of the 1960s. It is considered a subdomain of artificial intelligence. In machine learning, support vector machines (SVM) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. In this thesis, our aim is to propose two new statistical learning problems: one on the conception and evaluation of a multi-class SVM extension and another on the design of a new kernel for support vectors machines. First, we introduced a new kernel machine for multi-class pattern recognition : the hyperbolic support vector machine. Geometrically, it is characterized by the fact that its decision boundaries in the feature space are defined by hyperbolic functions. We then established its main statistical properties. Among these properties we showed that the classes of component functions are uniform Glivenko-Cantelli, this by establishing an upper bound of the Rademacher complexity. Finally, we establish a guaranteed risk for our classifier. Second, we constructed a new kernel based on the Fourier transform of a Gaussian mixture model. We proceed in the following way: first, each class is fragmented into a number of relevant subclasses, then we consider the directions given by the vectors obtained by taking all pairs of subclass centers of the same class. Among these are excluded those allowing to connect two subclasses of two different classes. We can also see this as the search for translation invariance in each class. It successfully on several datasets in the context of machine learning using multiclass support vector machines.

One-Class Support Vector Machine for Data Streams

Conference Paper

Nov 2020

REFERENCES

Chapter

Dec 2020

A tutorial on support vector regression

Article

Aug 2004

In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.

Mammogram Classification Using Support Vector Machine

Chapter

Full-text available

Jan 2020

Among the objectives of artificial intelligence techniques, we find computer-aided diagnosis systems that support preventive medical check-ups and perform detection, recognition, and classification patterns. Recently these techniques are emerged in different areas particularly in medical imaging. Medical image is an important source of information, and a golden tool for the diagnosis and assessment of a pathological analysis process. In this chapter Computer-Aided Diagnosis (CAD) system is proposed in detection and diagnosis of breast cancer, it is mainly composed of the following steps: preprocessing mammographic image, segmentation of suspect region on the mammographic image using Chan Vese model, extraction of global and local descriptors and then image classification into malignant and benign mammograms using Support Vector Machine (SVM) classifier. The analysis of mammographic images proposed system with a choice of the subset of local descriptors after tumor segmentation leads to a classification of malignant and benign mammograms. System proposed achieves 92% for accuracy.

Metric Learning from Imbalanced Data

Conference Paper