Article

An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Clustering is a popular data analysis and data mining technique. A popular technique for clustering is based on k-means such that the data is partitioned into K clusters. However, the k-means algorithm highly depends on the initial state and converges to local optimum solution. This paper presents a new hybrid evolutionary algorithm to solve nonlinear partitional clustering problem. The proposed hybrid evolutionary algorithm is the combination of FAPSO (fuzzy adaptive particle swarm optimization), ACO (ant colony optimization) and k-means algorithms, called FAPSO-ACO–K, which can find better cluster partition. The performance of the proposed algorithm is evaluated through several benchmark data sets. The simulation results show that the performance of the proposed algorithm is better than other algorithms such as PSO, ACO, simulated annealing (SA), combination of PSO and SA (PSO–SA), combination of ACO and SA (ACO–SA), combination of PSO and ACO (PSO–ACO), genetic algorithm (GA), Tabu search (TS), honey bee mating optimization (HBMO) and k-means for partitional clustering problem.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Due to which, the optimization algorithm suffers from local optima traps and initialization sensitivity. Clustering algorithms have been widely used [71,72,75] to assist metaheuristic algorithms in escaping from local optima traps by exploring and finding diverse features. In comparison to other metaheuristic search algorithms, WOA, as one of the newly proposed metaheuristic search algorithms, has a unique capability of searching best features through hunting. ...
... To improve the performance of the original metaheuristic algorithm-based clustering methods, various authors have modified metaheuristic search algorithms. Niknam and Amiri [71] proposed a hybrid evolutionary clustering model, FAPSO-ACO-K, by combining three traditional algorithms, namely fuzzy adaptive PSO (FAPSO), ACO, and k-means algorithm. Four artificial and six UCI data sets were used to test the proposed model. ...
Article
Full-text available
A brain-computer interface (BCI) based on an electroencephalograph (EEG) establishes a new channel of communication between the human brain and a computer. Redundant, noisy, and irrelevant channels lead to high computational costs and poor classification accuracy. Therefore, an effective feature selection technique for determining the optimal number of channels can improve BCI’s performance. However, existing meta-heuristic algorithms are prone to get trapped in local optimum due to high dimensional dataset. Thus, to reduce dimension, solve inter subject variation and choose an optimal subset of channels, a novel framework called Component Loading followed by Clustering and Classification (CLCC) is proposed in this paper. This novel framework is further divided into two experiment configurations-CLCC with Feature Selection (CLCC-FS) and CLCC without Feature Selection (CLCC-WFS). All these frameworks have been implemented on a motor imagery (MI) EEG dataset of 10 subjects in order to choose the best subset of channels. Further, seven different classifiers have been employed to assess the performance. Experimental outcomes show that on comparing various feature selection techniques, our proposed algorithm i.e., CLCC-FS Opposition-Based Whale Optimization Algorithm (CLCC-FS(OBWOA)) performed substantially better than the other feature selection techniques. We demonstrate that the proposed algorithm is able to achieve 99.6% accuracy by using only few channels and can improve the practicality of the BCI system by reducing the computation cost.
... Additionally, it outperformed a number of currently used clustering methods, including KM clustering, PSO-based clustering, and ACO-based clustering. Niknam and Amiri [39] Zhou and Li [41] developed two improved versions of FA for data clustering. The first method is named Greedy Probabilistic Firefly KM (GPFK), while the second one is so-called Probabilistic Firefly KM (PFK). ...
... 5. Macro-average F-score (Fscore M ): This performance measure is a broadly employed performance index, which is computed with reference to the macroaverage of precision and recall outcomes [55]; as it is defined in Eqs. (39), (40) and 41. ...
Article
Full-text available
This work presents Hybrid Capuchin Search Algorithm (HCSA) as a meta-heuristic method to deal with the vexing problems of local optima traps and initialization sensitivity of the K-means clustering technique. This study was inspired by the popularity and permanence of meta-heuristics in presenting convincing solutions, which sparked various efficient methods and computational tools to tackle difficult and practical real-world problems. The movement behavior of CSA is strengthened using the Chameleon Swarm algorithm to support the search agents of CSA to more effectively explore and exploit each potential region of the search space. This increases the capacity of both exploitation and exploration of the traditional CSA. Besides, the search agents of CSA utilize the rotation mechanism in CS to migrate to new spots outside the nearby regions to perform global search. This mechanism improves the search proficiency of CSA as well as the intensification and diversity abilities of the search agents. These expansion aptitudes of CSA expand its exploitation potential and broaden the range of search scopes, sizes, and directions in conducting clustering activities. A total of 16 different datasets from diverse sources, each with a different level of complexity, characteristics, and dimension, are used to assess the performance of the developed HCSA method on clustering tasks. According to the experimental results, the proposed HCSA performs statistically significantly better than the K-means clustering algorithm and eight meta-heuristics-based clustering in terms of both distance and performance metric measures.
... • Artificial dataset 1 (Art 1) has two features and it has four distinctive classes [44]. • Artificial dataset 2 (Art 2) has three features and it has five classes and two fifty data samples [44]. ...
... • Artificial dataset 1 (Art 1) has two features and it has four distinctive classes [44]. • Artificial dataset 2 (Art 2) has three features and it has five classes and two fifty data samples [44]. ...
Article
Full-text available
Data clustering is a technique for dividing data objects into groups based on their similarity. K-means is a simple, effective algorithm for clustering. But, k-means tends to converge to local optima and depends on the cluster's initial values. To address the shortcomings of the k-means algorithm, many nature-inspired techniques have been used. This paper is offered an improved version of bacterial colony optimization (BCO) based on opposition-based learning (OBL) algorithm called OBL ? BCO for data clustering. An OBL is used to increase the speed of the convergence rate and searching ability of BCO by computing the opposite solution to the present solution. The strength of the proposed data clustering technique is evaluated using several well-known UCI benchmark datasets. Different performance measures are considered to analyze the strength of the proposed OBL ? BCO such as Rand index, Jaccard index, Beta index, Distance index, Objective values, and computational time. The experimental results demonstrated that the proposed OBL ? BCO data clustering technique outperformed other data clustering techniques.
... • Artificial dataset 1 (Art 1) has two features and it has four distinctive classes [44]. • Artificial dataset 2 (Art 2) has three features and it has five classes and two fifty data samples [44]. ...
... • Artificial dataset 1 (Art 1) has two features and it has four distinctive classes [44]. • Artificial dataset 2 (Art 2) has three features and it has five classes and two fifty data samples [44]. ...
Article
Full-text available
Data clustering is a technique for dividing data objects into groups based on their similarity. K-means is a simple, effective algorithm for clustering. But, k-means tends to converge to local optima and depends on the cluster’s initial values. To address the shortcomings of the k-means algorithm, many nature-inspired techniques have been used. This paper is offered an improved version of bacterial colony optimization (BCO) based on opposition-based learning (OBL) algorithm called OBL + BCO for data clustering. An OBL is used to increase the speed of the convergence rate and searching ability of BCO by computing the opposite solution to the present solution. The strength of the proposed data clustering technique is evaluated using several well-known UCI benchmark datasets. Different performance measures are considered to analyze the strength of the proposed OBL + BCO such as Rand index, Jaccard index, Beta index, Distance index, Objective values, and computational time. The experimental results demonstrated that the proposed OBL + BCO data clustering technique outperformed other data clustering techniques.
... Recently, there have been an increasing number of studies on improving the K-means algorithm, mainly combining an optimization algorithm with the K-means clustering algorithm. Niknam and Amiri (2010) proposed a hybrid algorithm based on particle swarm optimization (PSO), ant colony optimization (ACO), and the K-means algorithm to optimize clustering. Xu and Li (2011) and Xie and Li (2014) proposed a K-means optimized clustering algorithm based on the improved PSO algorithm. ...
... The clustering accuracy rate was higher than 80% through UCI dataset verification. Because of the PSO algorithm, it could maintain its random behaviour better than the artificial bee colony (ABC) algorithm in determining the global optimum, and the result was superior to that of the ABC algorithm (Niknam and Amiri, 2010). ...
Article
Full-text available
As minerals are a non-renewable resource, sustainability must be considered in their development and utilization. Evaluation of the mineral resources carrying capacity is necessary for the sustainable development of mineral resource-based regions. Following the construction of a comprehensive evaluation index system from four aspects, namely resource endowment, socio-economic status, environmental pollution, and ecological restoration, a method combining particle swarm optimization (PSO) and the K-means algorithm (PSO-Kmeans) was used to evaluate the mineral resources carrying capacity of the Panxi region southwest Sichuan Province, China. The evaluation method is data-driven and does not consider the classification standards of different carrying capacity levels. At the same time, it avoids the problems of local optimization and sensitivity to initial points of the K-means algorithm, thereby providing more objective evaluation results and solving the problem of subjective division of each grade volume capacity in carrying capacity evaluation. The algorithm was verified through UCI data-sets and virtual samples. By superimposing a single index on the carrying capacity map for analysis, the rationality of the evaluation results was validated.
... The algorithm is iterative in nature, it assigns random value ' ' to points on a designated cluster in such a manner that sum of squared distance ' ' from center of cluster is minimum. K-means algorithm is highly random as it depends on randomly selected initial centroids [4]. So, on each run it gives different outputs. ...
... Several approaches have been suggested to remove the impediment of K-mean algorithm, also to calculate cluster centers and compute optimum number of clusters formed. Some of them are discussed here: Niknam et al [4] presented a combined approach of fuzzy logic and K-means, devised based on Ant colony optimization method. It outperforms traditional K-mean algorithm and provides improved clustering for larges ensemble points. ...
... The experimental results showed a better response and a quicker convergence than ordinary evolutionary methods. In addition, the authors [35] also proposed a new hybrid evolutionary algorithm that combined the fuzzy adaptive particle swarm optimization (FAPSO), ACO, and K-means algorithms, which was called FAPSO-ACO-K. The performance of this algorithm was much better than the other algorithms for the partitional clustering problem. ...
Article
Full-text available
Data clustering has attracted the interest of scholars in many fields. In recent years, using heuristic algorithms to solve data clustering problems has gradually become a tendency. The black hole algorithm (BHA) is one of the popular heuristic algorithms among researchers because of its simplicity and effectiveness. In this paper, an improved self-adaptive logarithmic spiral path black hole algorithm (SLBHA) is proposed. SLBHA innovatively introduces a logarithmic spiral path and random vector path to BHA. At the same time, a parameter is used to control the randomness, which enhances the local exploitation ability of the algorithm. Besides, SLBHA designs a replacement mechanism to improve the global exploration ability. Finally, a self-adaptive parameter is introduced to control the replacement mechanism and maintain the balance between exploration and exploitation of the algorithm. To verify the effectiveness of the proposed algorithm, comparison experiments are conducted on 13 datasets creatively using the evaluation criteria including the Jaccard coefficient as well as the Folkes and Mallows index. The proposed methods are compared with the selected algorithms such as the whale optimization algorithm (WOA), compound intensified exploration firefly algorithm (CIEFA), improved black hole algorithm (IBH), etc. The experimental results demonstrate that the proposed algorithm outperforms the compared algorithms on both external criteria and quantization error of the clustering problem.
... However, clustering faces challenges in selecting suitable data representatives, handling diverse data, and dealing with distribution complexities. It's a computationally complex task in the class of NP-complete problems, aiming to minimize dissimilarity measures for identifying clusters in varied datasets [4]. Two fundamental approaches to data clustering include hierarchical clustering, which entails a tree-like division of data, and partition clustering. ...
Chapter
Full-text available
Addressing challenges in data clustering for diverse data types, we introduce the Chaos Mountain Gazelle Optimizer (CMGO). This enhanced Mountain Gazelle Optimizer (MGO) is tailored for K-means clustering solutions. Noticing a skew in MGO’s strategy distribution, we integrated a chaotic map into the Territorial Solitary Males strategy and omitted the Migration to Search for Food strategy. This adjustment increases exploration and curtails exploitation, improving CMGO's effectiveness in clustering complex datasets. We implemented the Gower distance technique to navigate K-means clustering's limitations with categorical and binary data. Tests on numeric, binary, categorical, and mixed data underscore the clustering's versatility. We evaluated CMGO against 14 algorithms on 28 UCI and OpenML datasets using the F-Measure metric and the tied rank test for statistical significance ranking. CMGO outperforms the original MGO and other tested algorithms in clustering pure numeric and categorical data, securing first place, and third for mixed data. Thus, CMGO emerges as a robust, efficient K-means optimizing method for complex, diverse datasets.
... However, this algorithm had a higher computational time as compared to GWO (with the side effect of TS). Niknam and Amiri proposed a hybrid algorithm by combining PSO, ACO, and k-means algorithm [40] to overcome the issues of a traditional k-means algorithm. Shelokar et al. [46] proposed an ant colony-based algorithm to address clustering problems. ...
Article
Full-text available
This paper presents a hybrid meta-heuristic algorithm using Grey Wolf optimization (GWO) and JAYA algorithm for data clustering. The idea is use exploitative capability of JAYA algorithm in the explorative phase of GWO to form compact clusters. Here, instead of using one best and one worst solution for generating offspring, three best wolfs and three worst omega wolfs of the population are used. So, the best wolfs and worst omega wolfs assist in moving the new solutions towards the best solutions and simultaneously helps in staying away from the worst solutions. This enhances the chances of reaching the near optimal solutions. The superiority of the proposed method is compared with five promising algorithms, namely GWO, Sine-Cosine Algorithm (SCA), Particle Swarm Optimization (PSO), JAYA and K-means algorithms. The result obtained from the Duncan’s multiple range test and Nemenyi hypothesis based statistical test confirms the superiority and robustness of our proposed method.
... It is worth noting that clustering algorithms typically require the setting of one or more parameters to improve their effectiveness [24]. For example, K-Means requires specifying the number of clusters, whereas DBSCAN needs two parameters: the neighborhood radius and minimum density. ...
Article
Full-text available
Short-term load forecasting (STLF) plays an important role in facilitating efficient and reliable operations of power systems and optimizing energy planning in the electricity market. To improve the accuracy of power load prediction, an adaptive clustering long short-term memory network is proposed to effectively combine the clustering process and prediction process. More specifically, the clustering process adopts the maximum deviation similarity criterion clustering algorithm (MDSC) as the clustering framework. A bee-foraging learning particle swarm optimization is further applied to realize the adaptive optimization of its hyperparameters. The prediction process consists of three parts: (i) a 9-dimensional load feature vector is proposed as the classification feature of SVM to obtain the load similarity cluster of the predicted days; (ii) the same kind of data are used as the training data of long short-term memory network; (iii) the trained network is used to predict the power load curve of the predicted day. Finally, experimental results are presented to show that the proposed scheme achieves an advantage in the prediction accuracy, where the mean absolute percentage error between predicted value and real value is only 8.05% for the first day.
... Several studies were conducted in order to improve K-means' initial centroid quality. Among of them is done using the optimization method, such as genetic algorithm (GA) [14], particle swarm optimization (PSO) [21], firefly algorithm (FA) [19], hybrid GA-PSO-and fuzzy system [20]. Their proposed methods are superior to conventional K-means clustering. ...
Article
Full-text available
Riau province is one of the provinces in Indonesia where forest fires frequently occur every year. Hotspot data is geothermal points and they can be utilized as an indicator of forest fires. Clustering’s method can be used to analyze potential forest fires from hotspot data’s cluster pattern. In this study, hybrid genetic algorithm polygamy with K-means (GAP K-means) was used for hotspot data clustering. GA polygamy was used to determine the initial centroid of K-means. It was used to solve the sensitivity of K-means to the initial centroid, and to find the optimal solution faster. Experimentally compared the performance of GAP K-means, GA K-means, and K-means on the hotspots data, two artificial datasets, and three real-life datasets. Sum square error (SSE), davies bouldin index (DBI), silhouette coefficient (SC) and F-measure are used to evaluation clustering. Based this experiment, GAP K-means outperforms than K-means but GAP K-means still not fast to achieve convergent than GA K-means.
... They determine the motion state of a particle; Step 2: The result (Euclidean distance) of each search is the particle fitness, and then record the individual historical optimal position and the group's historical optimal position; Step 3: The historical optimal positions of an individual and the group are equivalent to two forces, which jointly affect the motion state of a particle in combination with the inertia of the particle itself. The particle swarm optimization algorithm and k-means algorithm are combined to find better clustering centers that minimize the Euclidean distance as much as possible [6,7]. ...
Article
Full-text available
As one sort of cultural relic, glassy antiques can effectively convey the historical information of a certain era and reveal the cultural exchanges in different regions as a symbol of foreign trade. However, due to long-term weathering and corrosion when buried in soil, the shape, color, and chemical components of a glassy antique can change considerably, and hence identifying it and recognizing its category is particularly difficult. Clustering is a popular technique of data analysis and data mining. K-means is one of the most popular data mining algorithms, as it is simple, scalable, and easy to modify in different contexts and fields of application. This paper uses k-means to find the clustering center showing the characteristics of the chemical components of different categories of glassy antiques. Then the particle swarm optimization algorithm (PSO) that offers a globalized detailed search methodology is utilized to improve the k-means clustering. The new result compared to the previous one of traditional k-means clusterin shows better classification capacity. Finally, it compares the results of k-means with that of PSO-k-means and analysis the advantages and disadvantages of PSO-k-means.
... First, random particles are created and their positions (x i ) and speeds (v i ) in a given dimension j-th are determined. Particle locations [49,50] are updated using Equations (10) and (11). ...
Article
Full-text available
Recently, pre-trained deep learning (DL) models have been employed to tackle and enhance the performance on many tasks such as skin cancer detection instead of training models from scratch. However, the existing systems are unable to attain substantial levels of accuracy. Therefore, we propose, in this paper, a robust skin cancer detection framework for to improve the accuracy by extracting and learning relevant image representations using a MobileNetV3 architecture. Thereafter, the extracted features are used as input to a modified Hunger Games Search (HGS) based on Particle Swarm Optimization (PSO) and Dynamic-Opposite Learning (DOLHGS). This modification is used as a novel feature selection to alloacte the most relevant feature to maximize the model’s performance. For evaluation of the efficiency of the developed DOLHGS, the ISIC-2016 dataset and the PH2 dataset were employed, including two and three categories, respectively. The proposed model has accuracy 88.19% on the ISIC-2016 dataset and 96.43% on PH2. Based on the experimental results, the proposed approach showed more accurate and efficient performance in skin cancer detection than other well-known and popular algorithms in terms of classification accuracy and optimized features.
... Taher Niknam combined fuzzy adaptive particle swarm optimization (PSO), ACO, and kmeans algorithms to find better cluster partition. ACO is used to make sure that the global best position is unique for every particle [14]. Zhang et al. used ACO to optimize the parameter of SVM applied to machinery fault diagnosis [15]. ...
Article
Full-text available
A combination method of statistical filtering (SF) and the ant colony optimization (ACO) is proposed for automatic decision of optimum symptom parameters and frequency bands for machinery diagnosis. The noise of vibration signals is canceled by using SF. Similarity factor Ipq is defined to evaluate the filtering performance; the significance level α is optimized by genetic algorithms (GA). The optimum symptom parameters in different frequency bands, by which the states of rotating machinery can be sensitively distinguished, are automatically and sequentially selected by ACO based on the Mahalanobis distance between different machine states. Finally, the Mahalanobis distance is used to identify failure types based on the sequential diagnostic method. The new method proposed in this paper has been used to diagnose a centrifugal pump system for faults which often occur in the pump, such as impeller unbalance, shaft misalignment, and cavitation. The verification results of the condition diagnosis for a centrifugal pump show that the new method has good performance.
... In [100], six PSO variants with discrete crossover operators have been proposed, which choose the second parents and the number of crossover points in different ways. Experimental results show that two proposed PSO variants outperform the standard PSO algorithm. ...
Article
Full-text available
Survey/review study A Survey of Algorithms, Applications and Trends for Particle Swarm Optimization Jingzhong Fang 1, Weibo Liu 1,*, Linwei Chen 2, Stanislao Lauria 1, Alina Miron 1, and Xiaohui Liu 1 1 Department of Computer Science, Brunel University London, Uxbridge, Middlesex, UB8 3PH, United Kingdom 2 The School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom * Correspondence: Weibo.Liu2@brunel.ac.uk Received: 18 October 2022 Accepted: 28 November 2022 Published: Abstract: Particle swarm optimization (PSO) is a popular heuristic method, which is capable of effectively dealing with various optimization problems. A detailed overview of the original PSO and some PSO variant algorithms is presented in this paper. An up-to-date review is provided on the development of PSO variants, which include four types i.e., the adjustment of control parameters, the newly-designed updating strategies, the topological structures, and the hybridization with other optimization algorithms. A general overview of some selected applications (e.g., robotics, energy systems, power systems, and data analytics) of the PSO algorithms is also given. In this paper, some possible future research topics of the PSO algorithms are also introduced.
... The studies in"Other" category, implemented Genetic Algorithm (Chan 2008), Particle Swarm Optimization (PSO) (Chiu et al. 2009) and Imperialist Competitive Algorithm (ICA)(DehghaniZadeh et al. 2018) for segmenting customers. The studies that implemented various extensions of the known clustering algorithms fall under "Proposed" category (Tsai and Chiu 2004;Niknam and Amiri 2010;Chiu and Kuo 2010;Munusamy and Murugesan 2020;Chan et al. 2016;Dhandayudam and Krishnamurthi 2014;Wang 2009). ...
Article
Full-text available
In recent years, as digital transformation picked up stream, the volume of customer transactional data that become available to companies has increased. By making use of such vast amount of transactional data and employing various data mining techniques, customer segmentation has received intensive attention from different industries, while significant research effort has been devoted to this topic, and the body of literature has begun to accumulate. In this context, the aim of this paper is to provide a comprehensive review of literature on transactional data-based customer segmentation to identify different characteristics in the field, analyze the application of data mining techniques, and highlight important points for further research. To review the existing literature in the field, three major online databases were used, and eventually, 84 relevant articles published in journals of well-known publishers are selected. The identified articles then completely analyzed based on the diverse criteria of the stages of CRISP-DM (CRoss Industry Standard Process for Data Mining) framework, and the results were reported. This systematic literature review can be very useful for academics and practitioners by providing a comprehensive overview of research work on customer segmentation using data mining and presenting guidelines for future research in this area as well.
... However, this algorithm has higher computational time compared to GWO with the side effect of TS. Niknam and Amiri proposed a hybrid algorithm by combining PSO, ACO and K-means algorithm [40] to overcome the issues of traditional K-means algorithm. Shelokar et al. [46] proposed an ant colony based algorithm to address the clustering problems. ...
Article
This paper presents a hybrid meta-heuristic algorithm using Grey Wolf optimization (GWO) and JAYA algorithm for data clustering. The idea is to use exploitative capability of JAYA algorithm in the explorative phase of GWO to form compact clusters. Here, instead of using one best and one worst solution for generating offspring, three best wolves and three worst omega wolves of the population are used. So, the best wolves and worst omega wolves assist in moving the feasible solutions towards the near-optimal solutions and it simultaneously helps in staying away from the worst solutions. This enhances the chances of reaching the near optimal solutions. The superiority of the proposed algorithm is compared with five promising algorithms, namely Sine-Cosine Algorithm (SCA), GWO, JAYA, Particle Swarm Optimization (PSO) and K-means algorithms. The performance of the proposed algorithm is evaluated for 23 benchmark mathematical problems using Friedman and Nemenyi hypothesis tests. Additionally, the superiority and robustness of our proposed algorithm is tested for 15 clustering problems using both Duncan's multiple range test and Nemenyi hypothesis test.
... Computational results showed that their proposed CPSO algorithm was very competitive and outperforms the genetic algorithm. Niknam et al. [22] considered the k-means algorithm highly depended on the initial state and converged to local optimum solution. Therefore, they presented a new hybrid evolutionary algorithm to solve nonlinear partitional clustering problem. ...
Article
In order to solve the cluster analysis problem more efficiently, we presented a new approach based on firefly algorithm (FA). First, we created the optimization model using the variance ratio criterion (VRC) as fitness function. Second, FA was introduced to find the maximal point of the VRC. The experimental dataset contains 400 data of 4 groups with three different levels of overlapping degrees: non-overlapping, partial overlapping, and severely overlapping. We compared the FA with genetic algorithm (GA) and combinatorial particle swarm optimization (CPSO). Each algorithm was run 20 times. The results show that FA can found the largest VRC values among all three algorithms, while costs the least time. Therefore, FA is effective and rapid for the cluster analysis problem.
... Bu sorunların üstesinden gelmek için Genetik Algoritma (Kapil vd., 2016;Maulik ve Bandyopadhyay, 2000;Zeebaree vd., 2017), Karınca Kolonisi Optimizasyonu (Kao ve Cheng, 2006;A. Kumar vd., 2018;Niknam ve Amiri, 2010) , Yapay Arı Kolonisi (Armano ve Farmani, 2014;Karaboga ve Ozturk, 2011;Tran vd., 2015) , Parçacık Sürüsü Optimizasyonu (H. Li, He ve Wen, 2015;Sun ve diğerleri, 2022;Xu, Li, Zhou, Xu ve Cao, 2018), Çiçek Tozlaşma Algoritması (Kaur ve diğerleri, 2020), Karınca Aslanı Optimizasyonu (Chen, Qi, Chen, Chen ve Cheng, 2020), Ateşböceği Algoritması (Hrosik vd., 2019;Wu, Peng, Fan, Wang ve Huang, 2021;Xie ve diğerleri, 2019) ve Guguk Kuşu Araması (Arjmand, Meshgini, Afrouzian ve Farzamnia, 2019;García, Yepes ve Martí, 2020) gibi sürü tabanlı birçok optimizasyon algoritması geliştirilmiştir. ...
... The result shows that the segmentation using Gabor filters has a better result than without using Gabor filters [6], [7]. However, the segmentation using thresholding requires a long computation time [9], as well as K-means which is constrained by the determination of the initial centroid which causes the local optimum [10], [11]. Unlike the research of Erwin et al. [9] and Hemeida et al. [12] which optimizes retinal blood vessel segmentation using a thresholding approach, this research aims to segment the blood vessel segmentation using a clustering approach optimized by particle swarm optimization (PSO). ...
Article
Full-text available
The structure of the retinal blood vessels can be obtained by segmenting the fundus images. A fundus image can be gained through color fundus photography or fluorescein angiography (FA). The fundus image produced by the camera can cause noise which can reduce the quality of the fundus image. To reduce the noise, this research uses the non-local means filter (NLMF). For texture analysis, the study uses Gabor filters due to the frequencies of this filter as the same as the human visual system. The segmenting process of the retinal blood vessel is performed using K-means optimized by particle swarm optimization (PSO). The accuracy of 0.9525, the precision of 0.8330, the sensitivity of 0.5817, and the specificity of 0.9880 are obtained using the proposed method.
... K-means [5] a prominent clustering technique, along with its variants is used with genetic algorithms, ecology based and population based techniques. KABC, KACO, K-SVM, K-MCI, K-Firefly [6], FAPSO-ACO-K [7], Global k-means, etc. are the various hybrids of nature inspired techniques using k-means clustering algorithm [8]. ...
Conference Paper
Nature inspired techniques are most popular techniques used for optimization. Clustering is a data mining functionality which is also used in optimization. In recent past, researchers have implemented several nature inspired techniques using clustering. In this paper, scientometric analysis is conducted to analyze the research trends of these techniques using data from Web of Science and Scopus databases where analysis of ACO, PSO, Bat, Lion, water droplets, and such optimization techniques inspired by nature are studies which use k-means, DB Scan, k-neighbor and other such clustering algorithms. Data is analyzed globally, cluster analysis of related keywords in analyzed along with link strength. Various other experiments are also conducted which help analyze research trends in clustering and nature inspired techniques.
... However, because the determination of the center point is carried out randomly in the first stage and taking the average value for the next step. The results are less than optimal, and local optimum convergence is obtained [13]. This weakness encourages the merger of k-means with a metaheuristic method that can overcome it, the hybridization of Particle Swarm Optimization (PSO) and K-means algorithm combines the ability of the globalized searching of the PSO technique and the fast and convergence of the K-Means algorithm and can avoid the drawback of both algorithms [7]. ...
Conference Paper
Full-text available
Cooperatives are a venture that can provide a solution to society's need for life-based on the spirit of help. The position of cooperatives in Indonesia is very important as one of the pillars of the economy. However, in the past few years, the number of cooperatives has steadily decreased because they have not fulfilled their duties and functions. This decline in the number of cooperatives also occurred in Cianjur, where around 70% of cooperatives died in 2018. Therefore, this study aims to analyze cooperatives in Cianjur based on clusters formed from the Cianjur cooperative database. The research data were obtained from the DKUPP Cianjur database. The data obtained totaled 1528 cooperatives with 8 attributes, namely the type of cooperative, cooperative group, business sector, number of members, own capital, external capital, business volume, and remaining business results. The cooperative cluster was formed using a combination of the Particle Swam Optimization and K-Means Clustering methods with the help of Rapid Miner. The result was three cooperative clusters in Cianjur with a Davies Bouldin index score of -1.54. The first cluster character was a thriving trading cooperative group or a developing trade cooperative group that had a fairly good business aspect, the character of the second cluster was an excellent financial service cooperative group or a cooperative with a very good financial condition, and the character of the third cluster is the unperforming financial cooperatives group or cooperative groups with poor financial conditions.
... It used the crossover and mutation strategy of genetic algorithm for particle swarm optimization update to balance the exploitation and exploration ability of the algorithm. In Ref. [36], a new hybrid algorithm called FAPSO-ACO-K was proposed to solve the nonlinear clustering problem which mixed fuzzy adaptive particle swarm optimization (FAPSO), ant colony optimization (ACO) and clustering together. In Ref. [37], a new hybrid algorithm (PSOSCALF) was proposed by using cosine SCA and Levy in PSO algorithm, which enhanced the exploration ability of the original PSO and prevented the algorithm from falling into local optimization. ...
Article
In recent years, many improved particle swarm optimization (PSO) algorithms have been developed to improve the performance of PSO. These improved algorithms have greatly improved PSO performance, but PSO still has some shortcomings. So, an particle swarm optimization with Chebychev functional-link network model is proposed (APSOCFLN) in this paper. Firstly, in order to make up for the shortcomings of the canonical PSO update search, a novel Chebychev functional-link network (CFLN) elite guidance strategy is proposed. Two different update mechanisms are executed alternately, increasing the diversity of the algorithm population and making the algorithm more capable of jumping out of local optimization. Secondly, in order to better balance the exploration and exploitation of the algorithm, an adaptive probability strategy is proposed. Thirdly, an adaptive weighting strategy is proposed. It will be used for CFLN elite guidance strategy and give different weights to the two alternating update strategies, which can effectively solve the problem of premature convergence of PSO. Fourthly, the APSOCFLN and comparison algorithms are used to solve the practical engineering problems, and the results show that APSOCFLN has high precision, fast convergence , and excellent performance. Finally, the performance of the algorithm is tested with CEC2017 and CEC2022 benchmark functions. The comparative algorithm includes the classical PSO improvement algorithms and the classical other algorithms, and the experimental results show the effectiveness of the proposed strategy and the good performance of the improved algorithm.
... All data are divided into the class represented by the cluster center closest to it, and the k cluster centers are updated according to the mean of the newly generated data objects in each category. If the change of the cluster center value in the adjacent iteration times exceeds the specified threshold, all data objects will be redivided according to the new cluster center; if the change of the cluster center value in the adjacent iteration times is less than the specified threshold, then the algorithm converges and the clustering result is output [25]. ...
Article
Full-text available
The criteria-based sand and dust weather determination method has the problem ofbeing a cumbersome and time-consuming process when processing a large amount of raw data, and cannot avoid the problems of repeatability and reproducibility. On the basis of statistical analysis of the air automatic monitoring data in the cities affected by sand and dust, this paper proposes a k-means optimization algorithm (MDPD-k-means) based on maximum density and percentage distance, which can quickly filter the characteristic data of sand and dust in a short time, and identify the days affected by sand and dust. This method effectively improves the data processing efficiency, solves the problems of poor reproducibility and large artificial error of traditional methods, and can support the business application of sand and dust data elimination. This paper uses the method to identify the sand and dust data of 10 cities in Shaanxi Province from 2016 to 2022, determines a total of 1107 sand and dust days, and points out that the number of days affected by sand and dust is increasing year by year. After excluding the effect of sand and dust, the urban PM10 concentration decreases by 18.42~1.41% respectively, which provides important data information for accurately evaluating the effectiveness of air pollution prevention and control.
... Individually element is updated by using the two finest values in every reiteration. The first most fine rate is obtained by qualification function and the second finest value is gained by using PSO in the population (Niknam and Amiri, 2010). • Clustering-Load Balancing Using PSO: PSO enhances the grouping, which yields inspiration from the attributes of ants in nature and from the associated field of PSO to fathom the issue in conveying systems for picking the briefest directing procedure. ...
Article
Full-text available
The optimal CH selection for finding the shortest path among the CHs is improved by developing the hybrid K-means with Particle Swarm Optimization (PSO) based hybrid Ad-hoc On-demand Distance Vector (AODV) channelling algorithms. The alive nodes, total packet sending time, throughput, and NL are increased using this hybrid technique, whereas dead nodes and EC are minimized in the network. The proposed algorithm utilizes a rotational method of utilization of cluster head (CH) to ensure that all member nodes are utilized uniformly based on the incoming traffic. The proposed algorithm has been implemented, experimented with, and compared in performance with LEACH, DLBA and GLBA algorithms. The proposed hybrid approach outperforms the existing techniques regarding average energy consumption and load distribution.
... where PG and CG indicate the optimal population extremum/individual extremum, respectively, f denotes fitness value of a particular particle, N swarm means the population size of the particle swarm. As found in related literature [48,49], the fitness value of particle swarm is normalized as the parameter of the fuzzy system according to the following formula: ...
Article
Full-text available
With rapid economic and demographic growth, traffic conditions in medium and large cities are becoming extremely congested. Numerous metropolitan management organizations hope to promote the coordination of traffic and urban development by formulating and improving traffic development strategies. The effectiveness of these solutions depends largely on an accurate assessment of the distribution of urban hotspots (centers of traffic activity). In recent years, many scholars have employed the K-Means clustering technique to identify urban hotspots, believing it to be efficient. K-means clustering is a sort of iterative clustering analysis. When the data dimensionality is large and the sample size is enormous, the K-Means clustering algorithm is sensitive to the initial clustering centers. To mitigate the problem, a hybrid heuristic "fuzzy system-particle swarm-genetic" algorithm, named FPSO-GAK, is employed to obtain better initial clustering centers for the K-Means clustering algorithm. The clustering results are evaluated and analyzed using three-cluster evaluation indexes (SC, SP and SSE) and two-cluster similarity indexes (CI and CSI). A taxi GPS dataset and a multi-source dataset were employed to test and validate the effectiveness of the proposed algorithm in comparison to the Random Swap clustering algorithm (RS), Genetic K-means algorithm (GAK), Particle Swarm Optimization (PSO) based K-Means, PSO based constraint K-Means, PSO based Weighted K-Means, PSO-GA based K-Means and K-Means++ algorithms. The comparison findings demonstrate that the proposed algorithm can achieve better clustering results, as well as successfully acquire urban hotspots.
... Among the 10 benchmark datasets, Artificial Datasets I and II are artificial datasets selected from the literature (Niknam and Amiri, 2010), and the remaining 8 datasets are related to life and physics from UCI. Table 2 summarizes the number of attributes, clusters, and instances and the application areas of ten benchmark datasets. ...
Article
Full-text available
Clustering is an unsupervised learning technique widely used in the field of data mining and analysis. Clustering encompasses many specific methods, among which the K-means algorithm maintains the predominance of popularity with respect to its simplicity and efficiency. However, its efficiency is significantly influenced by the initial solution and it is susceptible to being stuck in a local optimum. To eliminate these deficiencies of K-means, this paper proposes a quantum-inspired moth-flame optimizer with an enhanced local search strategy (QLSMFO). Firstly, quantum double-chain encoding and quantum revolving gates are introduced in the initial phase of the algorithm, which can enrich the population diversity and efficiently improve the exploration ability. Second, an improved local search strategy on the basis of the Shuffled Frog Leaping Algorithm (SFLA) is implemented to boost the exploitation capability of the standard MFO. Finally, the poor solutions are updated using Levy flight to obtain a faster convergence rate. Ten well-known UCI benchmark test datasets dedicated to clustering are selected for testing the efficiency of QLSMFO algorithms and compared with the K-means and ten currently popular swarm intelligence algorithms. Meanwhile, the Wilcoxon rank-sum test and Friedman test are utilized to evaluate the effect of QLSMFO. The simulation experimental results demonstrate that QLSMFO significantly outperforms other algorithms with respect to precision, convergence speed, and stability.
... Kao et al. [11] propose a hybrid method, namely combining k-means, Nelder-Mead simplex search, and PSO (K-NM-PSO). Niknam and Amiri [12] suggested the hybrid evolutionary algorithm, fuzzy adaptive particle swarm optimization-ant colony optimization-k-means algorithms (FAPSO-ACO-K), which has a greater chance of clustering. Laszlo and Mukherjee [13] proposed a genetic algorithm (GA) approach for seeding the k-means clustering method with centers using a unique crossover operator that swaps adjacent centers. ...
Article
Full-text available
Clustering is a robust machine learning task that involves dividing data points into a set of groups with similar traits. One of the widely used methods in this regard is the k-means clustering algorithm due to its simplicity and effectiveness. However, this algorithm suffers from the problem of predicting the number and coordinates of the initial clustering centers. In this paper, a method based on the first artificial bee colony algorithm with variable-length individuals is proposed to overcome the limitations of the k-means algorithm. Therefore, the proposed technique will automatically predict the clusters number (the value of k) and determine the most suitable coordinates for the initial centers of clustering instead of manually presetting them. The results were encouraging compared with the traditional k-means algorithm on three real-life clustering datasets. The proposed algorithm outperforms the traditional k-means algorithm for all tested real-life datasets. Keywords: Artificial bee colony algorithm K-means algorithm Optimize k-means clustering Variable-length representation This is an open access article under the CC BY-SA license.
... Then, Zhang presented a novel FCM-ELPSO for solving the problem that PSO-based clustering methods have poor execution times [24]. Beyond that, other intelligent optimization algorithms are often used in parameter selection problems, such as Ant Colony Optimization (ACO) [25], Artificial Bee Colony (ABC) [26], and Grey Wolf Optimizer (GWO) [27]. ...
Article
Full-text available
Over the years, research on fuzzy clustering algorithms has attracted the attention of many researchers, and they have been applied to various areas, such as image segmentation and data clustering. Various fuzzy clustering algorithms have been put forward based on the initial Fuzzy C-Means clustering (FCM) with Euclidean distance. However, the existing fuzzy clustering approaches ignore two problems. Firstly, clustering algorithms based on Euclidean distance have a high error rate, and are more sensitive to noise and outliers. Secondly, the parameters of the fuzzy clustering algorithms are hard to determine. In practice, they are often determined by the user’s experience, which results in poor performance of the clustering algorithm. Therefore, considering the above deficiencies, this paper proposes a novel fuzzy clustering algorithm by combining the Gaussian kernel function and Grey Wolf Optimizer (GWO), called Kernel-based Picture Fuzzy C-Means clustering with Grey Wolf Optimizer (KPFCM-GWO). In KPFCM-GWO, the Gaussian kernel function is used as a symmetrical measure of distance between data points and cluster centers, and the GWO is utilized to determine the parameter values of PFCM. To verify the validity of KPFCM-GWO, a comparative study was conducted. The experimental results indicate that KPFCM-GWO outperforms other clustering methods, and the improvement of KPFCM-GWO is mainly attributed to the combination of the Gaussian kernel function and the parameter optimization capability of the GWO. What is more, the paper applies KPFCM-GWO to analyzes the value of an airline’s customers, and five levels of customer categories are defined.
Article
Full-text available
The study of process mineralogy plays a very important role in the field of mineral processing and metallurgy, in which the measurement of mineral-embedded particle size is one of the main research areas. The manual measurement method using a microscope has many problems, such as heavy workload and low measurement accuracy. In order to solve this problem, this paper proposes a Gaussian mixture model based on an expectation maximization (EM) algorithm to measure the embedded particle sizes of minerals of polished metal sections. Experiments are here performed on the polished section images of ilmenite and pyrite, and we compared the results with a microscope. The experimental results show that the proposed method has higher precision and accuracy in measuring the embedded particle sizes of metal minerals.
Chapter
Since its inception, particle swarm optimization and its improvement has been an active area of research, and the algorithm has found its application in multifarious domains such as highly constrained engineering problems as well as artificial intelligence. The focal point of this paper is to make the reader aware of the innumerable applications of particle swarm optimization, especially in the field of bioinformatics, digital image processing, and computational linguistics. This review work is designed to serve as a comprehensive look-up guide and to navigate through the algorithm's scope and application in recent times in the aforementioned fields.
Article
Abstract This study evaluates the performance of Iran and 38 OECD countries by providing a DFM model with undesirable outputs. Comparing the changes in the rate of increase in output and the rate of decrease in inputs for the efficiency of decision units in DFM and CCR models is one of the objectives of this study. This research is applied research in terms of purpose and is descriptive research in terms of implementation. According to the review of the literature and research background of the employed Total labor force, Total primary energy consumption and Gross Capital Formation were considered as input, GDP was considered as a desirable output and total GHG emissions excluding LULUCF, was considered as an undesirable output. The results showed that inefficient countries should make the most change in the input of total primary energy consumption and the least change in the desirable output (GDP) in order to reach the efficiency frontier; therefore, considering energy consumption management can be effective in changing inefficient countries with poor performance in environmental sustainability, such as Iran into efficient ones. The analysis also showed that in order to improve the sustainability in Iran, policymakers should consider reducing energy consumption, increasing labor productivity, reducing greenhouse gas emissions and improving investment productivity and GDP. According to the research method, two modeling methods of evaluating the efficiency of countries in the field of sustainable development were compared and the advantages of DFM model in comparison to CCR were studied.
Article
Data clustering is a machine learning method for unsupervised learning that is popular in the two areas of data analysis and data mining. The objective is to partition a given dataset into distinct clusters, aiming to maximize the similarity among data objects within the same cluster. In this paper, an improved honey badger algorithm called DELHBA is proposed to solve the clustering problem. In DELHBA, to boost the population’s diversity and the performance of global search, the differential evolution method is incorporated into algorithm’s initial step. Secondly, the equilibrium pooling technique is included to assist the standard honey badger algorithm (HBA) break free of the local optimum. Finally, the updated honey badger population individuals are updated with Levy flight strategy to produce more potential solutions. Ten famous benchmark test datasets are utilized to evaluate the efficiency of the DELHBA algorithm and to contrast it with twelve of the current most used swarm intelligence algorithms and k-means. Additionally, DELHBA algorithm’s performance is assessed using the Wilcoxon rank sum test and Friedman’s test. The experimental results show that DELHBA has better clustering accuracy, convergence speed and stability compared with other algorithms, demonstrating its superiority in solving clustering problems.
Article
Full-text available
The 197 Public Inspection Center is a subset of the police force founded with the aim of developing public inspecting NAJA performance and their participation in managing different parts of this organization. Number and diversity of citizens' daily contacts with this center indicates the success of this system in attracting citizens' trust and increasing their feeling of responsibility toward the service offered by the police. The center's databank contains useful information concerning people's contacts with the system which can serve as an important, suitable source for appraising improvement in performance of NAJA. The tools suggested here include employing methods of data mining with the customer relationship management approach. An attempt has been made in this article to investigate the application data mining methods to this system. First, the application of this tool to customer relationship management is reviewed, and then, on the basis of RFM theory and clustering method, a pattern for identification of citizens' important, major requests in the area of police service will be presented. It is expected that following the approach provided will lead to discovery of the patterns useful in NAJA performance improvement
Chapter
Particle Swarm Optimization (PSO) has gained its importance over last 20 years and has been proved successful in many domains and disciplines of science and technology as well as in other fields. It has shown its ability in optimizing various complex problems in a simpler way. Due to its simplicity and worldwide applications, the latest breakthroughs in PSO, as well as their applications in various fields are stated in this chapter. Its significance, algorithm and working mechanism along with the pseudo-code are presented in this chapter. The utility of PSO has been addressed and the flaws in the algorithm have been recognized. The recent advancements and modifications of PSO in terms of its parameters are also discussed. Finally, its hybridization with other illustrious algorithms and applications in multiple disciplines and domains over the last decades are discussed. The motivation for all hybrid optimization techniques is examined for real-world challenges. It has been noticed that PSO can be hybridized with other algorithms and parallel applications.
Article
Tasks allocation problem in heterogeneous and distributed-multiprocessing computing environments is a nonlinear multi-objective NP-hard problem and from a research perspective, it is considereda major issue. The underlying objective of any allocation mechanism to execute the specified tasks set and enumerating processors is to minimize the overall cost. To solve the tasks allocation problems numerous meta-heuristic techniques have been tried and tested but still, plenty of scopes are there for optimal strategies. In the presented article, a comprehensive task assignment model based on particle swarm optimization (PSO) is developed, which optimizes response time, flowtime, and cost of the distributed computing system. In the present technique, the ‘n’ number of cluster centroids of ‘r’ tasks, is updated by the PSO technique to form the ‘n’ task clusters to minimize the communication costs then their allocation is made by the newly proposed heuristic method. Considering the fact, in the given model objective functions conflict with each other and the good thing about the PSO algorithm are its accuracy and speed. And to come up with the most favorable solution, authors have developed PSO based algorithm.The PSO-based technique that is proposed in this article to solve a given assignment problem in a distributed computing system is able to give better results in a manner of convergence rate as thePSO method integrates local and global search methods in an attempt to strike a balance between research and utilization. To examine the functioning of the developed technique, the well-demonstrated scheduling policies based on different techniques have been compared and obtained finer outcomes. The developed mechanism is acceptable for an erratic number of tasks and processors.
Article
The water hazard of the coal seam floor is a major threat to safe coal production in China. To improve the accuracy of water hazard predictions, a water inrush risk predictive model was constructed using PSO-SVM. Historical monitoring data were added to the basic database in a timely manner to narrow the difference between the monitoring data and predicted results. The optimized database was used for neural network model training. The prediction model was improved by establishing a database self-optimization and model self-learning process (SOMSP). The PSO-SVM model and the SOMSP was used to predict the inrush risk for 23 groups of floor water inrush cases from the north China mine area. The initial accuracy of the model was only 25% for the first 19 data groups, which were used as the basic training data to predict data groups 20–23. Using the SOMSP, the accuracy of the water inrush risk of the coal seam floor was increased to 100% (3/3). Thus, the accuracy of the predictions was greatly improved by the SOMSP.
Article
Business analytics refers to the application of sophisticated tools to obtain valuable information from a large dataset that is generated by a company. Among these tools, fuzzy optimisation stands out because it helps decision-makers to solve optimisation problems considering the uncertainty that commonly occurs in application domains. This paper presents a bibliometric analysis following the PRISMA statement on the Dimensions database to obtain publications related to fuzzy optimisation applied to business domains. The purpose of this analysis is to gather useful information that can help researchers in this area. A total of 2,983 publications were analysed using VOSviewer to identify the trend in the number of publications per year, relationships in terms in both the title and abstract of these publications, most influential publications, and relationships among journals, authors, and institutions.
Article
Full-text available
Статья посвящена определению критической толщины диффузионного слоя восстанов- ленных прецизионных деталей распылителя форсунки путем диффузионной металлизации в целях обеспечения их долговечной работы.
Article
This paper presents a new combined algorithm for the fuzzy Travelling Salesman Problem (FTSP) based on a composition of the Intelligent Water Drops (IWD) and the Electromagnetism-like (EM) algorithms. In a FTSP, the time consumed distance between cities i and j can be described by vague knowledge, such as fuzzy quantity. The main goal of FTSP is to achieve the minimum distance of Hamilton circuit of G graph, where the Hamilton circuit is a closed route of cities (i.e., nodes) of G that have been visited only once. The proposed algorithm transfers the generated responses by the IWD to the EM, where the best answer is selected. Importantly, the computed results from both algotithm are compared and the best is accumulated. In other words, in each iteration, the best result is collected by comparison between the current and previous hierarchies until the halt condition is fulfilled. Finally, the results of the genetic algorithm (GA), IWD and EM algorithms are compared, so that the efficiency of the proposed combined IWD-EM algorithm is determined.
Conference Paper
Full-text available
4th International Science Post Graduate Conference, University Teknologi Malaysia 2016
Article
Full-text available
Clustering problems appear in a wide range of unsupervised classification applications such as pattern recognition, vector quantization, data mining and knowledge discovery. The k-means algorithm is one of the most widely used clustering techniques. Unfortunately, k-means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum. This paper presents an efficient hybrid evolutionary optimization algorithm based on combining Ant Colony Optimization (ACO) and Simulated Annealing (SA), called ACO-SA, for cluster analysis. The performance is evaluated through several benchmark data sets. The simulation results show that the proposed algorithm outperforms the previous approaches such as SA, ACO and k-means for partitional clustering problem.
Article
Full-text available
Cluster analysis, which is the subject of active research in several fields, such as statistics, pattern recognition, machine learning, and data mining, is to partition a given set of data or objects into clusters. K-means is used as a popular clustering method due to its simplicity and high speed in clustering large datasets. However, K-means has two shortcomings. First, dependency on the initial state and convergence to local optima. The second is that global solutions of large problems cannot be found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. Over the last decade, modeling the behavior of social insects, such as ants and bees, for the purpose of search and problem solving has been the context of the emerging area of swarm intelligence. Honeybees are among the most closely studied social insects. Honeybee mating may also be considered as a typical swarm-based approach to optimization, in which the search algorithm is inspired by the process of marriage in real honeybee. Neural networks algorithms are useful for clustering analysis in data mining. This study proposes a two-stage method, which first uses self-organizing feature maps (SOM) neural network to determine the number of clusters and then uses honeybee mating optimization algorithm based on K-means algorithm to find the final solution. We compared proposed algorithm with other heuristic algorithms in clustering, such as GA, SA, TS, and ACO, by implementing them on several well-known datasets. Our finding shows that the proposed algorithm works better than others. In order to further demonstration of the proposed approach’s capability, a real-world problem of an Internet bookstore market segmentation based on customer loyalty is employed.
Article
Full-text available
The K-means algorithm is one of the most popular techniques in clustering. Nevertheless, the performance of the K-means algorithm depends highly on initial cluster centers and converges to local minima. This paper proposes a hybrid evolutionary programming based clustering algorithm, called PSO-SA, by combining particle swarm optimization (PSO) and simulated annealing (SA). The basic idea is to search around the global solution by SA and to increase the information exchange among particles using a mutation operator to escape local optima. Three datasets, Iris, Wisconsin Breast Cancer, and Ripley’s Glass, have been considered to show the effectiveness of the proposed clustering algorithm in providing optimal clusters. The simulation results show that the PSO-SA clustering algorithm not only has a better response but also converges more quickly than the K-means, PSO, and SA algorithms.
Conference Paper
Full-text available
Ant-based clustering and sorting is a nature-inspired heuristic for general clustering tasks. It has been applied variously, from problems arising in commerce, to circuit design, to text-mining, all with some promise. However, although early results were broadly encouraging, there has been very limited analytical evaluation of the algorithm. Toward this end, we first propose a scheme that enables unbiased interpre- tation of the clustering solutions obtained, and then use this to conduct a full evaluation of the algorithm. Our analysis uses three sets each of real and artificial data, and four distinct analytical measures. These results are compared with those obtained using established clustering techniques and we find evidence that ant-based clustering is a robust and viable alternative.
Conference Paper
Full-text available
This paper utilizes Ant-Miner - the first Ant Colony algorithm for discovering classification rules - in the field of web content mining, and shows that it is more effective than C5.0 in two sets of BBC and Yahoo web pages used in our experiments. It also investigates the benefits and dangers of several linguistics-based text preprocessing techniques to reduce the large numbers of attributes associated with web content mining.
Article
Full-text available
This study presents an efficient hybrid evolutionary optimization algorithm based on combining Ant Colony Optimization (ACO) and Simulated Annealing (SA), called ACO-SA, for optimal clustering N object into K clusters. The new ACO-SA algorithm is tested on several data sets and its performance is compared with those of ACO, SA and K-means clustering. The simulation results show that the proposed evolutionary optimization algorithm is robust and suitable for data clustering.
Article
Full-text available
In this paper, we propose a novel hybrid genetic algorithm (GA) that finds a globally optimal partition of a given data into a specified number of clusters. GA's used earlier in clustering employ either an expensive crossover operator to generate valid child chromosomes from parent chromosomes or a costly fitness function or both. To circumvent these expensive operations, we hybridize GA with a classical gradient descent algorithm used in clustering, viz. K-means algorithm. Hence, the name genetic K-means algorithm (GKA). We define K-means operator, one-step of K-means algorithm, and use it in GKA as a search operator instead of crossover. We also define a biased mutation operator specific to clustering called distance-based-mutation. Using finite Markov chain theory, we prove that the GKA converges to the global optimum. It is observed in the simulations that GKA converges to the best known optimum corresponding to the given data in concurrence with the convergence result. It is also observed that GKA searches faster than some of the other evolutionary algorithms used for clustering.
Article
Full-text available
The paper proposes an algorithm for data mining called Ant-Miner (ant-colony-based data miner). The goal of Ant-Miner is to extract classification rules from data. The algorithm is inspired by both research on the behavior of real ant colonies and some data mining concepts as well as principles. We compare the performance of Ant-Miner with CN2, a well-known data mining algorithm for classification, in six public domain data sets. The results provide evidence that: 1) Ant-Miner is competitive with CN2 with respect to predictive accuracy, and 2) the rule lists discovered by Ant-Miner are considerably simpler (smaller) than those discovered by CN2
Article
Full-text available
This paper introduces the ant colony system (ACS), a distributed algorithm that is applied to the traveling salesman problem (TSP). In the ACS, a set of cooperating agents called ants cooperate to find good solutions to TSPs. Ants cooperate using an indirect form of communication mediated by a pheromone they deposit on the edges of the TSP graph while building solutions. We study the ACS by running experiments to understand its operation. The results show that the ACS outperforms other nature-inspired algorithms such as simulated annealing and evolutionary computation, and we conclude comparing ACS-3-opt, a version of the ACS augmented with a local search procedure, to some of the best performing algorithms for symmetric and asymmetric TSPs
Article
Full-text available
In this paper, a novel design method for determining the optimal proportional-integral-derivative (PID) controller parameters of an AVR system using the particle swarm optimization (PSO) algorithm is presented. This paper demonstrated in detail how to employ the PSO method to search efficiently the optimal PID controller parameters of an AVR system. The proposed approach had superior features, including easy implementation, stable convergence characteristic, and good computational efficiency. Fast tuning of optimum PID controller parameters yields high-quality solution. In order to assist estimating the performance of the proposed PSO-PID controller, a new time-domain performance criterion function was also defined. Compared with the genetic algorithm (GA), the proposed method was indeed more efficient and robust in improving the step response of an AVR system.
Article
Full-text available
. Ant-Q is an algorithm belonging to the class of ant colony based methods, that is, of combinatorial optimization methods in which a set of simple agents, called ants, cooperate to find good solutions to combinatorial optimization problems. The main focus of this article is on the experimental study of the sensitivity of the Ant-Q algorithm to its parameters and on the investigation of synergistic effects when using more than a single ant. We conclude comparing Ant-Q with its ancestor Ant System, and with other heuristic algorithms. Published in the Proceedings of PPSN IV--Fourth International Conference on Parallel Problem Solving From Nature , H.--M. Voigt, W. Ebeling, I. Rechenberg and H.--S. Schwefel (Eds.), Springer-Verlag, Berlin, 656--665. Dorigo and Gambardella - A study of some properties of Ant-Q 2 1 Introduction In this paper we study some properties of Ant-Q, a novel distributed approach to combinatorial optimization based on reinforcement learning. Ant-Q (Gambar...
Article
The paper presents the pheromone-Q-learning (Phe-Q) algorithm, a variation of Q-learning. The technique was developed to allow agents to communicate and jointly learn to solve a problem. Phe-Q learning combines the standard Q-learning technique with a synthetic pheromone that acts as a communication medium speeding up the learning process of cooperating agents. The Phe-Q update equation includes a belief factor that reflects the confidence an agent has in the pheromone (the communication medium) deposited in the environment by other agents. With the Phe-Q update equation, the speed of convergence towards an optimal solution depends on a number of parameters including the number of agents solving a problem, the amount of pheromone deposit, the diffusion into neighbouring cells and the evaporation rate. The main objective of this paper is to describe and evaluate the performance of the Phe-Q algorithm. The paper demonstrates the improved performance of cooperating Phe-Q agents over non-cooperating agents. The paper also shows how Phe-Q learning can be improved by optimizing all the parameters that control the use of the synthetic pheromone.
Article
This paper presents an ant colony optimization methodology for optimally clustering N objects into K clusters. The algorithm employs distributed agents which mimic the way real ants find a shortest path from their nest to food source and back. This algorithm has been implemented and tested on several simulated and real datasets. The performance of this algorithm is compared with other popular stochastic/heuristic methods viz. genetic algorithm, simulated annealing and tabu search. Our computational simulations reveal very encouraging results in terms of the quality of solution found, the average number of function evaluations and the processing time required.
Article
Particle swarm optimization (PSO) is a novel population-based stochastic optimization algorithm inspired by the Reynolds' boid model. The original biological background of boid obeys three basic simple steering rules: separation, alignment and cohesion. However, to promote a simple update equation, none of these rules of boid model is employed by PSO methodology. Due to the weakness of biological background of PSO, in this paper, a new variant of PSO, boid particle swarm optimization (BPSO), is designed in which cohesion rule and alignment rule are both employed to improve the performance. In BPSO, each particle has two motions: divergent motion and convergent motion. For divergent motion, each particle adjusts its moving direction according to the the alignment direction and the cohesion direction, as well as in convergent motion, the original update equation of the standard version of PSO is used. To make a motion transition, a threshold is introduced to make the divergent motion is employed in the first period, whereas the convergent motion is used in the final stage. To testify the eciency, several unconstrained benchmarks are used to compare. Simulation results show the proposed variant is more eective and ecient than other two variants of particle swarm optimization when solving multi-modal high- dimensional numerical problems.
Article
Purpose In the classic recency‐frequency‐monetary value (RFV or RFM) approach to market segmentation, customers are grouped together into an arbitrary number of segments according to data on their most recent day of purchase (R), the number of buying orders placed (F) and the total monetary value of their purchases (V). The purpose of this paper is to show how to select the order in which the RFV dimensions are applied to data and choose the number of segments and the time frame used in such a way as to maximize the results of direct marketing campaigns. Design/methodology/approach A “genetically” optimized RFV model is built from data collected from a real world direct marketing campaign. The results produced when it is used are compared with the results yielded without the use of any forecasting method at all and with the support of a widely used basic RFV model. Findings Not only does the new model provide better results, but it is also easy to build and allows for the introduction of new dimensions that may improve its performance even further. Practical implications The new model improves the cost‐effectiveness of direct marketing campaigns by permitting more accurate identification of a company's most valuable customers and improving the quality of communication with its customers. It can thereby help them to become more competitive and profitable. This has clear implications for the gathering of marketing intelligence and planning of marketing strategies. Originality/value Although genetic algorithms have been shown to be powerful tools for problem solving, their use in marketing has been little reported. This work is a step towards bridging that gap. The genetically optimized RFV model is a new contribution to direct and relationship marketing, generating a positive qualitative and quantitative impact on the way companies relate to their customers.
Article
An external lexicon quality measure called the L-measure is derived from the F-measure (Rijsbergen, 1979; Larsen and Aone, 1999). The typically small sample sizes available for minority languages and the evaluation of Semitic language lexicons are two main factors considered. Large-scale evaluation results for the Maltilex Corpus are presented (Rosner et al., 1999).
Conference Paper
This paper presents an approach for optimal operation of distribution networks considering distributed generators (DGs). Due to private ownership of DGs, a cost based compensation method is used to encourage DGs in active and reactive power generation. The objective function is summation of electrical energy generated by DGs and substation bus (main bus). A particle swarm optimization is used to solve the optimal operation problem. The approach is tested on an IEEE34 buses distribution feeder
Article
Ant algorithms are optimisation algorithms inspired by the foraging behaviour of real ants in the wild. Introduced in the early 1990s, ant algorithms aim at finding approximate solutions to optimisation problems through the use of artificial ants and their indirect communication via synthetic pheromones. The first ant algorithms and their development into the Ant Colony Optimisation (ACO) metaheuristic is described herein. An overview of past and present typical applications as well as more specialised and novel applications is given. The use of ant algorithms alongside more traditional machine learning techniques to produce robust, hybrid, optimisation algorithms is addressed, with a look towards future developments in this area of study.
Article
Economic dispatch (ED) plays an important role in power system operation. ED problem is a non-smooth and non-convex problem when valve-point effects of generation units are taken into account. This paper presents an efficient hybrid evolutionary approach for solving the ED problem considering the valve-point effect. The proposed algorithm combines a fuzzy adaptive particle swarm optimization (FAPSO) algorithm with Nelder–Mead (NM) simplex search called FAPSO-NM. In the resulting hybrid algorithm, the NM algorithm is used as a local search algorithm around the global solution found by FAPSO at each iteration. Therefore, the proposed approach improves the performance of the FAPSO algorithm significantly. The algorithm is tested on two typical systems consisting of 13 and 40 thermal units whose incremental fuel cost functions take into account the valve-point loading effects.
Article
Data clustering helps one discern the structure of and simplify the complexity of massive quantities of data. It is a common technique for statistical data analysis and is used in many fields, including machine learning, data mining, pattern recognition, image analysis, and bioinformatics, in which the distribution of information can be of any size and shape. The well-known K-means algorithm, which has been successfully applied to many practical clustering problems, suffers from several drawbacks due to its choice of initializations. A hybrid technique based on combining the K-means algorithm, Nelder–Mead simplex search, and particle swarm optimization, called K–NM–PSO, is proposed in this research. The K–NM–PSO searches for cluster centers of an arbitrary data set as does the K-means algorithm, but it can effectively and efficiently find the global optima. The new K–NM–PSO algorithm is tested on nine data sets, and its performance is compared with those of PSO, NM–PSO, K–PSO and K-means clustering. Results show that K–NM–PSO is both robust and suitable for handling data clustering.
Article
A genetic algorithm-based clustering technique, called GA-clustering, is proposed in this article. The searching capability of genetic algorithms is exploited in order to search for appropriate cluster centres in the feature space such that a similarity metric of the resulting clusters is optimized. The chromosomes, which are represented as strings of real numbers, encode the centres of a fixed number of clusters. The superiority of the GA-clustering algorithm over the commonly used K-means algorithm is extensively demonstrated for four artificial and three real-life data sets.
Article
We present a genetic algorithm for selecting centers to seed the popular k-means method for clustering. Using a novel crossover operator that exchanges neighboring centers, our GA identifies superior partitions using both benchmark and large simulated data sets.
Article
This paper considers a clustering problem where a given data set is partitioned into a certain number of natural and homogeneous subsets such that each subset is composed of elements similar to one another but different from those of any other subset. For the clustering problem, a heuristic algorithm is exploited by combining the tabu search heuristic with two complementary functional procedures, called packing and releasing procedures. The algorithm is numerically tested for its effectiveness in comparison with reference works including the tabu search algorithm, the K-means algorithm and the simulated annealing algorithm.
Conference Paper
Ant colony optimization (ACO) algorithm has been applied to data mining recently. Aiming at Ant Miner, a classification rule learning algorithm based on ACO, this paper presents an enhanced Ant Miner, which includes two main contributions. Firstly, a rule punishing operator is employed to reduce the number of rules and the number of conditions. Secondly, an adaptive state transition rule and a mutation operator are applied to the algorithm to speed up the convergence rate. The results of experiments on some data sets demonstrate that the enhanced Ant-Miner can quickly discover better classification rules which have roughly competitive predicative accuracy and short rules
Article
This paper introduces k′-means algorithm that performs correct clustering without pre-assigning the exact number of clusters. This is achieved by minimizing a suggested cost-function. The cost-function extends the mean-square-error cost-function of k-means. The algorithm consists of two separate steps. The first is a pre-processing procedure that performs initial clustering and assigns at least one seed point to each cluster. During the second step, the seed-points are adjusted to minimize the cost-function. The algorithm automatically penalizes any possible winning chances for all rival seed-points in subsequent iterations. When the cost-function reaches a global minimum, the correct number of clusters is determined and the remaining seed points are located near the centres of actual clusters. The simulated experiments described in this paper confirm good performance of the proposed algorithm.
Article
Clustering methods partition a set of objects into clusters such that objects in the same cluster are more similar to each other than objects in different clusters according to some defined criteria. The fuzzy k-means-type algorithm is best suited for implementing this clustering operation because of its effectiveness in clustering data sets. However, working only on numeric values limits its use because data sets often contain categorical values. In this paper, we present a tabu search based clustering algorithm, to extend the k-means paradigm to categorical domains, and domains with both numeric and categorical values. Using tabu search based techniques, our algorithm can explore the solution space beyond local optimality in order to aim at finding a global solution of the fuzzy clustering problem. It is found that the clustering results produced by the proposed algorithm are very high in accuracy.
Article
We introduce a novel clustering algorithm named GAKREM (Genetic Algorithm K-means Logarithmic Regression Expectation Maximization) that combines the best characteristics of the K-means and EM algorithms but avoids their weaknesses such as the need to specify a priori the number of clusters, termination in local optima, and lengthy computations. To achieve these goals, genetic algorithms for estimating parameters and initializing starting points for the EM are used first. Second, the log-likelihood of each configuration of param- eters and the number of clusters resulting from the EM is used as the fitness value for each chromosome in the population. The novelty of GAKREM is that in each evolving generation it efficiently approximates the log-likelihood for each chromosome using logarithmic regression instead of running the conventional EM algorithm until its convergence. Another novelty is the use of K-means to initially assign data points to clusters. The algo- rithm is evaluated by comparing its performance with the conventional EM algorithm, the K-means algorithm, and the likelihood cross-validation technique on several datasets.
Conference Paper
Biological systems have often provided inspiration for the design of artificial systems. On such example of a natural system that has inspired researchers is the ant colony. In this paper an algorithm for multi-agent reinforcement learning, a modified Q-learning, is proposed. The algorithm is inspired by the natural behaviour of ants, which deposit pheromones in the environment to communicate. The benefit besides simulating ant behaviour in a colony is to design complex multiagent systems. Complex behaviour can emerge from relatively simple interacting agents. The proposed Q-learning update equation includes a belief factor. The belief factor reflects the confidence the agent has in the pheromone detected in its environment. Agents communicate implicitly to co-operate in learning to solve a path-planning problem. The results indicate that combining synthetic pheromone with standard Q-learning speeds up the learning process. It will be shown that the agents can be biased towards a preferred solution by adjusting the pheromone deposit and evaporation rates.
Conference Paper
eberhart @ engr.iupui.edu A concept for the optimization of nonlinear functions using particle swarm methodology is introduced. The evolution of several paradigms is outlined, and an implementation of one of the paradigms is discussed. Benchmark testing of the paradigm is described, and applications, including nonlinear function optimization and neural network training, are proposed. The relationships between particle swarm optimization and both artificial life and genetic algorithms are described, 1
Picking Them by Their Batting Averages: Recency-Frequency-Monetary Method of Controlling Circulation, Direct Mail/Marketing Association
  • G J Cullinan
G.J. Cullinan, Picking Them by Their Batting Averages: Recency-Frequency-Monetary Method of Controlling Circulation, Direct Mail/Marketing Association, New York, NY, 1977.
  • L F Armando
  • E A Schmitz
  • P Lima
  • F S P Manso
L.F. Armando, E.A. Schmitz, P. Lima, F.S.P. Manso, Optimized RFV analysis, Marketing Intelligence & Planning 24 (2006) 106-118.
  • D N Cao
  • J C Krzysztof
D.N. Cao, J.C. Krzysztof, GAKREM: a novel hybrid clustering algorithm, Information Sciences 178 (2008) 4205-4227.
A hybrid evolutionary algorithm based on ACO and SA for cluster analysis
  • T Niknam
  • J Olamaie
  • B Amiri
T. Niknam, J. Olamaie, B. Amiri, A hybrid evolutionary algorithm based on ACO and SA for cluster analysis, Journal of Applied Science 8 (15) (2008) 2695-2702.
An efficient hybrid evolutionary optimization algorithm based on PSO and SA for clustering
  • T Niknam
  • B Amiri
  • J Olamaie
  • A Arefi
T. Niknam, B. Amiri, J. Olamaie, A. Arefi, An efficient hybrid evolutionary optimization algorithm based on PSO and SA for clustering. Journal of Zhejiang University Science A, 2008, doi:10.1631/jzus.A0820196.
Where did you come from? Direct Marketing
  • M Raphael
M. Raphael, Where did you come from? Direct Marketing 62 (2002) 36-38.
Where did you come from?
  • Raphael
A honey-bee mating approach on clustering
  • Fathian