Figure - available from: Cluster Computing
This content is subject to copyright. Terms and conditions apply.
Comparison of newly designed algorithms

Comparison of newly designed algorithms

Source publication
Article
Full-text available
Traditional k-means clustering algorithm is sensitive to the choice of initial cluster centers and leads to local optimal results. k-means++ is a hybrid k-means clustering algorithm which specifies the procedure to initialize the cluster centers before proceeding with the standard k-means algorithm. Inspired by nature, some contemporary optimizatio...

Similar publications

Article
Full-text available
Inverse Weighted K-means less sensitive to poorinitialization than the traditionalK-means algorithm. Therefore, this paper introduce a new hybrid algorithm that integrates inverse weighted k-means algorithm with the optimization bat algorithm, which takes the advantages of both algorithms, from one side the quick convergence and the best global fit...

Citations

... Its core idea is as follows: Firstly, randomly select k initial cluster centers Ci (1 ≤ i ≤ k) from the dataset [8] , calculate the Euclidean distance between the remaining data objects and cluster center Ci, find the cluster center Ci closest to the target data object, and assign the data object to the cluster corresponding to the cluster center Ci. Then, calculate the average value of the data objects in each cluster as the new cluster center, proceed to the next iteration, and continue until the cluster centers no longer change or reach the maximum number of iterations [9] [10] . ...
Article
Full-text available
Integrated development of bus stations is proposed in response to the city’s priority development policy for public transportation. This approach aims to address the challenges of limited station land and increased financial pressure on bus operations by effectively utilizing urban land resources. Therefore, the selection of integrated development methods for bus stations becomes crucial. This study examines the characteristics of 36 bus station sites in Xi’an, considering both their intrinsic features and external environmental attributes. Using diverse data collection methods and analysis techniques, including Python, AcrGIS, and the Gaode Maps Open Platform, the research focuses on four dimensions: Points of Interest (POI), location characteristics, station attributes, and station connectivity features. By identifying 24 key clustering factors through Z-score standardization, Principal Component Analysis (PCA), elbow method, and K-means++ clustering, the 36 bus stations are categorized into five types: residential community development, commercial development, office building development, integrated development, and not recommended for development. These classification results provide guidance for future research on integrated development methods for bus stations.
... The K-Means++ approach ensures that centroids are initialized at distant places, reducing the chance of empty clusters or numerous clusters linked to a single centroid. Because of this initialization stage, the centroids are equally distributed over the data space, reducing the possibility that the algorithm may become caught in a specific minimum [77]. K-Means++ and the regular K-Means method are essentially identical, with the exception of the initial step. ...
Article
Full-text available
Data mining is an analytical approach that contributes to achieving a solution to many problems by extracting previously unknown, fascinating, nontrivial, and potentially valuable information from massive datasets. Clustering in data mining is used for splitting or segmenting data items/points into meaningful groups and clusters by grouping the items that are near to each other based on certain statistics. This paper covers various elements of clustering, such as algorithmic methodologies, applications, clustering assessment measurement, and researcher-proposed enhancements with their impact on data mining thorough grasp of clustering algorithms, its applications, and the advances achieved in the existing literature. This study includes a literature search for papers published between 1995 and 2023, including conference and journal publications. The study begins by outlining fundamental clustering techniques along with algorithm improvements and emphasizing their advantages and limitations in comparison to other clustering algorithms. It investigates the evolution measures for clustering algorithms with an emphasis on metrics used to gauge clustering quality, such as the F-measure and the Rand Index. This study includes a variety of clustering-related topics, such as algorithmic approaches, practical applications, metrics for clustering evaluation, and researcher-proposed improvements. It addresses numerous methodologies offered to increase the convergence speed, resilience, and accuracy of clustering, such as initialization procedures, distance measures, and optimization strategies. The work concludes by emphasizing clustering as an active research area driven by the need to identify significant patterns and structures in data, enhance knowledge acquisition, and improve decision making across different domains. This study aims to contribute to the broader knowledge base of data mining practitioners and researchers, facilitating informed decision making and fostering advancements in the field through a thorough analysis of algorithmic enhancements, clustering assessment metrics, and optimization strategies.
... Each cluster has a center, called the center of mass, and the k-value needs to be given. The k-means clustering algorithm proceeds as follows [51]: ...
Article
Full-text available
In order to improve the air pollution problem in northern China in winter, coal-to-electricity (CtE) projects are being vigorously implemented. Although the CtE project has a positive effect on alleviating air pollution and accelerating clean energy development, the economic benefits of electric heating are currently poor. In this study, a system based on vehicle-to-home (V2H) and photovoltaic power generation that can effectively improve the benefits of CtE projects is proposed. First, a V2H-based village microgrid is proposed. The winter temperature and direct radiation of the Beijing CtE project area are analyzed. Extreme operating conditions and typical operating conditions are constructed for potential analysis. After that, a bi-layer optimization model for energy management considering travel characteristics is proposed. The upper layer is a village-level microgrid energy-dispatching model considering meeting the heating load demand, and the lower layer is a multi-vehicle energy distribution model considering the battery degradation. The results show that the distribution grid expansion capacity of the electric heating system based on V2H and PV generation is reduced by 45.9%, and the residents’ electricity bills are reduced by 68.5%. The consumption of PV can be completed. This study has effectively increased the benefits of electric heating in northern China during winter. This helps the CtE project to be further promoted without leading to large subsidies from the government and the State Grid.
... According to the nature of the dataset in this paper, the size of the original prior box is no longer applicable, and it is necessary to obtain the size of the prior box suitable for this paper. In order to obtain an Anchorage value which is more suitable for target detection, the K-means [19,20] clustering algorithm and genetic algorithm [21] are used to calculate the Anchorage value by mutating the result of the K-means clustering. Among them, the Euclidean distance used in K-means clustering is changed in a distance-based way so that we can cluster to a more appropriate Anchorage value. ...
Preprint
Full-text available
When ground-penetrating radar is used to detect targets within concrete, the location of the targets, the identification of different shapes, properties and less obvious echoes all greatly increase the interpretation time of the staff and can easily cause misjudgment of the echo images. In this paper, the ground-penetrating radar echo images (B-scan) after processing are mean filtered to eliminate the direct waves that interfere greatly with the echoes. The RFB-s structure is added to the YOLOv3-SPP network structure, while the Anchor value is optimized and the EIOU loss function is introduced. For four types of data with different shapes and properties at random target locations, three models, YOLOv3, YOLOv3-SPP and the improved YOLOv3-SPP, are used for classification and identification, and the proposed algorithm models are comprehensively evaluated using model evaluation metrics. The experimental results show that the algorithm models proposed in this paper have good recognition effect in ground-penetrating radar echo image target detection.
... K-means++ algorithm [6][7] is a division-based clustering algorithm, using the elbow rule to determine the optimal number of clusters (k) for the division of the five chemical components of high potassium and lead barium, respectively. k value is established, and then brought into the k-means++ algorithm for cluster analysis to obtain the results of the division of high potassium and lead barium glass artifacts. ...
Article
Full-text available
The significance of this study is to classify and identify the types of ancient glass products according to their chemical composition. The data were selected from the proportion of chemical compositions that have been analyzed for ancient glass, and the best number of clusters (k) for the division was roughly determined using the elbow rule for the five chemical compositions of high potassium and lead barium, respectively, and brought into the k-means++ algorithm for cluster analysis, and then the final determination of k and the evaluation of the rationality of clustering were performed using the contour coefficient, and finally the Fisher discriminant analysis method based on variable meritocracy combined with eigenvalues, Wilke Lambda, and classification function coefficients to identify unknown categories of glass artifacts. The model used was analyzed and evaluated with good results, and the model is applicable to the classification of ancient glass artifacts and identification of the type to which they belong.
... Te K-means++ algorithm is based on the traditional K-means algorithm, which makes improvements to the initial clustering center selection; assuming that n manufacturing cell centers have been selected (0 < n < K), then when selecting the frst n + 1 manufacturing cell centers, the more distant points from the current n manufacturing cell center have a higher probability to be selected as the frst n + 1 manufacturing cell centers. Additionally, it overcomes the efect of random selection of the initial clustering centers of the traditional K-means algorithm and efectively improves the clarity and efciency of manufacturing cell classifcation [27]. Te specifc processes Computational Intelligence and Neuroscience of the K-means++ algorithm for classifying the manufacturing cells of complex aerospace components are summarized as follows: ...
Article
Full-text available
To cope with the problems of frequent mold changes, long production cycles and serious logistics crossings in workshop of aerospace enterprise. First, a manufacturing cell layout planning method based on the feature bit code domain method and K-Means++ is proposed to realize the accurate division of manufacturing cells. Then, a multiobjective optimization method of dynamic reconstruction layout based on improved fruit fly optimization algorithm (IFOA) is proposed to solve the reconstruction layout optimization of the production workshop problem with the optimization objectives of logistics cost, reconstruction cost, loss cost, and cell integrated area. Finally, plant simulation software is applied to simulate the workshop layout before and after optimization. The simulation results show that the logistics cost of the workshop cell layout after optimization is reduced by 8.7%, the utilization rate of the workshop area is improved by 5.2%, and the value-added rate of products is increased by 6.6%, which verifies the effectiveness and feasibility of the proposed model and method.
... Given the huge size of the search space, in order to find the optimal cluster centers and to improve the quality of clusters which lead to the better separation of users, other combinations of kMeans + + algorithm with meta-heuristic algorithms have been proposed in [25][26][27][28] studies, which shows the improved performance of kMeans + + . By extracting different characteristics from the ranking matrix and presenting new combinations [29]. ...
... The results obtained for differen datasets within the field of the machine learningt indicated that this approach performs better than standard kMeans clustering. In [27], the kMeans + + clustering approach was combined with meta-heuristic, cuckoo, krill, and bat algorithms. Prior to following the standard kMeans, the kMeans + + algorithm used a specific procedure for initializing the initial cluster centers. ...
... A review of the previous work indicates that one of the optimization methods of kMeans and kMeans + + algorithms is the use of meta-heuristic algorithms, amongst which ant colony [28], particle swarm [41], firefly [40], cuckoo [37], and krill [27] algorithms have been widely used to optimize clustering algorithms. In this study, firefly, cuckoo, and krill algorithms are utilized to optimize the MkMeans + + algorithm. ...
Article
Full-text available
To offer an appropriate recommendation to customers in recommender systems, the issue of clustering and separating users with different tastes from the rest of people is of significant importance. The MkMeans + + algorithm is a technique for clustering and separating users in collaborative filtering systems. This algorithm utilizes a specific procedure for selecting the initial centroids of the clusters and has a better function compared with its similar algorithms such as kMeans + + . In this paper, MkMeans + + algorithm is combined with Firefly, Cuckoo, and Krill algorithms and new algorithms called FireflyMkMeans + + , CuckooMkMeans + + , and KrillMkMeans + + are introduced in order to specify the optimal centroid of the cluster, better separate users, and avoid local optimals. In the proposed hybrid clustering approach, the initial population of firefly, cuckoo, and krill algorithms is initialized through the solutions generated by MkMeans + + algorithm, and it makes use of the benefits of MkMeans + + as well as firefly, cuckoo, and krill algorithms. Results and implementations on both MovieLens and FilmTrust datasets indicate that the proposed algorithms can perform better than their similar algorithms in clustering and separating users with different tastes (graysheep users), and enhance the quality of clusters and the accuracy of recommendations for users with similar tastes (white users).
... The Krill algorithm carries success in various fields such as text clustering [35], Feature selection [36] and training artificial neural networks [37]. Similarly, Cuckoo search is advantageous because of its ideal breeding behaviour and global exploration ability [38,39]. Motivated by these facts, we propose a hybrid meta-heuristic algorithm based on Krill herd and Cuckoo search. ...
Article
Full-text available
Wireless Sensor Networks are developed as a vital tool for monitoring diverse real time applications such as environmental monitoring factors, health care, wide area surveillance, and many more. Though the advantages of WSNs are plenty, the present challenge is to gain effective control over the depleting battery power and the network lifetime. Recent researches have proved that the energy consumption can be minimized if effective clustering mechanisms are incorporated. This paper proposes HOCK and HECK - novel energy efficient clustering algorithms to increase the network lifetime for homogeneous and heterogeneous environments, respectively. Both these algorithms are built using Krill herd and Cuckoo search. While the optimal cluster centroid positions are computed using the Krill herd algorithm, and the Cuckoo search is applied to select the optimal cluster heads. The performance of the HOCK algorithm is evaluated by varying base station locations and node density. To evaluate the HECK algorithm, two and three level heterogeneity are considered. The simulation results show that the proposed protocol is more effective in improving the network lifetime of WSNs compared to other existing methods such as GAECH, Hybrid HSAPSO, and ESO-LEACH.
... The endeavor to integrate WSN with smart city applications in smart energy, road safety, and so on raises a slew of issues [11]. Reliable connectivity, data transfer safety, interoperability among embedded systems, information compliance, and so on are also the main issues.Networking is one of the key requirements in certain activity implementations, but it necessitates a really careful selection [12]. The appropriate choice of a given network may significantly increase the WSN's lifespan [13]. ...
Article
Due to its widespread serviceability, the sensor network receives considerable interest. Within smart cities featuring multiple, collection and transmission, the function of the wireless sensor system is generally significant. Effective evaluation architecture is an automated process for monitoring individual consumers' power usage. The required information was typically transmitted through Wireless LAN (WLAN) media. In addition to gathering and detecting, this is one of the main characteristics of an Advanced Meter Infrastructures (AMI) network. If the connection and node will be in the right condition, effective communication is feasible. In most of the existing methods, network congestion and occurrence of fault in continuous communication are main drawbacks. To overcome these limitations, the Grid Topological Network Architecture to give an efficient route if there is a fault-tolerance networking system is presented in this work. The experiment exhibits improvement based on the energy usage and package distribution concerning the reduced and highly compressed Networking Routing Algorithm and the Ad-Hoc On-Demand Vector Method. From the results it is observed that delivery loss, mean response and delay response is lowered by using proposed method and it is proven to be efficient as compared to the exiting methods. Delays, dependability, expandability, integration, flexibility, delivery prioritization, accessibility, ease of application, and tracking of the proposed method are compared with the existing methods like RPL, AODV and it is proved that the proposed scheme exhibits better performance.
... Nayak et al. [37] combined fuzzy c-means (FCM) with chemical reaction optimization (CRO) to achieve the global best solution. Aggarwal and Singh [38] introduced a nature-inspired algorithm for optimizing the k-means++ algorithm, aimed at overcoming the tendency to fall into local optima. Lakshmi et al. [39] mixed the crow search algorithm (CSA) with k-means, and the quality of the solutions obtained on the benchmark dataset was significantly improved. ...
Article
Full-text available
Clustering analysis is essential for obtaining valuable information from a predetermined dataset. However, traditional clustering methods suffer from falling into local optima and an overdependence on the quality of the initial solution. Given these defects, a novel clustering method called gradient-based elephant herding optimization for cluster analysis (GBEHO) is proposed. A well-defined set of heuristics is introduced to select the initial centroids instead of selecting random initial points. Specifically, the elephant optimization algorithm (EHO) is combined with the gradient-based algorithm GBO for assigning initial cluster centers across the search space. Second, to overcome the imbalance between the original EHO exploration and exploitation, the initialized population is improved by introducing Gaussian chaos mapping. In addition, two operators, i.e., random wandering and variation operators, are set to adjust the location update strategy of the agents. Nine datasets from synthetic and real-world datasets are adopted to evaluate the effectiveness of the proposed algorithm and the other metaheuristic algorithms. The results show that the proposed algorithm ranks first among the 10 algorithms. It is also extensively compared with state-of-the-art techniques, and four evaluation criteria of accuracy rate, specificity, detection rate, and F-measure are used. The obtained results clearly indicate the excellent performance of GBEHO, while the stability is also more prominent.