Dataset before and after normalization

Source publication

Ensemble learning technique to improve breast cancer classification model

Article

Full-text available

Jun 2023

Cancer is a disease characterized by abnormal cell growth and is not contagious, such as breast cancer which can affect both men and women. breast cancer is one of the cancer diseases that is classified as dangerous and takes many victims. However, the biggest problem in this study is that the classification method is low and the resulting accuracy...

Context 1

... order for the method to be used to recognize data as input, it is necessary to normalize the data using a scale in the interval [0.1] Table 2 shows the world crude oil price dataset before and after normalization in the interval [0.1]. ...

View in full-text

Acquired machined surface texture images for varying machining lengths...

Contrast between the original grayscale image and the pre-processed...

An in-process machined surface roughness classification using an ensemble learning algorithm based on extracted automated features from real-time surface images in milling process

Article

Full-text available

Jan 2024

Sarat Babu Mulpur

In the realm of machining, the surface finish of the final product serves as a pivotal quality indicator, signifying the excellence of the manufactured component. Consequently, a pressing requirement exists for dependable and precise predictive models that can effectively oversee the surface finish of machined parts throughout the in-process stage....

A comparative analysis of using ensemble trees for botnet detection and classification in IoT

Article

Full-text available

Dec 2023

Enhancing IoT security is a corner stone for building trust in its technology and driving its growth. Limited resources and diversified nature of IoT devices make them vulnerable to attacks. Botnet attacks compromise the IoT systems and can pose significant security challenges. Numerous investigations have utilized machine learning and deep learnin...

Implementation of the Adaboost Method to Increase the Accuracy of Early Diabetes Predictions to Prevent Death Decision Tree-Based

Article

Full-text available

Mar 2024

Laskar Alam

Diabetes is a disease that does not show apparent, immediately visible symptoms, making patients who suffer from it unaware that they have diabetes. Therefore, diabetes is usually only discovered when it has damaged vital parts such as the kidneys, eyes, and human nerves. According to WHO, diabetes is the 9th most deadly disease in the world. Early...

A robust approach to shear strength prediction of reinforced concrete deep beams using ensemble learning with SHAP interpretability

Article

Full-text available

Dec 2023

The behavior of reinforced concrete (RC) deep beams is complex and difficult to predict due to factors such as compressive and shear stress and beam geometry. To address this challenge, researchers have proposed various machine learning models such as Artificial Neural Network, Decision Tree, Support Vector Machine, Adaptive Boosting, Extreme Gradi...

Fig-2 : heat map of relationship of 14 attributes in UCI dataset.

Fig: 3 Architecture of CVD (Cardiovascular Disease)

Fig-4 : features by feature importance score Algorithm Introduction: In...

A Novel Ensemble Approach with HGBDTRF for Enhanced Detection and Prediction of Heart Disease

Article

Full-text available

Apr 2024

V. Ramesh

Heart disease is responsible for around one-third of all deaths that occur throughout the globe, as shown by the statistics. The employ of machine learning headed for anticipate cardiac illness have emerged as an important technique for both treating and preventing the ailment as more research is carried out in this area. A unique method that we ar...

Improved playstore review sentiment classification accuracy with stacking ensemble

Article

Full-text available

Mar 2024

In today's digital era, user reviews on the Playstore platform are an invaluable source of information for developers, offering insights that are critical for service improvement. Previous research has explored the application of stacking ensemble methods, such as in the context of predicting depression among university students, to enhance prediction accuracy. However, these studies often do not explicitly detail the data acquisition process, leaving a gap in understanding the applicability of these methods to different domains. This research aims to bridge this gap by applying the stacking ensemble approach to improve the accuracy of sentiment classification in Playstore reviews, with a clear exposition of the data collection method. Utilizing Logistic Regression as the meta classifier, this methodology is executed in several stages. Initially, data was collected from user reviews of online loan applications on Google Playstore, ensuring transparency in the data acquisition process. The data is then classified using three basic models: Random Forest, Naive Bayes, and SVM. The outputs of these models serve as inputs to the Logistic Regression meta model. A comparison of each base model output with the meta model was subsequently carried out. The test results on the Playstore review dataset demonstrated an increase in accuracy, precision, recall, and F1 score compared to using a single model, achieving an accuracy of 87.05%, which surpasses Random Forest (85.6%), Naive Bayes (85.55%), and SVM (86.5%). This indicates the effectiveness of the stacking ensemble method in providing deeper and more accurate insights into user sentiment, overcoming the limitations of single models and previous research by explicitly addressing data acquisition methods.

Optimizing Support Vector Machine Performance for Parkinson's Disease Diagnosis Using GridSearchCV and PCA-Based Feature Extraction

Article

Full-text available

Feb 2024

Background: Parkinson's disease (PD) is a critical neurodegenerative disorder affecting the central nervous system and often causing impaired movement and cognitive function in patients. In addition, its diagnosis in the early stages requires a complex and time-consuming process because all existing tests such as electroencephalography or blood examinations lack effectiveness and accuracy. Several studies explored PD prediction using sound, with a specific focus on the development of classification models to enhance accuracy. The majority of these neglected crucial aspects including feature extraction and proper parameter tuning, leading to low accuracy. Objective: This study aims to optimize performance of voice-based PD prediction through feature extraction, with the goal of reducing data dimensions and improving model computational efficiency. Additionally, appropriate parameters will be selected for enhancement of the ability of the model to identify both PD cases and healthy individuals. Methods: The proposed new model applied an OpenML dataset comprising voice recordings from 31 individuals, namely 23 PD patients and 8 healthy participants. The experimental process included the initial use of the SVM algorithm, followed by implementing PCA for feature extraction to enhance machine learning accuracy. Subsequently, data balancing with SMOTE was conducted, and GridSearchCV was used to identify the best parameter combination based on the predicted model characteristics. Result: Evaluation of the proposed model showed an impressive accuracy of 97.44%, sensitivity of 100%, and specificity of 85.71%. This excellent result was achieved with a limited dataset and a 10-fold cross-validation tuning, rendering the model sensitive to the training data. Conclusion: This study successfully enhanced the prediction model accuracy through the SVM+PCA+GridSearchCV+CV method. However, future investigations should consider an appropriate number of folds for a small dataset, explore alternative cross-validation methods, and expand the dataset to enhance model generalizability. Keywords: GridSearchCV, Parkinson Disaese, SVM, PCA, SMOTE, Voice/Speech

Dataset before and after normalization

Context in source publication

Similar publications

Citations