Dataset before and after normalization

Dataset before and after normalization

Source publication
Article
Full-text available
Cancer is a disease characterized by abnormal cell growth and is not contagious, such as breast cancer which can affect both men and women. breast cancer is one of the cancer diseases that is classified as dangerous and takes many victims. However, the biggest problem in this study is that the classification method is low and the resulting accuracy...

Context in source publication

Context 1
... order for the method to be used to recognize data as input, it is necessary to normalize the data using a scale in the interval [0.1] Table 2 shows the world crude oil price dataset before and after normalization in the interval [0.1]. ...

Similar publications

Article
Full-text available
In the realm of machining, the surface finish of the final product serves as a pivotal quality indicator, signifying the excellence of the manufactured component. Consequently, a pressing requirement exists for dependable and precise predictive models that can effectively oversee the surface finish of machined parts throughout the in-process stage....
Article
Full-text available
Enhancing IoT security is a corner stone for building trust in its technology and driving its growth. Limited resources and diversified nature of IoT devices make them vulnerable to attacks. Botnet attacks compromise the IoT systems and can pose significant security challenges. Numerous investigations have utilized machine learning and deep learnin...
Article
Full-text available
Diabetes is a disease that does not show apparent, immediately visible symptoms, making patients who suffer from it unaware that they have diabetes. Therefore, diabetes is usually only discovered when it has damaged vital parts such as the kidneys, eyes, and human nerves. According to WHO, diabetes is the 9th most deadly disease in the world. Early...
Article
Full-text available
The behavior of reinforced concrete (RC) deep beams is complex and difficult to predict due to factors such as compressive and shear stress and beam geometry. To address this challenge, researchers have proposed various machine learning models such as Artificial Neural Network, Decision Tree, Support Vector Machine, Adaptive Boosting, Extreme Gradi...
Article
Full-text available
Heart disease is responsible for around one-third of all deaths that occur throughout the globe, as shown by the statistics. The employ of machine learning headed for anticipate cardiac illness have emerged as an important technique for both treating and preventing the ailment as more research is carried out in this area. A unique method that we ar...

Citations

... The Logistic Regression Meta-model then takes these predictions and analyzes them to understand how each basic model contributes to the overall prediction as shown in Figure 2. Models are evaluated based on metrics such as accuracy, precision, recall, and F1-score. This evaluation was carried out through cross validation to ensure the reliability of the results [16][17][18][19][20]. Results from each base model and stacking ensemble model are compared to determine performance improvements, if any. ...
Article
Full-text available
In today's digital era, user reviews on the Playstore platform are an invaluable source of information for developers, offering insights that are critical for service improvement. Previous research has explored the application of stacking ensemble methods, such as in the context of predicting depression among university students, to enhance prediction accuracy. However, these studies often do not explicitly detail the data acquisition process, leaving a gap in understanding the applicability of these methods to different domains. This research aims to bridge this gap by applying the stacking ensemble approach to improve the accuracy of sentiment classification in Playstore reviews, with a clear exposition of the data collection method. Utilizing Logistic Regression as the meta classifier, this methodology is executed in several stages. Initially, data was collected from user reviews of online loan applications on Google Playstore, ensuring transparency in the data acquisition process. The data is then classified using three basic models: Random Forest, Naive Bayes, and SVM. The outputs of these models serve as inputs to the Logistic Regression meta model. A comparison of each base model output with the meta model was subsequently carried out. The test results on the Playstore review dataset demonstrated an increase in accuracy, precision, recall, and F1 score compared to using a single model, achieving an accuracy of 87.05%, which surpasses Random Forest (85.6%), Naive Bayes (85.55%), and SVM (86.5%). This indicates the effectiveness of the stacking ensemble method in providing deeper and more accurate insights into user sentiment, overcoming the limitations of single models and previous research by explicitly addressing data acquisition methods.
... One of the most popular classification algorithm models is the Support Vector Machine (SVM) which separates two classes of data with a hyperplane. SVM has been widely used in various fields due to its superior capabilities in fault diagnosis [44], disease detection [45], [46], credit fraud detection [47], [48], and financial prediction [49]. Certain investigations applied PCA feature extraction method for model optimization [50] by reducing data dimensionality and computational burden, as well as expediting the classification process. ...
Article
Full-text available
Background: Parkinson's disease (PD) is a critical neurodegenerative disorder affecting the central nervous system and often causing impaired movement and cognitive function in patients. In addition, its diagnosis in the early stages requires a complex and time-consuming process because all existing tests such as electroencephalography or blood examinations lack effectiveness and accuracy. Several studies explored PD prediction using sound, with a specific focus on the development of classification models to enhance accuracy. The majority of these neglected crucial aspects including feature extraction and proper parameter tuning, leading to low accuracy. Objective: This study aims to optimize performance of voice-based PD prediction through feature extraction, with the goal of reducing data dimensions and improving model computational efficiency. Additionally, appropriate parameters will be selected for enhancement of the ability of the model to identify both PD cases and healthy individuals. Methods: The proposed new model applied an OpenML dataset comprising voice recordings from 31 individuals, namely 23 PD patients and 8 healthy participants. The experimental process included the initial use of the SVM algorithm, followed by implementing PCA for feature extraction to enhance machine learning accuracy. Subsequently, data balancing with SMOTE was conducted, and GridSearchCV was used to identify the best parameter combination based on the predicted model characteristics. Result: Evaluation of the proposed model showed an impressive accuracy of 97.44%, sensitivity of 100%, and specificity of 85.71%. This excellent result was achieved with a limited dataset and a 10-fold cross-validation tuning, rendering the model sensitive to the training data. Conclusion: This study successfully enhanced the prediction model accuracy through the SVM+PCA+GridSearchCV+CV method. However, future investigations should consider an appropriate number of folds for a small dataset, explore alternative cross-validation methods, and expand the dataset to enhance model generalizability. Keywords: GridSearchCV, Parkinson Disaese, SVM, PCA, SMOTE, Voice/Speech