2: Heat Map of the Dataset

2: Heat Map of the Dataset

Source publication
Article
Full-text available
The present paperreports an optimal machine learning model for an effective prediction of cardiovascular diseases that uses the ensemble learning technique. The present research work gives an insight about the coherent way of combining Naive Bayes and Random Forest algorithm using ensemble technique. It also discusses how the present model is diffe...

Similar publications

Article
Full-text available
The remarkable advances in ensemble machine learning methods have led to a significant analysis in large data, such as random forest algorithms. However, the algorithms only use the current features during the process of learning, which caused the initial upper accuracy’s limit no matter how well the algorithms are. Moreover, the low classification...

Citations

... However, many features may affect the final diagnosis when analyzing medical data. Therefore, to improve the accuracy of predictive models, many researchers are employing feature selection techniques that identify the most important features [20]. Many studies have proposed enhanced ensemble learning techniques with optimum feature selection methods for the detection of cardiac diseases. ...
Article
Full-text available
Cardiovascular disease (CVD) is a leading cause of death globally; therefore, early detection of CVD is crucial. Many intelligent technologies, including deep learning and machine learning (ML), are being integrated into healthcare systems for disease prediction. This paper uses a voting ensemble ML with chi-square feature selection to detect CVD early. Our approach involved applying multiple ML classifiers, including naïve Bayes, random forest, logistic regression (LR), and k-nearest neighbor. These classifiers were evaluated through metrics including accuracy, specificity, sensitivity, F1-score, confusion matrix, and area under the curve (AUC). We created an ensemble model by combining predictions from the different ML classifiers through a voting mechanism, whose performance was then measured against individual classifiers. Furthermore, we applied chi-square feature selection method to the 303 records across 13 clinical features in the Cleveland cardiac disease dataset to identify the 5 most important features. This approach improved the overall accuracy of our ensemble model and reduced the computational load considerably by more than 50%. Demonstrating superior effectiveness, our voting ensemble model achieved a remarkable accuracy of 92.11%, representing an average improvement of 2.95% over the single highest classifier (LR). These results indicate the ensemble method as a viable and practical approach to improve the accuracy of CVD prediction.