Dr. Aaron Aw Teik Hong, Ahmad Syahiran Bin Ahmad
Description of Invention
This study explores factors influencing heart disease, using the 2022 CDC dataset of over 200,000 Americans. Key objectives include identifying significant factors, investigating the impact of gender and Covid-19 on heart disease risk, and determining the most accurate machine learning model for prediction. The CRISP-DM methodology was followed, and three algorithms—Naïve Bayes, Logistic Regression, and Decision Tree—were applied. Logistic Regression emerged as the most accurate, validated by K-Fold Cross Validation. Findings revealed that ‘Had Angina’ significantly affects heart disease risk, while gender and Covid-19 status do not. Logistic Regression was the most effective at predicting heart disease.