Abstract
Ensemble methods are machine-learning techniques that include the creation of several learners for a given task. Ensemble techniques aim to achieve high classification accuracy and improve performance. In predicting breast cancer, we require enhancing the accuracy of algorithms; therefore, we utilize here an ensemble technique that combines predictions of several models. In this study, the proposed ensemble hard voting classifier employs a combination of five machine learning algorithms: Support Vector Machine (SVM), K-Nearest Neighbours(K-NN), Naive Bayes (NB), Decision Tree (DT), and Random Forest (RF) is used to provide a binary classification for breast cancer. The results of the individual classifiers are then combined and compared with the performance of five individual classifiers with the hard voting classifier. The results show that ensemble-voting techniques perform better than single classifiers. The Wisconsin Breast Cancer Dataset (WBCD) from the UCI machine-learning repository was used in our experiments. The proposed ensemble hard voting classifier has given the highest accuracy value with 96.49%, whereas Support Vector Machine, Nearest Neighbours, Naive Bayes, Decision Tree, and Random Forest achieved accuracies of 95.32%, 92.39%, 94.73%, 92.98%, and 95.32% respectively on the breast cancer dataset.
Keywords
Ensemble Learning, Hard Voting, machine learning, WBCD dataset (Wisconsin Breast Cancer Dataset)
References
Abrol, P., Kalrupia, N. and Kaur, J. (2022) ‘Hybrid Voting Classifier Model for COVID-19 Prediction by Embedding Machine Learning Techniques’, 13(02), pp. 171–183.
AL-Malali, M. K. H. (2021) ‘Behavioral Sense Classification using Machine Learning Algorithms’, pp. 1–144.
Das and D. Biswas (2019) ‘Prediction of breast cancer using ensemble learning’, in 2019 5th International Conference on Advances in Electrical Engineering, ICAEE 2019. Institute of Electrical and Electronics Engineers Inc., pp. 804–808. doi: 10.1109/ICAEE48663.2019.8975544.
Assiri, Adel S, Nazir, S. and Velastin, S. A. (2020) ‘Breast tumor classification using an ensemble machine learning method’, Journal of Imaging, 6(6), p. 39.
Assiri, Adel S., Nazir, S. and Velastin, S. A. (2020) ‘Breast Tumor Classification Using an Ensemble Machine Learning Method’, Journal of Imaging, 6(6). doi: 10.3390/JIMAGING6060039.
Fitni, Q. R. S. and Ramli, K. (2020) ‘Implementation of ensemble learning and feature selection for performance improvements in anomaly-based intrusion detection systems’, Proceedings – 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, IAICT 2020, pp. 118–124. doi: 10.1109/IAICT50021.2020.9172014.
Gupta, M. and Gupta, B. (2018) ‘An Ensemble Model for Breast Cancer Prediction Using Sequential Least Squares Programming Method (SLSQP)’, 2018 11th International Conference on Contemporary Computing, IC3 2018, pp. 1–3. doi: 10.1109/IC3.2018.8530572.
Ibrahim, S., Nazir, S. and Velastin, S. A. (2021) ‘Feature selection using correlation analysis and principal component analysis for accurate breast cancer diagnosis’, Journal of Imaging, 7(11). doi: 10.3390/jimaging7110225.
Iqbal, H. N., Nassif, A. B. and Shahin, I. (2020a) ‘Classifications of Breast Cancer Diagnosis using Machine Learning’, International Journal of Computers, 14(January 2021), pp. 86–86. doi: 10.46300/9108.2020.14.13.
Iqbal, H. N., Nassif, A. B. and Shahin, I. (2020b) ‘Classifications of Breast Cancer Diagnosis using Machine Learning’, International Journal of Computers, 14(January 2021), p. 86. doi: 10.46300/9108.2020.14.13.
Jafari, M. and Olbe, J. (2021) ‘A Comparison of Machine Learning Algorithms for Predicting Winners of League of Legends Matches’.
javatpoint (2021a) ‘K-Nearest Neighbor(KNN) Algorithm for Machine Learning – Javatpoint’, Javatpoint. Available at: https://www.javatpoint.com/k-nearest-neighbor-algorithm-for-machine-learning.
javatpoint (2021b) ‘Machine Learning Decision Tree Classification Algorithm – Javatpoint’, Javatpoint.Com. Available at: https://www.javatpoint.com/machine-learning-decision-tree-classification-algorithm.
Karimi, Z. (2021) ‘Confusion Matrix’, (October).
Kumar, U. K., Nikhil, M. B. S. and Sumangali, K. (2017) ‘Prediction of breast cancer using voting classifier technique’, 2017 IEEE International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials, ICSTM 2017 – Proceedings, (August), pp. 108–114. doi: 10.1109/ICSTM.2017.8089135.
Kumari, P. K. and Philosophy, D. O. F. (2012) ‘a Study on Novel Techniques for Heart Sound and Murmur Classification , Search and Retrievel’, pp. 1–174.
Leena Nesamani, S. and Nirmala Sugirtha Rajini, S. (2020) ‘Evaluation of ensemble machines in breast cancer prediction’, Advances in Parallel Computing, 37(1), pp. 391–395. doi: 10.3233/APC200173.
Li, Y. and Chen, W. (2020a) ‘A comparative performance assessment of ensemble learning for credit scoring’, Mathematics, 8(10), pp. 1–19. doi: 10.3390/math8101756.
Li, Y. and Chen, W. (2020b) ‘A comparative performance assessment of ensemble learning for credit scoring’, Mathematics, 8(10), p. 1756.
Murtirawat, R. et al. (2020) ‘Breast Cancer Detection Using K-Nearest Neighbors, Logistic Regression and Ensemble Learning’, Proceedings of the International Conference on Electronics and Sustainable Communication Systems, ICESC 2020, (Icesc), pp. 534–540. doi: 10.1109/ICESC48915.2020.9155783.
J.Brownlee – 2019’ ‘Naive Bayes Classifier From Scratch in Python (no date). Available at: https://machinelearningmastery.com/naive-bayes- classifier-scratch-python/.
Navlani, A. (2019) ‘Support Vector Machines with Scikit-learn’, DataCamp, pp. 1–15. Available at: https://www.datacamp.com/community/tutorials/svm-classification-scikit-learn-python.
Nguyen, Q. H. et al. (2019) ‘Breast Cancer Prediction using Feature Selection and Ensemble Voting’, Proceedings of 2019 International Conference on System Science and Engineering, ICSSE 2019, pp. 250–254. doi: 10.1109/ICSSE.2019.8823106.
Prince, M. S. M., Hasan, A. and Shah, F. M. (2019) ‘An Efficient Ensemble Method for Cancer Detection’, 1st International Conference on Advances in Science, Engineering and Robotics Technology 2019, ICASERT 2019, 2019(Icasert), pp. 1–6. doi: 10.1109/ICASERT.2019.8934817.
Rathore, N., Divya and Agarwal, S. (2014) ‘Predicting the survivability of breast cancer patients using ensemble approach’, Proceedings of the 2014 International Conference on Issues and Challenges in Intelligent Computing Techniques, ICICT 2014, pp. 459–464. doi: 10.1109/ICICICT.2014.6781326.
Raza, K. (2019) Improving the prediction accuracy of heart disease with ensemble learning and majority voting rule, U-Healthcare Monitoring Systems. Elsevier Inc. doi: 10.1016/b978-0-12-815370-3.00008-6.
Saad Assiri, A., Nazir, S. and Velastin, S. A. (2019) ‘A Hybrid Ensemble Method for Accurate Breast Cancer Tumor Classification using State-of-the-Art Classification Learning Algorithms’, (November). Available at: www.preprints.org.
Sarker, I. H. (2021) ‘Machine Learning: Algorithms, Real-World Applications and Research Directions’, SN Computer Science, 2(3). doi: 10.1007/s42979-021-00592-x.
Sruthi, E. R. (2021) ‘Random Forest | Introduction to Random Forest Algorithm’, AnalyticsVidya.com. Available at: https://www.analyticsvidhya.com/blog/2021/06/understanding-random-forest/.
Thirumal, P. C. and Nagarajan, N. (2015) ‘Utilization of data mining techniques for diagnosis of diabetes mellitus – A case study’, ARPN Journal of Engineering and Applied Sciences, 10(1), pp. 8–13.