A COMPARATIVE EVALUATION OF SUPERVISED MACHINE LEARNING MODELS FOR PREDICTING ACADEMIC PERFORMANCE FROM STRUCTURED EDUCATIONAL DATA
##plugins.themes.bootstrap3.article.main##
Abstrak:
The increasing availability of structured educational data has enabled the application of supervised machine learning techniques to predict students’ academic performance. Accurate prediction models can support early intervention strategies, personalized learning, and institutional decision-making. This study presents a comparative evaluation of widely used supervised machine learning models—including Logistic Regression, Decision Trees, Random Forest, Support Vector Machines, k-Nearest Neighbors, and Artificial Neural Networks—for predicting academic outcomes using structured educational datasets. The comparison is conducted based on predictive accuracy, interpretability, computational complexity, and robustness, as reported in prior empirical studies. The findings indicate that ensemble-based methods and margin-based classifiers generally outperform simpler linear models in predictive accuracy, while simpler models retain advantages in interpretability and implementation. The study synthesizes existing evidence to guide researchers and educational practitioners in selecting appropriate machine learning models for academic performance prediction.
##plugins.themes.bootstrap3.article.details##
##submission.howToCite##:
##submission.citations##:
Romero, C., & Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics, 40(6), pp. 601–618.
Siemens, G., & Baker, R. (2012). Learning analytics and educational data mining. Proceedings of the 2nd International Conference on Learning Analytics, pp. 252–254.
Baker, R. S. (2014). Educational data mining: An advance for intelligent educational systems. IEEE Intelligent Systems, 29(3), pp. 78–82.
Cortez, P., & Silva, A. (2008). Using data mining to predict secondary school student performance. Proceedings of the European Conference on Data Mining, pp. 512–521.
Han, J., Kamber, M., & Pei, J. (2012). Data Mining: Concepts and Techniques. Morgan Kaufmann, pp. 285–340.
Witten, I. H., Frank, E., & Hall, M. A. (2011). Data Mining: Practical Machine Learning Tools and Techniques. Elsevier, pp. 147–173.
Hosmer, D. W., & Lemeshow, S. (2000). Applied Logistic Regression. Wiley, pp. 1–32.
Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann, pp. 17–45.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), pp. 5–32.
Vapnik, V. N. (1998). Statistical Learning Theory. Wiley, pp. 138–164.
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), pp. 21–27.
Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer, pp. 225–290.
