Predicting Student Academic Performance: A Machine Learning Approach and Feature Analysis

Document Type : Research Paper

Authors

1 Department of Information Technology Management, University of Tehran, Tehran, Iran

2 Department of Industrial Management, Islamic Azad University, Tehran, Iran

3 Faculty of Management, University of Tehran

10.22059/ijms.2025.362506.676053

Abstract

Predicting student academic performance is a challenging task and, at the same time, has significant implications for educators and policymakers in the education field. By utilizing machine learning techniques, this article tries to explore the relationship between various features in six categories, i.e., demographic factors, personality traits, skills, favorite activities, relations with others, out-of-school activities on the one hand, and academic performance in terms of GPA, on the other. Data utilized in this study has been collected through several surveys in one of the schools in Iran over multiple years and educational levels, which form the basis of the analysis. Using CRISP-DM methodology, a predictive model is developed based on CatBoost Regressor. A predictive model with 0.87 R-squared is developed. Moreover, the analysis of features' importance reveals that positive personality traits such as "Interest in studying," "Quality of homework," "Contentment," "Self-regulation," and "Logical thinking and reasoning" skills are among the most predictive features affecting the students' academic performance which rooted in and supported by some of the famous psychological theories such as Self-Determination Theory. This study is unique in this field due to various features and data collection in different years and stages.

Keywords

Main Subjects