A MULTICLASS MACHINE LEARNING APPROACH FOR ACADEMIC STUDENT PERFORMANCE PREDICTION
Abstract
In recent years, there has been an increasing interest in the development of precise and reliable prediction models for evaluating student performance. The prediction of student grades holds great significance in educational institutions as it facilitates the identification of students facing difficulties, improvement of teaching methodologies, and implementation of targeted interventions. It is an essential task in education that empowers educators to recognize academically struggling students and offer them tailored support. Nevertheless, accurately predicting grades can be a challenging task due to the intricate and imbalanced nature of educational datasets. Fortunately, recent advancements in machine learning offer promising solutions to address this challenge, such as the utilization of the Multiclass Student Grade Prediction System employing SMOTE. The Multiclass Student Grade Prediction System incorporates SMOTE, a technique that balances the dataset by increasing the least represented data modules, thereby enhancing the accuracy of the model. This system has been developed through a comprehensive review of relevant literature, a detailed explanation of the methodology employed, the presentation of evaluation results, and a discussion on the implications of these findings for improving academic outcomes in educational institutions. By predicting the final grades of students across multiple classes, the proposed model aims to provide educators with a comprehensive and automated solution, enabling them to make data-driven decisions and enhance academic outcomes. Our generated results demonstrate that the Multiclass Student Grade Prediction System, employing SMOTE, outperforms other traditional methods in predicting student grades across multiple classes. Implementation of this system in educational institutions can significantly assist teachers in identifying academically struggling students, ultimately leading to improved academic outcomes. An Ensemble of Three Classifiers, referred to as ETCs, is introduced as a method for predicting student performance. This approach combines three classifiers: Adaptive Neuro Fuzzy Inference System (ANFIS), Support Vector Machine (SVM) classifier, and Decision Tree (DT). The outcomes of the experiment demonstrate that the proposed method is compared to different algorithms used in classifiers, including DT, Artificial Neural Network (ANN), ANFIS, and SVM.
Keyword : Multiclass prediction model, SMOTE, imbalanced datasets, and grade prediction, Ensemble of Three Classifiers(ETC), Adaptive Neuro Fuzzy Inference System (ANFIS), Support Vector Machine (SVM), Decision Tree (DT).
This work is licensed under a Creative Commons Attribution 4.0 International License.
References
[1] D. Solomon, S. Patil, and P. Agrawal, ‘‘Predicting performance and potential difficulties of university student using classification: Survey paper,’’ Int. J. Pure Appl. Math, vol. 118, no. 18, pp.2703–2707, 2018. [2] E. Alyahyan and D. Düştegör, ‘‘Predicting academic success in higher education: Literature review and best practices,’’ Int. J. Educ. Technol. Higher Educ., vol. 17, no. 1, Dec. 2020. [3] V. L. Miguéis, A. Freitas, P. J. V. Garcia, and A. Silva, ‘‘Early segmentation of students according to their academic performance: A predictive modelling approach,’’ Decis. Support Syst., vol. 115, pp. 36–51, Nov. 2018. [4] P. M. Moreno-Marcos, T.-C. Pong, P. J. Munoz-Merino, and C. D. Kloos, ‘‘Analysis of the factors influencing Learners’ performance prediction with learning analytics,’’ IEEE Access, vol. 8, pp. 5264–5282, 2020. [5] A. E. Tatar and D. Düştegör, ‘‘Prediction of academic performance at undergraduate graduation: Course grades or grade point average?’’ Appl. Sci., vol. 10, no. 14, pp. 1– 15, 2020. [6] Y. Zhang, Y. Yun, H. Dai, J. Cui, and X. Shang, ‘‘Graphs regularized robust matrix factorization and its application on student grade prediction,’’ Appl. Sci., vol. 10, p. 1755, Jan. 2020. [7] H. Aldowah, H. Al-Samarraie, and W. M. Fauzy, ‘‘Educational data mining and learning analytics for 21st century higher education: A review and synthesis,’’ Telematics Informat., vol. 37, pp. 13–49, Apr. 2019. [8] K. L.-M. Ang, F. L. Ge, and K. P. Seng, ‘‘Big educational data & analytics: Survey, architecture and challenges,’’ IEEE Access, vol. 8, pp. 116392–116414, 2020. [9] A. Hellas, P. Ihantola, A. Petersen, V. V. Ajanovski, M. Gutica, T. Hynninen, A. Knutas, J. Leinonen, C. Messom, and S. N. Liao, ‘‘Predicting academic performance: A systematic literature review,’’ in Proc. 23rd Annu. Conf. Innov. Technol. Comput. Sci. Educ., Jul. 2018, pp. 175–199. [10] L. M. Abu Zohair, ‘‘Prediction of student’s performance by modelling small dataset size,’’ Int. J. Educ. Technol. Higher Educ., vol. 16, no. 1, pp. 1–8, Dec. 2019, doi: 10.1186/s41239-019-0160-3. [11] X. Zhang, R. Xue, B. Liu, W. Lu, and Y. Zhang, ‘‘Grade prediction of student academic performance with multiple classification models,’’ in Proc. 14th Int. Conf. Natural Comput., Fuzzy Syst. Knowl. Discovery (ICNC-FSKD), Jul. 2018, pp. 1086– 1090. [12] S. T. Jishan, R. I. Rashu, N. Haque, and R. M. Rahman, ‘‘Improving accuracy of students’ final grade prediction model using optimal equal width binning and synthetic minority over-sampling technique,’’ Decis. Anal., vol. 2, no. 1, pp. 1–25, Dec. 2015. [13] A. Polyzou and G. Karypis, ‘‘Grade prediction with models specific to students and courses,’’ Int. J. Data Sci. Anal., vol. 2, nos. 3–4, pp. 159–171, Dec. 2016. [14] Z. Iqbal, J. Qadir, A. N. Mian, and F. Kamiran, ‘‘Machine learning based student grade prediction:A case study,’’ 2017, arXiv:1708.08744. [Online]. Available: https://arxiv.org/abs/1708.08744 [15] I. Khan, A. Al Sadiri, A. R. Ahmad, and N. Jabeur, ‘‘Tracking student performance in introductory programming by Means of machine learning,’’ in Proc. 4th MEC Int. Conf. Big Data Smart City (ICBDSC), Jan. 2019, pp. 1–6. [16] M. A. Al-Barrak and M. Al-Razgan, ‘‘Predicting students final GPA using decision trees: A casestudy,’’ Int. J. Inf. Educ. Technol., vol. 6, no. 7, pp. 528–533, 2016. [17] E. C. Abana, ‘‘A decision tree approach for predicting student grades in research project using WEKA,’’ Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 7, pp. 285–289, 2019. [18] F. Ahmad, N. H. Ismail, and A. A. Aziz, ‘‘The prediction of students’ academic performance using classification data mining techniques,’’ Appl. Math. Sci., vol. 9, pp. 6415–6426, Apr. 2015 [19] Hamsa, H., Indiradevi, S. and Kizhakkethottam, J.J., 2016. Student academic performance prediction model using decision tree and fuzzy genetic algorithm. Procedia Technology, 25, pp.326-332. [20] S. Rapuano and F. Zoino, “A learning management system including laboratory experiments on measurement instrumentation”, Instrumentation and Measurement, IEEE Transactions on, vol. 55, no. 5, (2006), pp. 1757-1766 21] G. Kakasevski, M. Mihajlov, S. Arsenovski and S. Chungurski, “Evaluating usability in learning management system Moodle”, Information Technology Interfaces, 2008. ITI 2008. 30th International Conference on IEEE, (2008), pp. 613-618 [22] Surjeet Kumar Yadav, Brijesh Bharadwaj, and Saurabh Pal, Mining Education Data to Predict Student’s Retention: A comparative Study, 2012 [23] Patro, S. and Sahu, K.K., 2015. Normalization: A preprocessing stage. arXiv preprint arXiv:1503.06462. [24] Yang, X.S. (2010) ‘A new metaheuristic bat-inspired algorithm’, Proceedings of the International Workshop on Nature Inspired Cooperative Strategies for Optimization, 12–14 May, Granada, Spain, pp.65–74. [25] Yang, X.S. (2012) ‘Bat algorithm for multi-objective optimisation’, International Journal of Bio-Inspired Computation, Vol. 3,No. 5, pp.267–274. [26] Polikar, R., 2012. Ensemble learning. In Ensemble machine learning (pp. 1-34). Springer, Boston, MA. [27] Tso, G.K. and Yau, K.K., 2007. Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks. Energy, 32(9), pp.1761-1768. [28] C.M.Bishop, Pattern Recognition and Machine Learning,Springer, Heidelberg, 2006. [29] Jang, J. S. R., ANFIS: adaptive-network based fuzzy inference system. IEEE Trans. Syst., Man,Cybernetics, 1993, 23(3), 665– 685. [30] M. Alizadeh, R. Rada, A.K.G. Balagh, M.M.S. Esfahani, Forecasting Exchange Rates: A Neuro-Fuzzy Approach, IFSA-EUSFLAT, 2009, pp.1745-1750. [31] D. M. Powers, “Evaluation: from precision, recall and F-measure to ROC”, informedness, markedness and correlation, (2011). [32] T. Y. Chen, F. C. Kuo and R. Merkel, “On the statistical properties of the f-measure. In Quality Software, 2004. QSIC 2004”, Proceedings. Fourth International Conference on. IEEE, (2004), pp. 146-153.