Predicting Student Performance on Virtual Learning Environment

Alnassar, Fatema Mohammad. 2023. Predicting Student Performance on Virtual Learning Environment. Doctoral thesis, Goldsmiths, University of London [Thesis]

Text (Predicting Student Performance on Virtual Learning Environment)
COM_thesis_AlnassarF_2023.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (5MB) | Preview

Abstract or Description

Virtual learning has gained increased importance because of the recent pandemic situation. A mass shift to virtual means of education delivery has been observed over the past couple of years, forcing the community to develop efficient performance assessment tools. Prediction of students performance using different relevant information has emerged as an efficient tool in educational institutes towards improving the curriculum and teaching methodologies. Automated analysis of educational data using state of the art Machine Learning (ML) and Artificial Intelligence (AI) algorithms is an active area of research.

The research presented in this thesis addresses the problem of students performance prediction comprehensively by applying multiple machine learning models (i.e., Multilayer Perceptron (MLP), Decision Tree (DT), Random Forest (RF), Extreme Gradient Boosting (XGBoost), CATBoost, K-Nearest Neighbour (KNN) and Support Vector Classifier (SVC)) on the two benchmark VLE datasets (i.e., Open University Learning Analytics Dataset (OULAD), Coursera). In this context, a series of experiments are performed and important insights are reported. First, the classification performance of machine learning models has been investigated on both OULAD and Coursera datasets. In the second experiment, performance of machine learning models is studied for each course of Coursera dataset and comparative analysis are performed. From the Experiment 1 and Experiment 2, the class imbalance is reported as the highlighted factor responsible for degraded performance of machine learning models. In this context, Experiment 3 is designed to address the class imbalance problem by making use of multiple Synthetic Minority Oversampling Technique (SMOTE) and generative models (i.e., Generative Adversial Networks (GANs)). From the results, SMOTE NN approach was able to achieve best classification performance among the implemented SMOTE techniques. Further, when mixed with generative models, the SMOTENN-GAN generated Coursera dataset was the best on which machine learning models were able to achieve the classification accuracy around 90%. Overall, MLP, XGBoost and CATBoost machine learning models were emerged as the best performing in context to different experiments performed in this thesis.

Item Type:

Thesis (Doctoral)

Identification Number (DOI):


Virtual Learning Environment, Student Performance, Machine Learning, Classification, Generative Adversial Networks (GANs), SMOTE, Artificial Intelligence.

Departments, Centres and Research Units:



30 September 2023

Item ID:


Date Deposited:

23 Oct 2023 12:28

Last Modified:

23 Oct 2023 12:36


View statistics for this item...

Edit Record Edit Record (login required)