검색 상세

Early prediction of student performance using machine learning algorithms with practical features

초록/요약

Predicting students’ performance in advance could help assist the learning process. Despite the potential benefits, technology for predicting student performance has not been widely used in education due to difficulties in usage and interpretation, as well as practical constraints such as data collection. This study proposes two methods to this limitation. The first method is to select practical features by reflecting the opinions of educational stakeholders, and to predict the performance of students using machine learning and explainable artificial intelligence (XAI) techniques. The researcher conducted qualitative research to ascertain the perspectives of educational stakeholders. Twelve people, including educators, parents of K-12 students, and policymakers, participated in a focus group interview. To verify whether at-risk students could be distinguished using the selected features, the researcher experimented with various machine learning algorithms. In addition to this, information intended to help each student was visually provided using the XAI technique. Finally, the perceptions of educators regarding the method proposed in this study were investigated. The second method is to use graph-based label propagation to alleviate the difficulties of data collection in the educational field. Students’ internal factors such as self-regulated learning (SRL), self-efficacy (SE), and learning motivation (MV) are known to be important factors influencing learning achievement. However, because data on these factors is primarily obtained through self-report surveys, it is more difficult to collect than online behavior data that is automatically accumulated in the learning management service. In this study, the researcher first collected SRL, SE, and MV data from 137 students and then used graph-based label propagation to infer the internal factors of students who did not participate in the survey. Following that, various machine learning algorithms were used to see if the performance of classifying at- risk students improves when the corresponding internal factors are included. Testing with 1,125 students data from Blackboard and 20,000 students from Canvas, it is confirmed that classification performance was higher when inferred students’ internal factors were included.

more

목차

Abstract i
Contents ii
List of Figures v
List of Tables vi
1. Introduction 1
1.1 The background of the research 1
1.2 Research questions 3
2. Related works 5
2.1 Learning analytics and predicting students’ academic performance 5
2.2 Educational factors that affect students’ performance 8
2.2.1 Interaction theory in distance learning 8
2.2.2 Self-regulated learning, self-efficacy, learning motivation as students’ internal factors of learning 8
2.3 Machine learning techniques 10
2.3.1 Ensemble learning 10
2.3.2 eXplainable AI (XAI) 11
2.3.3 Graph-based label propagation 12
3. Prediction of Students’ Performance Considering the Opinions of Educational Stakeholders 15
3.1 Methodology 15
3.1.1 Feature selection for predicting students’ performance 15
3.1.2 Early prediction of at-risk students using machine learning 24
3.1.3 Explanation of at-risk students 29
3.2 Experiment results 30
3.2.1 Results for early prediction of at-risk students 30
3.2.2 Results of granting explainability to at-risk students 34
3.3 Educators’ perceptions of practical early prediction methods 38
3.3.1 Participants 38
3.3.2 Instruments and data collection 38
3.3.3 Data analysis 39
3.3.4 Results 39
3.4 Discussion 41
3.4.1 Educational interventions on practical features 41
3.4.2 The need for local explanation 42
3.4.3 Educational stakeholders’ perception 43
4. Prediction of Students’ Performance Using Data Augmentation Methods 45
4.1 Methodology 45
4.1.1 Data collection 45
4.1.2 Correlation analysis 48
4.1.3 Students’ academic performance prediction considering SE, SRL, and MV 49
4.1.4 Graph-based label propagation 52
4.2 Experiment results 53
4.2.1 SRL, SE, MV and students’ academic performance 53
4.2.2 Feature selection result for SRL, SE, MV prediction 53
4.2.3 Distribution of inferred labels 54
4.2.4 Results for prediction of students’ academic performance 56
4.3 Discussion 66
5. Conclusion and Future work 69
Bibliography 71

more