An Optimal Fuzzy Neural Network Prediction Model for Student Performance Prediction
in Online Education
PuJing1
LiYuke*
-
(School of Arts and Media, Sichuan Agricultural University, Ya’ an, 625014, China )
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Keywords
Achievement prediction, Subtractive clustering, Genetic algorithm, Adaptive fuzzy inference system
1. Introduction
With the advent of the Internet era, online education (OE) has developed by leaps
and bounds. As a method that can promote educational balance, OE can enable every
student to enjoy fair and high-quality education, break through the limitations and
provide more learning opportunities for students. However, as the effectiveness of
OE often relies on students' self-discipline and lacks teachers' supervision and guidance,
which increases the probability of students' learning failure and ineffective learning,
accurate evaluation of students' learning is needed to ensure the quality of teaching
and learning (Bag et al., 2022). Existing systems for OE are generally management, learning, evaluation and monitoring
systems, containing student information, online learning hours and times, assignment
completion, participation in discussions, etc. The above data are generally stored
in various systems and are not effectively used to make accurate assessment of students'
learning effectiveness (Asif and Javed, 2021; Fisher et al., 2021). Therefore, many scholars have proposed the use of artificial intelligence to predict
students' learning performance in order to make timely adjustments to teaching programmes
and improve the quality of teaching. However, since the data such as students' academic
performance are generally redundant, with more useless information and no linear relationship
between the data, at the same time, the students' final grades are affected by a variety
of factors, and the uncertainty is greater,it is difficult for general algorithms
to process them efficiently and accurately, resulting in unsatisfactory prediction
results. The fuzzy set presents itself as a preferable solution when facing the uncertainty
problem. For instance, the intuitive trapezoid fuzzy set effectively handles the challenge
of multi-attribute group decision, arising from the interaction between individual
decisions and attributes. Also, the Dev fuzzy set appropriately deals with the uncertainty
of numerical and informative data attributes. Therefore, in order to enhance students'
utilization of online education related information and to achieve accurate prediction
of their performance. This study presents a student network education performance
prediction algorithm based on adaptive fuzzy neural reasoning system (ANFIS), principal
component analysis algorithm (PCA), and genetic algorithm (GA). The algorithm initially
performs principal component analysis to reduce the dimensionality of a substantial
volume of educational data. This reduces the computational complexity of the algorithm
as a whole. The genetic algorithm is then utilized to acquire the optimal ANFIS parameters.
Subsequently, the optimized ANFIS is employed to make performance predictions. The
approach effectively avoids the issue of ANFIS readily becoming trapped in local maxima,
thereby achieving high prediction accuracy.
2. Literature Review
Online education (OE) offers students fairer academic prospects, but also limits the
interaction time between teachers and students. Consequently, assessing students'
learning progress becomes a considerable challenge for educators. Baruah and Baruah (2021) address the problem of how to predict students' learning performance and propose
a deep score-based competitive diversity neuro-fuzzy network. The model integrates
fractional calculus and competitive diversity through a MapReduce framework to achieve
prediction of student performance. The test results showed that the model had excellent
accuracy in predicting students' grades. Thanh (2020) and his team proposed a grade PM based on LSTM and CNN for the problem of how to
achieve the prediction of university students' learning in online courses. The model
preprocesses the data by Quantile Transforms method, and then extracts the preprocessed
data into a deep learning model for achieving the forecast of college students' grades.
The model was tested to be highly feasible for predicting college students' grades.
YekunEphrem et al. (2021) team proposed a grade PM based on relevant techniques to address the problem of how
to evaluate students' performance in academic environments. The model was trained
by support vector machine, random forest, KNN and multilayer perceptron for enhancing
the prediction of the model. The experiment demonstrated that the model performed
better in terms of different assessment metrics. Trindade and Ferreira (2021) proposed a random forest algorithm based achievement PM for the problem of how to
predict student achievement by teachers' teaching styles. The model collects and processes
teachers' behavioral data to achieve prediction of achievements. The AUG of the model
was tested to be better than other models for predicting student achievement. Ahmed (2020) and his team proposed an extended KNN-based PM for the problem of how to achieve
rapid prediction of student achievement. The model combines KNN and moment descriptors
to improve prediction performance by classifying the training samples before searching.
The model was tested to reduce the search time by 75.4%-90.25% compared to the traditional
KNN model, while improving the classification accuracy. Haval et al. (2021) have used data mining technology to predict student performance and the prediction
results are well interpretable and can be used to improve teaching plans. Polinar et al (2020) used the naive Bayesian and C4.5 algorithms to predict student performance in the
teacher licensure exam. Their post-test AUG was about 0.83. Fairos et al. (2019) used supervised data mining methods to predict students' performance. This model
can predict student performance very well, but it cannot provide accurate results.
Abubakari et al. (2020) used neural networks to predict student performance, but the experimental results
showed that the accuracy of the model was only about 75%.
As a typical FNN model, ANFIS combines the merits of neuronal networks and fuzzy logic
with powerful learning ability and fuzzy knowledge representation, so ANFIS is widely
used in modeling nonlinear problems. Yilmaz (2022) and his team proposed a wear value PM for glass fiber reinforced polyester (GFRP)
composites based on ANFIS subclustering-based wear value PM. The model can predict
the wear of GFRP composites at different material concentrations and operating conditions
as a way to determine the optimal material concentration and operating conditions
for GFRPG composites. Surampudi and Shanmugam (2022) addressed the problem of how to detect and classify tumor-influenced images from
non-tumor-influenced brain magnetic resonance imaging (MRI) by proposing an ANFIS
classification model. The model filters the image noise using a vector-indexed filtering
algorithm, decomposes the images using wavelet transform, and finally classifies the
processed images using ANFIS. The experiment indicated that the model had high classification
sensitivity. Ghojghar et al. (2021) proposed a sandstorm PM in view of particle swarm algorithm (PSA) and ANFIS, i.e.
ANFIS-PSO model, for the problem of how to accurately predict sandstorms. The tested
R and RMSE values of the model are between 0.88-0.97 and 0.1-0.19, respectively, which
can achieve accurate prediction of seasonal dust storm frequency. Wali (2021) proposed a PM in view of PSA and ANFIS for the problem of predicting the original
chaotic pattern of double vortex coil circuit. The model achieves the prediction of
the original chaotic patterns by collecting and analyzing the observation results
and using the PSO algorithm to mine their potential relationships. The experiment
indicated that the model predicted the original chaotic patterns with high accuracy.
Sahitya and Prasad (2021) propose a traffic network accessibility evaluation model using artificial neural
networks and ANFIS for the problem of efficiency evaluation of traffic network structures.
The model quantifies, extracts and analyzes the characteristics of urban roads through
GIS and evaluates the performance of urban road networks by combining connectivity,
traffic network development and spatial pattern. The MAPE value of the model was tested
to be only 0.287%, which can achieve an effective assessment of the performance of
the traffic network.
To sum up, in the context of OE, how to capture students' learning situation by teachers
who do not have direct contact with students has been a major issue for which many
scholars have conducted relevant research. ANFIS is widely used in various industries
due to its powerful learning ability and nonlinear problem processing capability.
Therefore, this study proposes a PM for secondary school students' grades based on
ANFIS algorithm, and uses PCA and GA for data reduction and parameter optimization;
to reduce the burden of running the model and avoid local extremes, in order to achieve
accurate prediction of students' grades.
3. An Optimized Fuzzy Neural Network based Performance Prediction Model for Secondary
School Students
3.1 Improved ANFIS Model Incorporating PCA Algorithm and GA
FNN integrates the advantages of neural network and fuzzy system, which has powerful
learning ability and self-adaptation ability, and can realize parallel distributed
information processing, so it is widely used in various fields. A simple ANFIS structure
based on a first-order TS model is shown in Fig. 1.
Fig. 1. The structure of the ANFIS.
From Fig. 1, both $x$and$y$ indicate the input features. One can observe that the initial stage
involves fuzzifying the input features $x$ and $y$ by means of membership functions
to obtain membership degrees. Subsequently, multiplying the membership degrees yields
the triggering strength for each rule. The trigger intensity is normalised to obtain
the trigger proportion of the rule in the library. Then, by comparing the computational
rules, the input features$x$x and $y$ are linearly combined to obtain the input of
the next layer, and finally the result is blurred to get the final result. Eq. (1) displays the calculation formula for the outcomes of the first layer.
In Eq. (1),$x$ indicates the variable, respectively. $O_{1i}$ denotes the affiliation value
of the fuzzy ensemble. $a_{i}$ ,$b_{i}$ and$c_{i}$ all both represent the premise
parameters of the bell shape function, which can be adjusted by the algorithm。$A_{i}$
and $B_{i}$ denote the fuzzy sets (Ünver et al., 2022; Saeed et al., 2022) of $x$ and $y$. Next, the output of the first layer is reinforced, and the reinforcement
formula is given in Eq. (2).
In Eq. (2), $O_{2i}$ represents the relevant layer’s output. $\omega _{i}$ represents the excitation
intensity. $\mu _{{A_{i}}}\left(x\right)$ and $\mu _{{B_{i}}}\left(y\right)$ represent
the membership functions of the input variables x and y, respectively. Then the output
of the second layer is normalized, and the normalization formula is as in Eq. (3).
In Eq. (3), $O_{3i}$ represents the normalized data. The relevant layer’s output. is then processed
by a function, $f_{i}$, which is usually a linear function. When the output of the
fourth layer is counted in Eq. (4).
In Eq. (4), $O_{4i}$ serves as the fourth layer’s output. $p_{i}$ and $q_{i}$ serve as function
parameters, the more the input quantity, the more the function parameters. $r_{i}$
represents the rule back-piece parameter. Finally, the output of the fourth layer
is accumulated to get the final output. Eq. (5) showcases it.
In Eq. (5), $O_{5}$ represents the fifth layer’s output. In ANFIS, since the previous output
quantities often have high dimensional characteristics, it is essential to reduce
the dimensionality of the input quantities to remove the redundant information within
the data. The related diagram of PCA dimensionality reduction is showcased in Fig. 2.
Fig. 2. Schematic diagram of the dimension reduction principle of PCA.
Fig. 2 indicates that PCA, as an unsupervised dimensionality reduction method, reduces the
computational effort of the data by mapping the data into a lower dimensional space,
while ensuring the simplicity of feature extraction from the reduced data. The dimensionality
reduction method of PCA is to linearly transform the observations of the relevant
variables by an orthogonal transformation and project them as the values of a linearly
uncorrelated variable, which is the principal component (Dongbo and Huang 2020; Liu 2022; Tian 2020). Now suppose there is an existing matrix $X_{r\times c}$, and the elements of the
matrix are subtracted from the mean of the corresponding columns for obtaining the
new matrix. The calculation formula is given in Eq. (6).
In Eq. (6), $N_{j}$ stands for the column $j$ vector, $r$ for the number of columns in the matrix,
$c$ for the number of samples in each column, $E_{r\times 1}$ for the unit column
vector, and $N_{1\times c}$ for a row vector made up of the column vector's mean.
The eigenvector matrix maps the matrix $X_{r\times c}$ to produce the new matrix;
the number of mappings is shown in Eq. (7).
$Y_{r\times k}$ stands for the matrix acquired after mapping, $B_{c\times k}$ for
the matrix made up of the basis vectors of the feature vectors, and $k$ for the cutoff
factor in Eq. (7). The study suggests using the GA method to alter the FNN's parameters because the
usual parameter modification algorithm of FNN often pushes the neural network into
regional extremes. Fig. 3 displays the flow of the GA algorithm.
Fig. 3. Flow chart of the genetic algorithm.
Fig. 3 illustrates that the population is first initialized in the GA algorithm to obtain
the first generation population, and then the fitness value is calculated, which is
the inverse of the objective function. Eq. (8) demonstrates the relevant formula.
In Eq. (8), $t_{i}$ denotes the desired output of the fuzzy neural network; $y_{i}$ denotes
the actual output of ANFISd, whose quantitative indicator is student achievement.
$n$ serves as the quantity of individuals, and $f$ indicates the fitness function.
From Eq. (8), the minimum value of $E$ is equal to $\left(\sum _{i=1}^{n}\left(t_{i}-y_{i}\right)\right)/2$,
then the fitness function is the reciprocal of $E$. Then the selection operation is
carried out, and for the study to perform the selection, the roulette wheel algorithm
is chosen. Eq. (9) displays the probability and cumulative probability of an individual being selected.
In Eq. (9), $P\left(i\right)$ indicates the probability that the first $i$ individual is selected.
$f\left(i\right)$ indicates the fitness value of the first $i$ individual; and $Q\left(i\right)$
indicates the cumulative probability of an individual. Note that when $i=0;$ $Q\left(0\right)=0.$
The subsequent stage entails the crossover operation whereby two intersection points
are randomly selected and the segments at these points are exchanged to obtain new
individuals. This is followed by mutation operations to form the succeeding generation
population. The mutation method selected for this task is the basic location mutation,
which involves the random selection of an individual's mutation point location and
reversing the original gene at the mutation point. Finally, the new population is
evaluated for compliance with the criteria, and if it is compliant, the optimal solution
is generated; otherwise, the above procedure is repeated until it is compliant
3.2 PCA-GA-ANFIS-based Performance Prediction Model for Secondary School Students
In the Internet era, OE is gradually becoming prevalent. However, since teachers and
students do not have direct contact in OE, it is a great challenge to track students'
learning progress and effectiveness. By predicting students' learning performance,
teachers can fully understand students' learning effects so that they can adjust their
teaching programmes in time. Although FNN can address the complicated non-linear relationship
between students' regular grades and final grades, the traditional FNN PM runs slowly
and has low accuracy of prediction results because the FNN tends to fall into local
extremes and the samples of students' grades are often redundant and have high data
dimensionality. Therefore, based on the above problems, the study proposes a PCA-GA-ANFIS
performance PM. Before building the PCA-GA-ANFIS model, the basic FNN model needs
to be established, and the study generates the FNN structure through a subtractive
clustering algorithm. The algorithm first determines all samples as clustering centers,
and then the density near the samples to exclude non-clustering centers; the density
calculation formula is shown in Eq. (10).
In Eq. (10), $D_{i}$ denotes the density of the sample; $\gamma _{a}$ denotes the clustering
radius; $n$ denotes the number of samples; $x_{i}$ and $y_{i}$ both denote the samples.
The sample with the highest density in the sample is used as the first cluster center
(CC). The calculation of the second clustering center needs to eliminate the influence
of the first clustering center; therefore, the formula of the second clustering center
is as in Eq. (11).
In Eq. (11), $D_{ck}$ denotes the density of the CC selected for the $k$ th time; $x_{ci}$ is
the CC of the $k$ rd time; $\gamma _{b}$ the cluster radius of the $k$ th sample,
in general $\gamma _{b}>\gamma _{a}$. The above steps are cycled until all clustering
centers are found. $\gamma _{a}$ and the formula for $\gamma _{b}$ is given in Eq.
(12).
In Eq. (12), $\gamma _{a}$ and $\gamma _{b}$ determine the number of classes generated; the larger
$\gamma _{a}$ and $\gamma _{b}$ are, the fewer the number of classes generated, and
vice versa. Finally, the clustering centers found by the above steps are judged to
find the maximum density indicator; the rules for the determination are shown in Eq.
(13).
In Eq. (13), $\delta $ needs to be pre-set. If Eq. (13) holds, the algorithm ends; otherwise, return to the second step and continue the
algorithm. $\delta $ The larger the number of clusters, the smaller the quantity of
clusters; $\delta $ the smaller the number of clusters, the larger the number of clusters.
The FNN structure generated by the above subtractive clustering algorithm requires
more parameters to be set and is not resistant to the interference of wild point values.
Therefore, it needs to be optimized by the PSA to obtain the optimal network structure.
The flow of the PSA for optimizing the FNN is indicated in Fig. 4.
Fig. 4. Flow of particle swarm optimization for fuzzy neural networks.
Fig. 4 demonstrates that the PSA treats each individual as a particle, and the particle
is represented by a velocity vector and a position vector; the FNN is optimized by
calculating the output layer error and determining whether the error satisfies the
conditions. The updating formulas of the velocity vector and position vector are shown
in Eq. (14).
In Eq. (14), $V_{id}^{t}$ serves as the velocity of the $i$ particle after $t$ iterations in
the $d$ dimension; $X_{id}^{t}$ serves as the position of the $i$ particle after $t$
iterations in the $d$ dimension; $Pb_{id}^{t}$ serves as the best position of the
particle after $i$$t$ iterations in the $d$ dimension; $Pg_{d}^{t}$ serves as the
global best position after $t$ iterations; $c_{1}$ and $c_{2}$ serve as the acceleration
coefficients of the individual best position and the global best position weights,
respectively; $r_{1}$ and $r_{2}$ serve as the random numbers between $\left[0,1\right]$;
$\omega $ serves as the inertia weights. The update equations of individual best position
and global best position are shown in Eq. (15).
In Eq. (15), $f$ denotes the objective function. After the parameters of the FNN structure were
optimized by the PSA, the optimal FNN model structure was obtained, and then the optimized
ANFIS algorithm was fused with the neural network structure to obtain the PCA-GA-ANFIS
performance PM. the PCA-GA-ANFIS model flow is shown in Fig. 5.
Fig. 5. PCA-GA-NFIS model process.
Fig. 5 shows that the PCA-GA-ANFIS model first encodes the antecedent parameters, and then
optimizes the antecedent parameters by crossover and mutation operations of the genetic
algorithm to prevent the model from falling into a local optimum. The optimal antecedent
parameters are used as parameters of the FNN, and then the student performance data
are dimensionally reduced to reduce redundant data. Finally, the data are fuzzy processed
and enhanced normalized by ANFIS algorithm, and the prediction results of students'
grades are output.
4. Fuzzy Neural Network Model Test Results Analysis
For validating the function of the PCA-GA-ANFIS model, the study conducted simulation
experiments through MATLAB software to measure the model’s performance in terms of
accuracy and misclassification rate, and compared it with ANFIS and XGBoost. The dataset
originates from the marks achieved by year 9 students in a school, encompassing 9
subjects. A sample of students' performance data are shown in Table 1.
Table 1. Sample performance data of the students.
Number
|
Chinese
|
Math
|
English
|
Physics
|
Chemistry
|
Biology
|
Politics
|
History
|
Geography
|
1
|
67
|
85
|
79
|
72
|
83
|
87
|
71
|
73
|
82
|
2
|
78
|
76
|
76
|
74
|
89
|
93
|
74
|
82
|
75
|
3
|
84
|
73
|
85
|
77
|
90
|
86
|
76
|
76
|
82
|
4
|
85
|
82
|
74
|
80
|
93
|
75
|
75
|
73
|
75
|
5
|
76
|
81
|
90
|
73
|
76
|
79
|
78
|
84
|
90
|
6
|
91
|
76
|
93
|
76
|
84
|
82
|
83
|
82
|
92
|
7
|
59
|
83
|
88
|
82
|
87
|
84
|
82
|
76
|
75
|
8
|
63
|
74
|
86
|
84
|
79
|
76
|
70
|
88
|
84
|
9
|
77
|
79
|
80
|
85
|
82
|
73
|
69
|
89
|
76
|
10
|
82
|
86
|
75
|
75
|
86
|
80
|
75
|
85
|
73
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
The convergence of PCA-GA-ANFIS, ANFIS, and XGBoost is shown in Fig. 6.
As can be observed from Fig. 6, ANFIS achieved convergence after approximately 500 iterations, with a loss value
of around 0.3 at this point. Meanwhile, XGBoost reached convergence after about 430
iterations, with a loss value of approximately 0.25. PCA-GA-ANFIS achieved convergence
after roughly 300 iterations, with a loss value of about 0.15. It demonstrates that
the convergence of the PCA-GA-ANFIS model is superior to that of the ANFIS model and
the XGBoost model. The prediction results of PCA-GA-ANFIS, ANFIS and XGBoost with
the actual values are shown in Fig. 7.
Fig. 6. Loss curves of PCA-GA-NFIS, ANFIS, and XGBoost.
Fig. 7. Prediction results and actual values of PCA-GA-NFIS, ANFIS, and XGBoost models.
Fig. 7 indicates that the ANFIS model’s prediction curves are in poor agreement with the
actual values, with a maximum error of about 6; the prediction curves of XGBoost are
in better agreement with the actual curves than ANFIS, but the error is still large,
with a maximum error of about 5; the prediction curves of PCA-GA-ANFIS are in better
agreement with the actual curves, with a maximum error of only about 3.5. It can be
seen that the forecasting outcomes of the PCA-GA-ANFIS model are more consistent with
the actual situation. the absolute and relative errors (RE) of the forecasting outcomes
of PCA-GA-ANFIS, ANFIS and XGBoost are shown in Fig. 8.
From Fig. 8(a), the absolute error (AE) of the ANFIS
model is about 0.2 minimum and 6 maximum; the average absolute error is about 3.6.
The minimum AE of the XGBoost model is about 0.3, the maximum AE is about 5, and the
average AE is about 3.2. The AE of the PCA-GA-ANFIS model tends to have a minimum
of 0 and a maximum of 3.5; the average AE reaches about 2.2. The average AE is about
2.2. From Fig. 8(b), the minimum RE of the ANFIS model reaches about 0.1, the maximum RE reaches about
7%, and the average RE reaches about 2%. the minimum RE of the XGBoost model reaches
about 0.15%, the maximum reaches about 6.7%, and the average RE reaches 1.8%. the
minimum RE of the PCA-GA-ANFIS model tends to be 0%, the maximum RE reaches about
6%, and the average RE reaches 1.6%. The above results show that the PCA-GA-ANFIS
model has the least error in the prediction results. The forecasting outcomes of ANFIS,
XGBoost and PCA-GA-ANFIS are shown in Fig. 9.
Fig. 8. Absolute error and relative error of PCA-GA-NFIS, ANFIS and XGBoost prediction results.
Fig. 9. Prediction accuracy and precision of ANFIS, XGBoost, and PCA-GA-AFIS.
Fig. 10. Recall rate and F1 measure of ANPIS, XGBoost, and PCA-GA-AFIS models.
Fig. 9(a) indicates that the highest forecasting accuracy of the ANFIS model is for chemistry
with about 88.2%; the lowest is for English with about 85.4%; the average accuracy
(AA) is about 86.8%. the highest prediction accuracy of the XGBoost model is for history
with about 88.4%; the lowest is for politics with about 85.9%; the AA is about the
PCA-GA-ANFIS model has the highest prediction accuracy of about 91.8% for history;
the lowest prediction accuracy of about 86.7% for is mathematics; and the AA is about
89.4%. From Fig. 9(b), the highest accuracy of the ANFIS model for prediction results is about 89.5% for
physics; the lowest is about 85.8% for chemistry; the AA is about 87.8%. The highest
accuracy of the XGBoost model for prediction results is about 90.4% for biology; the
lowest is about 87.9% for English; the AA is about the PCA-GA-ANFIS model has the
highest prediction accuracy of 92.4% for politics, and the lowest prediction accuracy
of 87.9% for geography, with an AA of 90.3%. The above outcomes illustrate that the
PCA-GA-ANFIS model possesses higher precision of prediction results than the other
two models. The recall and F1-measure of ANPIS, XGBoost and PCA-GA-ANFIS models are
shown in Fig. 10.
Fig. 10(a) indicates that the ANFIS model possesses the highest prediction result recall for
geography, which is about 87.2%; the lowest is for history, which is about 84.8%;
the average recall (AR) is about 85.9%. The XGBoost model has the highest prediction
result recall for physics, which is about 87.3%; the lowest is for biology, which
is about 85.2%; the AR is about the PCA-GA-ANFIS model has the highest prediction
recall of about 90.1% for politics; the lowest prediction recall of about 87.5% for
is physics; and the AR of about 88.8%. From Fig. 10(b), the highest F1-measure of the ANFIS model is English with about 89.1%; the lowest
is physics with about 86.4%; the average F1-measure is about 87.6%. the highest F1-measure
of the XGBoost model is chemistry with about 87.6%; the lowest is biology with about
85.6%. The highest F1-measure of the PCA-GA-ANFIS model is for history, with about
92.2%; the lowest F1-measure is for geography, with about 88.9%; the average F1-measure
is about 90.6%. This shows that the comprehensive function of the PCA-GA-ANFIS model
exceeds the rest models. The ROC curves and C-index curves of PCA-GA-ANFIS, ANFIS
and XGBoost are shown in Fig. 11.
Fig. 11(a) demonstrates that the PCA-GA-ANFIS model outperforms the other two algorithms, as
evidenced by its ROC curves completely enveloping those of each algorithm. In Fig. 11(b), the C-index for ANFIS is around 0.7, for the XGBoost model it's around 0.8, whereas
the PCA-GA-ANFIS model boasts a C-index of approximately 0.85. It suggests that the
coherence of the forecasting outputs from the PCA-GA-ANFIS model surpasses that of
the ANFIS model and the XGBoost model.
Fig. 11. ROC and C-index curves of PCA-GA-NFIS, ANFIS, and XGBoost.
5. Conclusion
With the popularity of online education, to achieve accurate prediction of student
performance, this study used PCA-GA-ANFIS algorithm and compared with ANFIS and XGBoost
models. Compared with other algorithms, the performance prediction algorithm proposed
by the research is more accurate and more in line with the actual online teaching
situation. The specific experimental results and their deficiencies are shown as follows:
After testing, the PCA-GA-ANFIS model began to converge after 300 iterations, and
the rest of the models converged slower than this model. The loss values of PCA-GA-ANFIS,
ANFIS and XGBoost were about 0.15, 0.3 and 0.25, and the loss value of PCA-GA-ANFIS
was the lowest. In absolute values and RE, the absolute value and RE of PCA-GA-ANFIS
were 2.2 and 1.6%, respectively, which were lower than the two other models. In addition,
the AA, AP, AR and mean F1 of PCA-GA-ANFIS model were 89.4%, 90.3% and 88.8%, respectively,
which were higher than the other models. The c-index for PCA-GA-ANFIS, ANFIS, and
XGBoost is about 0.85,0.7, and 0.8, respectively, with the highest c-index for PCA-GA-ANFIS.
The above results show that the PCA-GA-ANFIS model has better prediction accuracy,
recall, consistency and overall performance than the other models, and the prediction
results are more consistent with the true values.
Since the genetic algorithm only optimizes the precursor parameters of ANFIS, and
the conclusion parameters are learned using the least squares algorithm, the accuracy
of the results needs to be improved. Therefore, future research will focus on how
to optimize the conclusion parameters. In addition, due to the limitation of less
sample data indicators, the accuracy of the results needs to be improved urgently.
Therefore, future studies need to add more related factors to improve the breadth
of research factors.