ZhangRuixue1
-
(Department of College English, Zhejiang Yuexiu University, Shaoxing, 312000, China
ruixue.zhang@gmx.com )
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Keywords
Decision tree, Resource recommendation model, ID3 algorithm, English reading
1. Introduction
English is the lingua franca for communication in all disciplines, so English learning
is becoming increasingly important in universities (Yi, 2020). The increase in teaching
requirements has not brought about changes in teaching methods, and most educators
have optimized the teaching arrangements from inside and outside the classroom or
improved them for English teaching. Nevertheless, such improvements are limited (Duan,
2021). Internet development has enabled various learning resources to spread worldwide.
This vast pool of resources can provide specific ideas for teaching English (Zhou,
2021). Among them, the selection of reading resources has a certain impact on reading
instruction. Students’ learning progress will be delayed if the selected resources
do not match the characteristics of the students. Hence, establishing an interactive
adaptive reading resource selection can effectively improve English reading instruction
(Ma, 2021). This study used ID3, a classification algorithm with interaction, as the
basis of the model and optimized its information gain formula to solve the problem
of local optimality. The model before and after improvement and the traditional recommendation
method were simulated and compared to evaluate the performance and practicality of
the model. The change in students’ reading scores and feedback was the performance
indicator, and the study also attempted to use the model to mine the recommended solutions
for different types of students to provide some ideas for improving English reading
instruction.
2. Related Work
The ID3 algorithm is the most commonly used algorithm in decision trees and is used
widely as the basis for various complex systems because of its superior classification
performance. Park et al. (2018) developed an ID3 adaptive path selection model using
the fuzzy decision tree algorithm to overcome the sensitivity of decision trees in
route selection. Simulation experiments were conducted on this model. The results
showed that this improvement could improve the prediction accuracy of the model with
good adaptability. An and Zhou (2022) examined the effect of the decision tree algorithm
in rural energy construction and set thresholds for the selection algorithm of terrain
features by the ID3 algorithm to promote attribute complementarity to filter irrelevant
attributes. The spatial optimization configuration problem for establishing solar
energy in rural areas showed that the algorithm had particular promotion potential.
Abbas A R and Farooq A O used ID3 to distinguish between skin and non-skin pixel types,
specifically by improving the ID3 algorithm to improve the skin detection accuracy
and exclude the interference of skin color on the recognition results. They added
three color space data sets to the algorithm. The results showed that the system accuracy
of each index was above 99.50% (Abbas and Farooq, 2019). Karthi et al. (2018) used
data mining for accident prediction in the railroad sector. They used text mining
techniques to mine the data provided by the user and the railroad sector, where the
unstructured data provided by the railroad sector was analyzed using the ID3 algorithm
to predict the cause of accidents. Pratama and Saragi (2018) attempted to classify
the quality of cassava to ensure the quality of related processed products by examining
various parameters of cassava and image processing of whiteness and speckle degree
in visual parameters and then classifying them using the ID3. Maingi et al. (2019)
proposed an ID3-based decision tree for symptom burden classification against disease
outbreaks. The algorithm specifically sorts and classifies the disease burden information
gained to derive the required knowledge. The results showed that their proposed method
could support the related field.
Reading comprehension is an important segment of English language learning and one
item requiring improvement in college education. Educational researchers have been
trying to improve the quality of reading comprehension in various ways. Zaiter (2020)
suggested that reading is as indispensable as writing, and as an educator, he believed
that students should be able to find motivation for reading and writing. Nevertheless,
there is a difference between reading and writing when it comes to academic writing.
They analyzed the situation of English majors in the Arab world and proposed remedial
measures supported by extensive experimental data to help prevent plagiarism. Wu (2021)
believed that it is necessary to improve students’ thinking when teaching English
reading comprehension, which is one of the core competencies of the subject. Chakraborty
and Chowdhury (2021) reported that reading comprehension is one of the essential English
skills at all levels of education, and the importance of reading comprehension in
obtaining a degree is becoming an issue. Academic reading is an early manifestation
of this concern, and the source of this finding is based on the results of a survey
of students in government colleges in Bangladesh who believed that teaching academic
reading to undergraduate students strengthens their competitiveness. Chinese higher
education policy focuses on the importance of developing students’ intercultural competence.
On the other hand, Yu and Maele (2018) suggested that this is not the case in practice.
Hence, they conducted a curriculum study of a university college while building a
Baker-based model of intercultural awareness to train participants. The results proved
that reading courses can help Chinese students build intercultural awareness. Audina
et al. (2020) reported that students who do not understand the reading content are
prone to translate word by word rather than comprehending it as a whole. In response
to this problem, they investigated the teaching strategies of English teachers and
their causes. They established the DRA strategy to guide students to understand the
text content, and the results proved that this attempt is meaningful.
The application and improvement of the ID3 algorithm by domestic and foreign scholars
have been proven effective. Moreover, the data processing models established under
ID3 are used widely in various fields, and the classification accuracy is improving.
The algorithm can combine the characteristics of users and data objects for adaptive
matching, which is a good fit for the problem that college English reading cannot
be adaptively recommended for students. Most educators improve English education from
the text and related issues inside and outside the classroom. Few integrate intelligent
technologies into English education. Thus, attempts at intelligent control have some
positive significance.
3. Construction of Reading Resource Recommendation Model based on ID3 Algorithm
3.1 Decision Tree Composition based on ID3 Algorithm
Data mining often requires a supervised learning algorithm to predict the attributes
and categories of unknown data. Tree-structured decision trees represent this class
owing to their good discriminative rule generation mechanism, among which ID3 is one
of the most commonly used algorithms (Hong et al., 2018). This algorithm first calculates
the gain value of the information, and the attribute with the highest result is used
as the basis for classifying other information. This approach minimizes the amount
of information required for classification and follows the principle of minimum randomness
of division. The decision tree is constructed with a modular distinction of known
attributes from top to bottom, starting with the root node for the classification
calculation of the sample set, which is then used as a basis for several divisions
of the sample (Tulloch et al., 2018). The decision tree will be iterated to achieve
the classification purpose of the above-mentioned sample until the construction is
completed. The non-categorical attributes of its structure will become non-leaf nodes,
and their attribute values are represented as branches. The complete structure from
the root of the tree to the leaf nodes represents a complete classification rule,
and the mapping of the entire rule builds an expression. The result will become a
resource recommendation expression.
The ID3 algorithm is simple and has strong learning ability. Its classification speed
is fast, so it is suitable as the basis of the algorithm for large-volume data processing.
Here, let the number of possible class labels of the sample set$X$ be$n$. The probability
distribution is expressed as Eq. (1).
At this point,$X$ contains the information entropy, whose expression is written as
Eq. (2).
If the value of $P_{i}$ > 0 in Eq. (2), then the value of$0\log 0$ is also$0$. The base of the logarithm is$2$ because the
information encoding method is binary encoding. If two variables are to be calculated
in the sample set$\left(X,Y\right)$, then the probability distribution is expressed
as Eq. (3).
The conditional entropy of $Y$ under the specific conditions of $X$ was calculated
using Eq. (4).
$P_{i}$ is then expressed as the mathematical expectation of the probability distribution
of$X$ for a given conditional shrimp. If the information entropy of another dataset
$A$ is $Entropy\left(A\right)$ and the empirical conditional entropy in this dataset
is $Entropy\left(B\left| A\right.\right)$ , the information gain of the dataset$B$
can be calculated using Eq. (5).
The larger the result of Eq. (5), the greater the information gain. The purity of the subset of the sample is higher.
The decision tree selects the attribute with the larger result value as the classification
attribute and constructs the nodes. Finally, it constructs the complete decision tree
in a cycle to analyze the recommended rules and recommend the appropriate reading
resources for college students.
The main advantage of the ID3 algorithm is the concept of information entropy. The
information gained reduces the sensitivity to abnormal training samples. This easy
operation mode of the upper and lower search space allows it to handle complex samples.
The tree structure lets the user visualize the classification rules and principles
(Andrew et al., 2018). Nevertheless, the algorithm also has some drawbacks: the relationship
between attributes is more complex, and the direction of subsequent optimization,
where attributes with large information gain values are not the best for splitting
because an increase in attribute value also leads to a larger gain value.
3.2 Optimization of ID3 Algorithm in the Resource Recommendation Model
The principle of the ID3 algorithm is to use the attribute with the greatest information
gain as the splitting attribute. On the other hand, multi-valued information will
also cause an increase in gain, so the problem of multi-value bias will directly affect
the classification accuracy of this algorithm (Li et al., 2018). Let$A$ be an attribute
of the dataset$X$; divide its value domain into two equal parts; set the attribute
as$A'=\left(A_{1},A_{2},\ldots ,A_{n+1}\right)$; determine the possibility of attribute
value bias of this algorithm by calculating $A_{i}$ and$A'_{i}$, which are the gain
of the attribute values before and after the transformation. $Gain\left(X,A\right)$
is the gain of the attribute $A$. $Gain\left(X,A'\right)$ is the gain of the new attribute$A'_{i}$.
In this case,$Gain\left(X,A\right)$ is calculated using Eq. (6).
$P\left(D_{i}\right)$ in Eq. (6) represents the probability of the attribute of class $i$ in the dataset,$P\left(A_{j}\right)$
is the proportion of the sample size, and$P\left(D_{i}\left| A_{j}\right.\right)$
is the probability of attribute$A$ having a value of$A_{j}$ corresponding to the attribute
of class $i$ in the dataset. Similarly, the gain value of the new attribute$Gain\left(X,A'\right)$
is calculated using Eq. (7).
In this case, the difference between the two gain values is calculated using Eq. (8).
$L=\frac{P\left(A'_{n}\right)}{P\left(A_{n}\right)},$ $x_{i}=P\left(X_{i}\left| A_{n}\right.\right)\,,$
$p_{i}=P\left(X_{i}\left| A'_{n}\right.\right)\,,$ $o_{i}=P\left(X_{i}\left| A'_{n+1}\right.\right)$
will be introduced to calculate the gain difference value and simplify the expression
of the calculation process, and the gain difference expression is expressed as Eq.
(9).
Eq. (9) is processed and divided by$P\left(A_{n}\right)$ to obtain Eq. (10).
Set$f\left(x\right)=x\log _{2}x$, at which point Eq. (11) is obtained.
According to the rules of concavity and convexity,$f\left(x\right)$ is a convex function,
and the following relationship can be obtained:
Eq. (13) can be obtained by processing each relationship.
Bringing Eq. (13) into the difference of information gain comparison results in $Gain\left(X\left|
A\right.\right)\leq Gain\left(X\left| A'\right.\right)$ because the attribute selection
mechanism of the ID3 algorithm is based on information gain, and a larger gain value
of$A'$ indicates that the algorithm has multi-value bias. Suppose students need to
read resources with attributes $A$, using the traditional ID3 to classify the potential
resources. Resources with multiple attributes will have the same results as those
with a strong longitudinal single attribute $A$. The quality will be reduced accordingly
based on this recommendation. At this time, it is necessary to improve this situation.
This study solves this problem by introducing the correlation coefficient of the fixed
class variable, and the improved gain formula is updated as (14).
The above Eq. (14) of$\rho _{ay}$ represents the correlation coefficient between attribute$A$ and category
$Y$. Introducing the correlation coefficient will reduce the information gain of the
category with little relevance and many attribute values. This change optimizes the
gain function in terms of the algorithmic process to solve the multi-value bias problem.
The formula must be simplified to make the constructed decision tree operation concise.
Eq. (15) expresses the final information gain formula after simplifying the logarithmic operation.
$B$ in Eq. (15) is a subset of the original dataset divided by$n$, while the original dataset has
$m$ classes. The dataset $B$ is divided into subsets using$m$ again. A decision tree
T, input data set X is generated based on the above optimization process. The feature
value and threshold are also set. If all individuals in the data set are the same
type, then generate class labels. If the data do not meet the requirements of feature
set E, select the highest number of individuals as labels. If the condition is not
met, follow the above formula set dispersion features on the information gain value
of data set X; the maximum value is taken as the split node. If the maximum value
is less than the threshold, the highest number of labels in the data set is selected
as the splitting point. If the labeling point is not satisfied, the feature value
less than the threshold is used as the new division basis to establish a new feature
value. The above steps are repeated until the decision tree is generated, as shown
in Fig. 1 below.
Fig. 1. ID3 algorithm generation architecture diagram.
3.3 Adaptive ID3 for Reading Resource Recommendation Model Construction
The reading recommendation model needs to be a two-way interactive model that adapts
to the situation of the tweeted person. In contrast, the situation of the tweeted
person changes, and the recommendation content should be updated in due time. Therefore,
the resource recommendation algorithm should understand the characteristics of the
target person, the characteristics of the reading resources, and the characteristics
of the attributes that need to be classified. The data storage in the pre-processing
session uses two-dimensional arrays, and discrete data should also be processed. The
experimental subjects of the study were selected to participate in CET-4 learners,
and their situation was modeled to understand students’ styles from four aspects based
on various reading ability scales and the actual learning involved: possessed reading
ability, learning goals, learning efficiency, learning style, and cognitive style.
When using the ID3 algorithm for student-style classification, feature selection is
crucial for constructing decision trees and the final classification results. The
style data are first pre-processed, which simplifies and standardizes students' learning
situations to classify students' styles accurately. First, the study defines the learning
style of each student, including reading ability, learning objectives, learning efficiency,
learning style, and cognitive style. Then, calculate their information entropy, conditional
entropy, and information gain in different situations. Next, find the maximum value
from all the feature value information gains, which will serve as the root node of
the ID3 algorithm decision tree. Form branches with this value until all subsets contain
data from the same category. This results in a decision tree that can classify students
based on their learning style characteristics. By applying the ID3 algorithm, students
can be classified based on their learning styles, better understanding each student's
learning preferences and needs. This has important guiding significance for educators
because it can help them better design and adjust teaching strategies to meet the
needs of different types of students, improving educational effectiveness.
The reading ability (Ability, Ab) in the study was rated according to the Chinese
English Reading Ability Scale, which has nine levels from small to large, indicating
ability in ascending order, adapting the study to the content specified in levels
4–7 (Ma, 2021). Students with different reading abilities will select reading content
to improve a particular ability. Some students aim to increase their vocabulary; others
want to increase their sense of language. The study will allow them to select the
reading goal in the student assessment model (Objective, Ob).
Cognitive style (Cs) is an element that affects the student’s learning abilities and
characteristics. Its advantage is that it visualizes the probability of students’
success and is an explicit indicator formed over time. From a reading comprehension
perspective, the two most involved cognitive styles are field-dependent and independent.
Field-dependent students prefer to read texts with human subjects, and their thinking
has a certain ability to synthesize. They prefer to study the text in detail when
reading, but they cannot easily establish an independent reading space and are easily
influenced by the outside world. Although independent students are the opposite, they
pay more attention to the actual content conveyed behind the text and prefer the content
of natural subjects. They will build their reading field when reading and have a specific
resistance to interference. Cognitive style is an essential element of research to
analyze the situation of college students. The learning result (Lr) will be assessed
based on the students’ self-assessments and test results.
The model involves three attributes of reading comprehension resources, the main content
that needs to be classified by ID3. Theme (Th) refers to the content source of reading
resources divided into natural subjects and social sciences. The difficulty value
(FV) is the level according to the overall assessment of the resources. Category (Ca)
is a category of questions based on the CET-4 test, including completion reading for
detail, sequential reading for logical order, and narrowly defined fine reading. The
final model is constructed according to the logical order of model construction, as
shown in Fig. 2.
Fig. 2. Adaptive recommendation model.
The adaptive model in Fig. 2 has four indicators in the learner segment and three indicators in the reading resource
model. In the actual process, the recommended reading resources should be changed
adaptively by combining both situations, while the feedback from learners is the basis
for real-time updates, and the resource recommendation model built on this basis can
be used as one of the teaching tools to improve teaching quality.
4. Results and Analysis
A specific CET-4 training course of a training institution was tested to assess the
performance of the constructed model. The necessary information was collected to build
a learner model. The ID3 algorithm was used to classify the reading resources in the
resource library, and the learner model was used as the basis of the attributes for
adaptive recommending. Seventy-five percent of the learner data was used as training
data; the remaining reference data was used as the basis for the evaluation results.
The decision tree resource generation categories are expressed regarding good or bad
recommendations, specifically YES and NO. The test set was added to the ID3 algorithm,
and the output results are shown in Fig. 3.
Fig. 3. Read Resource Recommendation decision tree.
The simulated data decision tree establishment is still based on the type of reading,
reading difficulty, learner’s effect, and cognitive style to establish the nodes,
which is similar to the decision tree establishment of the sample data, so the decision
tree establishment is valid. The relationship between the accuracy of this simulation
and the number of samples is as follows, as shown in Fig. 4.
Fig. 4. Relationship between the number of learners and accuracy.
The accuracy rate in the test set was close to the reference value ( > 80%). As the
number of learners increased, the curve of the test data nearly approached the curve
of the reference set, suggesting that the accuracy rate is also increasing and that
the recommendation model of ID3 as a classification tool is effective. Although the
accuracy rate obtained from the performance test experiment did not reach 90%, increasing
the feature data of learners can improve the performance, suggesting that this error
is an inherent limitation of the performance test experiment because of the limited
data it collects.
The model was applied to the daily teaching of an English tutorial institution in
the 1$^{\mathrm{st}}$ quarter of 2020, and the change in students’ reading performance
was used to indicate the impact of the model on teaching. The ID3 model after improvement
was used as the experimental group, and the control group was the ID3 model before
improvement and the traditional English recommendation model. The reading chapters
were recommended for students from the same resource library to evaluate the advantages
and disadvantages of the three methods.
The initial reading ability of students in the three groups was similar, all around
B5, and no students showed abnormal performance in the class (Table 1). Some differences in the outcomes of the three groups were observed after passing
the first training period. The most noticeable performance improvement was in the
improved ID3 recommended model group, which performed better than the standard ID3
model and traditional method groups. According to the students’ feedback, the goal
achievement rate of the improved ID model group reached over 90%, indicating that
the recommended model is effective.
Table 1. Changes of students in each group before and after learning.
Recommended model
|
Optimize ID3
|
Standard ID3
|
Traditional way
|
Number of students
|
177
|
172
|
169
|
Initial achievement
|
56.13±4.01
|
53.14±3.92
|
58.53±4.09
|
Average reading ability
|
R5+
|
R5
|
R5+
|
Performance improvement
|
18.01±1.07
|
13.21±1.03
|
9.02±1.01
|
Target achievement rate
|
91.22%
|
80.13%
|
69.21%
|
The accuracy of the model was assessed by evaluating the four indicators of recall,
accuracy, precision, and F-value according to the above experimental groupings. A
random sample of three groups was fitted with the recommendation and student feedback
as criteria to obtain the four indicators. Fig. 5 presents the results of the four indicators.
Fig. 5. Comparison of the prediction performance between two models.
The overall accuracy of the improved model reached more than 95% (Fig. 5), and each index was higher than the standard ID3 model for the same case. With the
change in the sampling proportion, the accuracy of the improved ID3 model did not
change significantly. In contrast, the accuracy of the standard ID3 algorithm decreased
as the sampling proportion decreased. Hence, the algorithm falls more easily into
a local optimum as the sample size decreases. The improvement made by adding the correlation
coefficient algorithm solves this problem, i.e., it does not change as the sample
size changes.
During the use of the recommendation model, the ID3 improvement model group was given
a reading test to monitor the change in the learners in real time. The learner profiles
were first entered into the model to find students with each characteristic as a basis
for finding typical learners. Table 2 lists the output of their characteristics.
Table 2. Table of typical students.
Feature dimension
|
Student C
|
Student B
|
Student C
|
Reading ability
|
R5
|
R5-
|
R5
|
Cognitive style
|
Dependence
|
Independence
|
Dependence
|
Self-evaluation efficiency
|
80
|
81
|
70
|
Initial accuracy
|
87.2%
|
80.3%
|
75.9%
|
Question type preference
|
Cloze
|
Reading Comprehension
|
Sort reading
|
Subject preference
|
Social
|
Natural
|
Social
|
As shown in Table 2, Learner A was field-dependent, with a preference for completion-type reading and
sensitivity to humanities and social science texts. Learner B was field-independent,
with a preference for reading comprehension and natural science texts. Student C was
field-dependent, with a preference for logical sequencing and humanities and social
science. All three students had similar initial abilities and minor differences in
their self-assessment abilities and initial test scores. Adaptive recommendations
were given to them using the improved model, and their accuracy rates were tallied,
as shown in Fig. 6.
Fig. 6. Comparison of the prediction performance of the two models.
Among the three students, student A showed the greatest improvement, but his accuracy
rate fluctuated the most (Fig. 6), indicating that the accuracy rate of the field-dependent students will be affected
by the environment. Nevertheless, the overall improvement was significant. Although
the initial ability of student B was more general, his accuracy rate also improved,
but the overall fluctuations were not significant, suggesting that the student was
more dependent on the difficulty of the reading material. His accuracy rate improved
from 30% to 50%, indicating that the effect of the recommendation model was significant.
The situation of student C was similar to B, and the improvement also proved the effectiveness
of the model.
Although the model is effective, the specific recommended attributes that should significantly
impact teaching and learning still need to be explored. The study analyzed each typical
learner's data statistically, with each indicator good or bad, taking values ranging
from 1–4 from small to large. Table 3 presents the specific results.
Table 3. Analysis of variance of the attribute and accuracy of the recommended resources.
Correspondence
|
Class III sum of squares
|
freedom
|
mean square
|
F
|
Significance
|
Correction model
|
12.864a
|
14
|
0.910
|
4.047
|
0.000
|
Th
|
6.241
|
3
|
1.421
|
7.151
|
0.000
|
Ca
|
0.465
|
1
|
0.516
|
2.364
|
0.163
|
De
|
0.246
|
1
|
0.246
|
1.036
|
0.221
|
Th×Ca
|
1.145
|
4
|
0.531
|
2.468
|
0.042
|
Th×De
|
1.984
|
1
|
0.359
|
1.634
|
0.201
|
De×Ca
|
2.093
|
1
|
2.147
|
11.397
|
0.001
|
Total
|
2042.000
|
732
|
/
|
/
|
/
|
The F-value of the model was 4.047, while the significance structure was 0.000 (Table 3), indicating that the analysis had some effect, where the question type and question
x difficulty level significantly affected the students’ reading accuracy. In contrast,
the other factors had little effect.
The same correlation analysis was performed on the output results of the learner’s
attributes in the data results, and then the results were clustered. The eigenvalue
transformation of the clustering results eventually yielded the radar plot of the
recommended preferences of the improved ID3 model for various classes of students,
as shown in Fig. 7.
Fig. 7. Recommended preference radar chart.
The algorithm in Fig. 7 has a higher requirement for difficulty in the resources recommended for category
A. This indicates that these students have better reading ability and have their own
goals and mobility. The recommended resources for these students are more in line
with their requirements. Type B students prefer moderately difficult topics; subject
matter and question type can also influence their correct rates. The overall situation
of category C is similar to that of type A students, with a higher requirement for
difficulty. In contrast, the type of questions and topics have an average effect on
them.
This study compared and verified the running time of the ID3 and traditional decision
tree algorithms to verify the differences between the proposed method and previous
methods. The research set a data volume interval of 100 to 500. Table 4 compares the running time of the two algorithms for constructing decision trees.
The ID3 algorithm proposed in this paper had higher efficiency in constructing decision
trees than the traditional decision tree algorithms. The time difference between the
two algorithms increased as the amount of data increased. When the data volume was
500, the running time of the ID3 algorithm in this article was 118.08 ms, which was
19.64% shorter than that of the traditional decision tree algorithm (146.94 ms). When
the data volume was 100, the running time of the ID3 algorithm was 24.64 ms, which
was 16.57% shorter than that of the traditional decision tree algorithm (29.54 ms).
These data fully demonstrate the superiority of the ID3 algorithm in processing large-scale
datasets. The performance of the traditional decision tree algorithms decreased gradually
as the amount of data increased, while the ID3 algorithm in this paper maintained
efficient computational speed. Therefore, the ID3 algorithm proposed in this article
has practical applications, especially when dealing with large-scale datasets.
Table 4. Comparison of the runtime between two algorithms for constructing decision trees.
Data volume
|
Run time/ms
|
Traditional Decision
Tree Algorithm
|
ID3 algorithm
|
100
|
29.54
|
24.64
|
200
|
61.56
|
50.17
|
300
|
88.15
|
68.91
|
400
|
118.68
|
95.34
|
500
|
146.94
|
118.08
|
5. Conclusion
Reading is one of the elements that improve English learners’ abilities, and this
learning module, which integrates vocabulary-linguistics and grammar, is also an important
element in assessing students’ English proficiency. Improving English reading proficiency
should focus on the students’ differences under the laws of education, and recommending
appropriate learning content tailored to them helps improve their accuracy. The decision
tree established in the study builds an adaptive recommendation model by considering
the students’ characteristics to adjust the algorithm. In addition, it also optimizes
the information gain formula by introducing correlation coefficients to prevent the
model from falling into a local optimum. The algorithm before and after the improvement
was tested. The improved algorithm fitted better with the reference value, and the
results of the four indicators for evaluating the performance showed that the accuracy
of the improved algorithm was above 95%. The average satisfaction of students with
the recommendation model was 91.22%. Although the accuracy of the standard ID3 model
did not reach 90%, the improvement path was still effective. Applying the algorithm
to students in an institution showed that the learner classification module in the
model can classify students into three categories. The adaptive recommendation for
the three types of students found that different types of students have different
requirements for resources. Students with strong learning abilities require more challenging
recommended reading, while students with average or poor ability need the recommendation
model to focus on topics and question types.
REFERENCES
Abbas A R and Farooq A O. (2019) ‘Skin Detection using Improved ID3 Algorithm’, Iraqi
Journal of Science, Vol. 60, No. 2, pp. 402-410.
An Y and Zhou H. (2022). ‘Short term effect evaluation model of rural energy construction
revitalization based on ID3 decision tree algorithm’, Energy Reports, No. 8, pp. 1004-1012.
Andrew, Russ, Gayle et al. (2018). ‘Decision tree for pretreatments for winter maintenance’,
Transpor-tation Research Record, Vol. 2055, No. 1, pp. 106-115.
Audina Y, Zega N and Simarmata A et al. (2020). ‘An analysis of teacher’s strategies
in teaching reading comprehension’, Lectura Jurnal Pendidikan, Vol. 11, No. 1, pp.
94-105.
Chakraborty S B and Chowdhury. (2021). ‘Teaching academic reading in English to the
undergraduate students at a government college of Bangladesh -Challenges and solutions’,
IOSR Journal of Research & Method in Education (IOSRJRME), Vol. 11, No. 2, pp. 49-63.
Duan X. (2021). ‘The application of activity-based method in English reading teaching
in senior high school’, Region - Educational Research and Reviews, Vol. 3, No. 2,
pp. 60-64.
Hong H, Liu J and Bui D et al. (2018). ‘Landslide susceptibility mapping using J48
Decision Tree with AdaBoost, Bagging and Rotation Forest ensembles in the Guangchang
area (China)’, Catena, No. 163, pp. 399-413.
Karthi M, Priscilla R and Benila E. (2018). ‘The patrons for anticipating the veracity
of rail mishaps using text mining and ID3 algorithm’, International Journal of Pure
and Applied Mathematics, Vol. 119, No. 15, pp. 1753-1759.
Li S, Laima S and Li H. (2018) ‘Data-driven modeling of vortex-induced vibration of
a long-span suspension bridge using decision tree learning and support vector regression’,
Journal of Wind Engineering and Industrial Aerodynamics, No. 172, pp. 196-211.
Ma Y. (2021). ‘The application of schema theory in the teaching of English reading
in senior high schools’, Region - Educational Research and Reviews, Vol. 3, No. 3,
pp. 17-20.
Ma Y. (2021). ‘The application of schema theory in the teaching of English reading
in senior high schools’, Region - Educational Research and Reviews, Vol. 3, No. 3,
pp. 17-20.
Maingi N N, Lukandu I A and Mwau M. (2019). ‘Inter-county comparative analysis of
ID3 decision tree algorithms for disease symptom burden classification and diagnosis’,
International Journal of Science and Research (IJSR), Vol. 8, No. 5, pp. 83-89.
Park K, Bell M G, Kaparias I and Belzner H. (2008). ‘Soft discretization in a classification
model for modeling adaptive route choice with a fuzzy id3 algorithm’, Transportation
Research Record, Vol. 2076, No. 1, pp. 20-28.
Pratama Y and Saragi H S. (2018). ‘Cassava quality classification for tapioca flour
ingredients by using ID3 algorithm’, Indonesian Journal of Electrical Engineering
and Computer Science, Vol. 9, No. 3, pp. 799-805.
Tulloch A, Nancy A and Stephanie A G et al. (2018). ‘A decision tree for assessing
the risks and benefits of publishing biodiversity data’, Nature Ecology & Evolution,
Vol. 2, No. 8, pp. 1209-1217.
Wu J. (2021). ‘The research on the English reading teaching mode aiming at the improvement
of thinking quality’, Region - Educational Research and Reviews, Vol. 3, No. 2, pp.
40-43.
Yi H. (2020). ‘Teaching strategies of cultivating humanistic literacy in reading teaching’,
Education Study, Vol. 2, No. 3, pp. 174-183.
Yu Q and Maele J V. (2018). ‘Fostering intercultural awareness in a Chinese English
reading class’, Chinese Journal of Applied Linguistics, Vol. 41, No. 3, pp. 357-375.
Zaiter W A. (2020). ‘Reading and writing skills: The challenges of teaching at college
level’, Addaiyan Journal of Arts Humanities and Social Sciences, Vol. 1, No. 10, pp.
41-51.
Zhou Q. (2021). ‘The application of TBLT to English reading teaching in junior high
school’, Region - Educational Research and Reviews, Vol. 3, No. 2, pp. 52-55.
Author
Ruixue Zhang obtained her Master’s Degree in English Language and Literature (2009)
from the Southwest University in China. Presently, she is working as a professor in
the Department of College English, Zhejiang Yuexiu University, Shaoxing. She has published
articles in more than 10 national or international journals and conference proceedings.
Her areas of interest include English Teaching and Educational Management.