1. Introduction
In the field of transportation infrastructure, monitoring road conditions is the most
challenging task on a worldwide scale because, without proper maintenance, the maintenance
and repair costs of the roads significantly increase vehicle damage and road accidents.
Major road accidents occur because of the poor conditions of the road surface [1-3].
The major focus while maintaining good quality roads is to support an efficient road
network and reduce traffic accidents. Nevertheless, road maintenance must face various
daily challenges, such as weather conditions, heavy manpower and heavy traffic loads.
An efficient and low-latency road surface monitoring system is needed to meet the
demands of frequently repairing the deteriorating road surface. Thus far, traditional
systems use equipment that is too expensive, such as LIDAR and GPR, which makes it
less efficient for deploying at a very large scale [5,6].
The major anomaly that comes as distress in the roads is as follows:
· Rutting
· Patching
· Cracking
· Alligator
· Block
· Traverse
· Longitudinal
· Raveling
· Potholes [4]
Therefore, this methodology was used to focus on these anomalies and analyze them
using machine-learning methodology.
2. Methods
The goal of this study was to design a low latency efficient road monitoring system
using a machine learning methodology to classify various conditions of road surfaces
using the data collected using smartphones. The results over a single axis were compared
with the three axes system and performances of neural networks to classify the conditions
of road surfaces. The proposed system was divided into various steps for a better
evaluation of the data, which are as follows (Fig. 3):
- Data Collection
· GPS location system
· Gyro meter
· Accelerometer
- Data processing
· Labeling
· Filtering
· Extraction of features
- Machine learning designing and training
· Support Vector Machine
· Decision Tree
· Neural Networks
· Design Model Evacuation
· Classification
Fig. 1. Cracking and its types.
Fig. 2. Rutting and its types.
Fig. 3. Block Diagram of Proposed System.
Fig. 4. Sample Data acquired during data collection [38].
A. Data Collection
The data collection stage is the stage that handles the situation of collecting and
recording data according to the requirements of the present system. Different sensors
of smartphones, such as GPS sensors, gyrostat, and accelerometer, were used.
As expected, the calculations might differ with different vehicles because of their
features and suspension quality. Therefore, the data were collected from two types
of vehicles by stabilizing smartphones and mounting. These vehicles were a car and
motorcycles [8,9].
The figure shows one of the collected data samples with readings as a sample. The
latitude and longitude records of the vehicle using GPS technology were used to detect
its location at every point in time. The rotation around all three axes is taken through
a gyro meter, and acceleration is obtained from all three axes using an accelerometer.
B. Data Processing
After collection, the data quality is deficient because it contains too many fluctuations,
which must be processed for analyzing and classifying the road surface quality. The
readings were taken using an accelerometer and gyro meter to determine the speed and
locations. The data were collected under different frames of reference on the same
device, as shown in Fig. 5.
Fig. 5. Frame of references of Accelerometer and Gyro meter on the same Cartesian Plane [38].
Fig. 6. Global frame of references common to both gyro-meter and accelerometer [38].
The coordinate axis reference frame needs to be reoriented to the global reference
frame that can become common with a redefined and reprocessed frame of reference of
the Gyro meter. The reorientation algorithm was used to reorient the data set obtained
using an accelerometer through Eyler’s angles.
The condition in which vehicles are at resting position on a horizontal surface was
taken because the ideal acceleration value is mentioned below:
The following equations were used to convert the accelerometer reference frame to
the Global reference frame by calculating two of the three Euler angles [30]
The road surface condition needs to be labeled in various parts for proper operation
on the ground level of the supervised machine learning algorithm. The conditions in
various road pavements could be labeled using the video recorded by the specialized
developed application [7,10].
Subsequently, the data collected regarding the speed and location coordinates are
being trained to detect the exact position of the road surface anomalies because of
the spline interpolation of GPS data, location, and speed calculation because of the
variance in the sampler rate of the Gyro meter or accelerometer. On the other hand,
the driving conditions, such as accelerating, decelerating, turning, and changing
lanes, which are unrelated to the road surface condition, are also considered. The
11$^{\mathrm{th}}$ order of high pass filters, attenuation of 80db, and a cut-off
frequency of 3Hz for filtering it out are needed to remove such disturbances in X’
and Z’ of acceleration data. This results in the omission of low-frequency bands and
the saving of high-frequency bandwidth of the data.
· Feature extraction
While driving, many road vibrations occur, which has various features. Nevertheless,
this study focused mainly on three broad domain categories: frequency, time, and wavelet
domain features.
✔ Time Domain Features
Through the analysis done according to Gandelmawla et al. on various domains in his
present and previous literature, the following were used to calculate from time domain
signals and their channels, crests, and packets of the signal: maximum, minimum, and
mean values; RMS; Peak to Peak; Ten-Point Average values [11].
✔ Frequency Domain Features
Focusing on the vibrational signals, the power spectral density would provide information
to distinguish between road surface conditions [12,13]. The signals were calculated for the windowed system, and the bandwidth was divided
into various signals of small bands of 5Hz each, of which average, maximum, and RMS
band values are considered frequency domain features.
✔ Wavelet Domain Features
Through the extensive analysis and study of Griffith’s studies, the RMS points and
ten-point averages of the studied scales were used to consider the wavelet domain
features [14-16].
The last task in features extraction is to extract the features from the dataset evaluated
of the accelerometer, in which the analysis found that the Y’ axis is just used to
distinguish road surface anomalies and the X’ and Z’ axes analysis that is affected
by every change in the road surface and provides information that can be used to distinguish
cracks and potholes.
Therefore, 54 features were considered using an accelerometer, and 162 feature values
for every feature vector were used.
C. Machine Learning Designing and Training
The research and methodology should have the self-capacity to learn and improve. AI
(Artificial Intelligence) was used for machine learning to improve with experience.
After training the system, the algorithm uses the relationship learned to solve the
same type of previously solved problems. For implementation, it is important to understand
the workflow diagram used (Fig. 7).
Fig. 7. Workflow diagram of Machine Learning Algorithms.
Fig. 8. Decision Tree Diagram [38].
Fig. 9. Structure of the Neural Network with one hidden layer.
No standard nomenclature exists for implementing machine learning algorithms. Hence,
selecting classifiers becomes difficult. Seven Machine Learning Classifiers were applied
from MATLAB and its incorporation of The Statistics and Machine Learning Toolbox [17-19]:
· Classification introduced by Naïve Bayes
· Analysis based on Discriminants
· Classifier based on Ensemble
· Decision Tree Induction
· Nearest Neighbours Tree
· SVM (Support Vector Machine)
· NN (Neural Networks)
Further analysis of their results revealed the three most popular and reliable techniques
to work on the data set of the processed road surface conditions:
· SVM
· Decision Trees
· Neural networks
Randomization and division of data sets in the ratio of 80:20 were used for training
and testing for the system. This data set was used in the three selected classifiers.
I) Support Vector Machine (SVM)
SVM works on the pattern recognition methodology for classifying and regression analysis
by evaluating input datasets as a supervised machine learning methodology. SVM uses
the hyperplane for classification because a hyperplane is the maximum margin between
the data point clusters of various classes [20]. Versatility, memory efficiency, and efficiency in high-dimensional spaces are needed.
II) Decision Trees
Classification or regression trees are decision trees that predict the output using
the input responses. Moving from the root node to the leaf node provides the output
based on the input values and decisions made in the path [21]. Fig. 8 shows the decision tree algorithm that categorizes the data using a series of arithmetical
tests.
III) Neural Networks
Neural networks are a set of algorithms that are designed to recognize patterns and
modeled loosely after the human brain. They use machine perception, labeling, or clustering
of raw input to interpret the sensory data. The patterns they recognize are numerical
and contained in vectors, into which all real-world data, images, sound, text, or
time series must be translated [34-36].
Neural network algorithms are used to connect the class labels (output layer) to the
feature vectors (input layer) using another multi-layered network layer called hidden
layers by the neural network algorithms (depicted in diagram 9). The total count of
hidden layers needed can be determined by measuring the complexity of the classification
problem [22-24].
The neural network helps cluster and classify the data. Therefore, this layer can
be taken as a clustering and classification layer on top of the data that needs to
be stored and managed. They group the unlabelled data according to the similarities
found among the example inputs, and the labeled dataset is then classified to train
on [39]. Although neural networks are extremely powerful and high-accuracy algorithms, they
require a large dataset to train them, which increases as the number of hidden layers
increase.
D. Designed Model Evaluation
As mentioned earlier about the classifiers, different performance assessment methodologies
are used in machine learning models to evaluate the classifier performance because
some parameters are used to measure their performance efficiently. For efficient analysis
and classification methodology, they are derivative of the confusion matrix, which
shows the working of a supervised machine-learning algorithm in tabular form based
on the true negatives (TN), true positives (TP), false negatives (FN), and false positives
(FP) in the following equation:
The performances of the simple decision tree and the SVM classifiers were assessed
using average test accuracy and average training loss on the trained classifier, where
the accuracy of the average test is the average of the correctness for the n-iterations
dataset during classification. In contrast, the average training loss is the average
in the sample loss of the skilled classifier model using a working-out dataset. In
addition, the average recall and average precision are recorded for three different
modules projected by the model for analyzing them and finding which portion of identification
is correct and which portion of actual positives are identified correctly.
Table 1 lists the results of the time requirement comparison, which are required to evaluate
the significance of the time requirement for feature extraction. It compares three
axes with one axis (Y-axis).
Table 1. Average Time Requirements for Feature Extraction.
Parameter
|
All Axis (ms)
|
Y axis (ms)
|
Filtering Using HIGH PASS FILTER
|
.158
|
.0211
|
Extraction of Feature using TIME DOMAIN
|
16.43
|
5.56
|
Feature Extraction using Frequency Domain
|
1.75
|
1.105
|
Feature Extraction using Wavelet Domain
|
53.41
|
19.65
|
Total
|
71.748
|
26.336
|
3. Results
This study analyzed and evaluated the results of the machine learning models to know
their capabilities in detecting road surface anomalies. The algorithm was run on a
Lenovo Idea Pad 310 on Microsoft Windows 10 Home OS with an Intel core of i5-7200
processor, 2.30 GHz CPU, and 8GB RAM.
A. SVM – Support Vector Machine
The simple SVM was implemented with one hundred iterations, where every iteration
use different groupings of occurrences for the testing and training. The machine also
maintains the same ratio every time for all classes. The generalized performance of
the algorithm was evaluated using the average values of evaluation parameters. Table 2 lists the calculated training loss, accuracy of testing, correctness, and recall
rates.
Table 2. Implemented Results of Simple SVM.
Parameter (AVG)
|
all Axis (ms)
|
Y axis (ms)
|
Training Loss
|
.0153
|
.0582
|
5-fold Loss
|
.079
|
.0891
|
7-fold Loss
|
.075
|
.0956
|
10-fold loss
|
.0821
|
.089
|
Leave one out loss
|
.071
|
.082
|
An analysis of Table 2 showed that the SVM classifier was trained using structures with lower loss and performed
better on all three axes than only one axis. Table 3 lists the data calculated on a cross-validated SVM over both cases of considering
all three axes and considering only one axis, Y. Considering all three axes in a classifier
has less training loss and fewer loss cross-validated errors.
Table 3. Implemented Results of Cross-Validated SVM.
Parameter (AVG)
|
all Axis (ms)
|
Y axis (ms)
|
Training Loss
|
.0312
|
.0942
|
Test Accuracy
|
.754
|
.895
|
|
Crack
|
Pothole
|
Smooth
|
Crack
|
Pothole
|
Smooth
|
Precision
|
.521
|
.6724
|
.892
|
.412
|
.668
|
.883
|
Recall
|
.323
|
.5149
|
.8762
|
.321
|
.521
|
.815
|
B. Decision Tree
The decision tree was produced using a highly varying set of hyperparameters with
each iteration with the number of nodes and node thresholds, which is easier and faster
for training. The process can be implemented just like SVM, and 500 iterations can
be used using an exclusive collection of training data sets for implementation in
SVM. Table 4 lists the training loss, precision, recall, and test accuracy of the decision tree
implementation.
Table 4. Implemented Results of the Decision Tree.
Parameter (AVG)
|
all Axis (ms)
|
Y axis (ms)
|
Training Loss
|
0.036
|
0.0613
|
Test Accuracy
|
0.886
|
0.816
|
|
Crack
|
Pothole
|
Smooth
|
Crack
|
Pothole
|
Smooth
|
Precision
|
.516
|
.642
|
.944
|
.379
|
.825
|
.851
|
Recall
|
.437
|
.554
|
.868
|
.121
|
.746
|
.816
|
The classifier Decision Tress trained with features had lower loss and performed better
on all three axes than on only one axis (Table 4). The features were calculated on a cross-validated model over both cases, considering
all three axes and considering only one axis, Y (Table 5). Considering all three axes in the classifier also gave less training loss and fewer
loss cross-validated errors.
Table 5. Cross-Validated Implemented Results of Decision Tree.
Parameter (AVG)
|
all Axis (ms)
|
Y axis (ms)
|
Training Loss
|
.0214
|
.0653
|
5-fold Loss
|
.079
|
.0981
|
7-fold Loss
|
.086
|
.0841
|
10-fold loss
|
.0821
|
.089
|
Leave one out loss
|
.089
|
.081
|
C. Neural Networks
The introductory examination phase of implementing an MLP neural network classifier
includes the assessment of test precision, accuracy, and recall for the different
parametric groupings selected earlier. The result is tabulated in Tables 6 and 7.
Different numbers of hidden layers were used to compare the effects of the number
of hidden layers in the analysis. The average test precision, recall rates, and accuracy
of the MLP model were relatively greater for all three axes as related to one axis.
Tanh was used as the activation function because the models are trained through features
from one axis only, which provides higher recall rates and precision among the three
classes. Nevertheless, high recall and precision rates for crashes were obtained using
ReLU and taking all three axes. The recall rates and accuracy for potholes and plane
roads did not change between the two activation functions [37].
Table 8 lists the average time acquired to categorize a single data set of data features
from all three axes for diverse ML algorithms and classifiers.
Table 6. Execution Results with the Tanh-Results for MLP.
Hidden Layers
|
Features of all axes
|
Features of Y' Axis only
|
Accuracy of Test
|
7
|
.752
|
.821
|
8
|
.891
|
.749
|
9
|
.781
|
.814
|
10
|
.823
|
.715
|
Accuracy Rates
|
|
Crack
|
Pothole
|
Smooth
|
Crack
|
Pothole
|
Smooth
|
7
|
.641
|
.632
|
.876
|
.42
|
.512
|
.875
|
8
|
.612
|
.641
|
.824
|
.216
|
.532
|
.87
|
9
|
.53
|
.658
|
.743
|
.43
|
.542
|
.89
|
10
|
.521
|
.617
|
.813
|
.41
|
.69
|
.991
|
Recall Rates
|
|
Crack
|
Pothole
|
Smooth
|
Crack
|
Pothole
|
Smooth
|
7
|
.68
|
.419
|
.97
|
.312
|
.62
|
.85
|
8
|
.59
|
.82
|
.98
|
.343
|
.68
|
.875
|
9
|
.58
|
.69
|
.99
|
.39
|
.574
|
.83
|
10
|
.52
|
.75
|
.94
|
.41
|
.661
|
.84
|
Table 7. Implementation results using the Re LU-Results for MLP.
MLP Hidden Layers Count
|
Features of all axes
|
Features of Y' Axis only
|
Accuracy of Test
|
7
|
.711
|
.8525
|
8
|
.989
|
.7865
|
9
|
.823
|
.8642
|
10
|
.8023
|
.7643
|
Accuracy Rates
|
|
Crack
|
Pothole
|
Smooth
|
Crack
|
Pothole
|
Smooth
|
7
|
.532
|
.625
|
.876
|
.42
|
.772
|
.882
|
8
|
.424
|
.698
|
.789
|
.415
|
.634
|
.89
|
9
|
.51
|
.786
|
.943
|
.39
|
.692
|
.98
|
10
|
.456
|
.721
|
.912
|
.39
|
.79
|
.912
|
Recall Rates
|
|
Crack
|
Pothole
|
Smooth
|
Crack
|
Pothole
|
Smooth
|
7
|
.98
|
.679
|
.92
|
.265
|
.69
|
.87
|
8
|
.424
|
.69
|
.912
|
.282
|
.81
|
.812
|
9
|
.56
|
.786
|
.885
|
.48
|
.734
|
.86
|
10
|
.521
|
.79
|
.934
|
.41
|
.751
|
.89
|
Table 8. Testing Time – Classifier Performance.
Classifier
|
Average Time to Categorize one Window (micro-seconds)
|
SVM
|
30.12
|
Decision Trees
|
5.113
|
MLP
|
38.134
|
4. Discussion & Future Work
Machine Learning approaches are quite effective for potholes and cracks. In addition,
implementing all three axes to consider the feature use is more accurate and precise
than only a single axis. This approach was novel for extracting many features from
all three axes in training multiclass ML classifiers. On the other hand, there were
some limitations that will be addressed in future works:
a) Loss in accuracy and precision because of the small number of data sets.
b) Errors in individual precision and recall rates because the distribution of instances
was disproportional in potholes, cracks, and smooth areas.
c) Possibilities of better results as the study focused on specific architectures
only in neural networks.
d) Impacts on the data acquired in data acquisition because of the condition of vehicles,
smartphones, and other devices.
e) Possibilities of larger scale implementation because of the self-learning methodology
of ML technologies.