Mobile QR Code QR CODE

  1. (Department of Railroad Engineering & Transport Management, Woosong University, Daejeon, Korea binayak@sis.ac.kr)
  2. (Department of Railway Vehicle System, Woosong University, Daejeon, Korea inspiration@live.wsu.ac.kr)



Surface roughness, Artificial intelligence, Deep learning, Evaluation method, Light and shade

1. Introduction

Industry 4.0 or the Fourth Industrial Revolution represents a paradigm shift in all manufacturing industries with the introduction of smart manufacturing systems. Industry 4.0 revolves around IoT, big data, and artificial intelligence (AI). Because of this, an intelligent system is preferred over a simple automated system in the manufacturing sector. Factories with such intelligent systems are called smart factories [1]. However, the extent to which smart factories have been introduced in precision manufacturing industries as of 2019 shows that many companies are still using simple automated systems.

Many advanced manufacturing countries like Japan, Germany, the United States, China, and Korea have begun strategically preparing for a smooth transition to smart factories. Data, technology, processes, people, and security are the major components in establishing a successful smart factory [2-4]. Various arenas of smart industry are shown in Fig. 1.

Germany has implemented Information and Communication Technology (ICT) in the manufacturing industry to control all processes of manufacturing, procurement, logistics, and service [5]. Smart factories not only support efficient production but also create an environment for innovation. For the smooth functioning of smart factories, all the assets should be able to communicate with a central control system. Such communication requires the implementation of various digital and physical technologies, such as additive manufacturing, robotics, advanced materials, and AI. Another important characteristic of a smart factory is autonomously running processes and making intelligent decisions based on various AI technologies.

There is a lack of autonomy in the quality inspection of the surface roughness of precision manufactured products, which could contribute to factors such as costs for introducing various inspection facilities and inspector training [6,7]. In general, surface roughness measurements are categorized into contact and non-contact methods. Contact methods use a stylus like a probe to directly touch the surface of precision products. Such processes generally have a higher unit price, can cause potential damage to the surface due to stylus pressure, and require considerable inspection time. Contactless quality inspection tools like coherence scanning interferometers, focus variation instruments, chromatic confocal microscopy, etc., are relatively free from problems such as stylus durability and damage to the measured object. However, the uses of these noncontact methods are limited due to the inconvenient operation, high investment, inflexibility of the measurement environment, the size limit of the measured object, and processing time.

Computer vision (CV) is a technology that is used for building artificially intelligent systems that obtain information from images of objects or multi-dimensional data to provide visual abilities to smart devices. Deep learning is a type of machine learning algorithm with a large number of layers in a neural network, which has given a huge boost to computer vision. Backed by deep learning technologies, CV has found many new applications, such as face recognition and classification of video, image, text, and voice [8,9]. CV backed by deep learning has various advantages, such as high accuracy, high-speed measurement and categorization, low cost, flexibility, non-contact operation, and extraction of large amounts of information. Additionally, supplementary information that can be obtained from images through mathematical calculations enables more accurate prediction in terms of recognition and classification.

This paper proposes a system for evaluating and classifying surface roughness using object surface shade distribution. The goal is to help the manufacturing industries make a smooth transition to smart factories by overcoming the limitations of traditional manufacturing industries. The system uses CV and deep learning technologies. Surface roughness measurement is one of the quality inspection steps for precision manufactured products. The newly developed method could possibly be deployed in production assembly lines to efficiently classify product surface quality in almost real time. This could prove economical in both time and cost.

Fig. 1. Cores of smart industry.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig1.png

2. Configuration and Design of the System

2.1 Background

When light is incident on an object, the characteristics of the reflected light are affected by differences in surface reflection, such as diffusion or blocking of light, which result in shading. The details of shading are described in the Cook-Torrence reflection model [10], which represents the reflection of light appearing in real objects in computer graphics. The Cook-Torrence reflection model is based on Physically Based Rendering (PBR) [11]. The model is a shader that uses computer graphics to regenerate the surface of an object with microfacets that cannot be observed by the naked eye. The shader then simulates the reflection.

The reflection conditions determined by micro facets can be expressed by Eqs. (1) and (2):

(1)
I r = I i a R a + l I i l ( n l ) d ω i l ( s R s + d R d )
(2)
r s = D × G × F 4 × ( n l ) × ( n v )

where $\textit{I}$$_{r}$ is the intensity of the reflected light, $\textit{I}$$_{ia}$ is the intensity of the incident ambient light, R$_{\mathrm{amb}}$ is the ambient reflectance, and $\textit{I}$$_{i}$ is the average intensity of the incident light. $\textit{N}$ is the unit surface normal, $\textit{L}$ is the unit vector in the direction of the light, and $\textit{d}$ is the fraction of reflectance that is diffuse. $\textit{d}{\omega}$$_{i}$ is the solid angle of the beam of the incident light, s is the fraction of reflectance that is specular, $\textit{R}$$_{amb}$ is the ambient reflectance, and $\textit{R}$$_{d}$ is the diffuse bidirectional reflectance. $\textit{D}$ is the facet slope distribution function, $\textit{G}$ is the geometrical attenuation factor, $\textit{F}$ is the reflectance of a perfectly smooth surface, and $\textit{R}$$_{s}$ is the specular bidirectional reflectance. Fig. 2 compares the surface reflection with the variation of G, which is visually distinct. The image pixel standard deviations (${\sigma}$) and mean values (${\mu}$) of the images vary, as shown in Table 1.

Table 1. Comparison of pixel standard deviation and mean value of histogram inFig. 2.

Fig. 2(a)

Fig. 2(b)

Std. Dev.

7.103

26.168

Mean

158.573

174.439

Table 2. Separation of labeling class based on Korean and ISO standards.

Class

Range of

Ra Value (µm)

Machining

type

Symbol

Roughness N

ISO Grade

Class 1

6.3-25

Rough

N10 - N11

Class 2

1.6-6.3

Coarse

▽▽

N8 - N9

Class 3

0.2-1.6

Smooth

▽▽▽

N5 - N7

N/A

0-0.2 µm

Precise

▽▽▽▽

N1 - N4

Table 3. Sample size for each class.

Class

Number of samples

Class 1

86

Class 2

118

Class 3

101

Table 4. Specifications of the ViTiny UM12 camera.

Magnification

10X to 280X

Lens & CMOS

sensor

5M pixels

Working distance

9.6mm to 140mm

Light Source

8 white LED lights

Metal steel stand

360° rotation and height adjustment

Fig. 2. Comparison of surface reflection by G value change.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig2.png
Fig. 3. Mitutoyo stylus SJ-210.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig3.png
Fig. 4. Surface roughness measurement using a stylus.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig4.png

2.2 Surface Roughness Measurement and Sample Classifications

Arithmetic mean deviation (Ra) values were measured using a Mitutoyo SJ-210 surface roughness tester following ISO 1997 standards [12]. Fig. 3 shows the Mitutoyo SJ-210 stylus with 0.75mN60$^{\circ}$, 2${\mu}$mR, which was used to measure the surface roughness of paper samples.

The stylus of the roughness tester is mechanically drawn across the surface across the sample length and provides various roughness parameters, as shown in Fig. 4. Rmax is the largest difference between peaks and valleys in an individual sample length, Ra is the average roughness over the sampling length, and Rt is the distance between the highest peak and lowest valley over the sampling length. The Ra values for the sample paper surface ranged from 0.325 ${\mu}$m to 24.709 ${\mu}$m.

Labels were divided into three categories following the ISO 1302:2002 standard [13], as shown in Table 2. The precision machining category of machining was excluded from this study because of insufficient samples. All the categorical features in the training data were encoded as a one-hot numeric array using the function $\texttt{sklearn. preprocessing.OneHotEncoder()}$. The number of images in each class is shown in Table 3.

2.3 Image Sampling and Image Color Space Conversion

Images of the paper samples were taken using a ViTiny UM12 Microscope camera with a magnification of 18X in fixed focus environments. The images were taken and stored at an image resolution of 640x480 pixels. The detailed specifications of the ViTiny UM12 microscope camera are shown in Table 4.

This study focuses on using paper as roughness samples for deep learning as they relatively easy to acquire and also have various surface qualities, textures, and materials. The paper samples were from the Samwon Paper Ltd. and Doosung Paper Co., Ltd. A total of 305 paper samples were selected for extracting adequate feature information for deep neural network (DNN) training.

To prevent interference from ambient light, a custom darkroom (420mm ${\times}$ 297mm ${\times}$ 350mm) was built for taking images from the microscopic camera, as shown in Fig. 5. To collect shading information based on the surface roughness of the sample, a light source was installed normal to the surface of the object. A digital microscope camera for photographing the sample surface was placed within the field of view (FOV) adjacent to the light source. Images taken in these environments were sent to the learning data acquisition system via OpenCV, a popular CV library [14].

The default format of the images taken by the camera was the JPG format. JPG images are stored in RGB color space based on the sensitivity of color detection cells in the human visual system. However, YCbCr color space is often used in digital image processing to take advantage of the lower resolution capacity of the human visual system for color luminosity. Thus, conversion from RGB to YCbCr color space is widely used in image processing.

Each pixel of an image in RGB format varies in intensity from 0 to 255, with 0 representing black and 255 representing white. RGB color space can be converted to YCbCr using Eq. (3), which can be implemented using the function $\texttt{cv2.cvtColor(input\_image, cv2.COLOR\_}$ $\texttt{BGR2YCrCb}\texttt{)}$ in OpenCV.

(3)
$$Y=16+\frac{65.738R}{256}+\frac{129.057G}{256}+\frac{25.064B}{256}$$ $$Cb=128-\frac{37.945R}{256}-\frac{74.494G}{256}+\frac{112.439B}{256}$$ $$Cr=128+\frac{112.439R}{256}-\frac{94.154G}{256}-\frac{18.285B}{256}$$

In YCbCr color space, Y channels contain the luminance (intensity) component, and Cb and Cr have the color information. As human and neural networks are more sensitive to luminance changes, Y channel information will suffice for deep learning, and Cb and Cr channel information is generally not required. When comparing this to the average Roughness (Ra) measured by the stylus, a distinction can be seen. In Fig. 7, the lower groups have a rougher surface than the upper group, which is indicated by data and is also visually noticeable. A histogram of these results is shown in Fig. 7(b).

The shading of the surface when illuminating the sample with the light source shows a significant relation to the surface roughness of the sample. Hence, quantitative indicators such as standard deviation, mean value, and histograms were extracted from the Y channel of the images. The detailed procedure is explained in the next section.

Fig. 5. A custom-built darkroom for imaging in a controlled environment.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig5.png
Fig. 6. Representative sample images and histograms for training deep learning model (a) Comparison of roughness samples in YCbCr image, (b) Comparison of roughness samples in Histogram image, (c) Comparison of roughness samples in Original image.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig6.png
Fig. 7. Classification based on KNN classifier (a) Original data distribution, (b) Fail case of KNN cluster 1, (c) Fail case of KNN cluster 2.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig7.png
Fig. 8. A typical LSTM cell.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig8.png

2.4 Dataset

An image is essentially an array containing pixel data. NumPy [15] was used to manipulate the image. NumPy (np) is a fundamental package for scientific computing in the Python programming environment and contains a large number of functions for mathematical operations. Thus, the Y channel, standard deviation, mean, and roughness class based on Ra values were taken into consideration while creating a deep learning model to evaluate the surface roughness of the sample.

The standard deviation and mean values were calculated using the functions np.std() and np.mean() of NumPy for images composed of only Y channel information. For ease of training, the image was resized to 480 x 480 pixels. LabelEncoder() and OneHotEncoder() functions from scikit-learn were used to encode the class label and the numerical data and saved in *.npy format, which is the standard binary file format in NumPy.

2.5 Computer Software and Hardware

The deep learning model training was performed on a dedicated workstation with 64 GB of RAM, GeForce GTX 2080Ti FTW3 ${\times}$ 2 GPUs, and AMD Ryzen 3960X processor. The workstation runs with Ubuntu 18.04 LTS and Python 3.7.

2.6 Model Design

Provided the roughness of the surface, the K-Nearest Neighbor Algorithm (KNN) [16] can be applied using $\texttt{sklearn.neighbors.KNeighborsClassifier()}$ for the classification of surface roughness. Similarly, histogram comparison using OpenCV function $\texttt{cv2.compareHist}$() can also be used for a similarity check when the color histogram is provided. However, both methods have limitations. KNN needs the measured roughness values for classification, which is time consuming, expensive, and considered to be a bottleneck operation in categorizing the roughness of the surface. Also, if two neighbors k+1 and k have identical roughness values but different labels, the result will depend on the ordering of the training data, making it very difficult to correctly predict the class. This is shown in Fig. 8. Histogram comparison uses various metrics for comparison. However, the values might be different under other lighting conditions.

To overcome these limitations, the Keras functional API was implemented to build a composite deep learning model. The core of deep learning is Artificial Neural Networks (ANNs), which are versatile, powerful, and scalable. This makes them ideal for tackling large and highly complex machine learning tasks such as image classification, speech recognition, machine translation, and recommender systems [17]. The functional Keras API was used to develop complex models with multiple inputs and modalities that offer ways to create deep learning models that have much more flexibility and complexity. This work implements a composite model including a Convolution Neural Network (CNN) and bidirectional Long-Short Term Memory (LSTM) for roughness classification. The inputs for the deep learning model are the image histogram, standard deviation, and mean values of the luminance of the image.

A CNN is a type of ANN developed from the idea that human visual nerves have partial connective structures. It consists of a convolution layer and a pooling layer, not a full-connected layer, making it different from a conventional ANN [18]. The convolution and pooling layers each have a role in extracting and compressing features from the input data, and this structure enhances the training of the model as it goes through the CNN layers.

Eq. (4) gives the output of a given neuron in a convolution layer:

(4)
$$ z_{i, j, k}=b_{k}+\sum_{u=0}^{f_{h}-1} \sum_{v=0}^{f_{w}-1} \sum_{k^{\prime}=0}^{f_{n^{\prime}}-1} x_{i^{\prime}, j^{\prime}, k^{\prime}}, w_{u, v, k^{\prime}, k} \text { with }\left\{\begin{array}{c} i^{\prime}=i \times s_{h}+u \\ j^{\prime}=j \times s_{w}+v \end{array}\right\} $$

where $\textit{z}$$_{i,j,k}$ is the output of the neuron located in row $\textit{i}$ and column $\textit{j}$ in feature map $\textit{k}$ of the convolution layer (layer $\textit{l}$). $\textit{s}$$_{h}$ and $\textit{s}$$_{w}$ are the vertical and horizontal strides, $\textit{f}$$_{h}$ and $\textit{f}$$_{w}$ are the height and width of the receptive field, and $\textit{f}$$_{n` }$ is the number of feature maps in the previous layer ($\textit{l}$-1). $\textit{x}$$_{i` ,j` ,k` }$ is the output of the neuron located in layer $\textit{l-1}$, row $\textit{i’}$, column $\textit{j’}$, feature map k’, $\textit{b}$$_{k}$ is the bias term, and $\textit{w}$$_{u,v,k` ,k}$ is the connection weight between any neuron in feature map $\textit{k}$ of the layer $\textit{l}$.

LSTM is a model that seeks to address the long-term dependency problems of existing Recurrent Neural Networks (RNNs) by introducing input, output, and forget gates, which change training structures [19]. Each gate determines specific operations, thereby finding the core of the learning data and allowing the model to remember its content longer. Fig. 9 shows a typical LSTM cell, and Eq. (5) summarizes its output at each time step.

(5)
$$ \begin{array}{l} \mathrm{i}_{(t)}=\sigma\left(\mathrm{W}_{x i}{ }^{\mathrm{T}} \mathrm{x}_{(t)}+\mathrm{W}_{h i}{ }^{\mathrm{T}} \mathrm{h}_{(t-1)}+\mathrm{b}_{i}\right) \\ \mathrm{f}_{(t)}=\sigma\left(\mathrm{W}_{x f}{ }^{\mathrm{T}} \mathrm{x}_{(t)}+\mathrm{W}_{h f}{ }^{\mathrm{T}} \mathrm{h}_{(t-1)}+\mathrm{b}_{f}\right) \\ o_{(t)}=\sigma\left(\mathrm{W}_{x o}{ }^{\mathrm{T}} \mathrm{x}_{(t)}+\mathrm{W}_{h o}{ }^{\mathrm{T}} \mathrm{h}_{(t-1)}+\mathrm{b}_{o}\right) \\ g_{(t)}=\tanh \left(\mathrm{W}_{x g}{ }^{\mathrm{T}} \mathrm{x}_{(t)}+\mathrm{W}_{h g}{ }^{\mathrm{T}} \mathrm{h}_{(t-1)}+\mathrm{b}_{g}\right) \\ c_{(t)}=f_{(t)} \otimes c_{(t-1)}+i_{(t)} \otimes g_{(t)} \\ y_{(t)}=h_{(t)}=o_{(t)} \otimes \tanh \left(c_{(t)}\right) \end{array} $$
Table 5. Details of deep learning conditions.

Hyperparameters

Value

Min

Max

Step

Default

Number of conv layers (Conv2D)

2

8

6

Number of Filters (Conv2D)

32

256

32

64

Number of layers

(LSTM)

2

8

6

Units (LSTM)

32

512

32

64

Optimizer

Adam, RMSprop, SGD

Adam

Max trials

20

Executions per trial

3

Fig. 9. The detailed architecture of the composite CNN and bidirectional LSTM used in the study.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig9.png

c$_{\mathrm{(t)}}$ and c$_{\mathrm{(t-1)}}$ are the long-term state at frame t and t-1, h$_{\mathrm{(t)}}$ and h$_{\mathrm{(t-1)}}$ are the short-term state at time t and (t-1), and x$_{\mathrm{(t)}}$ is the current input vector. Similarly, W$_{\mathrm{xi}}$, W$_{\mathrm{xf}}$, W$_{\mathrm{xo}}$, W$_{\mathrm{xg}}$, W$_{\mathrm{hi}}$, W$_{\mathrm{hf}}$, Who, and W$_{\mathrm{hg}}$ are the weight matrices of each of the four layers for their connection to the input vector x$_{\mathrm{(t)}}$ and previous short-term state g$_{\mathrm{(t-1)}}$ respectively. B$_{\mathrm{i}}$, b$_{\mathrm{f}}$, b$_{\mathrm{o}}$, and b$_{\mathrm{g}}$ are the bias terms for each of the four layers.

A wrapper layer known as $\texttt{Bidirectional()}$ was used in the model. Bidirectional LSTMs (or Bi-LSTMs) focus on the problem of obtaining the most out of the input sequence by stepping through input time steps in both the forward and backward directions [20]. Bi-LSTMs train two LSTMs instead of one LSTM on the input sequence.

In machine learning, many parameters control the learning process, which are known as hyperparameters [21]. Although there are numerous hyperparameters, only five were optimized in the proposed deep learning model. Before performing hyperparameter tuning, the search space was defined. The details of the parameters (name, minimum, maximum, steps, and default values) are presented in Table 5.

Hyperparameter tuning was carried out using the $\texttt{RandomSearch()}$ function in the Keras Tuner [22] library. The process for finding the best hyperparameters depends on the computing time and resources. In the current study, the search time was about nine hours on 2 ${\times}$ GeForce GTX 2080Ti FTW3 GPUs. The best accuracy achieved from the $\texttt{get\_best\_models()}$ function was 84.6%.

The CNN portion of the model used the $\texttt{Conv2D()}$ function for convolution operations, $\texttt{MaxPooling()}$ for dimensionality reduction, and $\texttt{Dropout()}$ to prevent the model from overfitting. The LSTM portion consists of a multilayered bi-directional LSTM cell and a $\texttt{Dropout()}$ to prevent overfitting. After concatenating CNN and LSTM, the $\texttt{softmax()}$ function was used in the last layer, and the $\texttt{categorical\_crossentropy}$ loss function was used because the study involves multiclass classification tasks. The detailed steps of model building and a comprehensive flow chart are shown in Fig. 10.

Although various numbers of epochs were tried, based on the minimum validation and training error, 30 epochs were used before saving the model in .h5 format. Overfitting occurred when training the model with over 35 epochs. The batch size was 4, and the RMSprop optimizer was used with an input image size of 480${\times}$480 pixels.

3. Results and Discussion

The code for the CNN + LSTM composite model was written in Python using TensorFlow’s implementation of the Keras high-level API. The complete code including data creation, data pre-processing, model design, and validation codes was uploaded to a Github repository and can be found at the following link: link: https://github.com/ thebinayak/New_Surface_Roughness_Classification_Method. Out of 305 images, 80% (244 images) was used as training data, and the remaining 20% (61 images) was used for testing the model. The training and test data were split using the $\texttt{train\_test\_split()}$ function from the scikit-learn package. An additional 27 fresh images were used as validation datasets for an unbiased evaluation. These images were completely new and were not used to test the model.

Fig. 10. Flow chart of the complete deep learning process.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig10.png
Fig. 11. Accuracy of a normal model.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig11.png

The model produced a validation accuracy of 85.246%, as shown in Fig. 11. In addition to the proposed composite model, several other models were also trained and validated for comparison. The simple CNN or LSTM models showed a validation accuracy of no more than 74%, which is significantly lower than the composite CCN + LSTM model, so they were discarded. Trials were also performed on images taken in RGB format, which showed poor performance compared to the YCbCr color space. This happened because YCbCr color space can separate luminance from chrominance more effectively than other color spaces.

The new model was built using the hyperparameter values acquired from the $\texttt{best\_model.summary()}$ function after hyperparameter tuning. The validation score from both models was comparable. Clean test data (n=27) were evaluated using the $\texttt{model.predict()}$ method, which gave 85.185% accuracy, which is close to validation accuracy.

Fig. 12. Precision and recall.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig12.png
Fig. 13. Comparing ROC curves and AUC score.
../../Resources/ieie/IEIESPC.2021.10.3.189/fig13.png

There are several performance evaluation measures for binary classification problems. These include cross-validation, the confusion matrix, precision and recall, and the receiver operating characteristic (ROC) curve. Binary classifiers distinguish between two classes only; however, multiclass classifiers can distinguish between more than two classes. Two of the commonly used strategies to classify multiclass problems using multiple binary classifiers are the one-versus-the-rest (OvR) and the one-versus-one (OvO) strategies.

In this study, we have implemented the OvR strategy for performance evaluation as it is generally preferred over OvO [23]. While using precision for the multiclass classification problem, it can be extended to multi-class classification by binarizing the output, such as ``Label y'' vs. ``not Label Y''. Precision (P) is defined as the number of true positives (T$_{\mathrm{p}}$) over the number of true positives plus the number of false positives (F$_{\mathrm{p}}$). Similarly, recall (R) is defined as the number of true positives (T$_{\mathrm{p}}$) over the number of true positives plus the number of false negatives (F$_{\mathrm{n}}$). The precision and recall are given in Eqs. (6) and (7).

(6)
$$P=\frac{T_{p}}{T_{p}+F_{p}}$$
(7)
$$R=\frac{T_{p}}{T_{p}+F_{n}}$$

Average precision (AP) summarizes the precision-recall plot as the weighted mean of precision achieved at each threshold:

(8)
$AP=\sum _{n}\left(R_{n}-R_{n-1}\right)P_{n}$

where P$_{\mathrm{n}}$ and R$_{\mathrm{n}}$ are the precision and recall at the nth threshold. Fig. 12 shows the precision-recall graph as well as the average precision score. It can be seen that the average precision score of class 3 is 1.00.

The receiver operating characteristic (ROC) curve is another metric used in classification. The ROC curve shows the true positive rate (TPR) against the false positive rate (FPR) using scikit-learn’s $\texttt{roc\_curve()}$ function. Similarly, the area under the curve (AUC) score is used to compare classifiers using scikit-learn’s $\texttt{roc\_auc\_ score()}$ function. It can be seen in Fig. 13 that the class 3 curve is closer to the top-left corner and has a greater AUC. The dashed line in the figure represents the ROC curve of a purely random classifier.

4. Conclusion

Quality control (QC) is an important part of every production process. In the Industry 4.0 era, finer surface roughness and tolerance are necessary for companies to compete in selling similar products. This is an important step after processes such as coating, printing, polishing, machining, forming, etc. Due to increasing labor costs and high competition, companies are looking for methods of checking surface quality that are fast and capable of analyzing large batches with little added effort.

A deep learning model was developed for classifying surface roughness using the light and shade composition of the surface. The developed method can be deployed in manufacturing industries producing precision-machined components to quickly assess the quality of precision processing. These companies can shift to a smart factory model that promotes low-cost, high-efficiency factory operations by applying IoT technology to manufacturing plants.

To train the proposed system, a dataset was constructed where luminance components were extracted from 305 surface roughness sample images to calculate mathematical standard deviations and mean values of surface shading. Histograms were also produced. Training was performed in an artificial neural network structure comprising the CNN + LSTM composite model. The datasets were divided at a ratio of 8:2 to allocate learning and non-learning test data. Based on this, the model's accuracy evaluation showed that the prediction accuracy for trained data was successful in achieving a significant generalization with a difference of less than 1% between training data and non-training data. The accuracy for non-training data was 81.967%, and when re-evaluation was conducted after optimization through Hyperparameter tuning, the accuracy was 85.185%.

The method developed in this study could enhance the speed of the surface quality control process compared to the stylus method or contactless laser method. This could facilitate the introduction of smart factory systems in the precision processing industry as CV and deep learning models decrease the initial cost of equipment and address environmental structural limitations. Also, it is believed that the system can be deployed in an automated process with optimal utilization of space, as well as improved productivity and labor cost savings compared to the existing quality assessment method, as it is easier to install the system directly in the automation process and is relatively free to install.

5. Limitations and Future Study

Although the results from the proposed method are promising, there are some limitations with the present studies. The surface roughness was classified into only three classes, which are sufficient for many industries. To match the classes according to ISO standards, more classes must be implemented. Because of the multi-step processes in generating roughness data, the amount of training data per class is too small. Higher accuracy can be expected for a bigger dataset.

This work is novel as the roughness measurement was done previously using expensive hardware or complicated methods. The work presented could assist medium and small industries that cannot afford to purchase high-end expensive pieces of hardware. Deployment of the proposed method in a real production and quality control environment could give further insight into how to proceed with improvements to the method. Future studies will focus on this aspect.

ACKNOWLEDGMENTS

This research was supported by 2020 Woosong University Academic Research Funding.

REFERENCES

1 
Yun I. C., 2019, A Strategy of Application for Smart Factories to the Precision Machining Industry., Republic of Korea: Korea Smart Manufacturing Industry Association.Google Search
2 
Bhandari B., Lee M., 2019, Haptic identification of objects using tactile sensing and computer vision., Advances in Mechanical Engineering, Vol. 11, No. 4DOI
3 
Kim DH. , Kim , T.J.Y. , Wang X. , et al. , 2018, Smart Machining Process Using Machine Learning: A Review and Perspective on Machining Industry., Int. J. of Precis. Eng. and Manuf.-Green Tech. 5, pp. 555-568DOI
4 
Lee G., Kim M., Quan Y., et al. , 2018, Machine health management in smart factory: A review., J Mech Sci Technol 32, pp. 987-1009.DOI
5 
Rojko A., 2017, Industry 4.0 concept: background and overview., International Journal of Interactive Mobile Technologies (iJIM), Vol. 11, No. 5, pp. 77-90DOI
6 
Kim M. H., Jung S. H., Lee C. G., 2019, Effect of Smart Factory Adoption and Policy Implications., KOREA DEVELOPMENT INSTITUTE.DOI
7 
Jeong Y-S., 2019, A Model Design for Enhancing the Efficiency of Smart Factory for Small and Medium-Sized Businesses Based on Artificial Intelligence., Journal of Convergence for Information Technology, Vol. 9, No. 3, pp. 16-21DOI
8 
Szeliski R., 2010, Computer vision: algorithms and applications., Springer Science & Business Media.DOI
9 
Deng L., Yu Dong., 2014, Deep Learning: Methods and Applications., Foundations and Trends® in Signal Processing, Vol. 7(3–4), pp. 197-387DOI
10 
Cook R. L., 1982, A Reflectance Model for Computer Graphics., ACM Transactions on Graphics, Vol. 1, No. 1, pp. 18DOI
11 
Pharr M., Jakob W., Humphreys G., 2016, Physically based rendering: From theory to implementation., Morgan Kaufmann.DOI
12 
International Organization for Standardization. , 1997, Geometrical Product Specifications (GPS) — Surface texture: Profile method — Terms, definitions and surface texture parameters (ISO Standard No. 4287:1997DOI
13 
International Organization for Standardization. 이름, 2002, Geometrical Product Specifications (GPS). Indication of surface texture in technical product documentation, ISO Standard No. 1302:2002DOI
14 
Culjak I., Abram D., Pribanic T., Dzapo H., Cifrek M., 2012, A brief introduction to OpenCV., 2012 Proceedings of the 35th International Convention MIPRO, Opatija, pp. 1725-1730URL
15 
Walt S. V. D., Colbert S. C., Varoquaux G., 2011, The NumPy array: a structure for efficient numerical computation., Computing in Science & Engineering, Vol. 13, No. 2, pp. 22-30DOI
16 
Gallego A. J., Calvo-Zaragoza J., Valero-Mas J. J., Rico-Juan J. R., 2018, Clustering-based k-nearest neighbor classification for large-scale data with neural codes representation., Pattern Recognition, Vol. 74, pp. 531-543DOI
17 
Saito G., 2017, Deep Learning from Scratch, O’ReillyURL
18 
Bradski G., Kaehler A., 2008, Learning OpenCV: Computer vision with the OpenCV library., O'Reilly Media Inc.URL
19 
Hochreiter S., Schmidhuber J., 1997, Long short-term memory., Neural computation, Vol. 9, No. 8, pp. 1735-1780DOI
20 
Schuster M., Paliwal K. K., 1997, Bidirectional recurrent neural networks., IEEE transactions on Signal Processing, Vol. 45, No. 11, pp. 2673-2681DOI
21 
Bergstra J., Bengio Y., 2012, Random search for hyper-parameter optimization., Journal of machine learning research, Vol. 13, No. feb, pp. 281-305URL
22 
Chollet F., Others. , 2019, keras-tuner, GitHub repositoryURL
23 
Geron A., 2019, Hands-on machine learning with Scikit-Learn, Keras & TensorFlow concepts tools and techniques to built intelligent systems, 2nd edition, O’ReillyURL

Author

Binayak Bhandari
../../Resources/ieie/IEIESPC.2021.10.3.189/au1.png

Binayak Bhandari is an assistant professor and director of the Smart Structure Design Lab at the Department of Railroad Engineering and Transport Management at Woosong University, Daejeon, Korea. He received his Ph.D. in mechanical and aerospace engineering from Seoul National University, Korea, in 2014. He has research contributions in the field of renewable energy systems design, smart materials, appropriate technology, advanced machining, design and optimization of railroad systems, machine learning, and deep learning.

Gijun Park
../../Resources/ieie/IEIESPC.2021.10.3.189/au2.png

Gijun Park is a research intern at the Smart Structure Design Laboratory at Woosong University