Mobile QR Code QR CODE

  1. (Department of Electronics and Communication Engineering, SRM Institute of Science and Technology / Kattankulathur-Chennai, Tamilnadu-603203, India {de0642, vijayakp}

Casting quality inspection, CNN, Industry 4.0, Machine learning

1. Introduction

Casting subjected to visual inspections is essential to improving productivity in today’s industry. The process of finding defects in casting requires more time to accomplish the inspection. Visual assessments are performed by experienced human auditors. In any case, quality control by ordinary visual techniques is debilitating, prone to mistakes, wasteful, and costly [9]. Hence, replacement of the traditional inspection process is required, and it must be automated to deliver to customers zero-defect products that meet high-quality standards. Quality control is a focal point of any modern industrial assembly. This intricate process should be completed with a high level of accuracy and meticulousness [12]. Current industry requires advanced solutions to check programmed creation of products and to distinguish defective materials. Smart checking is a useful functionality for mechanical innovation frameworks or machines, and is a compulsory advance towards programmed creation [1]. Machine learning (ML) algorithms are well-known techniques to handle the complex task of visual quality inspection of a casting [10,11]. More manufacturers prefer to utilize ML and deep learning (DL) techniques to automate quality assurance of products in order to reduce costs and time [2].

Open information by using vision gear is utilized to distinguish and report damaged items, to determine the reasons for inadequacies, and to permit quick and effective intercessions in modern industries [3]. The convolutional neural network (CNN) is the best technique for image classification with rapid processing speed [7] and to separate the attributes of items, with the end goal being modern quality control, which groups faulty and sound items via dataset preparation [5]. A CNN with X-ray images, called multi-optical image fusion (MOIF), is a promising technology for improving performance, but training a model with this technique is a complex and expensive process [4,13]. Similarly, surface-defect detection is done with three-dimensional data aided by stereoscopic vision, laser scanners, and spotted light measurement methods with a CNN [14]. For detecting internal cracks from the surface, the BoDoC methodology or AdaBoost with a support vector machine (SVM) algorithm provides good precision. But concerning computation time, it takes too long to execute the output [6,15]. Hence, to identify faults in casting products, a CNN provides the best solution in automation.

The approach proposed in this paper improves the accuracy of defect detection in casting products and is done through a CNN. The image dataset used in this work is an open-source online dataset for defect detection in casting products. These datasets include two classes of testing and training data: DEF and OK. Some sample images from the datasets are in Fig. 1.

Fig. 1. Sample Casting Images.

2. Related Work

In this section, we briefly discuss the results of various approaches used to identify defects in casting.

We deliberate in the following subsections the various methods included with a CNN for detecting defects on the surface of casting products.

2.1 MVGG-19

The Visual Geometry Group (VGG) makes most of the various customary leveled multipath designs of VGG19, while it associates early and late convolutional obstructs by developing additional ways of feature map handling. In this approach, three BD layers (batch normalization, drop out, global average pooling) yielding consecutive results are connected and taken care of with a brain organization of 2500 neurons.

Fig. 2 Illustrate the architecture of MVGG19 The thought behind this adjustment is to detach early and late separated highlights from a consecutive design, and interface them in a straightforward manner with the classifier at the highest point of the organization. Along these lines, the absolute removed highlights are expanded, and every way is answerable for moving elements from the early and the late picture process. In this manner, the BD blocks are not adding some other convolution activity, yet are simply moving, normalizing, and decreasing their elements [1].

Fig. 2. Multipath VGG19[1].
Table 1. Summary of various defect detection approaches.



Parameters measured

Result Achieved


Visual Motif discovery

Casting plate defects

The achieved accuracy for quality is 97.14%


CNN with a Visual Geometry Group – multipath network (MVGG19)

Defect detection and object recognition in various industries. (Castings, tools, metal surfaces, magnetic tiles, solar cells, bridge cracks)

Defect detection accuracy for casting was achieved is 97.88%


Photometric stereo algorithm with a custom segmentation network

Material surface defects (nickel)

Accuracy of 95.60% for NI surface defect detection.



Casting products

Validated with 400 products; achieved 98% accuracy


EfficientNet-B0 trained in a CNN for classification of products, and a Decision Tree algorithm for the prediction process

Casting plate defects

Classification achieved 96.88% accuracy


Data augmentation with Wasserstein Generative Adversarial Nets (WGANs); feature extraction through a CNN; multimodel ensemble to show the classified result.

Cast austenitic stainless steel (CASS)

Classification accuracy from MLP is 98.54% at 400°C/10 kh, which is high compared to other classification methods like KNN and the SVM.

2.2 Motif Discovery Approach

This approach is combined with a CNN using the ResNet34 architecture. Fig. 3 shows the process flow of Motif approach.

This structure has two steps.

· Images are converted into a time series.

The images are changed to a time series by utilizing a histogram. It acquires the power appropriation of the picture where the X-pivot addresses the force esteem, and the Y-hub addresses the relating pixels that have that force esteem.

· To determine image motifs, use a time series.

In this approach, a triple succession table is prepared with the help of the variable Euc. It stores values of the row, column, Euclidean distance, then Scount; the number of images in the time series is Scount[i]. Si<r for Euclidean distances in the image time series [2].

Fig. 3. The Motif Discovery Approach[2].

2.3 EfficientNet-B0

· This approach helps to increase the accuracy with minimal computational cost. It is used for a new scaling process in neural networks called compound scaling. These processes are done at the preprocessing stage. Then, the pre-trained images are fetched into a simple convolutional neural network.

· The EfficientNet-B0 baseline network is the mobile baseline network to evaluate the scaling method in convolutional networks. Table 2 shows the EfficientNet-B0 architecture [3].

Table 2. The EfficientNet-B0 baseline network[3].

Stage (i)

Operator (Fi)

Resolution (Hi x Wi)

Channels (Ci)

Layers (Li)


Conv 3x3





MB Conv1, k3x3





MB Conv6, k3x3





MB Conv6, k5x5





MB Conv6, k3x3





MB Conv6, k5x5





MB Conv6, k5x5





MB Conv6, k3x3





Conv 1x1 & pooling & FC




2.4 Photometric Stereo Approach

The photometric stereo approach right off the bat recovers surface directions and can be joined with a mixed strategy to compute a stature or profundity map. Indeed, even without a resulting reconciliation step, the surface directions can be utilized, for instance, to decide the bend boundaries of article surfaces. To obtain photometric stereo pictures, the item is continuously illuminated by several light sources, illustrated in Fig. 4.

To inspect a component with this approach, four light sources are placed at different angles. The gathered five photometric images are, a curvature image, a texture image, gradient images (X & Y), and a range image. Furthermore, these images are processed by the CNN for image classification. The processing time to detect defects in a product's surface was 707ms with a CPU.

Fig. 4. Illustration of the Photometric Stereo Approach.

2.5 Research Gap Identified

A literature review showed the capacity of a CNN for programmed deformity discovery. In addition, the significance of the picture quality, which is exceptionally related to obtaining a satisfactory setup, is indicated by the attributes of the surface to be investigated. Uniquely, in specular surfaces, geographic data are extremely pertinent, and hence are a significant wellspring of information on location. Moreover, handling time is basic in adapting to the review of parts in high-creation-rate situations. The following are the findings from the existing methods.

· Refinement is required to prepare a trained model to improve accuracy.

· Optimization is needed to improve the processing time of a system.

· Improvising precision in defect identification is done with real-time augmented data.

· Robustness is required to train the model to reduce the false rejection rate from the input datasets.

To provide the best solution for the findings mentioned above, we propose a CNN with a DenseNet architecture for defect detection in casting products. We demonstrate the inspection system with large input image datasets to classify defective products. Through this, manufacturing industries can ensure high quality in their products.

3. The Proposed System

The proposed system uses a CNN for model preparation, with DenseNet to classify the images and detect defects using feature extraction. Compared with the best-trained model with a CNN, predictions of defective and non-defective casting products were made.

The best-trained model was prepared with 6633 training images and 695 testing images. Here, the CNN uses a two-stage sequential process for model creation. Compilation of the whole CNN process was done with the help of the Adam optimizer. Casting image datasets were imported into Python using TensorFlow’s Keras from the local directory. The training and testing data were found in imported image datasets, with each image rescaled to 256${\times}$256 pixels from 512${\times}$512 pixels. Thus, the processing time for identifying faults in the images was reduced.

The prepared model is utilized to foresee the classes of pictures that were not recently remembered for the preparation and approval process. The arrangement will yield a likelihood score between 0 and 1, and an edge set at 0.5 was determined to isolate the classes. A likelihood score equivalent to, or more than, this limit is designated as defective; all other cases are OK.

3.1 System Architecture

The overall process of the system architecture for casting quality visual inspection using CNN with DenseNet is shown in Fig. 5.

The process flow of the architecture is as follows.

1. Casting image datasets are imported from the local directory using the image data generator function to preprocessing the images.

2. The model is prepared with the help of a CNN architecture with two stages.

3. Then, the CNN will have a database from the best-trained model to find defects in an image object.

4. Prediction of defective and non-defective datasets can be made through the linear regression process of the system.

5. Results are produced by importing the sample image data for testing from the corresponding directory.

Fig. 5. Overall system architecture for casting quality visual inspection with the CNN.

3.2 Data Preprocessing

The first and foremost step in this work is the preparation of data to normalize pixel values (between 0 and 255) into a range between 0 and 1, aided by passing rescale arguments in the Keras ImageDataGenerator for both training and testing sets.

From the data, 20% is for validation using the argument validation_split=0.2 in the training and testing data generator, and by using the flow_from_directory~() function during preprocessing. Fig. 6 shows the distribution ratio of the datasets.

Fig. 6. Distribution of the Datasets.

3.3 Preparation of the Trained Model

This section elaborates on the different processes involved in preparation by the best-trained model using a CNN.

2D Convolution: This is used for filtering during image processing. Here, it is developed with a 150${\times}$150 kernel with 32 filters in one stage; in another stage, the kernel size is 75${\times}$75 with the same filter size in the 2D convolution. The following equation is used for 2D convolution to provide a feature map from preprocessed data:

$ \begin{equation} G\left[a,b\right]=\left(l\mathrm{*}m\right)\left[a,b\right]=\sum _{p}\sum _{q}m\left[p,q\right]l\left[a-p,b-q\right] \end{equation} $


$\begin{align*} &l-Input\,\,image;\,\,m-Kernel; \\ a& b-rows\,\,and\,\,columns\,\,of\,\,output\,\,matrix\,\,indexes \\ p& q-indexes\,\,of\,\,kernel\,\,in\,\,the\,\,convolution \end{align*}$

Max_pooling 2D: To reduce the dimensions of an image, this function is used. There are two layers in this architecture: one is 75${\times}$75-dimension reduction from the 150${\times}$150 kernel, and the other reduces the kernel size from 38${\times}$38 to 19${\times}$19.

Flatten: Used to generate one-dimensional data from 2D data.

DenseNet: Used to load the pretrained data images for defect identification.

Table 3 and Fig. 7 show detailed descriptions of each layer's kernel size.

As mentioned above, the CNN comprises various layers: a progression of convolutional layers (with enactment), pooling layers, and one final, completely associated layer that creates a bunch of class scores for a given picture. The convolutional layers of the CNN go about as component extractors; they extricate shape and shading designs from the pixels that benefit from preparing the pictures.

Fig. 7. The CNN with a DenseNet Architecture.
Table 3. The sequential model.

Layer (type)

Output Shape

No. of Param.

conv2d (Conv2D)

(None, 150, 150, 32)


max_pooling2d (MaxPooling2D)

(None, 75, 75, 32)


conv2d_1 (Conv2D)

(None, 38, 38, 64)


max_pooling2d_1 (MaxPooling 2D)

(None, 19, 19, 64)


flatten (Flatten)

(None, 23104)


dense (Dense)

(None, 128)


dense_2 (Dense)

(None, 1)


4. Results and Discussion

This section describes the various research gaps in other approaches. Refinement in the CNN architecture with MVGG19 is needed to achieve more accuracy in identifying casting defects [1]. The photometric stereo algorithm achieves less accuracy due to a lack of real-time data acquisition [12]. Optimization is required for the motif discovery approach with a CNN to improve accuracy [2]. EfficientNet-B0 with the CNN-based model training approach can provide good accuracy (99%) for predictions from the model, but it needs more computation time to complete the process of finding defects [3]. Real-time augmentation of defect detection is more simple than multi-optical image fusion [4,8]. Modification in model creation for the CNN is needed to improve system accuracy in fault identification [5]. The proposed system uses the casting product dataset available online from the Kaggle website. Fig. 8 shows the performance of our proposed system compared with other approaches.

Fig. 8. Performance of the Proposed Approach Compared with Other Approaches.

4.1 Simulation Results

Fig. 9 shows the precision of the model, which (for the most part) increases, while errors are reduced over time. Likewise, preparation and approval of bends are firmly adjusted, indicating that the model does not cause overfitting, and may perform well when grouping pictures from the testing dataset.

Epoch 20 provided the best performance with the following results:

99.64% training accuracy

99.40% validation accuracy

1.65% training loss

2.56% validation loss

The datasets used in this study are publicly available from to prepare the model to classify OK and DEF image products using approaches by the various researchers, such as a CNN with MVGG, a CNN with the Motif Discovery approach, a CNN with EfficientNet-B0, and a CNN with the photometric approach. In this work, we propose a CNN with DenseNet to classify the images. Hence, researchers can replicate this work using a variety of variations in the future.

Fig. 9. Simulation results from Accuracies Achieved.
Fig. 10. Confusion Matrix to Generate the Classification Report.

4.2 Evaluation Criteria

The confusion matrix represents the basic prediction results from the system, which produces the outcome from binary classification. The data instances are predicted as either positive or negative. The following predictions can be made through this confusion matrix.

1. True Positive (TP): Correct Positive Prediction

2. True Negative (TN): Correct Negative Prediction

3. False Positive (FP): Incorrect Positive Prediction

4. False Negative (FN): Incorrect Negative Prediction

Accuracy: The accuracy from the images used to predict defective items can be obtained from Eq. (2):

$ \begin{equation} Accuracy=\frac{TP+TN}{\left(TP+TN+FP+FN\right)} \end{equation} $

Precision: Precision finds all positive samples from the given datasets. It is obtained from Eq. (3):

$ \begin{equation} Precision=\frac{TruePositive}{TruePositive+FalsePositive} \end{equation} $

Recall: This cannot mark negative samples from the model as positive. It is obtained from Eq. (4):

$ \begin{equation} Recall=\frac{TruePositive}{TruePositive+FalseNegative} \end{equation} $

F1-score: The harmonic mean of precision is called the F1-score. It is obtained from Eq. (5):

$ \begin{equation} F1-score=2\times \frac{Precision*Recall}{Precision+Recall} \end{equation} $

Table 4 shows a classification report from the proposed method. The mathematical expressions used to find Precision, Recall, and F1-score are discussed below.

Table 4. Classification report from the proposed system.















Macro Avg.




Weighted Avg.




4.3 Performance Measures

Beyond finding the accuracy of the trained model, another critical point to investigate in the system is processing time during an inspection. Measuring the time spent by the system ensures continuous improvement in production. In this connection, our system inspects castings in less processing time (around 454ms with a CPU).

Table 5 compares Precision, Recall and F1-score from existing methods for casting defect detection against our system.

Table 5. Comparison of the proposed system versus existing method parameters.






MVGG19 Approach (Apostolopoulos and Tzani, 2022)





Motif Discovery (Bhatia, A.S et al. 2022)





EfficientNet-B0 (Benbarrad et al. 2021)





Photometric stereo approach (Saiz et al. 2022)





The proposed system





5. Conclusion

This work mainly focused on replacing the traditional visual inspection method used in industry. Here, an optimized CNN architecture improves the accuracy in identifying defects in casting products. The images used for our work are in RGB format. It is essential to convert them to 2D for prediction of defective products. Then, the image datasets are preprocessed by using TensorFlow’s Keras in Python. These preprocessed datasets provide very high accuracy in image classification.

One limitation of this system is that it uses preprocessed 2D images for fault identification, but 2D images are not sufficient to detect defects in the casting surface when the defect is very small. Hence, the inspection of defective regions can be done in 3D to get high precision in identifying small defects. There is a need to improve optimization for depth analysis of an image in order to find the dimensions of the defects in industrial casting products. Hence, future work needs to improvise small-defect detection in the manufacturing of casting products.


The authors would like to thank [] for access to its publicly available casting datasets used in this work.


Ioannis D. Apostolopoulos , Mpesiana A. Tzani. , 2022, Industrial object and defect recognition utilizing multilevel feature extraction from industrial scenes with Deep Learning approach, Journal of Ambient Intelligence and Humanized ComputingDOI
Amanjeet Singh Bhatia , Rado Kotorov , Lianhua Chi. , 2022, Casting plate defect detection using motif discovery with minimal model training and small data sets, Journal of Intelligent ManufacturingDOI
Tajeddine Benbarrad , Marouane Salhaoui , Soukaina Bakhat Kenitar , Mounir Arioua. , 2021, Intelligent Machine Vision Model for Defective Product Inspection Based on Machine Learning, Journal of Sensor and Actuator Networks, pp. 1-18DOI
Lee Jong Hyuk, Kim Byeong Hak, Kim Min Young, 2021, Machine Learning-based Automatic Optical Inspection System with Multimodal Optical Image Fusion Network, International Journal of Control, Automation and Systems, Vol. 19, No. 10, pp. 3503-3510DOI
Thong Phi Nguyen , Choi Seungho, Park Sung-Jun, Park Sung Hyuk, Yoon Jonghun, 2021, Inspecting Method for Defective Casting Products with Convolutional Neural Network (CNN, International Journal of Precision Engineering and Manufacturing-Green Technology, Vol. 8, pp. 583-594DOI
Iker Pastor-López , Borja Sanz , Alberto Tellaeche , Giuseppe Psaila , José Gaviria de la Puerta , Pablo G. Bringas , 2021, Quality assessment methodology based on machine learning with small datasets: Industrial castings defects, Neurocomputing, Vol. 456, pp. 622-628DOI
Junjie Xing , Minping Jia , 2021, A convolutional neural network-based method for workpiece surface defect detection, Measurement, Vol. 176, No. 109185DOI
Kim Jin-Gyum, Jang Changheui, Kang Sung-Sik, 2021, Classification of ultrasonic signals of thermally aged cast austenitic stainless steel (CASS) using machine learning (ML) models, Nuclear Engineering and TechnologyDOI
Xinyi Le , Junhui Mei , Haodong Zhang , Boyu Zhou , Juntong Xi , 2020, A learning-based approach for surface defect detection using small image datasets, Neurocomputing, Vol. 408, pp. 112-120DOI
Rui Li , Mingzhou Jin , Vincent C. Paquit , 2021, Geometrical defect detection for additive manufacturing with machine learning models, Materials & Design, Vol. 206, No. 109276DOI
Peng Wang , Yiran Yang , Narges Shayesteh Moghaddam , 2022, Process modeling in laser powder bed fusion towards defect detection and quality control via machine learning: The state-of-the-art and research challenges, Journal of Manufacturing Processes, Vol. 73, pp. 961-984DOI
Fátima A. Saiz , Iñigo Barandiaran , Ander Arbelaiz , Manuel Graña , 2022, Photometric Stereo-Based Defect Detection System for Steel Components Manufacturing Using a Deep Segmentation Network, Sensors, Vol. 22, No. 882DOI
Max Ferguson , Ronay Ak , Yung-Tsun Tina Lee , Kincho H. Law , 2017, Automatic Localization of Casting Defects with Convolutional Neural Networks, IEEE International Conference on Big Data, Vol. 978-1-5386-2715-0, pp. 1726-1735DOI
Xiaoxin Fang , Qiwu Luo , Bingxing Zhou , Congcong Li , Lu Tian , 2020, Research Progress of Automated Visual Surface Defect Detection for Industrial Metal Planar Materials, Sensors, Vol. 20, No. 5136DOI
Cheng Jin , Xianguang Kong , Jiantao Chang , Han Cheng , Xiaojia Liu , 2020, Internal crack detection of castings: a study based on relief algorithm and Adaboost-SVM, The International Journal of Advanced Manufacturing Technology, Vol. 108, pp. 3313-3322DOI


Vijayakumar Ponnusamy

Vijayakumar Ponnusamy received his Ph.D. from SRM IST in 2018. He obtained his Masters in Applied Electronics from the College of Engineering, Guindy, in 2006. In 2000, he received his B.Eng. in Electronics and Communication Engineering from Madras University. He is currently a Professor in the ECE Department, SRM IST, Chennai, Tamil Nadu, India. He is a Certified IoT Specialist and Data Scientist. He is also a recipient of the NI India Academic Award for Excellence in Research (2015). His current interests are in machine learning and deep learning, IoT-based intelligent system design, blockchain technology, and cognitive radio networks. He is a senior member of IEEE.

E. Dilliraj

E. Dilliraj received a B.Engg. in Electronics and Communication Engineering and a Master’s degree in Embedded System Technologies from Anna University. He was an Assistant Professor in Electronics and Communication Engineering at Prathyusha Engineering College. His current interests are image processing, machine learning algorithms, deep learning, artificial intelligence, computer vision, the IoT, and embedded systems. His is currently a research scholar at the SRM Institute of Science and Technology.