MohanLaavanya
                     VeeramaniVijayaraghavan
               
                  - 
                           
                        (Department of Electronics and Communication Engineering, VIgnan’s Foundation for Science
                        Technology and Research Deemed to be University / Guntur, Andhra Pradesh 522213, India
                        {drml_ece, drvvr_ece}@vignan.ac.in)
                        
 
            
            
            Copyright © The Institute of Electronics and Information Engineers(IEIE)
            
            
            
            
            
               
                  
Keywords
               
               Convolutional neural network, Deep learning, Gaussian noise, Image denoising
             
            
          
         
            
                  1. Introduction
               
                  
                  				Image denoising is a significant task for computer vision applications since it
                  affects computer vision algorithms in the recognition of images. The goal of denoising
                  is to remove the noise by conserving all characteristics of images. In general, denoising
                  methods are available in spatial and transform domains [1,2]. Spatial domain methods use the spatial connection of pixels to eliminate noise.
                  Spatial domain methods are further classified into linear filters and non-linear filters.
                  			
               
               
                  				Linear approaches [3,4] are frequently used to remove additive noise in images by blurring the original signal.
                  Gaussian filtered detection of forgery images was implemented by extracting feature
                  vectors from the Gaussian filter residual in the spatial domain and performs excellently
                  [5]. The SVD which is a non-linear spatial filter also eliminate noise in images, another
                  spatial non-linear filter but it is lacking in distinguishing significant and non-significant
                  singular values [6,7]. A bilateral filter is also based on a nonlinear technique that considers both the
                  intensity and geometric closeness of a pixel to perform image denoising. The transform
                  domain-based methods differentiate the data signal and noise signal by sparse depiction
                  and obtain good denoising performance [8].
                  			
               
               
                  				In general, conventional image processing techniques depend on prior knowledge
                  for the removal of noise, and the computational complexity is higher. Moreover, in
                  a state-of-art denoising algorithm, a missing part is the consideration of a residual
                  image. A residual image can be defined as the dissimilarity between a noise-affected
                  image and a denoised image. The residual image has several properties that have to
                  be explored. The residual-image statistics like the mean square error and structural
                  similarity index support denoising [9]. By considering the statistical moments and correlation of patches in a residual
                  image, noise reduction by preservation of texture and contrast can be obtained [10].
                  			
               
               
                  				In recent technology development, a convolutional neural network (CNN) using deep
                  learning gives good results in not only the fields of object detection, classification,
                  but also in image noise reduction. We presents a Gaussian filtered residual using
                  a deep learning CNN for the noise reduction in an image. The denoising CNN architecture
                  is based on residual denoising and has some small structure details that have not
                  been evaluated. This problem is addressed in our work using a Gaussian residual filter
                  for a residual image. Finally, the approach was evaluated using the peak signal-to-noise
                  ratio (PSNR) and structural similarity index measure (SSIM).
                  			
               
               
                  				The paper stream chart is, in section 2 background ground theory of the proposed
                  work is described briefly. Section 3 discusses the scheme presented, and Section 4
                  covers the results and comparisons. The final section discusses the conclusion.
                  
                  			
               
             
            
                  2. Background
               
                  				Neural networks have already been explored in signal denoising. A CNN for image
                  denoising [11] gives good denoising performance for different variance levels. In recent days, deep
                  CNNs have created a revolution in image classification, identification, restoration,
                  etc. CNNs can be used in image de-speckling of satellite aperture radar images using
                  Euclidean loss and total variation loss [12]. The restoration of an image from noise [13] removes the noise by identifying the difference between noisy image and denoised
                  image. The result is a clean image, and this is called residual learning. A residual-based
                  CNN has more layers, more parameters, and more computational cost. Residual image
                  denoising using a dilated convolutional layer reduces the computational cost and receptive
                  field size using mathematical calculation [14].
                  			
               
               
                  				To prevents the problem of vanishing gradients, a CNN with 17 layers and residual
                  learning was presented [15]. The residual network converges quickly due to a gradient skipping scheme at the
                  learning stage to remove different noise at a different level [16]. Multi-scale residual learning influences the network depth as well as the number
                  of models required for a learning process by using a dropout layer that slows the
                  training process [17]. The training speed of the network is increased by using pre-processing and post-processing
                  layers after the non-linear mapping improves the denoising performance, which is called
                  FFDNet [18].
                  			
               
               
                  				Algorithm complexity is reduced by recognizing the pixels at lower scale. To enrich
                  the spatial information of the image, the structural detail of the image can be found
                  by using a guided filter-based CNN. The network has two sub-networks. The networks
                  extract the informative features, and these features are given as input to another
                  network, where the non-consistent features are suppressed. This architecture is applicable
                  for image denoising, up-sampling, and texture separation [19]. In proposed method we use the gaussian filter residual image which is different
                  from residual image methodologies presented in other papers are used to evaluate the
                  impact on image denoising.
                  
                  			
               
             
            
                  3. The Proposed Scheme
               
                  				The deep architecture has three basic layers with one skip layer. The first layer
                  and last layer of the architecture have only a convolutional layer and rectified linear
                  unit (ReLU). In the second level of operation, a convolutional layer plus a batch
                  normalization layer and ReLu layer find the feature map. The same batch of operations
                  is performed up to 28 levels.
                  			
               
               
                  				The skip layer is used for a residual image. The convolutional layer has 30 filters
                  with a kernel size of 3x3, and hence, 3x3x3 is used to generate 30 feature maps. The
                  feature map gives the presence or absence of features in terms of intensity values.
                  The convolution layer multiplies the input array with a weight matrix called the kernel.
                  The operation of the convolutional layer is shown in Fig. 1.
                  			
               
               
                  				The feature map of the convolutional layer is given to the ReLU layer to introduce
                  non-linearity, so that the output will not be a sum. Otherwise, the network will not
                  obtain its objective. The ReLU maintains the positive pixels and makes the negative
                  pixels zero, as shown in Fig. 2. The activation map is the output of the ReLU layer. The next one is the batch normalization
                  layer used to normalize the outputs of the previous activation layer. It normalizes
                  them by subtracting the batch mean from the output and dividing them by the standard
                  deviation. The last layer is once again a convolutional and ReLU layer. In this architecture,
                  stochastic gradient descent is used as an optimization technique. Stochastic gradient
                  descent de-normalizes the output by changing the mean and standard deviation.
                  			
               
               
                  				The architecture shows that the filter structure plays a major role in extracting
                  the feature map in the convolutional layer. The value of the filters is learned by
                  the CNN by itself during a training process. The feature map size is fixed by three
                  parameters like the depth, stride, and zero padding. We has used a depth of three,
                  stride of one, and padding of one. The filters used in each convolutional layer is
                  different, so simple specific characteristics of images can be obtained.
                  			
               
               
                  				We have handled a color image, and the image is divided into three planes, where
                  each plane is fed to the network separately and finally concatenated to find the original
                  denoised color image. The presented method’s denoising performance is better than
                  that of a denoising CNN. This can be seen in Fig. 3, which shows the residual image from using a denoising CNN in Fig. 3(a) and the proposed method in Fig. 3(b). Fig. 3(a) shows that the image has very small features with more noise compared to our result.
                  Fig. 3(a) also shows that even when the noisy image is passing through various layers for prediction
                  of noise, small structured details are present in the residual image, which violates
                  the assumption of independent identically distributed noise.
                  			
               
               
                  				In order to overcome the above problem effectively, by capturing the left-out
                  structure in a residual image, Gaussian convolution is done on the residual image.
                  The image filtered by Gaussian convolution is represented by:
                  			
               
               
               
               
                  				I denotes the image, p denotes the position of a pixel, s denotes the spatial
                  location of image, $\left\| p-q\right\| $ denotes the Euclidian distance between the
                  pixels p and q, which defines the size of neighborhood, and finally, $G_{\sigma }$(x)
                  is the Gaussian kernel.
                  			
               
               
                  				The spatial distance between the pixels plays a role in Gaussian convolution but
                  not in their values. In Gaussian convolution, a bright pixel has control over a dark
                  pixel that is adjacent to it, even when the values of the two pixels are different.
                  In the result, edges are blurred, and the structures are captured since discontinuities
                  are averaged together.
                  			
               
               
                  				The proposed method uses Gaussian convolution, which is linear and more effective
                  at smoothing images. Smoothing has a strong effect on the contours of the image objects
                  as the contrast is not preserved at the edges. Therefore, the last convolution layer
                  feature map is Gaussian filtered (GF), and this filtered feature is subtracted from
                  the last layers feature map to obtain the Gaussian filter residual (GFR). Hence, GFR
                  has much less structure of the original image and more noise. Finally, a denoised
                  image is obtained by subtracting the GFR from the original image. This concept is
                  applied for all three planes of the input noisy image to denoise it. The presented
                  scheme is shown in Fig. 4.
                  
                  			
               
               
                     Fig. 1. Operation of convolutional layer.
 
               
               
                     Fig. 3. Residual image (a) Denoising CNN method, (b) Proposed method.
 
               
                     Fig. 4. Presented network architecture for image denoising.
 
             
            
                  4. Results and Discussion
               
                  				The efficacy of the presented scheme is assessed in MATLAB. The presented scheme
                  was verified on standard test images. The test images are ``Lena,'' ``Pepper,'' and
                  ``Baboon.'' To test the technique, ``imnoise'' is used, and adaptive white Gaussian
                  noise is added to the input image at four different noise levels of 0.01, 0.02, 0.03,
                  and 0.04. A learning rate of 0.0001 and minimum batch size of 128 were adopted for
                  the stochastic descent gradient optimization technique.
                  			
               
               
                  				In the presented scheme, PSNR is used to quantify the signal strength of the denoised
                  images. The presented scheme is compared with a denoising CNN in terms of the PSNR
                  and SSIM. Tables 1 and 2 show the performance comparisons of the proposed method for
                  three test images. Table 1 shows that the presented process performs better than the denoising CNN with an average
                  result of 0.3 dB. Another metric, SSIM, displays the error measure of the perceptual
                  image in Table 2.
                  			
               
               
                  				A visual comparison between our method and the denoising CNN for a noise level
                  of 0.04 is shown in Fig. 5 for the Lena image and Fig. 6 for the Monarch image. Figs. 5(c) and 6(c) indicate that our approach restores the
                  image better than the denoising CNN. Hence, the combination of a CNN and Gaussian
                  filter residual is effective for image denoising.
                  
                  			
               
               
                     Fig. 5. Lena image (a) Noise-affected image, (b) Denoised image of denoising CNN method, (c) Denoised image of proposed method.
 
               
                     Fig. 6. Monarch image (a) Noise-affected image, (b) Denoised image of denoising CNN method, (c) Denoised image of proposed method.
 
               
                     Table 1. Noisy PSNR and denoised PSNR values of images of Lena, Pepper, and Baboon.
                  
                        
                           
                              | Image | Noise variance (σ) | Denoising CNN method | Proposed method | 
                        
                              | Noisy PSNR | Denoised PSNR | Noisy PSNR | Denoised PSNR | 
                     
                     
                           
                              | Lena | 0.01 | 20.21 | 30.56 | 20.21 | 30.59 | 
                        
                              | 0.02 | 17.37 | 28.79 | 17.37 | 28.89 | 
                        
                              | 0.03 | 15.76 | 27.64 | 15.76 | 27.76 | 
                        
                              | 0.04 | 14.67 | 26.68 | 14.67 | 26.86 | 
                        
                              | Pepper | 0.01 | 20.32 | 29.92 | 20.32 | 29.95 | 
                        
                              | 0.02 | 17.51 | 28.23 | 17.51 | 28.32 | 
                        
                              | 0.03 | 15.96 | 27.01 | 15.96 | 27.18 | 
                        
                              | 0.04 | 14.88 | 25.98 | 14.88 | 26.26 | 
                        
                              | Baboon | 0.01 | 20.17 | 24.22 | 20.17 | 24.62 | 
                        
                              | 0.02 | 17.30 | 22.80 | 17.30 | 23.09 | 
                        
                              | 0.03 | 15.70 | 22.01 | 15.70 | 22.27 | 
                        
                              | 0.04 | 14.60 | 21.45 | 14.60 | 21.71 | 
                     
                  
                
               
                     Table 2. Noisy SSIM and denoised SSIM values of images of Lena, Pepper, and Baboon.
                  
                        
                           
                              | Image | Noise variance (σ) | Denoising CNN method | Proposed method | 
                        
                              | Noisy SSIM | Denoised SSIM | Noisy SSIM | Denoised SSIM | 
                     
                     
                           
                              | Lena | 0.01 | 0.8035 | 0.9703 | 0.8035 | 0.9786 | 
                        
                              | 0.02 | 0.6856 | 0.9621 | 0.6856 | 0.9678 | 
                        
                              | 0.03 | 0.6025 | 0.9579 | 0.6025 | 0.9601 | 
                        
                              | 0.04 | 0.5419 | 0.9491 | 0.5419 | 0.9515 | 
                        
                              | Pepper | 0.01 | 0.8200 | 0.9701 | 0.8200 | 0.9713 | 
                        
                              | 0.02 | 0.7140 | 0.9529 | 0.7140 | 0.9598 | 
                        
                              | 0.03 | 0.6416 | 0.9487 | 0.6416 | 0.9506 | 
                        
                              | 0.04 | 0.5862 | 0.9381 | 0.5862 | 0.9418 | 
                        
                              | Baboon | 0.01 | 0.7584 | 0.8477 | 0.7584 | 0.8596 | 
                        
                              | 0.02 | 0.6376 | 0.7932 | 0.6376 | 0.8307 | 
                        
                              | 0.03 | 0.5529 | 0.7590 | 0.5529 | 0.7670 | 
                        
                              | 0.04 | 0.5033 | 0.7304 | 0.5033 | 0.7413 | 
                     
                  
                
             
            
                  5. Conclusion 
               
                  				The work that we presented is a Gaussian residual learning CNN to remove noise
                  from an image affected by Gaussian noise. The network is built by stacking a CNN along
                  with a Gaussian filter to identify a Gaussian filtered residual. The Gaussian filtered
                  residual helps in obtaining a clean image. Moreover, we showed that the proposed scheme
                  removes noise in a better way than a denoising CNN in terms of the performance metrics
                  PSNR and SSIM. The experiential results of visual comparison also exhibit that our
                  proposed method performs better.
                  			
               
             
          
         
            
                  
                     REFERENCES
                  
                     
                        
                        Talebi H., Zhu X., Milanfar P., 2013, How to SAIF-ly boost denoising performance,
                           IEEE Trans. Image Process., Vol. 22, No. 4, pp. 1470-1485

 
                     
                        
                        Varghese G., Wang Z., 2010, Video denoising based on a spatiotemporal Gaussian scale
                           mixture model, IEEE Trans. Circuits Syst. Video Technol., Vol. 20, No. 7, pp. 1032-1040

 
                     
                        
                        Ozkan M. K., Erdem A. T., Sezan M. I., Tekalp A. M., 1992, Efficient multiframe Wiener
                           restoration of blurred and noisy image sequences, IEEE Trans. Image Process., Vol.
                           1, pp. 453-476

 
                     
                        
                        Rosenfeld A., Kak A. C., 1982, Digital picture processing, Second edition, Academic,
                           New York, USA

 
                     
                        
                        Hwang J. J., Rhee K. H., Research, Gaussian filtering detection based on features
                           of residuals in image forensics, in Proc. of IEEE Int. Conf. Computing & Communication
                           Technologies, Research, Innovation, and Vision for the Future, Hanoi, Vietnam, pp.
                           153-157

 
                     
                        
                        Patterson H. C., Andrews C. L., Speech, Singular value decompositions and digital
                           image processing, IEEE Trans. Acoust., Speech, Signal Process., Vol. 24, pp. 26-53

 
                     
                        
                        Lee H-C., Lee H-J., Kwon H., Liang J., 1991, Digital image noise suppression method
                           using SVD block transform, U.S. Patent 5 010 504

 
                     
                        
                        Chang S. G., Yu B., Vetterli M., 2000, Adaptive wavelet thresholding for image denoising
                           and compression, IEEE Trans. Image Process., Vol. 9, No. 9, pp. 1532-1546

 
                     
                        
                        Brunet D., Vrscay E. R., Wang Z., July 2009, The use of residuals in image denoising,
                           in Proc. of 6th Int. Conf. Image Analysis and Recognition, Halifax, Canada, pp. 1-12

 
                     
                        
                        Riot P., Almansa A., Gousseau Y., Tupin F., Sept. 2016, Penalizing local correlations
                           in the residual improves image denoising performance, in Proc. of 24th European Conf.
                           Signal Processing, Budapest, Hungary, pp. 1867-1871

 
                     
                        
                        Koziarski M., Cyganek B. L., 2016, Deep neural image denoising, in Proc. of Int. Conf.
                           Computer Vision and Graphics, pp. 163-173

 
                     
                        
                        Wang P., Zhang H., Patel V. M., 2017, SAR image despeckling using a convolutional
                           neural network, IEEE Signal Process., Letters, Vol. 24, No. 2, pp. 1763-1767

 
                     
                        
                        Zhang K., Zuo W., Chen Y., Meng D., Zhang L., 2017, Beyond a Gaussian denoiser: residual
                           learning of deep CNN for image denoising, IEEE Trans. Image Process., Vol. 26, No.
                           7, pp. 3142-3155

 
                     
                        
                        Wang T., Sun M., Hu K., Boston, Dilated deep residual network for image denoising,
                           in Proc. of IEEE 29th Int. Conf. Tools with Artificial Intelligence, Boston, MA, USA,
                           pp. 1272-1279

 
                     
                        
                        Tian C., Xu Y., Fei L., Wang J., Luo J. Wen and N., 2019, Enhanced CNN for image denoising,
                           IET-CAAI Trans. Intelligence Technology, Vol. 4, No. 1, pp. 17-23

 
                     
                        
                        Zhang F., Cai N., Wu J., G , Cen , Wang H., Chen X., 2018, Image denoising method
                           based on a deep convolution neural network, IET Image Process., Vol. 12, No. 4, pp.
                           485-493

 
                     
                        
                        Chen C., Xu Z., 2018, Aerial-image denoising based on convolutional neural network
                           with multi-scale residual learning approach, Information Journal, Vol. 9, No. 7, pp.
                           169-186

 
                     
                        
                        Tassano M., Delon J., Veit T., 2019, An analysis and implementation of the FFDNet
                           image denoising method, Image Processing On Line Journal, Vol. 9, pp. 1-25

 
                     
                        
                        Li Y., Huang J-B., Ahuja N., Yang M. H., 2019, Joint image filtering with deep convolutional
                           networks, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 41, No. 8, pp.
                           1-14

 
                   
                
             
            Author
            
            
               			M. Laavanya currently serves as Associate Professor in Department of Electronics
               and Communication Engineering, Vignan’s Foundation for Science, Technology and Research
               University, Guntur, Andhra Pradesh. She received her B. E., degree in Electronics
               and Communication from Madurai Kamaraj University, Madurai in the year 2003. She received
               her M. E., degree in Applied Electronics from Anna University, Chennai in the year
               2005. She did her Ph. D., research work in the domain of Image Denoising and awarded
               Ph. D., in the year 2019 by Anna University, Chennai. Her areas of research include
               Signal, Image, Video Processing and Deep Learning. She is the life member of Indian
               Society for Technical Education and Overseas member of The Institute of Electronics,
               Information and Communication Engineers.
               		
            
            
            
               			V. Vijayaraghavan currently serves as Associate Professor in Department of Electronics
               and Communication Engineering, Vignan’s Foundation for Science, Technology and Research
               University, Guntur, Andhra Pradesh. He received his B. E., degree in Electronics and
               Communication from Madurai Kamaraj University, Madurai in the year 2003. He received
               his M. E., degree in Computer Science and Engineering from Anna University, Tiruchirappalli
               in the year 2010. He did his Ph. D., research work in the domain of Image Denoising
               and awarded Ph. D., in the year 2019 by Anna University, Chennai. His areas of research
               include Image Processing, Deep Learning, Embedded Systems, Wireless Networks, and
               Network Security. He is a life member of Indian Society for Technical Education.