Mobile QR Code

1. (School of Information, Communications and Electronics Engineering, The Catholic University of Korea / Pucheon-City, Korea)

Image denoising, Deep learning, U-net, New structure, Improved U-net

1. Introduction

Various image denoising methods have been studied, for example, when images are damaged by Gaussian noise, impulse noise, and speckle noise [1 - 9]. The nonlocal mean (NLM) technique and the block-matching and 3D filtering (BM3D) technique, which eliminate noise by calculating the weighted sum using similarity for each image patch in an entire image, show very good denoising performance [3, 4]. In recent years, deep learning methods, which have excellent performance in various image processing fields, have been studied for application to image denoising, showing performance superior to conventional image denoising techniques [5 - 9].

In this paper, we propose an efficient deep neural network structure to improve image denoising performance by improving the structure of U-net, which is widely used for image restoration. The proposed structure adds pre-processing and post-processing to the conventional U-net structure while also adding a convolution layer in addition to a shortcut for each stage of U-net. Since the proposed structure improves the convergence performance of the deep neural network when generating the target image, it can be used not only for denoising but also for various image restoration applications. By training the proposed structure using images with various noise intensities, noise at various intensities can be removed with a single trained parameter. Extensive computer simulations show that the proposed method yields superior denoising performance compared to BM3D and other deep learning methods.

2. Image Denoising Method

To remove noise such as Gaussian noise, impulse noise and speckle noise [1, 2], starting with a median filter, various denoising methods using the characteristics of the frequency bands of the image and noise have been studied. A denoising technique that utilizes high-frequency characteristics of noise has a problem in that high-frequency components of the original image are also lost. The NLM method shows very good performance in image denoising by calculating the weighted sum from the entire image using the local similarity of each patch [3]. In particular, the BM3D technique, which groups image patches into a 3D structure and precisely calculates weights, was identified as the technique that showed state-of-the-art denoising performance prior to using a deep learning technique [4].

Since deep learning methods have shown excellent performance in various image processing fields, a lot of research has been conducted into applying deep learning to image denoising [5 - 9]. Zhang et al. proved that the deep convolutional neural network (CNN) structure can be applied to image denoising to achieve excellent denoising performance [5]. A CNN using a variable split technique was proposed to reduce the number of computations for image denoising without degrading performance [6], and FFDNet was proposed, which can handle a wide range of noise levels and can improve convergence speed using a noise level map as input [7]. Tian et al. proposed ADNet using an attention-guided CNN [8]. These CNN methods have been proven to show superior denoising performance, compared to the BM3D technique [5 - 9].

3. Improving U-net for Image Denoising

Because deep learning has shown excellent performance in various fields of image processing, many studies on deep learning are being conducted. Research on deep learning is being conducted from various aspects, such as structures and training methods for deep neural networks. Among the various structures of deep neural networks, U‑net, shown in Fig. 1, was proposed for medical image processing, but it has been used in various image processing fields, including image restoration [10, 11]. U‑net improves convergence performance by adding skip connection to the autoencoder structure. The U-net encoder consists of a contractive path that extracts feature vectors from the input image, and the decoder consists of an expansive path that restores the image from the extracted feature vectors. In the deep learning process, the feature vectors extracted from the contractive path are trained so they are as close as possible to the feature vectors of the target image. For the expansive path, U-net is trained to restore the image as closely as possible to the target image using the extracted feature vectors. Image characteristics that may be lost in the process of reducing the size of the feature vectors in the contractive path are transferred to the expansive path through the skip connection, and are used in the image restoration process, thus improving convergence performance compared to the autoencoder.

In this paper, we propose improved structures for U-net, and we prove that it shows superior denoising performance, compared to the conventional deep neural networks. The improved U-net proposed in this paper can be used in various image restoration fields as well as for denoising. First, we propose the deep neural networks shown in Figs. 2 and 3. Convergence performance is enhanced by further processing the U-net input and output through pre-processing and post-processing, respectively. The input data from the pre-processing unit are transferred to the post-processing unit through an additional skip connection. After concatenation with the data processed in the expansive path, the image is restored through the final post-processing step. As shown in Fig. 2, pre-processing, the additional skip connection, and post-processing all compose a single module, and convergence performance can be further improved through cascaded connections of the modules. Also, as shown in Fig. 3, each stage of U-net can be modified by applying the so-called ResBlock structure that adds a convolution layer with a shortcut to each U-net stage. This structure can be used together with the pre-processing and post-processing structures described above in order to maximize the overall performance. As is shown in Section 4, the convergence and denoising performance of the proposed structure are improved compared to the conventional U-net. Since the proposed structure can improve the overall convergence performance of a deep neural network that minimizes the difference between the target image and the degraded input image, it can be used in various image restoration fields as well as for image denoising.

Fig. 1. U-net.

Fig. 2. Improved U-net (ImpUnet1 & ImpUnet2).

Fig. 3. Improved U-net (ImpUnet3).

Table 1. Average PSNR and SSIM Results (Kodak images).

 Method PSNR (in dB) SSIM σ = 10 σ = 30 σ = 50 σ = 10 σ = 30 σ = 50 Noisy 28.21 18.85 14.78 0.6595 0.2744 0.1551 BM3D [4] 36.57 30.88 28.62 0.9435 0.8472 0.7788 DnCNN [5] 36.58 31.28 28.95 0.9447 0.8580 0.7917 IRCNN [6] 36.70 31.25 28.94 0.9448 0.8584 0.7943 FFDNet [7] 36.81 31.40 29.11 0.9462 0.8597 0.7952 ADNet [8] 36.73 31.28 28.93 0.9452 0.8576 0.7887 Unet [10] 36.19 31.29 28.98 0.9430 0.8622 0.7957 ImpUnet1 36.61 31.46 29.16 0.9461 0.8647 0.8025 ImpUnet2 36.72 31.56 29.27 0.9466 0.8677 0.8056 ImpUnet3 36.52 31.45 29.18 0.9452 0.8640 0.8027 ImpUnet4 36.88 31.63 29.30 0.9478 0.8688 0.8079

Table 2. Average PSNR and SSIM Results (BSD68 images).

 Method PSNR SSIM σ = 10 σ = 30 σ = 50 σ = 10 σ = 30 σ = 50 Noisy 28.30 19.03 14.99 0.7069 0.3299 0.1944 BM3D [4] 36.18 30.25 27.80 0.9541 0.8541 0.7776 DnCNN [5] 36.44 30.67 28.25 0.9562 0.8687 0.7987 IRCNN [6] 36.37 30.57 28.19 0.9557 0.8675 0.7985 FFDNet [7] 36.50 30.70 28.31 0.9567 0.8682 0.7984 ADNet [8] 36.38 30.56 28.13 0.9555 0.8660 0.7931 Unet [10] 35.84 30.56 28.22 0.9527 0.8690 0.8001 ImpUnet1 36.20 30.71 28.33 0.9557 0.8721 0.8050 ImpUnet2 36.30 30.75 28.39 0.9560 0.8741 0.8064 ImpUnet3 36.15 30.70 28.32 0.9549 0.8721 0.8043 ImpUnet4 36.39 30.79 28.38 0.9570 0.8749 0.8078

4. Performance Evaluation

In order to evaluate the performance of the proposed method, extensive simulations were performed using a program based on TensorLayer [12]. Training images were generated using the DIV2K image database [13]. BSD68 images and Kodak images, which are the most widely used standard test images [14, 15], were used to measure performance. Image patches at 64$\times$64 were extracted from the training images, and training was performed to minimize the mean square error (MSE) loss over a total of 20,000 epochs using the Adam optimizer [16]. The step size started at $10^{-4}$ and was decreased by 1/2 for every 4,000 epochs. Additive white Gaussian noise with standard deviation that varied between 5 and 50 was added to the input training images for the deep neural network, hence, training the deep neural network to operate regardless of the noise level.

Performance comparisons of the deep neural networks are presented in Figs. 4-9 and Tables 1 and 2, where ImpUnet1 to ImpUnet4 represent stages of the improved U-net as proposed in this paper. ImpUnet1 improves U-net by using only one pre-processing and post-processing unit, while ImpUnet2 improves U-net by using three pre-processing and post-processing units. ImpUnet3 improves U-net by using ResBlock, and ImpUnet4 improves U-net by using three pre-processing and post-processing units and ResBlock. First, to analyze the convergence performance of the deep neural network, the MSE convergence curves are presented in Fig. 4. We can see that the convergence performance of the proposed structure improves, compared to the conventional U-net. When pre-processing, post-processing, and ResBlock are used together, the convergence performance is at its best. Tables 1 and 2 show the average peak signal-to-noise ratio (PSNR) and the average structural similarity index measure (SSIM) [17] for 68 BSD68 test images and 24 Kodak images. For comparison with the proposed method, the denoising performance of the BM3D technique and deep neural networks that provide excellent performance from among the existing deep neural networks used for image denoising, was compared for various noise standard deviations, σ. We can see that the proposed deep neural network shows significant PSNR and SSIM gain, compared to BM3D and the existing deep neural networks, respectively. The proposed method outperforms the conventional U-net by up to 0.7 dB for PSNR, and shows better performance than BM3D and existing neural networks for all noise levels. As shown in Figs. 5-9, the noise reduction performance of the proposed deep neural network is superior to that of the BM3D technique and the existing deep neural networks, and detailed characteristics of the image are restored well.

Fig. 4. MSE convergence (a) MSE for all 2,000 epochs, (b) MSE for the last 500 epochs.

Fig. 5. Test images for subjective comparison of denoising results (a) Kodak image 7, (b) BSD68 image 47, (c) Kodak image 1, (d) BSD68 image 18.

Fig. 6. Comparison of denoising results (Kodak image 7, σ=50).

Fig. 7. Comparison of denoising results (BSD68 image 47, σ=50).

Fig. 8. Comparison of denoising results (Kodak image 1, σ=30).

Fig. 9. Comparison of denoising results (BSD68 image 18, σ=30).

5. Conclusion

In this paper, a deep learning–based image denoising method using an improved U-net was proposed. The convergence and denoising performance of the proposed deep neural network is improved by adding pre-processing and post-processing to the conventional U-net. The performance is further enhanced by adding a convolution layer together with a shortcut in each stage of U-net. In particular, pre-processing and post-processing have a modular structure, and performance can be further improved through adopting a cascaded connection between modules. Extensive simulations confirmed that the proposed method has superior denoising performance compared to BM3D and existing deep learning methods. Since the proposed structure improves the overall convergence performance of U-net, it can be used not only for image denoising but also for various image restoration applications.

ACKNOWLEDGMENTS

This study was supported by Research Fund 2020 of The Catholic University of Korea and by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (No. 2017R 1D 1A 1B03030585).

REFERENCES

1
Mafi M., Tabarestani S., Cabrerizo M., Barreto A., Adjouadi M., 2018, Denoising of ultrasound images affected by combined speckle and Gaussian noise, IET Image Processing, Vol. 12, No. 12, pp. 2346-2351
2
Dong Y., Xu S., 2007, A new directional weighted median filter for removal of random-valued impulse noise, IEEE Signal Processing Letters, Vol. 14, No. 3, pp. 193-196
3
Buades A., Coll B., Morel J.-M., June 2005, A non-local algorithm for image denoising, in Proc. of Computer Vision and Pattern Recognition 2005 (CVPR 2005), pp. 60-65
4
Dabov K., Foi A., Katkovnik V., Egiazarian K., Aug. 2007, Image denoising by sparse 3-D transform domain collaborative filtering, IEEE Trans. on Image Processing, Vol. 16, No. 8, pp. 2080-2095
5
Zhang K., Zuo W., Chen Y., Meng D., Zhang L., 2017, Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising, IEEE Transactions on Image Processing, Vol. 26, No. 7, pp. 3142-3155
6
Zhang K., Zuo W., Gu S., Zhang L., , Learning deep CNN denoiser prior for image restoration, in CVPR 2017
7
Zhang K., Zuo W., Zhang and L., 2018, FFDNet: Toward a fast and flexible solution for CNN-based image denoising, IEEE Transactions on Image Processing, Vol. 27, No. 9, pp. 4608-4622
8
Tian C., Xu Y., Li Z., Zuo W., Fei L., Liu H., April 2020, Attention-guided CNN for image denoising, Neural networks, Vol. 124, pp. 117-129
9
Tian C., Fei L., Zheng W., Zuo Y. W., Lin C-W., Nov. 2020, Deep learning on image denoising: An overview, Neural networks, Vol. 131, pp. 251-275
10
Ronneberger O., Fischer P., Brox T., 2015, U-Net: Convolutional networks for biomedical image segmentation, MICCAI 2015: Medical Image Computing and Computer-Assisted Intervention 2015, pp. 234-241
11
Kim Y. J., Lee C. W., August 2020, Deep Learning Method for Extending Image Intensity Using Hybrid Log-Gamma, IEIE Transactions on Smart Processing and Computing, Vol. 9, No. 4, pp. 312-316
12
Dong H., Supratak A., Mai L., Liu F., Oehmichen A., Yu S., Guo Y., 2017, TensorLayer: A versatile library for efficient deep learning development, in Proc. ACM-MM 2017, pp. 1201-1204
13
Agustsson E., Timofte R., , NTIRE 2017 challenge on single image super-resolution: Dataset and study, in CVPRW 2017
14
Franzen R., 1999, Kodak lossless true color image suite, source: http://r0k.us/graphics/kodak, Vol. 4
15
Martin D., Fowlkes C., Tal D., Malik J., , A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in ICCV 2001.
16
Kingma D., Adam J. B., 2015, Adam: A method for stochastic optimization, International Conference on Learning Representations
17
Horé A., Ziou D., 2010, Image quality metrics: PSNR vs. SSIM, 20th International Conference on Pattern Recognition

Author

Jaewook Han

Jaewook Han is a student at the School of Information, Communi-cations and Electronics Engineering, the Catholic University of Korea. His current interests lie in the area of image processing and deep learning.

Jinwon Choi

Jinwon Choi is a student at the School of Information, Communications and Electronics Engineering, the Catholic University of Korea. His current interests lie in the area of image processing and deep learning.

Changwoo Lee

Changwoo Lee received a BSc and an MSc in control and instrumentation engineering from Seoul National University. After receiving a PhD in image processing area from Seoul National University in 1996, he worked as a Senior Researcher with Samsung Electronics. He is currently a Professor at the School of Information, Communications and Electronics Engineering, the Catholic University of Korea. His current interests lie in the area of image processing and deep learning.