KimYejin1
                     YimChanghoon2
               
                  - 
                           
                        (Intelligent Image Processing Laboratory, Konkuk University, Seoul, Korea   jinye96@konkuk.ac.kr)
                        
- 
                           
                        (Intelligent Image Processing Laboratory, Konkuk University, Seoul, Korea   cyim@konkuk.ac.kr
                        )
                        
 
            
            
            Copyright © The Institute of Electronics and Information Engineers(IEIE)
            
            
            
            
            
            
          
         
            
                  1. Introduction
               
                  				Haze is a phenomenon in which particles in the air scatter light and obscure an
                  image [1]. As a result, outdoor images may have visual limitations. Haze-induced issues can
                  be fatal, especially in situations where traffic conditions require high visibility
                  in real time [2], such as car accidents, suspension of aircraft operations, and docking of ships.
                  Various studies have been conducted to remove the haze effects from images to ensure
                  consistent visibility regardless of weather conditions. In addition, during disasters
                  such as fires, it is essential to secure visibility in the event of high smoke levels
                  in the air. Removing the haze effects in images is essential to observe situations
                  through video equipment such as CCTV to cope with disaster situations.
                  			
               
               
                  				Consistent visible images have many advantages. Enhanced images after haze removal
                  can be used as data for various deep learning applications, such as object recognition
                  and tracking. In deep learning, the learning effect of recognizing objects in images
                  may vary even when the illumination varies at weak points [3]. Hence, dehazed images are highly desirable in various fields.
                  			
               
               
                  				In robot vision, cameras are critical because they are responsible for the visual
                  capabilities. Obtaining clear images from the camera regardless of the weather conditions
                  can determine the performance of robot vision and mobility [4]. Dehazing can be used to obtain haze-free real-time images in driving environments
                  and can provide improved visibility during driving and parking [5].
                  			
               
               
                  				Consistently available high visibility can also reduce the probability of accidents
                  in autonomous driving. Additionally, image dehazing can be applied to crime prevention
                  by assisting in identifying the face of a perpetrator in a hazy image. Hence, image
                  dehazing is important in the field of image processing and image enhancement. Because
                  image dehazing is an ill-posed problem, it is necessary to test various approaches
                  based on an atmospheric scattering model to solve it.
                  			
               
               
                  				Many studies on image dehazing have been performed using convolutional neural
                  networks (CNNs) [6-8]. The DehazeNet method [6] was used to estimate a transmission map directly from a hazy image. The AOD-Net method
                  [7] obtains a dehazed image from a hazy image using CNNs. Recently, various studies have
                  been conducted using methods to estimate feature maps in hazy images to produce haze-free
                  images [8].
                  			
               
               
                  				One method [9] uses CNNs to estimate the depth, and another method [10] applies transfer learning using a simple encoder-decoder network structure with skip
                  connections. In this paper, we propose a way to estimate the transmission map indirectly
                  from depth estimation using CNNs to generate dehazed images based on an atmospheric
                  scattering model.
                  
                  			
               
             
            
                  2. Related Works
               
                  				In the past, haze removal was performed using image processing, in which basic
                  contrast enhancement could be used. However, most haze removal methods are based on
                  hypotheses or empirical evidence. For example, the dark channel prior (DCP) method
                  [12] is based on the hypothesis that there is a low value in at least one of three RGB
                  channels in color images. Several methods using CNNs that can remove haze within an
                  image were developed with advancements in deep learning techniques. This led to the
                  development of the DehazeNet method [6], which is based on an atmospheric scattering model and estimates the transmission
                  map from a hazy image through a CNN. An illustration of the atmospheric scattering
                  model is shown in Fig. 1.
                  			
               
               
                  				The atmospheric scattering model [11] can be represented as:
                  			
               
               
               
                  				From Eq. (1), the clean (haze-free) image $J\left(x\right)$ can be expressed as:
                  			
               
               
               
                  
                  				In Eq. (1), $I\left(x\right)~ $is the hazy image, $J\left(x\right)$ is the clean image, $t\left(x\right)$is
                  the transmission map, and $\alpha $ is the global atmospheric light. The transmission
                  map $t\left(x\right)$ [1,6] can be expressed as:
                  			
               
               
               
                  				In Eq. (3), $\beta $ is the scattering coefficient, and $~ d\left(x\right)$ is the depth (distance).
                  Eq. (3) shows that depth information affects the transmission values $t\left(x\right)$.
                  			
               
               
                  				Typical image dehazing methods estimate the transmission map. In some previous
                  works, the depth information was used for a similar problem of fog removal. Fog effects
                  have been removed using depth estimation, which is based on the assumption that the
                  difference between brightness and saturation becomes larger as the depth becomes larger
                  [19]. Depth values have been estimated from the degree of blur for single-image fog removal
                  [20]. Unlike previous studies, we propose the application of depth information with deep
                  learning for the estimation of transmission maps. In the proposed method, the depth
                  is estimated using deep learning methods, which give more accurate depth values.
                  			
               
               
                  				DehazeNet [6] is a typical haze removal method that uses CNNs to estimate the transmission map
                  and obtain a dehazed image from it based on an atmospheric scattering model. It requires
                  guided image filtering as a post-processing procedure to refine the transmission map.
                  An advantage of this method is that CNNs are used for deep learning to solve the image
                  dehazing problem.
                  			
               
               
                  				Unlike the DehazeNet [6] method, the AOD-Net [7] method combines a transmission map and global atmospheric light parameters into a
                  single parameter function $K\left(x\right)$, which can be learned through deep learning
                  networks. An advantage of AOD-Net is that it is an end-to-end deep learning network.
                  The densely connected pyramid dehazing network (DCPDN) [13] is a GAN-based method that can produce an image similar to a dehazed image. In this
                  method, separate networks are used for learning the transmission map and global atmospheric
                  light so that it can generate a haze-free image through a joint discriminator. A recent
                  method called FFA-Net [8] learns channel attention and pixel attention maps on a block-by-block basis. It removes
                  the haze by concatenating a hazy image and learns the feature maps as residual networks.
                  			
               
               
                  				Deep learning networks have been used for depth estimation as well as haze removal.
                  The Monodepth method [14] learns from stereo images and can predict depth information from a single image,
                  whereas the Densedepth method [10] can estimate depth information using transfer learning. In the Monodepth method [14], KITTI data [15] are used as stereo images to estimate the disparity map derived from the left image,
                  which is consistent with the right image. The right image requires the disparity map
                  estimated on the left image to calculate the error. As this process repeats, it is
                  possible to create an image in the opposite direction, which allows the creation of
                  stereo images and depth information.
                  			
               
               
                  				Monodepth2 [9] is a follow-up to the Monodepth method [14]. It uses the characteristics of the KITTI dataset [15], which was constructed using consecutive images captured by a moving car. In Monodepth2,
                  the results can be corrected through reprojection. This method leads to fewer errors
                  because of the creation of stereo images. The Densedepth method [10] uses layers consisting of an encoder–decoder structure, which are interconnected
                  using skip connections. In this method, KITTI data [15] and NYU Depth V2 data [16] can be used to estimate depth information for both indoor and outdoor images.
                  
                  			
               
               
                     Table 1. Parameters of depth estimation networks.
                  
                        
                           
                              | Parameter | Monodepth2 | Densedepth | 
                     
                     
                           
                              | Training dataset | KITTI dataset | NYU2 depth dataset KITTI dataset | 
                        
                              | Batch size | 12 | 4 | 
                        
                              | Epoch | 20 | 20 | 
                        
                              | Learning rate | 0.0001 | 0.0001 | 
                        
                              | Min depth | 0.1 | 10 | 
                        
                              | Max depth | 100.0 | 1000 | 
                        
                              | Optimizer | Adam | Adam | 
                     
                  
                
               
                     Fig. 1. Atmospheric scattering model.
 
               
                     Fig. 2. Sequence diagram of the proposed method.
 
               
                     Fig. 3. Network structure of the Monodepth2 method for depth estimation.
 
             
            
                  3. The Proposed Method
               
                  				The proposed method generates a transmission map indirectly from a depth map,
                  which can be generated by using previous depth estimation networks based on deep learning.
                  Then, we obtain a dehazed image using a transmission map based on an atmospheric scattering
                  model. Fig. 2 shows a sequence diagram of the proposed method.
                  			
               
               
                  				As shown in Fig. 2, the depth map is estimated before the estimation of the transmission map. For the
                  depth map, a training process is performed using depth estimation networks based on
                  deep learning. After the training process is complete, a depth estimation model is
                  obtained. Once the depth estimation model is obtained, image dehazing can be performed
                  on a hazy image. For a hazy input image, we estimate the depth map using the depth
                  estimation model. The transmission map is estimated from the depth map using the relationship
                  described in Eq. (3). Finally, we obtain the dehazed image using the atmospheric scattering model described
                  in Eq. (2).
                  			
               
               
                  				Two methods were tested for the depth estimation model. The first method is Monodepth2
                  [9], which allows the correction of loss values in learning by applying additional information
                  to the network for the depth estimation. The second method is Densedepth [10], which uses transfer learning with both indoor and outdoor image data.
                  			
               
               
                  				The network structure of Monodepth2 method is shown in Fig. 3. The depth network is based on the U-Net structure, which enables the prediction
                  of the overall depth information. The pose network assists in predicting the depth
                  information from the movement of objects that are in the front and rear image frames.
                  Using the information in the pose networks, the networks adjust the parameters to
                  generate a depth map. For this method, training is conducted using the KITTI datasets
                  [15], which include mono images, stereo images, and mono and stereo images.
                  			
               
               
                  				Fig. 4 presents the detailed network structure of the Densedepth method [10]. This network was originally applied for image classification, and the encoder–decoder
                  structure method was used to estimate the depth. The training of this method uses
                  the KITTI data and NYU2 depth data. The NYU2 depth data are indoor data, and the KITTI
                  data are outdoor data.
                  
                  
                  			
               
             
            
                  4. Experimental Results
               
                  				Table 1 presents the parameters used for training Monodepth2 and Densedepth in the experiments.
                  			
               
               
                     4.1 Results of Depth Estimation Networks
                  
                     					Fig. 5 shows the experimental results using Monodepth2. Fig. 5(a) shows a test image of the Berkeley dataset [17] as the input. Figs. 5(b)-(d) show the resulting depth maps using mono images as the training data. Figs. 5(b) and (c) show the results of training with image sizes of 640 ${\times}$ 192 and 1024 ${\times}$
                     320, respectively.
                     				
                  
                  
                     					Fig. 5(d) shows the result of training with an image size of 640 ${\times}$ 192, as shown in
                     Fig. 5(b) without applying the pose network. Figs. 5(e)-(g) are the resulting depth map images using stereo images as the training data.
                     Figs. 5(e) and (f) show the results of training with image sizes of 640 ${\times}$ 192 and 1024 ${\times}$
                     320, respectively. Fig. 5(g) shows the resulting depth map without applying the pose network with a size of 640
                     ${\times}$ 192, which is same as the size in Fig. 5(e).
                     				
                  
                  
                     					Figs. 5(h)-(j) show the resulting depth maps using both mono and stereo images as the training data.
                     Figs. 5(h) and (i) show the results of training with sizes of 640 ${\times}$ 192 and 1024 ${\times}$
                     320, respectively. Fig. 5(j) show the result of training without applying the pose network with a size of 640${\times}$192,
                     which is the same as the size in Fig. 5(h). There is a tendency for small objects to be perceived at farther distances and for
                     large objects to be perceived at nearer distances. If the pose network is not applied,
                     the overall depth outlines are blurred, and the depth estimation values become less
                     accurate.
                     				
                  
                  
                     					Fig. 6 shows the experimental results obtained using Densedepth. The encoder part of Densedepth
                     network was set as DenseNet-169 [21] for the experiments. We compared the results of depth estimation for indoor and outdoor
                     images using the NYU2 depth dataset and KITTI dataset.
                     				
                  
                  
                     
                     					Fig. 6(a) shows the indoor image data [16] used as the input for Figs. 6(b) and (c). Fig. 6(d) shows the outdoor image data [19] used as the input for Figs. 6(e) and (f). Figs. 6(b) and (e) show the depth map images obtained by training using the NYU2 depth dataset. Figs. 6(c) and (f) show the depth map images obtained by training using the KITTI dataset.
                     				
                  
                  
                     					The NYU2 depth dataset provides indoor image data, and the KITTI dataset provides
                     outdoor image data, so the resulting depth maps are different. With the NYU2 depth
                     dataset, the results preserve more edges of objects. The results obtained with the
                     NYU2 depth dataset show more detailed depth results than those obtained with the KITTI
                     dataset for indoor images. For the outdoor images, the results obtained with the NYU2
                     depth dataset cannot predict the overall depth map, while the results with the KITTI
                     dataset can predict the depth map more evenly.
                     
                     				
                  
                  
                        Fig. 4. Detailed network structure of the Densedepth method (a) Encoder network, (b) Decoder network (AV: average pooling, CC: concatenate, CV: convolution, DB: dense block, GAP: global average pooling, MP: max pooling, SM: softmax, US: up-sampling).
 
                  
                        Fig. 5. Results of depth estimation by training using the various configurations of data (a) Input image, (b) Result with mono image (640${\times}$192), (c) Result with mono image (1024 ${\times}$ 320), (d) Result with mono image (640 ${\times}$ 192) without the pose network, (e) Result with stereo images (640 ${\times}$ 192), (f) Result with stereo images (1024 ${\times}$ 320), (g) Result with stereo images (640 ${\times}$ 192) without the pose network, (h) Result with mono and stereo images (640 ${\times}$ 192), (i) Result with mono and stereo images (1024 ${\times}$ 320), (j) Result with mono and stereo images (640 ${\times}$ 192) without the pose network.
 
                  
                        Fig. 6. Results of depth estimation by training using the various configurations of the dataset (a) Indoor input image, (b) Result depth map by [10] with NYU2 depth training data, (c) Result depth map by [10] with KITTI training data, (d) Outdoor input image, (e) Result depth map by [10] with NYU2 depth training data, (f) Result depth map by [10] with KITTI training data.
 
                
               
                     4.2 Transmission Map and Dehazed Image Obtained using the Proposed Method
                  
                     					For depth estimation, we used previously described depth estimation networks
                     [9,10]. Both networks were implemented using TensorFlow codes. For the test, unannotated
                     real-world hazy images were used [18]. Haze removal experiments were carried out by converting the depth map into the transmission
                     map using the relationship described in Eq. (3). Figs. 7 and 8 show the process of image dehazing using the proposed method.
                     				
                  
                  
                     					Figs. 7(a) and 8(a) show the input hazy images [18]. Figs. 7(b) and 8(b) show the depth maps from the input images using the depth estimation model with the
                     depth estimation network. Figs. 7(c) and 8(c) show the visualized transmission maps from the depth map. Figs. 7(d) and 8(d) show the dehazed images after the haze removal is carried out from the transmission
                     map using the atmospheric scattering model described in Eq. (2). In these results, the depth value becomes lower for nearby objects, and the transmission
                     value become higher. In addition, objects at farther distances changed more.
                     				
                  
                  
                        Fig. 7. Process of image dehazing using the proposed method (a) Input image, (b) Depth information, (c) Transmission map (visualized), (d) Dehazed image obtained using the proposed method.
 
                  
                        Fig. 8. Process of image dehazing using the proposed method (a) Input image, (b) Depth map, (c) Transmission map (visualized), (d) Dehazed image obtained using the proposed method.
 
                
               
                     4.3 Result Comparison of Proposed Method and DehazeNet
                  
                     					DehazeNet [6] directly generates a transmission map from a hazy image using the CNNs. Comparisons
                     of the results for hazy natural outdoor images for the proposed method and the DehazeNet
                     method are shown in Figs. 9-11. Figs. 9(a)-(c) show the hazy images obtained using Bdd100k [17]. Figs. 10(a)-(c) show the dehazed images obtained using the DehazeNet method. Figs. 11(a)-(c) show the dehazed images obtained using the proposed method.
                     				
                  
                  
                     					In the DehazeNet method, haze effects are sufficiently removed, changes in the
                     intensity levels are high, and the roads are excessively darkened from recognizing
                     the bright parts of the road as haze. In the results obtained using the proposed method,
                     the haze is removed more evenly, and the road parts are dehazed correctly while preserving
                     the intensity levels without any darkening effects.
                     				
                  
                  
                     					We also performed experiments using the synthetic objective testing set (SOTS)
                     [18] for the comparison of image dehazing results by DehazeNet and the proposed method
                     with Monodepth2 and Densedepth. Figs. 12 and 13 show the results with the PSNR values,
                     which were calculated using the groundtruth images of SOTS. The figures show that
                     the proposed method gives better PSNR results for image dehazing than the DehazeNet
                     method. The dehazed images from DehazeNet are darker than those from the proposed
                     method.
                     
                     				
                  
                  
                        Fig. 9. Hazy Images obtained using the Bdd100k (a) Hazy image of a street, (b) Hazy image of a road with heavy traffic, (c) Hazy image of roadside trees.
 
                  
                        Fig. 10. Dehazed images obtained using the DehazeNet method (a) Dehazed image of the street, (b) Dehazed image of a road with heavy traffic, (c) Dehazed image of roadside trees.
 
                  
                        Fig. 11. Dehazed Images obtained using the proposed method (a) Dehazed image of the street, (b) Dehazed image of a road with heavy traffic, (c) Dehazed image of roadside trees.
 
                  
                        Fig. 12. Image dehazing results with PSNR values obtained by the proposed method and DehazeNet (a) Hazy image of cityscapes, (b) Dehazed image using the proposed method with Monodepth2 (PSNR: 28.30), (c) Dehazed image using the proposed method with Densedepth (PSNR: 29.35), (d) Dehazed image using DehazeNet (PSNR: 27.60), (e) Groundtruth image.
 
                  
                        Fig. 13. Image dehazing results with PSNR values obtained by the proposed method and DehazeNet (a) Hazy image of roadways, (b) Dehazed image using the proposed method with Monodepth2 (PSNR: 29.15), (c) Dehazed image using the proposed method with Densedepth (PSNR: 28.50), (d) Dehazed image using DehazeNet (PSNR: 27.99), (e) Groundtruth image.
 
                  
                        Fig. 14. Results of a highly-hazed road image with various scattering coefficient values (a) Input image, (b) β = 0.2, (c) β = 0.4, (d) β = 0.6, (e) β = 0.8, (f) β = 1.0, (g) β = 1.2.
 
                  
                        Fig. 15. Results of an airport image with various scattering coefficient values (a) Input image, (b) β = 0.2, (c) β = 0.4, (d) β = 0.6, (e) β = 0.8, (f) β = 1.0, (g) β = 1.2.
 
                  
                        Fig. 16. Results of a parking lot image with various scattering coefficient values (a) Input image, (b) β = 0.2, (c) β = 0.4, (d) β = 0.6, (e) β = 0.8, (f) β = 1.0, (g) β = 1.2.
 
                  
                        Fig. 17. Results of creek between buildings with various scattering coefficient values (a) Input image, (b) β = 0.2, (c) β = 0.4, (d) β = 0.6, (e) β = 0.8, (f) β = 1.0, (g) β = 1.2.
 
                  
                        Fig. 18. Results of park image with various scattering coefficient values (a) Input image, (b) β = 0.2, (c) β = 0.4, (d) β = 0.6, (e) β = 0.8, (f) β = 1.0, (g) β = 1.2.
 
                
               
                     4.4 Comparison of the Results with Respect to β 
                  
                     					The degree of change in the transmission map by depth value can be adjusted using
                     the scattering coefficient β in Eq. (3). Figs. 14-18 show the results obtained with various β values (0.2, 0.4, 0.6, 0.8,
                     1.0, and 1.2, respectively). As β increases, the degree of change in the transmission
                     map by depth value becomes higher. In this case, there is more difference in these
                     values for the dehazed image compared to the input hazy image. Conversely, as β decreases,
                     the degree of change in the transmission map by depth value becomes lower.
                     				
                  
                  
                     					In these results, it is observed that the  β value of 1 results in more appropriate
                     dehazed images. Depending on the characteristics of the original hazy image, a disadvantage
                     of creating an excessively dark area appears when β becomes high in high-contrast
                     areas such as shadows. If the estimated depth information does not match well with
                     the original image, a high β value may result in artifacts due to the errors in depth
                     information. If the depth information is somewhat similar to the original image, a
                     higher β value provides better dehazing effects. When applying the depth estimation
                     network trained with the KITTI dataset, the dehazing process was performed relatively
                     well for the road environment. The dehazing effects were relatively low when there
                     were buildings on both sides without any vehicles, as shown in Figs. 15 and 16.
                     
                     				
                  
                
             
            
                  5. Conclusion 
               
                  				In this paper, we proposed a novel technique for image dehazing by indirectly
                  creating a transmission map through the estimation of the depth map as opposed to
                  direct estimation of the transmission map in previous image dehazing methods. The
                  dehazing results using the proposed method were superior to those of previous methods
                  that generate the transmission map directly with post-processing from the input image.
                  However, the proposed method has limitations. First, the dataset covering the road
                  environment in daylight could provide incorrect results for depth map estimation.
                  Second, it is necessary to set the value of atmospheric light adaptively as each test
                  set needs the appropriate atmospheric light value to be estimated for image dehazing.
                  Future research should be directed to resolve these issues.
                  			
               
             
          
         
            
                  ACKNOWLEDGMENTS
               
                  				This research was supported by the MSIT (Ministry of Science, ICT), Korea, under
                  the ITRC (Information Technology Research Center) support program (IITP-2020-2016-0-00465)
                  and supervised by the IITP (Institute for Information & communications Technology
                  Planning & Evaluation). This work was also supported by the National Foundation of
                  Korea (NRF) grant, which is funded by the Korean government (MIST) (NRF-2019R1H1A2079873).
                  			
               
             
            
                  
                     REFERENCES
                  
                     
                        
                        Narasimhan S. G., Nayar S. K., Jul. 2002, Vision and the atmosphere, Int. J. Comput.
                           Vision, Vol. 48, No. 3, pp. 233-254

 
                     
                        
                        Jingkun Z., Sep. 2015, Analysis of causes and hazards of China’s frequent hazy weather,
                           The Open Cybernetics & Systemics Journal, Vol. 9, pp. 1311-1314

 
                     
                        
                        Yan Z., Zhang H., Wang B., Paris S., Yu. Y., 2016, Automatic photo adjustment using
                           deep learning, ACM Trans. Graphics, Vol. 35, No. 2, pp. 11

 
                     
                        
                        Cowan C. K., Kovesi P. D., May. 1988, Automatic sensor placement from vision task
                           requirements, IEEE Trans. Pattern Analysis Machine Intelligence, Vol. 10, No. 3, pp.
                           407-416

 
                     
                        
                        Lee S., Maik V., Jang J., Shin J., Paik J., May. 2005, Noise-adaptive spatio-temporal
                           filter for real-time noise removal in low light level images, IEEE Trans. Consumer
                           Electronics, Vol. 51, No. 2, pp. 648-653

 
                     
                        
                        Cai B., Xu X., Jia K., Qing C., Tao D., Jan. 2016, DehazeNet: an end-to-end system
                           for single image haze removal, IEEE Trans. Image Processing, Vol. 25, No. 11, pp.
                           5187-5198

 
                     
                        
                        Li B., Peng X., Wang Z., Xu J-Z., Feng D., 2017, AOD-Net: all-in-one dehazing Network,
                           IEEE Int. Conf. Computer Vision, pp. 4770-4778

 
                     
                        
                        Xu Qin , Zhilin Wang , Yuanchao Bai , Xiaodong Xie , Huizhu Jia , 2019, FFA-Net: Feature
                           fusion attention network for single image dehazing, arXiv preprint arXiv:1911.07559

 
                     
                        
                        Godard C., Aodha O. M., Brostow. G. J., 2019, Digging into self-supervised monocular
                           depth estimation, IEEE Int. Conf. Computer Vision

 
                     
                        
                        Alhashim I., Wonka. P., 2018, High quality monocular depth estimation via transfer
                           learning, arXiv e-prints, abs/1812.11941

 
                     
                        
                        McCartney E. J., 1976, Optics of the atmosphere: Scattering by molecules and particles,
                           New York, NY, USA: Wiley

 
                     
                        
                        He K., Sun J., Tang X., Dec. 2011, Single image haze removal using dark channel prior,
                           IEEE Trans. Pattern Analysis Machine Intelligence, Vol. 33, No. 12, pp. 2341-2353

 
                     
                        
                        Zhang H., Patel V. M., 2018, Densely connected pyramid dehazing network, IEEE Int.
                           Conf. Computer Vision Pattern Recognition, pp. 3194-3203

 
                     
                        
                        Godard C., Aodha O. M., Brostow G. J., 2017, Unsupervised monocular depth estimation
                           with left-right consistency, IEEE Int. Conf. Computer Vision Pattern Recognition,
                           pp. 270-279

 
                     
                        
                        Geiger A., Lenz P., Stiller C., Urtasun R., Sep. 2013, Vision meets robotics: The
                           KITTI dataset, International Journal of Robotics Research, Vol. 32

 
                     
                        
                        Silberman N., Hoiem D., Kohli P., Fergus R., 2012, Indoor segmentation and support
                           inference from rgbd images, European Conf. Computer Vision

 
                     
                        
                        Yu F., Chen H., Wang X., Xian W., Chen Y., Liu F., Madhavan V., Darrell T., 2020,
                           Bdd100k: a diverse driving dataset for heterogeneous multitask learning, IEEE Conf.
                           Computer Vision Pattern Recognition

 
                     
                        
                        Li B., Ren W., D.Fu , Tao D., Feng D., Zeng W., Wang. Z., Aug. 2019, Benchmarking
                           single image dehazing and beyond, IEEE Transactions on Image Processing, Vol. 28,
                           No. 1, pp. 492-505

 
                     
                        
                        Pal D., Arora A., 2018, Removal of fog effect from highly foggy images using depth
                           estimation and fuzzy contrast enhancement method, International Conference on Computing
                           Communication and Automation, pp. 1-6

 
                     
                        
                        Jiwani M. A., Dandare S. N., Jun 2013, Single image fog removal using depth estimation
                           based on blur estimation, International Journal of Scientific and Research Publications,
                           Vol. 3, No. 6, pp. 1-6

 
                     
                        
                        Huang G., Liu Z., Maaten L., Weinberger K. Q., 2017, Densely connected convolutional
                           networks, IEEE Conference on Computer Vision and Pattern Recognition, pp. 2261-2269

 
                   
                
             
            Author
             
             
             
            
            
               			Yejin Kim received a BSc in Software from Konkuk University, Korea, in 2018. Currently,
               she is a graduate student at the Department of Software at Konkuk University and a
               researcher in the Intelligent Image Processing Laboratory. Her research interests
               include image dehazing via deep learning.
               		
            
            
            
               			Changhoon Yim received a BSc from the Department of Control and Instrumentation
               Engineering, Seoul National University, Korea, in 1986, an MSc in Electrical and Electronics
               Engineering from the Korea Advanced Institute of Science and Technology in 1988, and
               a PhD in Electrical and Computer Engineering from the University of Texas at Austin
               in 1996. He worked as a research engineer at the Korean Broadcasting System from 1988
               to 1991. From 1996 to 1999, he was a member of the technical staff in the HDTV and
               Multimedia Division, Sarnoff Corporation, New Jersey, USA. From 1999 to 2000, he worked
               at Bell Labs, Lucent Technologies, New Jersey, USA. From 2000 to 2002, he was a Software
               Engineer at KLA-Tencor Corporation, California, USA. From 2002 to 2003, he was a Principal
               Engineer at Samsung Electronics, Suwon, Korea. Since 2003, he has been a faculty member
               and is currently a professor in the Department of Computer Science and Engineering,
               Konkuk University, Seoul, Korea. His research interests include digital image processing,
               video processing,  multimedia communication, and deep learning.