Mobile QR Code QR CODE

  1. (Electrical, Electronics & Communication Engineering, Korea University of Technology and Education (KOREATECH), Cheonan 31253, Korea {shine1606, dk970610, ks.choi}@kroeatech.ac.kr )



Color consistency, Color distribution, Point set registration, Image stitching, Superpixels

1. Introduction

MULTI-VIEW images acquired from different types of cameras have color discrepancies since the colors of digital images are quite different depending on the characteristics of image sensors. Even images acquired through an identical sensor may can also have a color inconsistency problem due to various internal and external factors, such as shading depending on the lighting direction, the base region for white balance, and exposure time. These color discrepancies cause performance degradation in consumer electronics applications that use multi-view images, such as video stitching for multiple camera-based surveillance systems [1,2], three-dimensional reconstruction [3,4], and image stitching for panoramic images on mobile phones [5]. For example, in the case of 3D reconstruction, correspondence matching between color inconsistent images is prone to failure. As a result, the point cloud reconstructed from images with color inconsistency becomes very sparse and noisy.

Thanks to high-resolution cameras and powerful 3D graphics processors within mobile phones, panoramic images can be instantly created in mobile phones after taking multiple images [1]. However, color inconsistency in the multiple images results in very annoying visual artifacts within the panoramic image, especially along the boundaries of the overlapping areas. To mitigate the color inconsistency among multi-view images, various color correction methods have been proposed by imposing the color characteristics of a target image on source images [6-11].

The most efficient method [6] transfers the color characteristics by adjusting the standard deviation of the source image according to that of the target image for each color channel. In another method [7], gamma correction is performed for the luminance component, while linear correction is simply used for the chrominance components. Color consistency enhancement methods modeling the color remapping function as a spline curve were proposed [8-11]. However, these remapping-per-channel approaches usually result in a color cast (white balance) problem.

By regarding each pixel color as a 3D point, a color image can be represented as a 3D point cloud in the RGB color space. In this sense, color transfer can be viewed as a transformation of the 3D point cloud, so improving color consistency between two images can be considered as registration and transformation between point clouds distributed differently in the RGB color space. In this paper, we propose a novel color correction method to enhance the color consistency of multi-view images. In contrast to conventional methods that determine transfer functions explicitly, the proposed method transforms the color point cloud of the source image into that of the reference image non-rigidly by using a 3D point set registration algorithm.

Specifically, non-rigid registration between the target and source point clouds provides color correspondences between both images. Although the correspondence matching is not performed explicitly, the color change of corresponding pixels in the two images can be inferred. However, determining color correspondences of all the pixels requires huge computational burdens, as expected. In order to significantly reduce the computation for the 3D point cloud registration, we propose a point cloud simplification method. For each image, superpixel segmentation followed by the k-means clustering is employed to determine representative color values. The point cloud of the representative colors effectively approximates the point cloud of the image, even though it contains significantly fewer points. The experimental results confirm that the proposed method outperforms conventional methods.

This work extends our previous results [12] in several important respects. The proposed method is explicated in more detail, we further implement and compare recently presented methods, the number of data sets used is increased, and we present the execution times of the methods, which is critical for consumer electronic devices. The rest of the paper is organized as follows. In the following section, an overview of related work is given. In Section III, the proposed color correction approach is explicated in detail. In Section IV, the proposed method is evaluated with objective and subjective comparisons. Finally, conclusions are presented in Section V.

2. Related Work

2.1 Multi-view Image Color Correction

Color consistency among several images can be improved by fitting the color distribution of a source image I$_{s}$ among the images to that of the target image I$_{t}$. The concept of color transfer was first proposed by Reinhard et al. [6], who aimed to propagate the color characteristic of I$_{t}$ to that of I$_{s}$. In order to decorrelate different color channels, the RGB color signals of both I$_{t}$ and I$_{s}$ are converted to the CIELab color space. Then, the distribution of each channel of I$_{s}$ is modified to have the same standard deviation as that of the same channel of I$_{t}$.

In one study [7], the overlapping area between the adjacent linearized images is extracted, and the logarithmic mean for the luminance component and the mean for the chrominance components are computed with respect to the overlapping area of each image. The gamma values are estimated to minimize the differences of every pair of the logarithmic means. Similarly, the coefficients to be applied to the chrominance components are estimated by minimizing the differences of every adjacent pairs of the chroma-means.

To make a more accurate color mapping, Hwang et al. [11] proposed correcting each pixel’s color with an independent affine model, which is the solution of a probabilistic moving least square based on feature color correspondences. In the color correction algorithm [8], for each color channel, a fifth-degree polynomial is determined to transfer colors. Before determining the polynomial, the authors use the scale-invariant feature transform (SIFT) algorithm [13] to match correspondences between the source image I$_{s}$ and the target I$_{t}$. Then, the fifth-degree polynomial is determined using least squares regression with the correspondences. Since extreme pixel values of very dark or bright regions tend not to be selected as features, the regressed polynomial cannot deal with such pixel values. In order to cope with this problem, a small number of correspondence samples under the 10th percentile and over the 90th percentile are regressed by a one-dimensional linear model.

However, if the number of correspondences is small in the middle range, the fifth-degree polynomial will overfit and result in a distorted representation. In addition, this method requires the extraction of a sufficient number of features in the background and foreground for effective color correction. Otherwise, the color correction quality is degraded.

In another study [10], instead of using feature matching, the 3D geometrical relation between the source and target images is utilized. This method can find more accurate correspondences by projecting 3D points onto both I$_{t}$ and I$_{s}$. However, the geometrical relation (the relative pose of the images) is usually unavailable or should be determined in advance using a calibration process, so it is generally hard to apply this method. Furthermore, inaccurate regression due to lack of correspondences can also occur, as in a study described earlier [8].

The cost function of optimization-based method [9] consists of color and quality terms. In the color term, the quadratic spline is obtained and explains the color difference through the cumulative distribution function (CDF) matching. In the quality term, constraints such as gradient preservation and stretching of the dynamic range are imposed. Color is corrected like in the polynomial regression-based method, but image quality is preserved due to constraints.

2.2 Non-rigid 3D Point Set Registration

The iterative closest point (ICP) method is the most popular method for registering rigid 3D point clouds due to its simplicity and low computational complexity [14-17]. ICP iteratively improves the pose of a point cloud with respect to the reference (target/base) point cloud by minimizing overall distances between correspondences. However, if the initial guess about the pose of the point cloud is not close to the optimal pose, registration easily fails.

The coherent point drift (CPD) algorithm [18] assigns correspondences between two sets of points and recovers the transformation that maps one point set to another using a probabilistic density estimation approach. One point cloud $\mathbf{Y}_{M\times D}=\left(\mathbf{y}_{1},\cdots ,\mathbf{y}_{M}\right)^{T}$ represented using Gaussian mixture model (GMM) centroids is fitted to another point cloud $\mathbf{X}_{N\times D}=\left(\mathbf{x}_{1},\cdots ,\mathbf{x}_{N}\right)^{T}$ by maximizing the likelihood. In the CPD algorithm, both the correspondence matching and transformation determination are achieved through the expectation-maximization (EM) optimization approach. The GMM centroids are parameterized with the transformation parameters $~ \theta $ and the variances of Gaussians $\sigma ^{2}$. The GMM probability density function of the CPD method can be written as:

(1)
$ p\left(\mathbf{x}\right)=\sum _{m=1}^{M}p\left(\mathbf{x}|\mathbf{y}_{m}\right), $

where $p\left(\mathbf{x}|\mathbf{y}_{m}\right)=\frac{1}{\left(2\pi \sigma ^{2}\right)^{D/2}}\exp \left(-\frac{\left\| \mathbf{x}-\mathbf{y}_{m}\right\| ^{2}}{2\sigma ^{2}}\right).$

In the E-step, a posteriori probability distributions of mixture components $P^{old}\left(\mathbf{y}_{m}|\mathbf{x}_{n}\right)$ obtained using previous (``old'') parameter values are computed.

(2)
$ P^{old}\left(\mathbf{y}_{m}|\mathbf{x}_{n}\right)=\frac{\exp \left(-\frac{1}{2}\left\| \frac{\mathbf{x}_{n}-T\left(\mathbf{y}_{m},\theta ^{old}\right)}{\sigma ^{old}}\right\| ^{2}\right)}{\sum _{k=1}^{M}\exp \left(-\frac{1}{2}\left\| \frac{\mathbf{x}_{n}-T\left(\mathbf{y}_{k},\theta ^{old}\right)}{\sigma ^{old}}\right\| ^{2}\right)} $

where $T\left(\mathbf{y}_{m},\theta \right)$ is a transformation applied to $\mathbf{Y}$.

Then, the parameter values are updated by minimizing the expectation of a negative log-likelihood function in the M-step. The objective function can be written as:

(3)
$Q\left(\theta ,\sigma ^{2}\right)=\frac{1}{2\sigma ^{2}}\sum _{n=1}^{N}\sum _{m=1}^{M}P^{old}\left(\mathbf{y}_{m}|\mathbf{x}_{n}\right)\left\| \mathbf{x}_{n}-T\left(\mathbf{y}_{m},\theta \right)\right\| ^{2}$.

The EM algorithm proceeds by alternating between E- and M-steps until convergence.

The GMM centroids move coherently as a group to preserve the topological structure of the point clouds. By imposing coherence constraint properly, CPD can achieve rigid and non-rigid registration by using regularization. However, despite the existence of a fast algorithm for CPD, its computational complexity easily increases as the point cloud increases because the probability estimation involving the Gaussian modeling in the E-step should be performed for each point.

3. The Proposed Method

Fig. 1 illustrates an overview of the proposed method. Given N multi-view images $\left\{\mathbf{I}_{i}\right\},i=1,\cdots ,N$, a target (reference) image I$_{t}$ is chosen from them, and a source image $\mathbf{I}_{s}\in \left\{\mathbf{I}_{i}\right\}/\mathbf{I}_{t}$ is color-corrected according to target image I$_{t}$. For efficiently performing the point set registration process in the color transfer (CF) module, representative colors of I$_{t}$ and I$_{s}$ (respectively referred to as color point sets C$_{t}$ and C$_{s}$) are initially obtained through the representative color point cloud approximation (RCPCA) module. Each set contains much fewer color points compared to the number of pixels in the corresponding image. It is nevertheless noteworthy that the point set effectively approximates the color distribution of the image, as shown in the three point cloud plots at the bottom of Fig. 1.

The CPD algorithm is employed to obtain the transformation for registering C$_{s}$ to C$_{t}$ robustly. Because C$_{t}$ and C$_{s}$ do not contain all the color values of I$_{t}$ and I$_{s}$, the actual color change for each pixel in I$_{s}$ is determined by propagating the color change of an associated representative color to the pixel. An image with enhanced color consistency I$_{s` }$ is produced through the color change propagation.

Fig. 1. The overview of the proposed method. The representative color point cloud approximation module approximates the color distribution of each image I with a much smaller number of representative color points C, which are obtained by clustering the average colors of the superpixels. The 3D point set registration determines non-rigid transformations V for matching correspondences between the reduced color point sets. Each pixel color of I$_{s}$ changes according to the color transfer of its associated cluster.
../../Resources/ieie/IEIESPC.2022.11.6.426/fig1.png

3.1 Representative Color Point Cloud Approximation

In the proposed method, the CPD method is employed to transform a point cloud representing the color distribution of a source image to the point cloud of the color distribution of the target image. However, because there is much computation for iteratively estimating per-point probabilities, using CPD directly for the point cloud of the image is impractical for consumer electronic devices. Therefore, in the proposed method, the color distribution of the images is simplified through a faithful yet efficient approximation method before performing CPD. That is, by obtaining much fewer representative color points, the number of points to be processed in the later 3D point set registration is sufficiently reduced.

In order to obtain faithful representatives, the color values are deliberately determined through the following two processes in the RCPCA module. Firstly, both I$_{t}$ and I$_{s}$ are over-segmented to homogeneous regions S$_{t}$ and S$_{s}$, respectively. This is done using the simple linear iterative clustering (SLIC) algorithm [19-21], which has been widely used for preprocessing in various computer vision tasks. During the SLIC process, it is notable that an average color value and central position are also determined for each superpixel $S_{i}^{~ j}\in \mathbf{S}_{i}$.

k-means clustering iteratively computes distances from each pixel to all the segment centroids to determine the closest one to which the pixel belongs. This strategy increases the computational complexity linearly with respect to the number of segments. In contrast, in the SLIC algorithm, regardless of the number of segment centroids, only a few centroids $\mathrm{S}_{\mathrm{i}}^{\mathrm{~ j}}$ near each pixel p are compared in terms of the distance, which is given as:

(4)
$ D\left(p,S_{i}^{~ j}\right)=\sqrt{D_{c}\left(p,S_{i}^{~ j}\right)^{2}+\lambda ^{2}\cdot D_{s}\left(p,S_{i}^{j}\right)^{2}}, $

where $S_{i}^{~ j}$ represents the j-th superpixel, $j=1,\cdots ,N_{S}$. $D_{c}\left(\cdot ,\cdot \right)$ and $D_{s}\left(\cdot ,\cdot \right)$ indicate the CIELab color and spatial distances between $p$ and $S_{i}^{~ j},$ respectively. $\lambda $ controls the compactness of the superpixels. The reduction in the distance computation makes the SLIC perform very fast with excellent boundary adherence performance.

Then, the representative color values C$_{t}$ and C$_{s}$ are determined by the k-means clustering of the color values of the superpixels S$_{t}$ and S$_{s}$$_{,}$ respectively, where each representative color set C$_{i}$ has N$_{C}$ color values denoted by $C_{i}^{~ k}$. The number of representative color values is much less than the number of superpixels of each image $\left(N_{C}\ll N_{S}\right)\,.$ It is notable that almost identical representative color values can be obtained by using only the k-means clustering [22] with I$_{t}$ and I$_{s}$ directly without over-segmentation. In this case, however, the computational complexity of the k-means clustering increases drastically, which will be presented later in the experiment.

If N$_{S}$ is too small, the superpixel regions are enlarged and inhomogeneous, resulting in inaccurate representative color values. This is why we initially obtain sufficient representative color points using the superpixel segmentation and then further reduce the number of points with k-means clustering. Since $N_{S}$ is much smaller than the number of pixels, the k-means clustering is also performed fast. As shown in Fig. 1, the point cloud of the representative color values approximates the color distribution of the original image faithfully. In addition, the association mapping $\mathrm{M}$ between each pixel and the representative color $C_{i}^{~ k}$ to which the pixel belongs can be also obtained during this RCPCA process. This mapping information is effectively exploited in the subsequent color change propagation process.

3.2 Non-rigid 3D Point Set Registration

The CPD algorithm is performed to assign correspondences between the two representative color point clouds C$_{t}$ and C$_{s}$ and to obtain a transformed point cloud C$_{s` }$ by estimating the non-rigid transformation for each correspondence pair. Since C$_{t}$ and C$_{s}$ have the same number of color values, correspondences can be inferred through point proximity. Let $v^{~ k}$ denote the color change of $C_{s}^{~ k}$:

(5)
$v^{~ k}=C_{s'}^{~ k}-C_{s}^{~ k}$.

All the pixels associated with $C_{s}^{~ k}$ according to M are transferred by applying $v^{~ k}$ to the pixels; i.e., by simply adding $v^{~ k}$ to their color values.

Fig. 2 demonstrates a representative color point cloud C$_{s}$ and its transformed point cloud C$_{s` }$ with the color change of a correspondence pair in the RGB color space. Because $C_{s}^{~ k}$ is actually obtained from and associated with superpixels, pixels within each superpixel are transferred using the same $v^{~ k}$. Since the superpixel is well aligned with the object boundary, this approach enhances the color consistency naturally without visually annoying pixels, which can occur when transferring adjacent pixels differently.

Fig. 2. Color transfer using non-rigid 3D point set registration. A representative color $C_{S}^{~ k}$ is transformed to $C_{S'}^{~ k}$, and the color change $v^{~ k}$ is obtained. The color values of the pixels associated with $C_{S}^{~ k}$ are transferred by $v^{~ k}$.
../../Resources/ieie/IEIESPC.2022.11.6.426/fig2.png

4. Experimental Results

In order to evaluate the performance of the proposed algorithm, a color transfer (CT) method [6], a gamma correction-based (GC) method [7], a polynomial regression-based (PR) method [8], and an optimization- based (OPT) method [9] were implemented and compared. We used 23 pairs in Middlebury [23], 1024 pairs in Instereo2K [25], 120 pairs in IVYLAB [26], and 17 pairs in our multi-view dataset.

In each dataset, the left image was utilized as a target image. Since our dataset has a wider baseline than other datasets, and each image was taken with a different exposure time, large color discrepancy exists between the images.

The parameters for the proposed method are summarized in Table 1. $\lambda $ was set experimentally to control the effect of distance term of SLIC algorithm and to make the image well-representative. $N_{S}$, $N_{C}$, and the maximum iteration number in CPD were set to be suitable for the computational power of various consumer electronics. Fig. 3 shows the correction results of multi-view color discrepancy caused by auto-exposure in widely used consumer electronics, such as mobile phones and digital cameras. The target image is darker than the source image due to a small exposure time. The PR, OPT, and the proposed methods produce tones more similar to the target image than the CT and GC methods. However, the PR and OPT methods, which utilize feature matching, failed to effectively transfer colors of the yellow and cyan chairs due to lack of features over the textureless objects. In Fig. 3(f), the color distribution matching-based proposed method successfully color-corrected for all objects and backgrounds.

To evaluate the methods objectively, we compared the structural similarity by averaging the structural similarity index measure (SSIM) [13] for three color channels as follows:

(6)
$ SSIM\left(x,y\right)=\left[I\left(x,y\right)\cdot c\left(x,y\right)\right], $
Table 1. Parameter Settings for the Proposed Method.

Parameter

Value

$\lambda $

20

$N_{S}$

2400

$N_{C}$

128

Maximum number of iterations for CPD

100

where x and y indicate each window to measure. The luminance term $I(x,y)$ and the contrast term $c(x,y)$ are calculated as follows:

(7)
$ I(x,y)=\frac{2\mu _{x}\mu _{y}+c_{1}}{\mu _{x}^{2}+\mu _{y}^{2}+c_{1}}, \\ $
(8)
$ c(x,y)=\frac{2\sigma _{x}\sigma _{y}+c_{2}}{\sigma _{x}^{2}+\sigma _{y}^{2}+c_{2}}, $

where $\mu _{x}$ and $\mu _{y}$ represent the average of color, and $\sigma _{x}$ and $\sigma _{y}$ represent the variance of color for window x and y, respectively. $c_{1}$ and $c_{2}$ are stabilization terms for division with a weak dominator. Table 2 shows objective comparison of the methods in terms of SSIM for the four multi-view image datasets. The best and second best values are presented in bold and red, respectively. The CT method scored the highest value for the three datasets that were taken with a small baseline and had small color differences. However, there was only a slight improvement for our dataset with a large baseline and color difference. The GC and PR methods even degraded color consistency for the small-baseline datasets.

The OPT and proposed methods were improved for all the datasets, and the proposed method achieved the highest value for our dataset and on average. The proposed method without over-segmentation was also improved in all the datasets, but the k-means clustering in the method did not converge sufficiently due to a large number of pixels. As a result, inaccurate representative colors were obtained, and the performance was lower than that of the proposed method using over-segmentation.

As an additional objective evaluation, the methods were compared in terms of the peak signal-to-noise ratio (PSNR). PSNR measures the numerical similarity of each pixel and was calculated as follows:

(9)
$ PSNR=10\log \left(\frac{R^{2}}{MSE}\right), $

where $R$ represents the maximum possible pixel value of the image. The mean squared error (MSE) is calculated as follows:

(10)
$ MSE\left(I_{1},I_{2}\right)=\frac{\sum _{M,N}\left[I_{1}\left(m,n\right)-I_{2}\left(m,n\right)\right]^{2}}{MN}, $

where $M$ and $N$ represent the number of rows and columns of image $I$.

Fig. 3. Subjective comparisons of various color correction method: (a) Color-inconsistent source image I$_{s}$; (b) Image obtained using CT[6]; (c) Image obtained using GC[7]; (d) Image obtained using PR[8]; (e) Image obtained using OPT[9]; (f) Image obtained using the proposed method.
../../Resources/ieie/IEIESPC.2022.11.6.426/fig3.png
Table 2. Comparison of Structural Similarity (SSIM).

Dataset

Input image

CT [6]

GC [7]

PR [8]

OPT [9]

Proposed

w/o

over-seg.

Proposed

InStereo2K

0.9979

0.9989

0.9972

0.9962

0.9984

0.9982

0.9984

IVYLAB

0.9988

0.9994

0.9986

0.9954

0.9990

0.9990

0.9994

Middlebury

0.9991

0.9993

0.9990

0.9987

0.9993

0.9991

0.9992

ours

0.9371

0.9545

0.9525

0.9704

0.9863

0.9916

0.9985

Avg.

0.9832

0.9880

0.9868

0.9902

0.9958

0.9970

0.9989

Table 3. Comparison of Peak Signal-to-Noise Ratio (PSNR).

Dataset

Input image

CT [6]

GC [7]

PR [8]

OPT [9]

Proposed

w/o

over-seg.

Proposed

InStereo2K

17.399

17.428

17.354

17.421

17.281

17.443

17.432

IVYLAB

16.689

16.798

16.677

16.976

16.620

16.801

16.774

Middlebury

14.013

13.995

14.009

14.137

13.899

14.051

14.046

ours

16.674

16.776

17.476

18.153

18.121

18.206

18.453

Avg.

16.194

16.249

16.379

16.672

16.480

16.625

16.676

Table 4. Comparison of Execution Time (ms).

Resolution

CT [6]

GC [7]

PR [8]

OPT [9]

Proposed

w/o

over-seg.

Proposed

2880x1988

4.19

12.37

7.08

8.35

266.95

6.31

1920x1080

1.66

6.00

3.21

3.44

88.33

2.44

320x480

0.15

0.77

0.05

0.94

5.50

0.41

Avg.

2.00

6.38

3.45

4.24

120.26

3.05

Table 3 shows a comparison in terms of PSNR. The CT and GC methods showed low average values similar to structural similarity. The PR method showed the highest values for IVYLAB and Middlebury datasets, whereas it was degraded in SSIM. The OPT method degraded the color consistency in all the datasets except ours.

The proposed method improved the color consistency for all datasets and achieved the highest values for InStero2K, our dataset, and on average. The proposed method without over-segmentation did better than the proposed in the InStereo2K, IVYLAB, and Middlebury datasets since superpixels obtained using over-segmentation sometimes overlap other object boundaries due to high complex textures in the datasets, resulting in slightly lower values. However, it is notable that only the proposed method improved for all the datasets and achieved the highest average value in both subjective and objective evaluations.

Fig. 4 shows how color consistency enhancement can improve 3D reconstruction results obtained using a structure from motion (SfM) algorithm [4]. Since the background and the objects in our datasets 1 and 2 have less textures, and a few images were used, the 3D scenes were partially reconstructed. The SfM algorithm easily failed with the images with color inconsistent and thus resulted in very sparse 3D reconstruction. The 3D reconstructions with the color-corrected images obtained using both the CT and the GC methods were also unsatisfactory.

More 3D points were produced using the images obtained using the OPT method as shown in Fig. 4(e). However, textureless parts, including the wall, the floor, and the chairs, were still sparse. In contrast, the 3D reconstructions after color correction by the PR and the proposed methods were significantly improved.

The proposed method produced better results than the PR method by generating more 3D points for the textureless objects. It is notable that a similar amount of feature matching was initially found in the SfM algorithm regardless of the color correction method. However, the less consistent the color was, the more correspondences were removed as outliers. Consequently, the most 3D points were generated when using the resultant images of the proposed method.

Table 4 summarizes the execution times of the various methods for multi-view images on a PC platform (2.4-GHz CPU and 32 GB of RAM). The best and second best values are represented in bold and red, respectively. The CT method took the least amount of time overall thanks to the smallest computations for calculating the standard deviation. The PR method took about twice as long as the CT method except for the smallest image. This exception is because feature extraction and matching were performed very fast in the PR method. Since the GC method performs optimization to determine the gamma and linear coefficients, it ran about three times slower than the CT method. The OPT method also requires iterative optimization and thus is two times slower than the CT method.

The proposed method is the second fastest method and achieved 13\% higher efficiency compared to the PR method. However, if the representative colors were obtained directly clustering the original color distribution without the over-segmentation process, the execution time increased by 39 times. This confirms that the over-segmentation in the proposed method enhances the computational efficiency successfully.

In Fig. 5, the color correction methods are compared when applied for image stitching. Fig. 5(a) shows an image obtained by stitching images with different exposures. The CT method did not effectively correct the image color, and the boundaries of the overlapping areas were very noticeable. The GC and OPT methods changed color tones, but the visually annoying boundaries still remained. The PR method and the proposed method provided better but different results.

In Fig. 5(d), color difference was perceivable along the boundaries of images 1 and 2 obtained by the PR method, especially on the wall of the building. The proposed method produced more color consistent images, and the images were stitched seamlessly, as shown in Fig. 5(f).

Fig. 4. 3D reconstruction results using: (a) color-inconsistent images; (b) images obtained by CT[6]; (c) images obtained by GC[7]; (d) images obtained by PR[8]; (e) images obtained by OPT[9]; (f) images obtained by the proposed method.
../../Resources/ieie/IEIESPC.2022.11.6.426/fig4.png
Fig. 5. Image stitching results using: (a) color-inconsistent input image; (b) images obtained by CT[6]; (c) images obtained by GC[7]; (d) images obtained by PR[8]; (e) images obtained by OPT[8]; (f) images obtained by the proposed method.
../../Resources/ieie/IEIESPC.2022.11.6.426/fig5.png

5. Conclusion

In this paper, an efficient color consistency enhancement method for multiple color-inconsistent images was proposed. In contrast to the conventional methods to find color transfer functions, the proposed method reformulated the color consistency enhancement problem as a color distribution matching problem, so the color distribution of the source image is transformed into that of the target image using the 3D point set registration. Through the faithful simplification of the color distribution, the 3D non-rigid matching is performed accurately and efficiently, which is suitable for consumer electronic devices. It was confirmed that the proposed method achieves excellent performance in color consistency enhancement both subjectively and objectively, and the proposed representative color point cloud approximation enhances the computational efficiency successfully.

ACKNOWLEDGMENTS

This work was supported by the BK-21 FOUR program through National Research Foundation of Korea (NRF) under the Ministry of Education.

REFERENCES

1 
Kim B.-S., Choi K.-A., Park W.-J., Kim S.-W., Ko S.-J., May 2017, Content-preserving video stitching method for multi-camera systems, IEEE Trans. Consum. Electron., Vol. 63, No. 2, pp. 109-116DOI
2 
Xu W., Mulligan J., 2010, Performance evaluation of color correction approaches for automatic multi-view image and video stitching, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 263-270DOI
3 
Hartley R., Zisserman A., 2003, Multiple View Geometry in Computer Vision, NY, USA: Cambridge University PressDOI
4 
Schönberger J. L., Frahm J., 2016, Structure-from-Motion Revisited, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.(CVPR), pp. 4104-4113DOI
5 
Brown M., Lowe D. G., Aug. 2007, Automatic Panoramic Image Stitching using Invariant Features, Int. J. Comput. Vision, Vol. 74, pp. 59-73DOI
6 
Reinhard E., Adhikhmin M., Gooch B., Shirley P., Jul. 2001, Color transfer between images, IEEE Comput. Graph. Appl., Vol. 21, No. 5, pp. 34-41DOI
7 
Xiong Y., Pulli K., Nov. 2010, Color matching for high-quality panoramic images on mobile phones, IEEE Trans. on Consum. Electron., Vol. 56, No. 4, pp. 2592-2600DOI
8 
Jung J.-I., Ho Y.-S., 2013, Improved polynomial model for multi-view image color correction, J. Korea Info. Com. Society, Vol. 38c, pp. 881-886DOI
9 
Xia M., Yao J., Gao Z., Nov. 2019, A closed-form solution for multi-view color correction with gradient preservation, ISPRS J. Photogram. Remote Sens., Vol. 157, pp. 188-200DOI
10 
Shin D., Ho Y.-S., 2015, Color correction using 3D multi-view geometry, in Proc. SPIE Color Imaging XX, Vol. 9395DOI
11 
Hwang Y., Lee J., Kweon I. S., Kim S. J., 2014, Color transfer using probabilistic moving least squares, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 3342-3349DOI
12 
Jeong H., Yoon B., Jeong H., Choi K.-S., Sep. 2021, Multi-view image color correction using 3D point set registration, in Proc. IEEE Conf. Image Process., pp. 1744-1748DOI
13 
Lowe D. G., 1999, Object recognition from local scale-invariant features, in Proc. Int. Conf. Comput. Vis., pp. 1150-1157DOI
14 
Yang J., Li H., Jia Y., Dec. 2013, GO-ICP: Solving 3D registration efficiently and globally optimally, in Proc. Int. Conf. Comput. Vis., pp. 1457-1464DOI
15 
Campbell D., Petersson L., Dec. 2015, An adaptive data representation for robust point-set registration and merging, in Proc. Int. Conf. Comput. Vis.DOI
16 
Serafin J., Grisetti G., Oct. 2015, NICP: Dense normal based point cloud registration, in Proc. IEEE/RSJ Int. Conf. Intel. Robots and Syst., pp. 742-749DOI
17 
Pomerleau F., Magnenat S., Colas F., Liu M., Siegwart R., Dec. 2011, Tracking a depth camera: Parameter exploration for fast ICP, in Proc. IEEE/RSJ Int. Intel. Robots and Syst., pp. 3842-3829DOI
18 
Achanta R., Shaji A., Smith K., Lucchi A., Fua P., Susstrunk S., May 2012, SLIC Superpixels compared to state-of-the-art superpixel methods, IEEE Trans. Pattern Anal. Mach. Intel., Vol. 34, No. 11, pp. 2274-2282DOI
19 
Choi K.-S., Oh K.-W., May 2016, Subsampling-based acceleration of simple linear iterative clustering for superpixel segmentation, Comput. Vis. Image Understanding, Vol. 146, pp. 1-8DOI
20 
Oh K.-W., Choi K.-S., 2019, Acceleration of simple linear iterative clustering using early candidate cluster exclusion, J. Real-Time Image Process., Vol. 16, pp. 945-956DOI
21 
Arthur D., Vassilvitskii S., 2007, K-means++: The advantages of careful seeding, in Proc. ACM-SIAM Symp. Discrete Algorithms, pp. 1027-1035URL
22 
Scharstein D., Hirschmüller H., Kitajima Y., Krathwohl G., Nesic N., Wang X., Sep. 2014, High-resolution stereo datasets with subpixel-accurate ground truth, In German Conf. Pattern Recognit., Germany, pp. 31-42DOI
23 
Wei Bao , Wei Wang , Yuhua Xu , Yulan Guo , Siyu Hong , Xiaohu Zhang , 2020., InStereo2K: A large real dataset for stereo matching in indoor scenes, Sci. China Info. Sci., Vol. 63, No. 11DOI
24 
Jung Y. J., Sohn H., Lee S., Park H. W., Ro Y. M., Dec. 2013, Predicting visual discomfort of stereoscopic images using human attention model, IEEE Trans. Circuits and Sys. Video Tech., Vol. 23, No. 12, pp. 2077-2082DOI
25 
Myronenko A., song X., Dec. 2010, Point set registration: coherent point drift, IEEE Trans. Pattern Anal. Mach. Intel., Vol. 32, No. 12, pp. 2262-2275DOI
26 
Horé A., Ziou D., 2010, Image quality metrics: PSNR vs. SSIM, in Proc. Int. Conf. Pattern Recognit., pp. 2366-2369DOI

Author

Hyeonwoo Jeong
../../Resources/ieie/IEIESPC.2022.11.6.426/au1.png

Hyeonwoo Jeong received an M.S. degree in the Interdisciplinary Program in Creative Engineering at KOREATECH, South Korea, in 2022. His research interests include SLAM, image segmentation, and 3D data processing.

Dongkeun Kim
../../Resources/ieie/IEIESPC.2022.11.6.426/au2.png

Dongkeun Kim is currently in the process of obtaining a B.S. degree in electronics engineering from Korea University of Technology and Education (KOREATECH).

Kang-Sun Choi
../../Resources/ieie/IEIESPC.2022.11.6.426/au3.png

Kang-Sun Choi received a Ph.D. degree in nonlinear filter design in 2003, an M.S. in 1999, and a B.S. in 1997 in electronic engineering from Korea University. In 2011, he joined the School of Electrical, Electronics & Communication Engineering at Korea University of Technology and Education, where he is currently a professor. In 2017, he was a visiting scholar at the University of California, Los Angeles. From 2008 to 2010, he was a research professor in the Department of Electronic Engineering at Korea University. From 2005 to 2008, he worked in Samsung Electronics, Korea, as a senior software engineer. From 2003 to 2005, he was a visiting scholar at the University of Southern California. His research interests are in the areas of deep learning-based semantic segmentation, multimodal sensor calibration, human-robot interaction, and culture technology. He is a recipient of an IEEE International Conference on Consumer Electronics Special Merit Award (2012).