Mobile QR Code QR CODE

2024

Acceptance Ratio

21%

Main Menu

※ The user interface design of www.ieiespc.org has been recently revised and updated. Please contact inter@theieie.org for any inquiries regarding paper submission.

Journal Search

IEIESPC(IEIE Transactions on Smart Processing and Computing)

IEIESPC Vol. 12, No. 05, p.390-397

ISSN (online) :

2287-5255

Received : 13 February 2023Revised : 2 April 2023Accepted : 15 April 202330 October 2023

DOI :

https://doi.org/10.5573/IEIESPC.2023.12.5.390

Regular Paper

Review Paper: This paper reviews the recent progress, possibly including previous works in a particular research topic, and has been accepted by the editorial board through the regular reviewing process.

Review of Spatial and Temporal Color Constancy

HaJeong-Won¹ KimJong-Ok¹

(School of Electrical Engineering, Korea University / 145 Anam-ro, Seongbuk-gu, 02841, Korea {jwon9339, jokim}@korea.ac.kr )

^*Corresponding Author: Jong-Ok Kim

License :

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.(www.theieie.org).

Abstract

Color constancy is the ability to recognize the inherent object color invariant to surrounding illuminants. With the development of electric bulbs, there are various illuminant environments. It is an important process for image signal processing pipelines and has been studied for a long time. Most studies focus on spatial information of a single image. Several studies recently proposed the use of temporal features of high-speed video. Because light bulbs are supplied by AC (alternative current) power, the intensity varies sinusoidally with time, which can be captured with a high-speed camera. The temporal features of periodic variation were used for several color constancy studies. They showed the usefulness of temporal features. This review introduces various color constancy methods in spatial and temporal domains and compares the accuracy of illuminant estimation.

Keywords

Color constancy, Review, Spatial, Temporal, Statistics-based, Alternative current

1. Introduction

Color constancy is the ability to recognize the inherent color invariant of the illuminant. Although the human visual system has color constancy ability, the images captured with a digital camera is affected by the illuminant. Therefore, estimating and eliminating illuminants essential in the image signal processing pipeline. This process affects not only the visual quality, but also the performance of computer vision tasks, such as classification and segmentation ^[1]. Because a white balance is one of the important processes in the image signal processing pipeline, it has been studied for a long time. Fig. 1 presents the concept of color constancy. The goal of color constancy is to correct the image captured under colored illuminant to white light. This paper provides a comprehensive review of color constancy methods. Table 1 lists the color constancy methods introduced in this paper.

Conventional color constancy methods can be classified as statistics-, physics-, and learning-based. Statistics-based algorithms formulate the hypothesis that can be applied to natural scenes and exploit it for illuminant estimations. Although it has been used widely for commercial devices, it has limitations for uniformly colored regions with narrow color distributions. The physics-based methods use physical models with respect to the surface reflection of light. It is highly challenging to accurately determine the model parameters of the ill-posed problem formulated by the physics model accurately is highly challenging. Owing to the development of deep learning, it has been exploited for illuminant estimation and contributed to performance improvement.

With the development of imaging technologies, high-speed cameras have been equipped with consumer devices. The high-speed camera can capture rapid changes and variations of scenes imperceptible to the human eyes. With the utility of high-speed cameras, it is expected that they will be used popularly for consumer devices, such as smartphones.

Most color constancy studies have exploited a single image, and several works are using temporal features of multiple frames. Recently, some studies exploited the temporal features of AC light sources in illuminant estimation. Fig. 2 shows the scenario of illuminant estimation under AC light sources. The intensity of light sources varies with time because light sources are supplied by the alternative current (AC) power. It flickers with double AC frequency. Although human eyes cannot capture this fast variation, it can be observed with a high-speed camera whose capturing speed is faster than the fluctuation. This temporal fluctuation can be a powerful prior to illuminant estimation. Several studies have exploited AC fluctuations for color constancy ^[14-^18]. ^[14] first proposed exploiting AC fluctuation for illuminant estimation. It selects AC pixels that follow sinusoidal curves and exploited dichromatic plane estimation. ^[16] proposed a temporal gradient map generated with the intensity difference between adjacent frames. They assumed that the more illuminated region showed higher variations. In reference ^[15], all the parameters of the dichromatic model were estimated with temporal priors where the reflection components fluctuate sinusoidally. ^[17,^18] extracted the temporal correlation of high-speed video with a non-local neural network. The temporal correlation map was used for temporal feature extraction. With the temporal fluctuation of AC light bulbs, ^[14-^18] achieved more accurate illuminant estimation than the spatial-based methods.

There are various color constancy datasets for single image methods, such as Gehler-Shi Color Checker ^[20] and NUS ^[21]. On the other hand, the temporal color constancy methods assume AC variations of illuminant, and conventional datasets were unsuitable for evaluation. Therefore, the high-speed video dataset ^[15] was proposed and exploited for experiments. The high-speed video dataset consists of various objects and illuminant conditions, such as practical indoor scenes and laboratory environments.

This paper introduces various color constancy methods that exploit spatial and temporal features. The remainder of the paper is organized as follows. Section 2 describes the color constancy methods that exploit only spatial features. Section 3 summarizes the temporal color constancy methods. Experimental results are shown in Section 4. Finally, Section 5 concludes the paper.

Fig. 1. Concept of color constancy.

Table 1. Summary of color constancy methods.

		Methods	Input
Statistics		Gray world ^[2], Max-RGB ^[3], Shades of gray ^[4], 1st order grey edge ^[5], 2nd order grey edge ^[5], Grey pixels ^[6]	Single image
Physics	Spatial	IIC ^[7], ICC^[8]	Single image
Physics	Temporal	Prinet et al. ^[13], Yoo et al. ^[14]	High-speed video
Learning	Spatial	Bianco et al. ^[9] , FFCC ^[19], FC4 ^[10], ReWU ^[11]	Single image
Learning	Temporal	Ha et al. ^[16], DDME ^[15], Yoo et al. ^[17], Yoo et al. ^[18]	High-speed video

2. Spatial Color Constancy

The image value $f$ depends on the spectral curve of light source $e\left(\lambda \right)$, surface reflectance $s\left(\lambda \right)$, and camera sensitivity function $c\left(\lambda \right)$:

$ \begin{equation*} f=\left(R,G,B\right)^{T}=\int _{w}e\left(\lambda \right)s\left(\lambda \right)c\left(\lambda \right)d\lambda \end{equation*} $

where $\lambda $ and $w$ mean wavelength and visible spectrum, respectively. It is assumed that the color of the illuminant is uniform in the scene. The illuminant that is estimated in color constancy can be expressed as follows:

$ \begin{equation*} e=\left(R_{e},G_{e},B_{e}\right)^{T}=\int _{w}e\left(\lambda \right)c\left(\lambda \right)d\lambda \end{equation*} $

To correct the color-biased image with an estimated illuminant, normalized light source color is used. The normalized illuminant vector $\hat{e}$ is exploited because the goal of correction is to change the color of the scene, not the brightness. The corrected pixel can be obtained as follows:

$ \begin{equation*} \left(R',G',B'\right)=\left(\frac{R}{\hat{e}_{R}},\frac{G}{\hat{e}_{G}},\frac{B}{\hat{e}_{B}}\right) \end{equation*} $

The following describes spatial color constancy methods, which are divided into statistics-, physics-, and learning-based.

2.1 Statistics-based Methods

The statistics-based methods exploit the statistical properties of a scene. The best-known assumption of the statistics-based approach is the gray world ^[2]. It assumes that the average reflectance of a scene under white illuminant is achromatic. The illuminant color is estimated with the average of the image.

$ \begin{equation*} \int f_{c}\left(x\right)dx=ke_{c},~ c\in \left\{R,G,B\right\} \end{equation*} $

where $k$ is a multiplicative constant and $e$ is a normalized illuminant vector.

Another assumption ^[3] is that there is a patch with perfect reflectance in a natural image. Because this patch shows high reflectance in the scene, the max value of each channel represents the illuminant.

$ \begin{equation*} \max f_{c}\left(x\right)=ke_{c} \end{equation*} $

^[4] introduces the gray world and max-RGB algorithms as exceptional cases of more general color constancy algorithms based on the Minkowski norm as follows:

$ \begin{equation*} \left(\frac{\int \left(f\left(x\right)\right)^{p}dx}{\int dx}\right)^{1/p}=ke \end{equation*} $

The gray world and max-RGB are the same as $p=1$ and $p=~ \infty $, respectively. ^[4] reported this as shades of gray and found that the best results are obtained with $p=6$.

^[5] assumed that the derivative of an image is achromatic, and it is called the gray edge. The illuminant can be estimated with the 1$^{\mathrm{st}}$ and 2$^{\mathrm{nd}}$ derivatives of an image. It generalized gray-world, max-RGB, and shades of gray and proposed a gray edge algorithm as follows:

$ \begin{equation*} \left(\int \left| \frac{\partial ^{n}f^{\sigma }\left(x\right)}{\partial x^{n}}\right| ^{p}dx\right)^{1/p}=ke \end{equation*} $

Because the average reflectance of achromatic objects under colored illuminant is identical to illuminant chromaticity, ^[6] assumed that gray pixels exist in natural scenes. ^[6] defined an illuminant invariant measure to find gray pixels in a color-biased image. The illuminant can be estimated with the average of selected pixels that are close to gray.

2.2 Physics-based Methods

The physics-based methods exploit the physical interaction between the object and the light source. They are based on the dichromatic model:

$ \begin{equation*} I_{c}=m_{d}\Lambda _{c}+m_{s}\Gamma _{c},~ ~ c\in \left\{r,g,b\right\} \end{equation*} $

where$~ \Lambda _{c}$ and $\Gamma _{c}$ are diffuse and specular chromaticity, respectively, and $m_{d}$ and $m_{s}$ are their respective weights. Specular chromaticity is identical to the illuminant color, while diffuse reflection represents the intrinsic color of objects.

In the inverse intensity chromaticity space ^[7], the linear relationship between RGB chromaticity ($\sigma _{c}=\frac{I_{c}}{\sum I_{i}}$) and inverse intensity was studied and is expressed as follows:

$ \begin{equation*} \sigma _{c}=p_{l}\frac{1}{\sum I_{c}}+\Gamma _{c} \end{equation*} $

where $p_{l}=m_{d}\left(\Lambda _{c}-\Gamma _{c}\right)$. Based on ^[7], Woo et al. ^[8] exploited selected specular pixels for accurate specular chromaticity estimation. ^[8] studied the relationship between the length of the dichromatic line and specularity. It obtained a reliable dichromatic line using pixels that are colored uniformly, bright as possible, and produced a long line segment in chromaticity space.

2.3 Learning-based Methods

The learning-based model estimates the illuminant by learning the features from the training dataset. ^[9] first introduced CNN in color constancy. It extracts the features with CNN, and the reshaped features were fed into fully connected layers, and a three-dimensional illuminant vector was estimated. On the other hand, the network accepts the patch of a given image. If a patch with less illuminant information is selected, it may affect performance. To alleviate this problem, FC4 ^[10] estimates illuminant with a full image. It generates a four-channel output, a local illuminant map (three channels), and a confidence map (1 channel). A local illuminant map represents the illuminant of a local region, and confidence means the estimated accuracy of the local illuminant at the corresponding region. The global illuminant was finally estimated using the weighted sum of local illuminants using a confidence map.

Instead of extracting deep features with CNN, ^[11] reported that color constancy can be solved at a more lightweight model. Conventional algorithms use some pixels more important for illuminant estimation, and they detect and analyze these pixels. As the conventional methods, ^[11] proposed a feature map reweight unit that can focus on significant pixels.

3. Temporal Color Constancy

Some studies exploit the temporal features of image sequences. Yang et al. ^[12] assumed fixed objects and moving cameras under constant illuminant chromaticity. Different images (I and J) captured with different viewing points are used. Let a pixel $p$ in I correspond to $\overline{p}$ in J. Then, the pixel value $I\left(p\right)$ and $J\left(\overline{p}\right)$ share diffuse components, and the difference comes from specular components. The normalized difference of matching pixels indicates the illuminant chromaticity:

$ \begin{equation*} \Gamma _{c}=\frac{I_{c}\left(t+\Delta t\right)-I_{c}\left(t\right)}{m_{s}\left(t+\Delta t\right)-m_{s}\left(t\right)}=\frac{\Delta I_{c}\left(t\right)}{\Delta m_{s}\left(t\right)} \end{equation*} $

Prinet et al. ^[13] enhanced Yang et al. ^[12] by formulating it in a probabilistic manner and achieved a robust illuminant estimation performance.

Several studies assumed high-speed video captured under AC light sources. Yoo et al. ^[14] proposed using AC variation of light sources in illuminant estimation. Previous methods ^[12,^13] need to find corresponding pixels between frames. On the other hand, in the AC light-based method, it is easier to find corresponding pixels of adjacent frames because of high-speed capture. The intensity of light sources supplied by alternative current fluctuates sinusoidally over time. This temporal fluctuation is captured with a high-speed camera and is exploited to select AC pixels. An AC pixel is a pixel whose intensity varies sinusoidally due to AC variation of light sources. The mean intensity of RGB channels ($I_{m}=\left(I_{R}+I_{G}+I_{B}\right)/3$) can be modeled as a sinusoidal curve. It is represented as follows:

$ \begin{equation*} I_{m}\left(t\right)\approx f\left(t,\Theta \right)=A_{m}\sin \left(4\pi f_{ac}t/f_{cam}+\phi \right)+off \end{equation*} $

$A_{m}$ is the amplitude; $f_{ac}$ is the standard frequency of the AC current (typically 50 or 60 Hz); $f_{cam}$ is the frame rate; $off$ is the offset value. The parameter set $\Theta =\left(A_{m},~ \phi ,~ off\right)$ was estimated iteratively using the Gauss-Newton method. Fig. 3 shows the modeling result of the intensity variation of the boxed regions with a sinusoidal curve. The illuminant is estimated with the intersection of multiple dichromatic planes of AC pixels. Because of the short exposure time of the high-speed image, it contains inherent low-light noise, and it makes dichromatic model estimation challenging. The noisy pixels can be removed, and an accurate dichromatic model can be estimated by selecting AC pixels.

DDME ^[15] estimates all parameters ($m_{d},~ \Lambda ,~ m_{s}$, and $\Gamma $) of the dichromatic model with temporal features of high-speed video. The image pixels can be factorized to chromaticity dictionary matrix $D$ and coefficient matrix $C$ as follows:

$ \begin{equation*} I=DC \end{equation*} $

$D$ contains the illuminant and diffuse chromaticity dictionary, and $C$ represents the weight coefficient $m_{d}$ and $m_{s}$. The proposed network consists of two branches, and the chromaticity dictionary and coefficient are estimated with each branch. Although the estimation of the dichromatic model is a highly ill-posed problem, ^[15] successfully estimated the parameters with temporal priors of a high-speed video. The chromaticity of diffuse reflection between adjacent frames is approximately identical because the camera and object are static in a short interval. As the intensity of the AC light source varies, the intensities of the reflection components (diffuse and specular) also vary sinusoidally. Therefore, $m_{d}$ and $m_{s}$ are regularized to have sinusoidal variation. The mean coefficient weights were fitted with Gauss-Newton method, as reported elsewhere ^[14]. The regression error is reflected as a loss function. With these temporal constraints, the network of ^[15] accurately learns the parameters of the dichromatic model. Color constancy and highlight removal can be conducted successfully using the estimated dichromatic model.

Ha et al. ^[16] proposed using the temporal gradient of high-speed video in illuminant chromaticity estimation. The proposed network comprises two subnets that estimate local illuminant and confidence map. The local illuminant and confidence map means the local illuminant of each local region and the estimated accuracy of the corresponding region, respectively. ^[16] assumed that highly illuminated regions show high-intensity variation. It calculates the temporal gradient of the video with the intensity difference between two adjacent gray frames. The maximum operation is taken among temporal gradient frames. Fig. 4 shows the generation process of the maximum gradient map. The maximum gradient map is used to estimate the confidence map. The confidence map can detect the highly illuminated region useful for illuminant estimation. The accurate estimation is achieved by estimating illuminants with highly illuminated regions.

^[17,^18] estimated the illuminant by extracting a temporal correlation map of high-speed video. The temporal correlation of high-speed video was extracted with a non-local block. The network of ^[17] consisted of temporal and spatial branches. The temporal branch uses temporal correlation weighted input frames as input. Fig. 5 presents the extraction of temporal features with a non-local neural network ^[22]. The temporally attentive regions can be detected using the temporal features. By weighting the input frames with temporal correlation map, the illuminant map that contains AC variation information can be obtained. As reported elsewhere ^[14], detecting AC-illuminated regions contributes to illuminant estimation. The spatial branch learns spatial features of a single image as the spatial-based color constancy deep networks. ^[17] improved the illuminant estimation performance by adding a spatial branch in the previous study ^[18]. ^[17] claimed that the temporal features can contribute to illuminant estimation of complex indoor scenes, while spatial features are for simple laboratory scenes.

Fig. 3. AC variation modeling of high-speed video.

Fig. 4. Generation of maximum gradient map. Reprinted from [16]. Copyright by IEEE.

Fig. 5. Temporal feature extraction based on the non-local neural method. Reprinted from [17]. Copyright by IEEE.

4. Experimental Results

4.1 Dataset

There are several public datasets for single-image color constancy, such as Gehler-Shi Color Checker ^[20] and NUS ^[21]. On the other hand, these datasets are unavailable for temporal color constancy methods that require multiple frames as input. Using high-speed video with temporal AC variation was first conducted in ^[14], and a high-speed dataset was constructed. It contains 80 high-speed raw videos captured with Sentech STC-MCS43U3V high-speed vision camera. It includes various objects like plastic, rubber, metal, stones, and fruit. The raw frames were normalized and demosaicked to estimate illuminant and white balance. A color checker was captured for the ground truth illuminant. The dataset was captured in a laboratory setting that blocks external lights.

^[15] extended the high-speed dataset of ^[14] to contain diverse scenes. It comprised 225 raw high-speed videos captured with Sentech STC-MCS43U3V high-speed vision camera. The camera was set to a frame rate of 150 FPS and an exposure time of 1/300 sec. Total 225 scenes were captured and divided into 150 and 75 scenes for training and test, respectively. Although the previous dataset ^[14] only contained closed scenes, ^[15] contained open indoor scenes for practical performance evaluation. The former scenes occupied 33.3%, and 66.7% were open scenes. Fig. 6 shows sample images of the dataset. Two types of light sources (incandescent and fluorescent) were used to capture the closed scenes. The open scenes are captured in indoor public places, such as caf\'{e}s, libraries, schools, and hotels. These scenes may include ambient light or sunlight, which is a practical environment of the real world. The color checker was captured for ground truth illuminant.

Fig. 6. Sample images of the high-speed dataset [15]. It consists of closed (top row) and open (bottom) environment.

4.2 Experimental Results

The angular error, which is a common quality measure of color constancy, was used for the performance evaluation. The direction of the illuminant vector indicates the illuminant color and the intensity is not considered. The angular error between the ground truth ($\Gamma _{gt}$) and estimated illuminant ($\Gamma _{est}$) was calculated as follows:

$ \begin{equation*} AE=\arccos \left(\frac{\Gamma _{est}\cdot \Gamma _{gt}}{\parallel \Gamma _{est}\parallel ~ \parallel \Gamma _{gt}\parallel }\right)~ \end{equation*} $

A smaller angular error indicates a more accurate estimation.

Table 2 lists the angular error of various color constancy methods. The color constancy methods are categorized as statistics, physics, and learning-based. In addition, the methods are divided into temporal and spatial approaches by whether they use temporal features. The high-speed dataset ^[15] is used to evaluate the performance of temporal methods. Note that all the learning-based methods are trained with the high-speed dataset. For a fair comparison, the angular errors of the single image-based methods are averaged for five frames.

As shown in Table 2, the temporal methods performed better than the spatial ones. To confirm the robust performance, the average angular error for closed and ambient scenes are shown in Table 2. ^[14] reported the best performance for closed scenes between physics-based methods. On the other hand, it failed to find AC pixels under complex illuminant environments or weak temporal variation and provided the worst estimation for ambient scenes among physics-based methods. With deep learning methods, the illuminant estimation performance was highly improved. Fig. 7 presents the white-balanced versions of the estimated illuminants with the learning-based methods. The temporal-based methods achieved better reconstruction than the spatial-based methods. Also ,the temporal feature methods showed performance improvement for ambient scenes. It means that the temporal methods are robust to illuminant environments. With temporal features of AC light source, the robust performance can be achieved.

Fig. 7. (a) Input image, and white-balance image with (b) ground truth illuminant and estimated illuminant with (c) Bianco et al. [9]; (d) FFCC [19]; (e) FC4 [10]; (f) ReWU [11]; (g) Ha et al. [16]; (h) DME [15]; (i) Yoo et al. [17].

Table 2. Angular Error Comparisons.

Method			Mean	Median	Best-25%	Worst-25%	Closed	Ambient
Statistics	Gray world ^[2]		4.08	3.42	1.27	8.34	6.28	2.97
	Max-RGB ^[3]		13.22	13.62	5.92	20.65	14.85	12.40
	Shades of gray ^[4]		5.56	4.40	1.40	12.61	5.65	5.12
	1^st order grey edge ^[5]		10.34	9.94	2.69	18.84	7.72	11.65
	2^nd order grey edge ^[5]		12.39	12.12	3.65	22.06	9.59	13.79
	Grey pixels ^[6]		8.17	6.41	1.79	17.88	7.19	8.65
Physics	Spatial	IIC ^[7]	4.25	3.28	1.03	9.08	3.94	4.41
	Spatial	ICC^[8]	3.47	2.72	1.10	7.53	5.17	2.61
	Temporal	Prinet et al. ^[13]	5.30	3.95	0.92	11.70	7.83	4.03
	Temporal	Yoo et al. ^[14]	4.60	3.75	1.59	8.91	3.65	5.07
Learning	Spatial	Bianco et al. ^[9]	1.79	1.12	0.36	4.42	1.44	1.97
		FFCC ^[19]	1.42	0.68	0.12	4.18	0.19	2.04
		FC4 ^[10]	2.26	2.05	0.76	4.17	2.30	2.25
		ReWU ^[11]	1.26	0.55	0.30	3.30	0.87	1.46
	Temporal	Ha et al. ^[16]	0.95	0.24	0.12	2.84	0.35	1.25
		DDME ^[15]	1.16	0.85	0.26	2.75	0.90	1.28
		Yoo et al. ^[18]	1.15	0.37	0.26	4.53	0.90	1.27
		Yoo et al. ^[17]	1.00	0.43	0.26	2.57	0.56	1.22

5. Conclusion

This paper reviewed the spatial and temporal color constancy methods, focusing on temporal ones. The color constancy methods can be divided into statistics-, physics-, and learning-based approaches. Several studies proposed using temporal features of high-speed video, while most color constancy methods exploit spatial features of a single image. The intensity of light sources supplied by alternative current varies with time, which can be captured in high-speed video. The methods use this temporal fluctuation as a prior of illuminant estimation. The periodic variation can be modeled as a sinusoidal curve. This property is used to find AC pixels to estimate the accurate dichromatic plane. In addition, it is used as temporal loss for training the network that estimates dichromatic parameters. The intensity difference between frames can be used to detect highly illuminated regions. The temporal correlation of high-speed video was generated with a non-local neural network and contributes to illuminant estimation. The experimental results confirmed that temporal features lead to better color constancy.

ACKNOWLEDGMENTS

This work is supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (No. 2020R1A4A4079705).

REFERENCES

Mahmoud Afifi and Michael S Brown. What else can fool deep learning? addressing color constancy errors on deep neural network performance. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 243-252, 2019.

Buchsbaum, Gershon. "A spatial processor model for object colour perception." Journal of the Franklin institute 310.1 (1980): 1-26.

Land, E. H. (1977). The retinex theory of color vision. Scientific american, 237(6), 108-129.

Finlayson, G. D., & Trezzi, E. (2004, January). Shades of gray and colour constancy. In Color and Imaging Conference (Vol. 2004, No. 1, pp. 37-41). Society for Imaging Science and Technology.

Van De Weijer, J., Gevers, T., & Gijsenij, A. (2007). Edge-based color constancy. IEEE Transactions on image processing, 16(9), 2207-2214.

Yang, K. F., Gao, S. B., & Li, Y. J. (2015). Efficient illuminant estimation for color constancy using grey pixels. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2254-2263).

Robby T Tan, Katsushi Ikeuchi, and Ko Nishino. Color constancy through inverse-intensity chromaticity space. In Digitally Archiving Cultural Objects, pages 323-351. Springer, 2008.

Sung-Min Woo, Sang-Ho Lee, Jun-Sang Yoo, and Jong-Ok Kim. Improving color constancy in an ambient light environment using the phong reflection model. IEEE Transactions on Image Processing, 27(4):1862-1877, 2018.

Simone Bianco, Claudio Cusano, and Raimondo Schettini, "Color Constancy Using CNNs", CVPR2015, pp. 81-89.

Yuanming Hu, Baoyuan Wang, and Stephen Lin, "FC4: Fully Convolutional Color Constancy with confidence-weighted Pooling", CVPR2017, pp. 4085-4094.

Jueqin Qiu, Haisong Xu, and Zhengnan Ye. Color constancy by reweighting image feature maps. IEEE Transactions on Image Processing, 29:5711-5721, 2020.

Yang, Q., Wang, S., Ahuja, N., & Yang, R. (2010). A uniform framework for estimating illumination chromaticity, correspondence, and specular reflection. IEEE Transactions on Image Processing, 20(1), 53-63.

Prinet, V., Lischinski, D., & Werman, M. (2013). Illuminant chromaticity from image sequences. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3320-3327).

Yoo, J. S., & Kim, J. O. (2019). Dichromatic model based temporal color constancy for AC light sources. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 12329-12338).

Yoo, J. S., Lee, C. H., & Kim, J. O. (2021). Deep Dichromatic Model Estimation Under AC Light Sources. IEEE Transactions on Image Processing, 30, 7064-7073.

Ha, J. W., Yoo, J. S., & Kim, J. O. (2021, June). Deep color constancy using temporal gradient under AC light sources. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2355-2359). IEEE.

Yoo, J. S., Lee, K. K., Lee, C. H., Seo, J. M., & Kim, J. O. (2022). Deep spatio-temporal illuminant estimation under time-varying AC lights. IEEE Access, 10, 15528-15538.

Yoo, J. S., Lee, C. H., & Kim, J. O. (2020, December). Deep temporal color constancy for ac light sources. In 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) (pp. 217-221). IEEE.

Barron, J. T., & Tsai, Y. T. (2017). Fast fourier color constancy. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 886-894).

Gehler, P. V., Rother, C., Blake, A., Minka, T., & Sharp, T. (2008, June). Bayesian color constancy revisited. In 2008 IEEE Conference on Computer Vision and Pattern Recognition (pp. 1-8). IEEE.

Cheng, D., Prasad, D.K., Brown, M.S.: Illuminant estimation for color constancy: why spatial-domain methods work and the role of the color distribution. JOSA A 31(5) (May 2014) 1049-1058.

Wang, X., Girshick, R., Gupta, A., & He, K. (2018). Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7794-7803).

Author

Jeong-Won H

Jeong-Won Ha received her B.S. in electrical engineering from Korea University, Seoul, Korea, in 2021. She is pursuing her M.S. in electrical engineering at Korea University, Seoul, Korea. Her research interest includes color constancy, dichromatic model, and intrinsic image decomposition.

Jong-Ok Kim

Jong-Ok Kim received his B.S. and M.S. degrees in electronic engineering from Korea University, Seoul, South Korea, in 1994 and 2000, respectively, and Ph.D. in information networking from Osaka University, Osaka, Japan, in 2006. From 1995 to 1998, he served as an Officer with the Korean Air Force. From 2000 to 2003, he was with the SK Telecom Research and Development Center and Mcubeworks Inc., South Korea, where he was involved in research and development on mobile multimedia systems. From 2006 to 2009, he was a Researcher with the Advanced Telecommunication Research Institute International (ATR), Kyoto, Japan. He joined Korea University, Seoul, in 2009, where he is currently a Professor. His current research interests include image processing, computer vision, and intelligent media systems. He was a recipient of the Japanese Government Scholarship, from 2003 to 2006.