Mobile QR Code QR CODE

  1. (Department of Smart Manufacturing Engineering, Changwon National University, Changwon-si, Gyeongsangnam-do, 51140, Korea
  2. (Department of Information and Communication Engineering, Changwon National University, Changwon-si, Gyeongsangnam-do, 51140, Korea )

Display-to-camera communication, Complementary color barcode-based optical camera communications, Deep neural network

1. Introduction

The limited available spectrum for radio frequency (RF) communications will not be able to meet the exponentially increasing demand for wireless Internet access in the near future, leading to a major spectrum crisis. Owing to this undeniable situation, the introduction of new spectrum for better energy efficiency, greater connection bandwidth, and lower usage costs for 6$^{\mathrm{th}}$ generation (6G) communications and future Internet connections becomes very important. Optical wireless communications (OWC) [1,2] has recently been explored as a complementary alternative to RF links for development of heterogeneous networks that enable peer-to-peer connectivity in a cost-effective and reliable way. OWC uses a very wide spectrum of optical domains, including infrared, visible light, and ultraviolet domains, to establish communication performance [2].

Recently, a variety of research into D2C communications has been conducted. Wang et al. [7,8] proposed InFrame and InFrame++, which provide full-frame communication with imperceptible video artifacts. For this, complementary frame composition and a hierarchical frame structure were designed. HiLight [9] uses the orthogonal transparency (alpha) channel to transmit data without the need for coded images. DisCo [10], based on a rolling-shutter camera, executes decoding by translating a temporal sequence into a spatial pattern. Secure barcode-based visible light communication (SBVLC) [11] considers physical security issues and the design principles of 2D barcodes to add security features. RainBar [12] and RainBar+ [13] were designed with a high-capacity barcode layout to enable flexible frame synchronization and code extraction in VLC systems. In [14] and [15], the SoftLight scheme, presented over screen-camera links, uses color modulation schemes with channel coding and a $\textit{soft hint}$ for data decoding through the barcode layout. In [16], complementary color barcode-based optical camera communications (CCB-OCC) was proposed, where symbols are sent with carefully designed complementary color pairs that are perceived by the human eye as a white bar, but the color pattern is detectable by a camera. Although the CCB-OCC method was first presented in [16], it is necessary to improve the detection performance of color barcodes for practical use.

In this paper, we propose a new CCB-OCC system to improve data rate performance using deep neural network (DNN)-based barcode detection and adaptive color-value extraction. The contributions of this study are twofold. First, a novel DNN network is designed for detecting robust and seamless barcode image regions in D2C links. Second, value extraction from color histograms was refined to mitigate the effect of D2C noise and synchronization jitter. In the conventional CCB-OCC technique, an image-processing-based approach was used to detect the color barcode region, but this could result in barcode detection failure or loss of information from the barcode. In addition, fixed-peak, position-based signal extraction used in the conventional technique cannot solve the problem caused by synchronization jitter between display and camera. However, the proposed method uses a DNN-based model with real-time barcode detection and adaptive color-value extraction, which can provide a robust D2C link for data transmission. Experimental results validated the proposed CCB-OCC system’s significant improvement in data rate performance, and the proposed scheme can be regarded as a potential candidate for next-generation short-distance machine-to-machine (M2M) communications.

2. System Model of CCB-OCC

In the CCB-OCC scheme, the transmitting side encodes data into color barcode sequences and displays a consecutive packet of color barcodes on its electronic display. When the refresh rate of the display is greater than 120Hz, these barcode images are perceived by the human eye as a white bar. When captured with a camera, however, they are perceived as a color barcode owing to the rolling shutter mechanism of a CMOS sensor in a camera device. The receiving side uses a camera to capture the display and acquires consecutive images, including the color barcodes. This visible spatial pattern in the received image represents the data. After detecting a color barcode, packet synchronization is performed by checking the location of pilot symbols. After that, channel estimation obtains a D2C channel based on the color space, and the transmitted data are decoded by using the obtained channel information. The CCB-OCC system model is presented in Fig. 1.

Fig. 1. The system model for complementary color barcode-based OCC.

For the D2C communications scenario, the main objective of the transmitter is to encode the transmit data into color barcodes in complementary pairs and to display them on the monitor without noticeable artifacts visible to the human eye. For this, the display monitor’s refresh rate should be 120Hz or higher. In addition, a bit-to-color-mapping process converts the input bit stream into a specific color. Here, binary bits are mapped to specific colors according to well-designed symbolic constellation structures. Each color-mapped symbol consists of a pair of complementary colors, constructing pilot and data symbols. Images captured by the camera of the receiving device are sequentially stored in a memory buffer to detect continuous bit signals. To extract the signal from the received image, it is necessary to detect the pure color barcode area, excluding the background. Packets are composed of pilot and data symbols in red-green-blue (RGB) colors, and the pilot symbols are retrieved through color barcode pixel value extraction for packet synchronization and channel estimation. The wireless optical channel between the electronic display and the camera is estimated by obtaining each component through histogram analysis of the RGB channel, and the remaining data symbols are decoded using the predicted channel information.

CCB-OCC uses a unique color barcode design that makes the continuous information transmitted from the display invisible to the human eye but detectable by devices equipped with a rolling-shutter-based camera. This is done by sequentially presenting on the display device the complementary colors opposite each other in the hue circle structure [16]. The packet structure of the CCB-OCC scheme is defined with pilot color symbols and data symbols with complementary color pairs. These transmit symbols are encoded within the color barcode area of each image frame that appears through the display device. As shown in Fig. 2, the pilot symbol consists of six consecutive frames in the RR$^{\prime}$GG$^{\prime}$BB$^{\prime}$ color pattern sequence. The remaining data symbols are transmitted as packets consisting of complementary color pairs of data. The image frames containing this encoded data are displayed on the screen based on the refresh rate, and the data can be decoded by capturing successive images with the camera or receiving device. When a pair of complementary colors appearing in a continuous image is rapidly scanned, the human eye does not recognize the original color but perceives it as a white bar representing the sum of the complementary colors. Using these complementary colors, the proposed color barcode pattern is visually unobtrusive and does not interfere with the overall quality of the display content. To minimize the effect of bit errors when converting from decoded symbols to binary numbers, we use a gray-coding-based code constellation.

Fig. 2. Packet structure of CCB-OCC.

Fig. 3 shows the complementary color pairs in a series of images from the display. When a color barcode on a monitor with a refresh rate of 60 Hz is received by a CMOS camera sensor at 30$\textit{fps}$, at least two colors appear in every frame. As seen in the figure, the RR'GG'BB' color pattern appears in which R, G, and B indicate the primary red, blue, and green, with R$^{\prime}$, G$^{\prime}$, and B$^{\prime}$ indicating their complementary colors. Theoretically, the number of colors observed in a color barcode area of one received image is the ratio of the camera's incoming capture rate to the transmission rate, i.e. the ratio of the display refresh rate to the camera capture rate. As shown in Fig. 3, six RR'GG'BB' colors are obtained along three consecutive image frames received by the camera, since the camera’s receive rate is half the display transmission rate. However, if the camera capture rate and refresh rate are out of sync, three colors may appear simultaneously in the captured image. Considering both of these cases, it is necessary to estimate the channel and detect the signal.

Fig. 3. An example of received images when the display’s refresh rate is 60$\textit{Hz}$ and the camera’s capture rate is 30$\textit{fps.}$

3. DL-based Color Barcode Detection

D2C communications requires accurate barcode detection to ensure successful decoding from consecutive captured images. Since multimedia information other than the color barcode is included in the captured image, along with the external part of the display device, it is important to accurately detect only the barcode area for accurate signal detection. In this section, we present an approach that can reliably detect barcode regions using deep learning technology.

3.1 Barcode Detection

There are many methods available to detect an object with bounding boxes. Among those methods, for real-time applications, the faster region-based convolutional neural network (Faster R-CNN) [17] and You Only Look Once (YOLO) [18] are the most popular in various research fields. Faster R-CNN is a CNN that considers the entire input image as one candidate region. Features are extracted by pooling one feature map generated through the learned CNN. As a method for generating a candidate region, an object is recognized after estimating the extracted feature map as a candidate region by applying a separate region proposal network instead of the selective search technique.

YOLO is an algorithm that can predict an object existing in an image (and the location of the object) by looking at the image only once. Instead of detecting it as an object to be classified, it approaches it as a single regression problem by dividing the bounding box in a multidimensional manner and applying class probability. The input image is divided into a grid through the CNN, and an object in the corresponding area is recognized by generating a bounding box and a class probability according to each section. Because YOLO does not apply a separate network for extracting candidate regions, it provides performance in terms of processing time that is superior to a Faster R-CNN. Based on this, we used a YOLO v3 model as a function to detect color barcode regions from received images. In order to train the YOLO model, a data set including labeling for the color barcode area was built using software called YOLO Mark, which is shown in Fig. 4.

Fig. 4. Color barcode labeling in the dataset used to train the YOLO model.

To extract the signal contained in the color barcode detected using YOLO, an additional post-processing step is required. First, an image-difference operation must be performed in order to identify the pure barcode area. The area within the bounding box detected through YOLO contains not only color barcodes but also a background that includes the display’s external parts. If this background is not removed, noise will be created when color values are extracted through histogram analysis from each color channel at a later stage. If only the image-difference operation is performed, the image-difference value appears large in areas other than the color barcode due to jitter in the camera capture process or relative movement between the display and the camera. Then, by using the binary image as a mask filter, only the color channel values within the barcode image area are obtained from the original image to be analyzed as a histogram. This series of processes is shown in Fig. 5.

Fig. 5. Processing detected color barcodes into histograms of color channels.
Fig. 6. Varying the peak position when obtaining a representative value of the color barcode.

3.2 Color-value Extraction

In the D2C communications link, a rolling shutter effect is observed in the received image due to the difference between the scan rate of the display device and the capture rate of the camera. This creates a barcode with multiple colors in one line-scan area in the captured image. The proposed technique using this phenomenon can obtain representative values of color barcodes containing transmission information through histogram analysis. After checking the histogram information of the color area and the complementary color area in the color barcode area, we find the component showing the maximum value, and select it as the color value for the color barcode. The result becomes a multi-color barcode on one line of the captured image. Considering this phenomenon, the proposed method can obtain a representative value of a color barcode through histogram analysis. Even if a single color barcode contains multiple color patterns, we can find the representative value that is closest to the transmitted signal.

In the D2C communications environment, the rolling shutter effect is observed in the received image due to the difference between the refresh rate of the display device and the capture rate of the camera. In particular, a color barcode is generated in which a plurality of colors appears continuously in the horizontal direction of the captured image. Although there are multiple colors in a color barcode pattern, it is necessary to obtain a representative value of the color barcode through histogram analysis from continuous color change. This can be done by finding the symbol value that most closely resembles the transmitted signal. For this, it is necessary to find the larger value among the color components representing the maximum value of the histogram of the RGB channel and the one representing the complementary color, and to select it as the representative color value in the corresponding color barcode.

In addition, when extracting color barcode values using histogram information, the peak position used for obtaining a representative value of the color barcode is gradually changed. In the conventional CCB-OCC scheme [16], the peak position is set at a fixed value in an individual packet. Although, in theory, the refresh rate of the display device should be an integer multiple of the camera frame rate in order to obtain the rolling shutter effect, an exact relationship is not established in practice. As shown in Fig.~6, the peak position for pilot signal detection was 192 in the n-th packet; however, a peak position of 231 may appear in the next packet. In this case, since synchronization jitter occurs between the two consecutive packets, a peak position from which a color barcode representative value can be extracted is set to be changed during the period when multiple symbols are received. Through this, it is possible to more accurately extract the value of the color barcode through histogram analysis, and ultimately improve signal detection accuracy.

3.3 Optical Channel Estimation and Data Decoding

The start position of the packet can be determined by finding the combination of RR$^{\prime}$GG$^{\prime}$BB$^{\prime}$ in successive frames stored in the camera receiver buffer, and the pilot and data symbols within the packet can be obtained according to the determined packet length. Unlike the other frames, the color barcode area of the pilot frame is dominated by pure red, green, and blue colors. Therefore, the value of the RGB component in the barcode area of \hspace{0pt}\hspace{0pt}the pilot symbol packet is determined through histogram analysis and is used as the value of the pilot symbol for channel estimation.

In terms of transmitting information, different electronic display types have different color appearance and brightness distributions. Receivers with different lenses and camera sensors can have different structures for processing color features. The reason that the pilot frame is used in all packets is to obtain channel information between the display and the camera by understanding the relationship between the transmitted RGB value and the received RGB value. A D2C channel matrix can be obtained from an inverse matrix operation by transmitting a predefined signal color through a pilot frame. Using the estimated channel obtained from the received color barcode area and the pixel values obtained from the data packet frame, the red, green, and blue values affected by the wireless optical link between the display device and the camera are corrected, and we then move on to the data decoding procedure.

By analyzing the histogram for RGB components in the data frame of the packet, the peak position at each RGB component can be observed. By multiplying the inverse of channel matrix H to the observed color vector, the value of the intended color symbol sent by the display device can be estimated. The result of a barcode value composed of RGB components is mapped to a bit-color constellation, and data decoding is performed by calculating the minimum Euclidean distance from a symbol predefined in the constellation. In this way, consecutive data symbols within that packet can be decoded using the estimated channel matrix.

4. Experimental Result

In this section, we discuss the extensive experiments with YOLO-based barcode detection. The representative-value position-correction technique proposed in this study can more accurately detect a color barcode in the captured image. Note that a distance of more than 110cm will result in distortion and the blur effect, making it more difficult to extract the precise point of channel estimation and decoding from the RGB histogram.

As seen in Fig. 7, we investigated the impact of view angle on the achievable data rate (ADR) of the proposed CCB-OCC scheme. Here, the distance parameter was set to 90cm. Table 1 presents the ADR based on the yaw angle. As can be seen, when the yaw angle between the display and the camera lies between -20 and 20 degrees, the proposed scheme presents no significant difference in the data rate. This is the opposite result from the existing technique’s performance, which differs depending on the angle. Because the conventional technique extracts a color barcode using the image processing method, if the yaw angle is large, the barcode image region becomes too thin, causing color information loss. But the proposed YOLO-based technique reliably detects the color barcode region through bounding box detection even at a large yaw angle, so it can extract representative color values more accurately. Therefore, the performance of the proposed method is not significantly different, depending on the view angle, but presents a better data rate than the existing method.

Fig. 7. Experimental environment with varying distances and angle values.
Fig. 8. Performance from the achievable data rate according to D2C distance.

We evaluated the performance of the CCB-OCC scheme using YOLO-based barcode detection. Our receiving device was a Samsung Galaxy S9, which is a common Android smartphone equipped with a standard camera featuring 1920${\times}$1080 resolution at a 30fps video capture rate. At the transmitter, the resolution of the electronic display was 1920${\times}$1080@60Hz, and experiments were conducted indoors under normal lighting conditions. In the experimental environment, as shown in Fig. 8, the performance can be measured by changing the distance between the display and the camera (to 110cm from 90cm) and the view angle (from -20 to 20 degrees). To verify the performance of the proposed method, it was compared with results from the conventional CCB-OCC scheme [16].

Fig. 8 shows the achievable data rate from the proposed CCB-OCC scheme based on the distance from display to camera. As the distance increases, the resolution of the captured color barcode image decreases, the noise effect increases, and the channel estimation and data decoding accuracy decrease. As can be seen, the proposed scheme outperformed the conventional scheme when the distance was between 90cm and 110cm. By automatically extracting the color bar code area through the YOLO model and with image post-processing, it is possible to acquire pilot and data signals more precisely than with manual barcode detection used in the existing technique. The proposed scheme provided a maximum data rate of 79.7bps and a minimum rate of 76.2bps in the 90-110cm range.

In Table 1, we can see that the proposed CCB-OCC scheme showed better achievable data rates than the conventional scheme at various angles. Even if the yaw angle between the camera and display was different by 20 degrees, the proposed scheme provided a data rate of 80.7bps or more. The YOLO model, which guarantees real-time object detection, can guarantee robust data rate performance by stably detecting color barcode images taken from various angles.

Table 1. Achievable data rate based on yaw angle.

Yaw angle


ADR of conv.



ADR of prop.


















5. Conclusion

In this paper, we designed and implemented a new CCB-OCC system for data rate improvement using DNN-based barcode detection and adaptive color extraction. We introduced a DNN-based barcode detection concept where YOLO v3 was used to detect a color barcode without losing information from within the barcode region. In addition, a color-value extraction scheme obtained symbol values from color histograms in an adaptive manner to compensate for synchronization jitter. The experimental results proved that the proposed CCB-OCC scheme outperforms the existing CCB-OCC scheme for various distances and angles in D2C communications links.


This research was supported by a National Research Foundation of Korea (NRF) grant funded by the Korean government (NRF-2022R1A2B5B01001543).


Rahaim M., Little T.D.C., Sept. 2017, Interference in IM/DD optical wireless communication networks, IEEE/OSA Journal of Optical Communications and Networking, Vol. 9, pp. d51-D63DOI
Al-Kinani A., Wang C., Zhou L., Zhang W., thirdquarter 2018, Optical Wireless Communication Channel Measurements and Models, IEEE Communications Surveys & Tutorials, Vol. 20, No. 3, pp. 1939-1962DOI
Chen C., Zhong W., Yang H., Du P., 15 Feb.15, 2018., On the Performance of MIMO- NOMA-Based Visible Light Communication Systems, IEEE Photonics Technology Letters, Vol. 30, No. 4, pp. 307-310DOI
Memedi A., Dressler F., Firstquarter 2021, Vehicular Visible Light Communications: A Survey, IEEE Communications Surveys Tutorials, Vol. 23, No. 1, pp. 161-181DOI
Luo et al. P., Oct. 2015, Experimental Demonstration of RGB LED-Based Optical Camera Communications, IEEE Photonics Journal, Vol. 7, No. 5, pp. 1-12DOI
Kim S. J., Lee J. W., Kwon D. -H., Han S. -K., Oct. 2018, Gamma Function Based Signal Compensation for Transmission Distance Tolerant Multi-level Modulation in Optical Camera Communication, IEEE Photonics Journal, Vol. 10, No. 5, pp. 1-7DOI
Wang A., Peng C., Zhang O., Shen G., Zeng B., 2014, InFrame: Multiplexing full-frame visible communication channel for humans and devices, in Proceedings of the 13th ACM Workshop on Hot Topics in Networks, Los Angele,s USADOI
Wang A., Li Z., Peng C., Shen G., Fang G., Zeng B., 2015, InFrame++: Achieve simultaneous screen-human viewing and hidden screen-camera communication, in Proceedings of the 13th International Conference on Mobile Systems, Applications and Service, Florence, ItalyDOI
Li T., An C., Campbell A. T., Zhou X., 2014, HiLight: Hiding bits in pixel translucency changes, ACM Workshop on Visible Light Communication Systems, Maui, Hawaii, USADOI
Jo K., Gupta M., Nayar S. K., 2016, DisCo: Display-Camera Communication Using Rolling Shutter Sensors, ACM Transactions on Graphics, Vol. 35, No. 5, pp. 1-13DOI
Zhang B., Ren K., Xing G., Fu X., Wang C., 2016, SBVLC: Secure barcode- based visible light communication for smartphones, IEEE Transactions on Mobile Computing, Vol. 15, No. 2, pp. 432-446DOI
Wang Q., Zhou M., Ren K., Lei T., Li J., Wang Z., Columbus, OH, USA, 2015, Rain Bar: Robust Application-Driven Visual Communication Using Color Barcodes, in Proceedings of IEEE 35th International Conference on Distributed Computing Systems, pp. 537-546DOI
Zhou M., Wang Q., Lei T., Wang Z., 2018, Enabling Online Robust Barcode-Based Visible Light Communication with Realtime Feedback, IEEE Transactions on Wireless Communications, Vol. 17, No. 12, pp. 8063-8076DOI
Du W., Liando J. C., Li M., San Francisco, CA, USA, 2016, Softlight: Adaptive visible light communication over screen-camera links, in IEEE INFOCOM, pp. 1620-1628DOI
Du W., Liando J. C., Li M., 2017, Soft Hint Enabled Adaptive Visible Light Communication over Screen-Camera Links, IEEE Transactions on Mobile Computing, Vol. 16, No. 2, pp. 527-537DOI
Jung S.-Y., Lee J.-H., Nam W., Kim B. W., 2020., Complementary Color Barcode-Based Optical Camera Communications, Wireless Communi- cations and Mobile Computing, Volume 2020, Article ID 3898427DOI
Yang Y., Gong H., Wang X., Sun P., 2017., Aerial Target Tracking Algorithm Based on Faster RCNN Combined with Frame Differencing, Aerospace, Vol. 4, No. 32DOI
Redmon J., Divvala S., Girshick R., Farhadi A., 2015, You Only Look Once: Unified, Real-time Object Detection, IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788DOI


Min Tae Kim

Min Tae Kim received the B.S. degree from Department of Information and Communication Engineering, Changwon National University, Changwon, South Korea, in 2021, and in progress the M.S degrees from the Department of Smart Manufacturing Engineering, Changwon national University. His research interests include visible light communications, machine learning, and deep learning.

Byung Wook Kim

Byung Wook Kim received a B.S. from the School of Electrical Engineering, Pusan National University, Pusan, South Korea, in 2005, and an M.S. and a Ph.D. from the Department of Electrical Engineering, KAIST, Daejeon, South Korea, in 2007 and 2012, respectively. He was a Senior Engineer with KERI, Changwon-si, South Korea, from 2012 to 2013. He was an Assistant Professor with the School of Electrical Engineering, Kyungil University, Gyeongsan-si, South Korea, from 2013 to 2016. He was an Assistant Professor with the Department of ICT Automotive Engineering, Hoseo University, from 2016 to 2019. He is currently an Assistant Professor with the Department of Information and Communication Engineering, Changwon National University, Changwon-si, South Korea. His research interests include visible light communications, machine learning, and deep learning.