ChungYunchul1
KimYoungmin2,*
-
(School of Electronic and Electrical Engineering, Hongik University, Seoul, Korea
ytony357@naver.com )
-
(School of Electronic and Electrical Engineering, Hongik University, Seoul, Korea
youngmin@hongik.ac.kr)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Keywords
Approximate computing, Approximate adder, Sobel edge detection, Error metrics
1. Introduction
Recently, energy-efficient approximate computing has been attracting attention
as many data processing and error-tolerant designs emerge [1]. Approximate computing achieves energy efficiency by taking advantage of area, power,
and delay instead of giving up some accuracy. This is useful in image and video processing,
machine learning, etc., which process much data but do not require high accuracy.
The processing requires only an acceptable level of accuracy and is error-tolerant.
Approximate computing with this characteristic is suitable for edge detection, which
finds pixels corresponding to the edges of an image.
An edge-detection algorithm extracts boundaries from an image to obtain meaningful
data. This is useful in image processing because it leaves important structural feature
data of the image and filters out less important data [2]. Among the various edge detection methods, Sobel edge detection is an efficient algorithm
with little arithmetic complexity [3] and is widely used with FPGAs [4].
We applied approximate computing to the adder of an edge detector using a Sobel
filter and compared the errors with the correct result. Approximate computing was
implemented and simulated by applying the logic of AMA, AXA, and InXA approximate
adders to five 11-bit adders of a Sobel filter. Based on the simulation results, each
approximate adder can be summarized according to the logic utilization, Average Error
Distance (AED), and Error Rate (ER).
AMA4 shows good values in AED and ER, InXA3 does not function as a filter when
approximate adder is applied over 8 bits, and AXA1 shows moderately good logic utilization,
AED, and ER in 5 - 7 bits. InXA1 does not provide good AED and ER values, but its
logic utilization is relatively acceptable at 65-88%. Overall, AMA4 shows the best
results when compared using various metrics. This shows that when the Sobel filter
with approximate computing is applied in an FPGA, results with acceptable errors are
obtained. In addition, it can be seen that results with various tradeoffs can be expected
when approximate computing is applied to several edge detection algorithms to be implemented
in FPGA.
Fig. 1. Sobel filter on hardware [3].
Next, in Section 2, we look at Sobel edge detection, approximate adders, and error
metrics. In Section 3, we show the simulation process of a Sobel edge detection filter
with an approximate adder applied. A simulation was conducted in Quartus using Verilog
HDL, and the logic utilization obtained from the synthesized filters and the AED and
the ER were compared through figures. Section 4 shows the simulation results. The
results of the simulation were checked using the output image, AED, and ER for each
approximate adder. Finally, Section 5 presents the conclusion.
2. Related Work
2.1 Sobel Edge Detection
A Sobel operator is a 3-by-3 kernel edge detection filter that performs an algorithm
to detect edges in vertical and horizontal directions. The horizontal and vertical
kernels are G$_{\mathrm{x}}$ $\left[\begin{array}{lll}
-1 & 0 & 1\\
-2 & 0 & 2\\
-1 & 0 & 1
\end{array}\right]$= d G$_{\mathrm{y}}$ = $\left[\begin{array}{lll}
1 & 2 & 1\\
0 & 0 & 0\\
-1 & -2 & -1
\end{array}\right]$Edge detection uses a gray pixel and multiplies a 3-by-3 gray
pixel by a value corresponding to each position of the kernel to find out the amount
of change in the center pixel [5].
A Sobel filter is composed of hardware, as shown in Fig. 1. It receives filter values and 8-bit grayscale pixels and performs calculations.
Depending on the values of the horizontal and vertical kernel, the filter detects
an edge through an adder, shift register, etc. We compared the 5 gray boxed adders
shown in Fig. 1 with the precise Sobel filter by replacing them with approximate adders. The filter
performs addition with five 11-bit Ripple Carry Adders (RCAs), and the logic of the
approximate adder is applied to the low part of the addition operation of the RCA.
2.2 Approximate Adder
Approximate adders are full adders with appropriate computing, which usually
simplify transistor levels to create a tradeoff in area, power, delay, and accuracy.
Approximate adders include the Approximate Mirror Adder (AMA) [6], Approximate XOR/XNOR-based Adder (AXA) [7], and Inexact Adder (InXA) [8]. AMA is an approximate adder that simplifies the complexity by reducing the number
of transistors and load capacitance in conventional mirror add cells. It was designed
to prevent short circuits or open circuits from occurring in the simplified scheme
and to ensure that the full adder has minimal errors in the truth table. AMA gives
minimal loss to output quality and provides benefits of power, area, and delay through
a tradeoff.
AXA is implemented using 10 transistors by adding an XOR/XNOR gate, which reduces
the transistors of the accurate XOR/XNOR-based adder by two or four. This design shows
better performance than conventional accurate adders, such as lower propagation delay,
lower power consumption, and small area due to reduced logic complexity and node capacitance.
InXA simplifies the logic of a precise adder using a very small number of transistors
(6 or 8 transistors only). The small number of transistors results in delay and power
reduction because of the smaller area and reduced capacitance. In addition, it provides
fewer erroneous outputs compared with AMA or AXA.
As mentioned earlier, the logic of approximate adders is applied to the low-part
operation of the 11-bit RCA of a Sobel filter. The approximation results of the 1-bit
full adder of 10 approximate adders used in this study are summarized in Table 1. As shown, there are 2 - 4 errors out of 8 total cases. AXA1 and AXA2 result in the
maximum of 4 errors. AMA2 and InXA3 have the same error results. In the simulation,
they are applied to the low part of 5 bits of the RCA, extended by one bit, and eventually
used for all 11 bits.
2.3 Error Metrics
Approximate computing, unlike precise computing, has reliability and accuracy
issues, so new metrics are needed to better understand and evaluate the behavior of
computing with these different tradeoffs. A different and appropriate indicator is
needed to evaluate the efficiency of a design with approximate computing. Metrics
for approximate computing may offer new perspectives for approximate computing design
[9].
As mentioned earlier, appropriate error metrics are used for analysis of Sobel
filters with approximate computing. The difference between the Exact Result ($\textit{R}$)
and Approximate Result ($\bar{R}$) is called the Error Distance ($\textit{ED}, \textit{ED}$
= $|R-\bar{R}|$). ED and Average ED (AED) [9] are used to determine the reliability of the adder output. Error Rate (ER) is the
percentage of the total output that has an error, which helps to understand the degradation
of approximate computing [10]. The AEDs and ERs were used as error metrics.
Table 1. Truth table for approximate adder. ‘-’ means correct results both in C$_{\mathrm{out}}$
and S, ‘o’ means correct, and ‘x’ means wrong result.
|
AMA1
|
AMA2
|
AMA3
|
AMA4
|
AXA1
|
AXA2
|
AXA3
|
InXA1
|
InXA2
|
InXA3
|
ABCin
|
Cout S
|
Cout S
|
Cout S
|
000
|
-
|
01(ox)
|
01(ox)
|
-
|
-
|
01(ox)
|
-
|
-
|
-
|
01(ox)
|
001
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
11(xo)
|
-
|
-
|
010
|
10(xx)
|
-
|
10(xx)
|
00(ox)
|
10(xx)
|
00(ox)
|
00(ox)
|
-
|
-
|
-
|
011
|
-
|
-
|
-
|
01(xx)
|
01(xx)
|
-
|
-
|
-
|
11(ox)
|
-
|
100
|
00(ox)
|
-
|
-
|
10(xx)
|
10(xx)
|
00(ox)
|
00(ox)
|
-
|
-
|
-
|
101
|
-
|
-
|
-
|
-
|
01(xx)
|
-
|
-
|
-
|
11(ox)
|
-
|
110
|
-
|
-
|
-
|
-
|
-
|
11(ox)
|
-
|
00(xo)
|
-
|
-
|
111
|
-
|
10(ox)
|
10(ox)
|
-
|
-
|
-
|
-
|
-
|
-
|
10(ox)
|
3. Sobel Edge Detection with Approximate Adder
In the simulation, output images of Sobel filters with a precise adder and an
approximate adder were obtained using Verilog. The logic utilization of the Sobel
filter synthesized with Quartus and the error metrics of AED and ER were analyzed
and compared. In each simulation, a 256-by-256 RGB bmp file, ``Lena,'' was converted
to gray scale. A Sobel filter was applied to obtain an output image with accurate
results and approximate computing applied. At the same time, we compared two images
and outputs of AED and ED as well.
In the simulation shown in Fig. 1, a Sobel filter that inputs 8 gray pixels and outputs one edge detected pixel value
was used. The 256-by-256 bmp image used as input has a 54-byte bitmap header, and
the rest of the file contains the RGB pixel data of the image. The RGB pixel data
obtained from the bmp file is converted to grayscale and input to the Sobel filter
based on the pixel location information found from the header of the file.
In one filter operation, gray pixels corresponding to the position of the 3-by-3
mask are required. The filter operation of the structure in Fig. 1 is performed for 8 pixels excluding the center, and the resulting pixel is at the
center position of the 3-by-3 mask. This operation is performed 64,516 times (254
x 254) in the position excluding the edge of the 256-by-256 image. The values excluding
the edge in the output image were obtained through this operation. The edges were
padded with a value of 255.
The 8-bit gray pixels input to the Sobel filter are subtracted and shifted according
to the horizontal and vertical masks and are added together through the 11-bit RCA
in the precise Sobel filter. In the Sobel filter with approximate adder logic applied,
the 11-bit RCA is composed of approximate 1-bit adders rather than precise 1-bit full
adders. As shown in Fig. 1, approximate computing is applied to the five 11-bit adders of the filter, and approximate
adder logic is applied to the low-part of each 11-bit RCA. It increases by one bit
from the lower 5 bits and applies it to all 11 bits of the adder. The results are
summarized in Tables 2-4. The numbers where the output images are not obtained at
all are shown in gray.
4. Results and Discussion
In the simulation, when an approximate adder was applied to the low 5 - 7 bits
of the RCA, all output images show the result of edge detection. But when applied
to 8 - 11 bits, edge detection was not done at all depending on the type of approximate
adder. We compared the logic utilization after synthesizing each filter with Quartus
based on 135% logic utilization of the precise filter. Most of the area was reduced,
but some of the filters increased. Also, when the filter function was not performed
properly, it was displayed as 1%.
Logic utilization is a measure of how full a device is in Quartus. It is an index
based on the number of half-adaptive logic modules (half-ALMs) used in our design.
Logic utilization of 1% indicates that half-ALMs are not implemented properly because
the logic module is not properly configured according to the design. This means that
when a certain approximate adder logic is applied excessively to the RCA, the correct
module in Quartus is not synthesized and does not function properly. The filter to
which the approximate adder is applied was classified into 4 categories based on logic
utilization, AED, and ER.
Table 2. Logic utilization (%).
bits
|
Precise
|
AMA1
|
AMA2
|
AMA3
|
AMA4
|
AXA1
|
AXA2
|
AXA3
|
InXA1
|
InXA2
|
InXA3
|
5
|
135
|
135
|
82
|
88
|
93
|
119
|
132
|
134
|
88
|
137
|
82
|
6
|
136
|
88
|
81
|
89
|
112
|
72
|
134
|
77
|
138
|
88
|
7
|
135
|
83
|
87
|
74
|
112
|
130
|
136
|
74
|
140
|
83
|
8
|
142
|
82
|
79
|
79
|
112
|
130
|
136
|
66
|
142
|
82
|
9
|
145
|
1
|
1
|
72
|
108
|
126
|
139
|
65
|
148
|
1
|
10
|
143
|
1
|
1
|
73
|
104
|
116
|
136
|
65
|
147
|
1
|
11
|
115
|
1
|
1
|
74
|
86
|
65
|
127
|
65
|
139
|
1
|
Table 3. Average Error Distance (AED).
bits
|
AMA1
|
AMA2
|
AMA3
|
AMA4
|
AXA1
|
AXA2
|
AXA3
|
InXA1
|
InXA2
|
InXA3
|
5
|
12.19
|
18.68
|
17.96
|
19.07
|
14.48
|
24.90
|
27.39
|
34.67
|
24.18
|
18.68
|
6
|
21.04
|
56.16
|
46.47
|
39.45
|
24.64
|
57.54
|
53.67
|
75.17
|
54.59
|
56.16
|
7
|
35.39
|
132.06
|
110.17
|
74.91
|
57.15
|
117.57
|
94.57
|
129.83
|
99.88
|
132.06
|
8
|
57.15
|
174.85
|
173.67
|
99.75
|
102.17
|
157.88
|
132.71
|
133.96
|
110.30
|
174.85
|
9
|
59.44
|
175.24
|
175.24
|
100.27
|
138.94
|
160.30
|
137.76
|
133.82
|
110.56
|
175.24
|
10
|
59.18
|
175.24
|
175.24
|
75.43
|
142.34
|
142.54
|
117.64
|
46.14
|
95.75
|
175.24
|
11
|
30.34
|
175.24
|
175.24
|
33.75
|
142.34
|
175.24
|
68.55
|
46.14
|
30.61
|
175.24
|
Table 4. Error Rate (ER).
bits
|
AMA1
|
AMA2
|
AMA3
|
AMA4
|
AXA1
|
AXA2
|
AXA3
|
InXA1
|
InXA2
|
InXA3
|
5
|
87.4%
|
89.0%
|
89.4%
|
89.1%
|
88.4%
|
92.3%
|
90.3%
|
91.2%
|
85.4%
|
89.0%
|
6
|
89.0%
|
91.6%
|
91.5%
|
90.4%
|
90.4%
|
92.6%
|
91.8%
|
92.3%
|
87.2%
|
91.6%
|
7
|
89.4%
|
92.2%
|
92.0%
|
91.5%
|
91.5%
|
92.8%
|
92.6%
|
92.9%
|
88.0%
|
92.2%
|
8
|
89.4%
|
91.8%
|
91.9%
|
91.4%
|
92.4%
|
93.1%
|
92.9%
|
92.9%
|
88.3%
|
91.8%
|
9
|
90.7%
|
91.8%
|
91.8%
|
91.6%
|
92.1%
|
92.0%
|
94.9%
|
92.9%
|
88.2%
|
91.8%
|
10
|
90.7%
|
91.8%
|
91.8%
|
91.4%
|
92.0%
|
93.1%
|
96.3%
|
96.4%
|
88.1%
|
91.8%
|
11
|
90.5%
|
91.8%
|
91.8%
|
90.8%
|
92.0%
|
91.8%
|
96.8%
|
96.4%
|
86.3%
|
91.8%
|
4.1 AMA1 and AMA4
AMA1 has the same logic utilization as precise adders or higher and is not superior
to other approximate adders. However, with the AMA1 approximate adders, the AED results
are the best, and ER is the second best after InXA2. When all 11 bits are used in
AMA1, the logic utilization is only 115%, which is 15% lower than a precise adder’s.
The AED value is also the best, and the image comes out clearly compared to 6 - 11
bits. The filter's performance is good, but its area is disappointing.
Similar to AMA1, AMA4 has better error metrics than 5 - 10 bits when applied
to all 11 bits, and a clean image is output. Of course, it shows good error metrics
even with 5 - 10 bits. However, AMA4's logic utilization is significantly lower than
the standard at 72 - 93%. This means that the approximate adders with AMA4 logic take
up a small area and have good performance.
AMA1 and AMA4 both show similar patterns of AED and ER, and both perform well
in terms of error metrics. However, in logic utilization, they show different aspects.
While AMA1 shows poorer logic utilization than precise adders, AMA4 shows better logic
utilization than AMA1, precise adders, and all approximate adders used in the simulation.
A comparison of the output image and various performance metrics of AMA4 is shown
in Figs. 2 and 3, respectively.
4.2 AMA2 (InXA3) and AMA3
AMA2 and InXA3 have different transistor-level schematics, but the truth table
is the same, so the results of the simulation are the same. They do not have normal
edge detection at above 8 bits, and logic utilization is 1 at above 9 bits, so they
cannot function as normal filters. At 5 - 7 bits with proper edge detection, logic
utilization is as low as 82 - 88%, and AED and ER also show good values. AMA3 is numerically
similar to AMA2, and similarly, when AMA3 is applied with more than 8 bits, it cannot
function as a normal filter.
AMA2, InXA3, and AMA3 have moderately decent AED and ER at 5 and 6 bits, and
their logic utilization is low, showing usable performance and features. However,
at 7 bits, AED and ER increase very much, resulting in lower performance. At 8 bits
or more, not only do the appropriate index values come out, the filter does not function
properly. In summary, they show some filter performance at 5 and 6 bits, but they
cannot be used elsewhere. A comparison of the output image and various performance
metrics of AMA2 is shown in Figs. 4 and 5, respectively.
Fig. 2. Output image of (a) precise and AMA4 based approximation with lower-part (b)
5, (c) 6, (d) 7, (e) 8, (f) 9, (g) 10, (h) 11 bits.
Fig. 3. AMA4’s Logic utilization, AED and ER.
Fig. 4. Output image of (a) precise and AMA2 based approximation with lower-part (b)
5, (c) 6, (d) 7, (e) 8, (f) 9, (g) 10, (h) 11 bits.
Fig. 5. AMA2’s Logic utilization, AED and ER.
4.3 AXA1 and AXA2
AXA1 and AXA2 show good AED and ER at low bits, and they mostly show good performance
edge detection. However, at 8 bits or higher, the error metrics are considerably worse
compared to AMA1 and AMA4. The error metrics are among the worst of all approximate
adders. Also, the overall logic utilization is poor. AXA1’s is around 110, and AXA2
shows logic utilization around 130%. There is no significant improvement compared
to 135% for a precise adder. Even in the output image, it is possible to confirm that
the edge is hardly detected at 8 bits or more, and it is not clean. They have enough
performance to be used at 5 - 7 bits, but generally, they do not reduce the area compared
to the previous one. A comparison of the output image and various performance metrics
of AXA1 is shown in Figs. 6 and 7, respectively.
4.4 AXA3, InXA1, and InXA2
InXA1 has the best logic utilization among adders at 65 - 88%, but AED and ER
are not good. Uniquely, AED increases from 5 to 9 bits and then decreases rapidly
after that. At 10 bits and 11 bits, AED shows good performance due to very low logic
utilization and low ER, but it is unfortunate because the ER is quite high. It does
not look good to apply InXA1's logic to the filter.
AXA3 and InXA2 also have bad AED among approximate adders, which increases to
5 - 9 bits and then decreases after that. However, logic utilization is considerably
higher than InXA1’s and even higher than a precise filter’s. The AXA3's ER is similar
to InXA1’s, but the InXA2 shows the best ER among the approximate adders. A comparison
of the output image and various performance metrics of inXA1 is shown in Figs. 8 and
9, respectively.
As a result, the Sobel edge detection filter shows the best performance based
on the three metrics used by AMA4 among the approximate adders of the simulation.
In the simulation, approximate adders show a variety of area and error figures for
each type and can be selected for different purposes.
Fig. 6. Output image of (a) precise and AXA1 based approximation with lower-part (b)
5, (c) 6, (d) 7, (e) 8, (f) 9, (g) 10, (h) 11 bits.
Fig. 7. AXA1’s Logic utilization, AED and ER.
Fig. 8. Output image of (a) precise and InXA1 based approximation with lower-part
(b) 5, (c) 6, (d) 7, (e) 8, (f) 9, (g) 10, (h) 11 bits.
Fig. 9. InXA1’s Logic utilization, AED and ER.
5. Conclusion
This study used Verilog HDL to model 10 types of approximate adder logic applied
to Sobel filters and precise Sobel filters with Quartus and compared the output images
through appropriate error metrics. Each design was compared through AED and ER, which
are error metrics for approximate computing, as well as logic utilization based on
the number of used half-ALMs. The simulation used the logic of approximate adders
AMA, AXA, and InXA, which perform approximate computing and make a tradeoff between
performance and accuracy.
Approximate filters have various types depending on their logic, such as a type
that cannot perform a filter function when an approximate adder is applied to 8 bits
or more (AMA2, AMA3, and InXA3), and some have good error metrics (AMA1 and AMA4).
Among the 10 approximate adder logics simulated here, AMA4 took up a small area due
to the low usage of the logic module and had good error metrics. It also output the
best-quality edge-detected images. Simulation showed that the approximate adder logic
yielded tolerable error and acceptable quality when applied to a Sobel edge detection
filter. We confirmed that approximate computing can be used in various ways for edge
detection and filters synthesized through Verilog-HDL yield results, which have different
tradeoff characteristics from approximate computing applied at the transistor level.
ACKNOWLEDGMENTS
This research was supported by the Basic Science Research Program through the
National Research Foundation of Korea (NRF) and funded by the Ministry of Education
(NRF-2020R1F1A1055251). The EDA tool was supported by IC Design Education Center (IDEC),
Korea.
REFERENCES
Han J., Orshansky M., May 2013, Approximate computing: an emerging paradigm for energy-efficient
design, in Proc. ETS, pp. 1-6
Juneja M., Sandhu P. S., 2009, Performance evaluation of edge detection techniques
for images in spatial domain., International journal of computer theory and Engineering,
Vol. 1, No. 5, pp. 614
Sobel I., 1990, An Isotropic 3×3 Gradient Operator, Machine Vision for Three-Dimensional
Scenes. Freeman, H., Academic Pres, NY, 376379.
Chaple G., Daruwala R. D., 2014, Design of Sobel operator based image edge detection
algorithm on FPGA, 2014 International Conference on Communication and Signal Processing,
Melmaruvathur, pp. 788-792
Soares L. B., da Rosa M. M. A., Diniz C. M., da Costa E. A. C., Bampi S., Exploring
power-performance-quality tradeoff of approximate adders for energy efficient Sobel
filtering, 2018 IEEE 9th Latin American Symposium on Circuits & Systems (LASCAS),
Puerto Vallarta, pp. 1-4
Gupta V., et al. , 2013, Low-Power Digital Signal Processing Using Approximate Adders,
IEEE TCAD, Vol. 32, No. 1, pp. 124-137
Yang Z., et al. , 2013, Approximate XOR/XNOR-based adders for inexact computing, in
Proc. IEEE Conf. on Nanotechnology
Almurib H. A., et al. , Inexact Designs for Approximate Low Power Addition by Cell
Replacement, DATE (2016) 660.
Liang J., Han J., Lombardi F., 2013, New metrics for the reliability of approximate
and probabilistic adders, IEEE Transactions on Computers, Vol. 62, No. 9, pp. 1760-1771
Breuer M.A., 2004, Intelligible test techniques to support error-tolerance, in Proc.
IEEE Asian Test Symposium, pp. 386-393
Author
Yunchul Chung is in the bachelor's degree program in Electronic and Electrical
Engineering at Hongik University, Seoul, Korea. His research interests are system
and circuit design and approximate computing.
Youngmin Kim received a BSc in electrical engineering from Yonsei University, Seoul,
Korea, in 1999, and an MSc and a PhD in electrical engineering from the University
of Michigan, Ann Arbor, in 2003 and 2007, respectively. He held a senior engineering
position at Qualcomm in San Diego, CA. He is currently an associate professor at Hongik
University, Seoul, South Korea. Prior to joining Hongik University, he was with the
School of Computer and Information Engineering at Kwangwoon University, Seoul, South
Korea, and the School of Electrical and Computer Engineering at the Ulsan National
Institute of Science and Technology (UNIST), Ulsan, South Korea. His research interests
include embedded systems, variability-aware design methodologies, design for manufacturability,
design and technology co-optimization methodologies, and low-power and 3D IC designs.