Mobile QR Code QR CODE

  1. (School of Electronic and Electrical Engineering, Hongik University, Seoul, Korea ytony357@naver.com )
  2. (School of Electronic and Electrical Engineering, Hongik University, Seoul, Korea youngmin@hongik.ac.kr)



Approximate computing, Approximate adder, Sobel edge detection, Error metrics

1. Introduction

Recently, energy-efficient approximate computing has been attracting attention as many data processing and error-tolerant designs emerge [1]. Approximate computing achieves energy efficiency by taking advantage of area, power, and delay instead of giving up some accuracy. This is useful in image and video processing, machine learning, etc., which process much data but do not require high accuracy. The processing requires only an acceptable level of accuracy and is error-tolerant. Approximate computing with this characteristic is suitable for edge detection, which finds pixels corresponding to the edges of an image.

An edge-detection algorithm extracts boundaries from an image to obtain meaningful data. This is useful in image processing because it leaves important structural feature data of the image and filters out less important data [2]. Among the various edge detection methods, Sobel edge detection is an efficient algorithm with little arithmetic complexity [3] and is widely used with FPGAs [4].

We applied approximate computing to the adder of an edge detector using a Sobel filter and compared the errors with the correct result. Approximate computing was implemented and simulated by applying the logic of AMA, AXA, and InXA approximate adders to five 11-bit adders of a Sobel filter. Based on the simulation results, each approximate adder can be summarized according to the logic utilization, Average Error Distance (AED), and Error Rate (ER).

AMA4 shows good values in AED and ER, InXA3 does not function as a filter when approximate adder is applied over 8 bits, and AXA1 shows moderately good logic utilization, AED, and ER in 5 - 7 bits. InXA1 does not provide good AED and ER values, but its logic utilization is relatively acceptable at 65-88%. Overall, AMA4 shows the best results when compared using various metrics. This shows that when the Sobel filter with approximate computing is applied in an FPGA, results with acceptable errors are obtained. In addition, it can be seen that results with various tradeoffs can be expected when approximate computing is applied to several edge detection algorithms to be implemented in FPGA.

Fig. 1. Sobel filter on hardware [3].

../../Resources/ieie/IEIESPC.2021.10.4.355/fig1.png

Next, in Section 2, we look at Sobel edge detection, approximate adders, and error metrics. In Section 3, we show the simulation process of a Sobel edge detection filter with an approximate adder applied. A simulation was conducted in Quartus using Verilog HDL, and the logic utilization obtained from the synthesized filters and the AED and the ER were compared through figures. Section 4 shows the simulation results. The results of the simulation were checked using the output image, AED, and ER for each approximate adder. Finally, Section 5 presents the conclusion.

2. Related Work

2.1 Sobel Edge Detection

A Sobel operator is a 3-by-3 kernel edge detection filter that performs an algorithm to detect edges in vertical and horizontal directions. The horizontal and vertical kernels are G$_{\mathrm{x}}$ $\left[\begin{array}{lll} -1 & 0 & 1\\ -2 & 0 & 2\\ -1 & 0 & 1 \end{array}\right]$= d G$_{\mathrm{y}}$ = $\left[\begin{array}{lll} 1 & 2 & 1\\ 0 & 0 & 0\\ -1 & -2 & -1 \end{array}\right]$Edge detection uses a gray pixel and multiplies a 3-by-3 gray pixel by a value corresponding to each position of the kernel to find out the amount of change in the center pixel [5].

A Sobel filter is composed of hardware, as shown in Fig. 1. It receives filter values and 8-bit grayscale pixels and performs calculations. Depending on the values of the horizontal and vertical kernel, the filter detects an edge through an adder, shift register, etc. We compared the 5 gray boxed adders shown in Fig. 1 with the precise Sobel filter by replacing them with approximate adders. The filter performs addition with five 11-bit Ripple Carry Adders (RCAs), and the logic of the approximate adder is applied to the low part of the addition operation of the RCA.

2.2 Approximate Adder

Approximate adders are full adders with appropriate computing, which usually simplify transistor levels to create a tradeoff in area, power, delay, and accuracy. Approximate adders include the Approximate Mirror Adder (AMA) [6], Approximate XOR/XNOR-based Adder (AXA) [7], and Inexact Adder (InXA) [8]. AMA is an approximate adder that simplifies the complexity by reducing the number of transistors and load capacitance in conventional mirror add cells. It was designed to prevent short circuits or open circuits from occurring in the simplified scheme and to ensure that the full adder has minimal errors in the truth table. AMA gives minimal loss to output quality and provides benefits of power, area, and delay through a tradeoff.

AXA is implemented using 10 transistors by adding an XOR/XNOR gate, which reduces the transistors of the accurate XOR/XNOR-based adder by two or four. This design shows better performance than conventional accurate adders, such as lower propagation delay, lower power consumption, and small area due to reduced logic complexity and node capacitance.

InXA simplifies the logic of a precise adder using a very small number of transistors (6 or 8 transistors only). The small number of transistors results in delay and power reduction because of the smaller area and reduced capacitance. In addition, it provides fewer erroneous outputs compared with AMA or AXA.

As mentioned earlier, the logic of approximate adders is applied to the low-part operation of the 11-bit RCA of a Sobel filter. The approximation results of the 1-bit full adder of 10 approximate adders used in this study are summarized in Table 1. As shown, there are 2 - 4 errors out of 8 total cases. AXA1 and AXA2 result in the maximum of 4 errors. AMA2 and InXA3 have the same error results. In the simulation, they are applied to the low part of 5 bits of the RCA, extended by one bit, and eventually used for all 11 bits.

2.3 Error Metrics

Approximate computing, unlike precise computing, has reliability and accuracy issues, so new metrics are needed to better understand and evaluate the behavior of computing with these different tradeoffs. A different and appropriate indicator is needed to evaluate the efficiency of a design with approximate computing. Metrics for approximate computing may offer new perspectives for approximate computing design [9].

As mentioned earlier, appropriate error metrics are used for analysis of Sobel filters with approximate computing. The difference between the Exact Result ($\textit{R}$) and Approximate Result ($\bar{R}$) is called the Error Distance ($\textit{ED}, \textit{ED}$ = $|R-\bar{R}|$). ED and Average ED (AED) [9] are used to determine the reliability of the adder output. Error Rate (ER) is the percentage of the total output that has an error, which helps to understand the degradation of approximate computing [10]. The AEDs and ERs were used as error metrics.

Table 1. Truth table for approximate adder. ‘-’ means correct results both in C$_{\mathrm{out}}$ and S, ‘o’ means correct, and ‘x’ means wrong result.

AMA1

AMA2

AMA3

AMA4

AXA1

AXA2

AXA3

InXA1

InXA2

InXA3

ABCin

Cout S

Cout S

Cout S

000

-

01(ox)

01(ox)

-

-

01(ox)

-

-

-

01(ox)

001

-

-

-

-

-

-

-

11(xo)

-

-

010

10(xx)

-

10(xx)

00(ox)

10(xx)

00(ox)

00(ox)

-

-

-

011

-

-

-

01(xx)

01(xx)

-

-

-

11(ox)

-

100

00(ox)

-

-

10(xx)

10(xx)

00(ox)

00(ox)

-

-

-

101

-

-

-

-

01(xx)

-

-

-

11(ox)

-

110

-

-

-

-

-

11(ox)

-

00(xo)

-

-

111

-

10(ox)

10(ox)

-

-

-

-

-

-

10(ox)

3. Sobel Edge Detection with Approximate Adder

In the simulation, output images of Sobel filters with a precise adder and an approximate adder were obtained using Verilog. The logic utilization of the Sobel filter synthesized with Quartus and the error metrics of AED and ER were analyzed and compared. In each simulation, a 256-by-256 RGB bmp file, ``Lena,'' was converted to gray scale. A Sobel filter was applied to obtain an output image with accurate results and approximate computing applied. At the same time, we compared two images and outputs of AED and ED as well.

In the simulation shown in Fig. 1, a Sobel filter that inputs 8 gray pixels and outputs one edge detected pixel value was used. The 256-by-256 bmp image used as input has a 54-byte bitmap header, and the rest of the file contains the RGB pixel data of the image. The RGB pixel data obtained from the bmp file is converted to grayscale and input to the Sobel filter based on the pixel location information found from the header of the file.

In one filter operation, gray pixels corresponding to the position of the 3-by-3 mask are required. The filter operation of the structure in Fig. 1 is performed for 8 pixels excluding the center, and the resulting pixel is at the center position of the 3-by-3 mask. This operation is performed 64,516 times (254 x 254) in the position excluding the edge of the 256-by-256 image. The values excluding the edge in the output image were obtained through this operation. The edges were padded with a value of 255.

The 8-bit gray pixels input to the Sobel filter are subtracted and shifted according to the horizontal and vertical masks and are added together through the 11-bit RCA in the precise Sobel filter. In the Sobel filter with approximate adder logic applied, the 11-bit RCA is composed of approximate 1-bit adders rather than precise 1-bit full adders. As shown in Fig. 1, approximate computing is applied to the five 11-bit adders of the filter, and approximate adder logic is applied to the low-part of each 11-bit RCA. It increases by one bit from the lower 5 bits and applies it to all 11 bits of the adder. The results are summarized in Tables 2-4. The numbers where the output images are not obtained at all are shown in gray.

4. Results and Discussion

In the simulation, when an approximate adder was applied to the low 5 - 7 bits of the RCA, all output images show the result of edge detection. But when applied to 8 - 11 bits, edge detection was not done at all depending on the type of approximate adder. We compared the logic utilization after synthesizing each filter with Quartus based on 135% logic utilization of the precise filter. Most of the area was reduced, but some of the filters increased. Also, when the filter function was not performed properly, it was displayed as 1%.

Logic utilization is a measure of how full a device is in Quartus. It is an index based on the number of half-adaptive logic modules (half-ALMs) used in our design. Logic utilization of 1% indicates that half-ALMs are not implemented properly because the logic module is not properly configured according to the design. This means that when a certain approximate adder logic is applied excessively to the RCA, the correct module in Quartus is not synthesized and does not function properly. The filter to which the approximate adder is applied was classified into 4 categories based on logic utilization, AED, and ER.

Table 2. Logic utilization (%).

bits

Precise

AMA1

AMA2

AMA3

AMA4

AXA1

AXA2

AXA3

InXA1

InXA2

InXA3

5

135

135

82

88

93

119

132

134

88

137

82

6

136

88

81

89

112

72

134

77

138

88

7

135

83

87

74

112

130

136

74

140

83

8

142

82

79

79

112

130

136

66

142

82

9

145

1

1

72

108

126

139

65

148

1

10

143

1

1

73

104

116

136

65

147

1

11

115

1

1

74

86

65

127

65

139

1

Table 3. Average Error Distance (AED).

bits

AMA1

AMA2

AMA3

AMA4

AXA1

AXA2

AXA3

InXA1

InXA2

InXA3

5

12.19

18.68

17.96

19.07

14.48

24.90

27.39

34.67

24.18

18.68

6

21.04

56.16

46.47

39.45

24.64

57.54

53.67

75.17

54.59

56.16

7

35.39

132.06

110.17

74.91

57.15

117.57

94.57

129.83

99.88

132.06

8

57.15

174.85

173.67

99.75

102.17

157.88

132.71

133.96

110.30

174.85

9

59.44

175.24

175.24

100.27

138.94

160.30

137.76

133.82

110.56

175.24

10

59.18

175.24

175.24

75.43

142.34

142.54

117.64

46.14

95.75

175.24

11

30.34

175.24

175.24

33.75

142.34

175.24

68.55

46.14

30.61

175.24

Table 4. Error Rate (ER).

bits

AMA1

AMA2

AMA3

AMA4

AXA1

AXA2

AXA3

InXA1

InXA2

InXA3

5

87.4%

89.0%

89.4%

89.1%

88.4%

92.3%

90.3%

91.2%

85.4%

89.0%

6

89.0%

91.6%

91.5%

90.4%

90.4%

92.6%

91.8%

92.3%

87.2%

91.6%

7

89.4%

92.2%

92.0%

91.5%

91.5%

92.8%

92.6%

92.9%

88.0%

92.2%

8

89.4%

91.8%

91.9%

91.4%

92.4%

93.1%

92.9%

92.9%

88.3%

91.8%

9

90.7%

91.8%

91.8%

91.6%

92.1%

92.0%

94.9%

92.9%

88.2%

91.8%

10

90.7%

91.8%

91.8%

91.4%

92.0%

93.1%

96.3%

96.4%

88.1%

91.8%

11

90.5%

91.8%

91.8%

90.8%

92.0%

91.8%

96.8%

96.4%

86.3%

91.8%

4.1 AMA1 and AMA4

AMA1 has the same logic utilization as precise adders or higher and is not superior to other approximate adders. However, with the AMA1 approximate adders, the AED results are the best, and ER is the second best after InXA2. When all 11 bits are used in AMA1, the logic utilization is only 115%, which is 15% lower than a precise adder’s. The AED value is also the best, and the image comes out clearly compared to 6 - 11 bits. The filter's performance is good, but its area is disappointing.

Similar to AMA1, AMA4 has better error metrics than 5 - 10 bits when applied to all 11 bits, and a clean image is output. Of course, it shows good error metrics even with 5 - 10 bits. However, AMA4's logic utilization is significantly lower than the standard at 72 - 93%. This means that the approximate adders with AMA4 logic take up a small area and have good performance.

AMA1 and AMA4 both show similar patterns of AED and ER, and both perform well in terms of error metrics. However, in logic utilization, they show different aspects. While AMA1 shows poorer logic utilization than precise adders, AMA4 shows better logic utilization than AMA1, precise adders, and all approximate adders used in the simulation. A comparison of the output image and various performance metrics of AMA4 is shown in Figs. 2 and 3, respectively.

4.2 AMA2 (InXA3) and AMA3

AMA2 and InXA3 have different transistor-level schematics, but the truth table is the same, so the results of the simulation are the same. They do not have normal edge detection at above 8 bits, and logic utilization is 1 at above 9 bits, so they cannot function as normal filters. At 5 - 7 bits with proper edge detection, logic utilization is as low as 82 - 88%, and AED and ER also show good values. AMA3 is numerically similar to AMA2, and similarly, when AMA3 is applied with more than 8 bits, it cannot function as a normal filter.

AMA2, InXA3, and AMA3 have moderately decent AED and ER at 5 and 6 bits, and their logic utilization is low, showing usable performance and features. However, at 7 bits, AED and ER increase very much, resulting in lower performance. At 8 bits or more, not only do the appropriate index values come out, the filter does not function properly. In summary, they show some filter performance at 5 and 6 bits, but they cannot be used elsewhere. A comparison of the output image and various performance metrics of AMA2 is shown in Figs. 4 and 5, respectively.

Fig. 2. Output image of (a) precise and AMA4 based approximation with lower-part (b) 5, (c) 6, (d) 7, (e) 8, (f) 9, (g) 10, (h) 11 bits.

../../Resources/ieie/IEIESPC.2021.10.4.355/fig2.png

Fig. 3. AMA4’s Logic utilization, AED and ER.

../../Resources/ieie/IEIESPC.2021.10.4.355/fig3.png

Fig. 4. Output image of (a) precise and AMA2 based approximation with lower-part (b) 5, (c) 6, (d) 7, (e) 8, (f) 9, (g) 10, (h) 11 bits.

../../Resources/ieie/IEIESPC.2021.10.4.355/fig4.png

Fig. 5. AMA2’s Logic utilization, AED and ER.

../../Resources/ieie/IEIESPC.2021.10.4.355/fig5.png

4.3 AXA1 and AXA2

AXA1 and AXA2 show good AED and ER at low bits, and they mostly show good performance edge detection. However, at 8 bits or higher, the error metrics are considerably worse compared to AMA1 and AMA4. The error metrics are among the worst of all approximate adders. Also, the overall logic utilization is poor. AXA1’s is around 110, and AXA2 shows logic utilization around 130%. There is no significant improvement compared to 135% for a precise adder. Even in the output image, it is possible to confirm that the edge is hardly detected at 8 bits or more, and it is not clean. They have enough performance to be used at 5 - 7 bits, but generally, they do not reduce the area compared to the previous one. A comparison of the output image and various performance metrics of AXA1 is shown in Figs. 6 and 7, respectively.

4.4 AXA3, InXA1, and InXA2

InXA1 has the best logic utilization among adders at 65 - 88%, but AED and ER are not good. Uniquely, AED increases from 5 to 9 bits and then decreases rapidly after that. At 10 bits and 11 bits, AED shows good performance due to very low logic utilization and low ER, but it is unfortunate because the ER is quite high. It does not look good to apply InXA1's logic to the filter.

AXA3 and InXA2 also have bad AED among approximate adders, which increases to 5 - 9 bits and then decreases after that. However, logic utilization is considerably higher than InXA1’s and even higher than a precise filter’s. The AXA3's ER is similar to InXA1’s, but the InXA2 shows the best ER among the approximate adders. A comparison of the output image and various performance metrics of inXA1 is shown in Figs. 8 and 9, respectively.

As a result, the Sobel edge detection filter shows the best performance based on the three metrics used by AMA4 among the approximate adders of the simulation. In the simulation, approximate adders show a variety of area and error figures for each type and can be selected for different purposes.

Fig. 6. Output image of (a) precise and AXA1 based approximation with lower-part (b) 5, (c) 6, (d) 7, (e) 8, (f) 9, (g) 10, (h) 11 bits.

../../Resources/ieie/IEIESPC.2021.10.4.355/fig6.png

Fig. 7. AXA1’s Logic utilization, AED and ER.

../../Resources/ieie/IEIESPC.2021.10.4.355/fig7.png

Fig. 8. Output image of (a) precise and InXA1 based approximation with lower-part (b) 5, (c) 6, (d) 7, (e) 8, (f) 9, (g) 10, (h) 11 bits.

../../Resources/ieie/IEIESPC.2021.10.4.355/fig8.png

Fig. 9. InXA1’s Logic utilization, AED and ER.

../../Resources/ieie/IEIESPC.2021.10.4.355/fig9.png

5. Conclusion

This study used Verilog HDL to model 10 types of approximate adder logic applied to Sobel filters and precise Sobel filters with Quartus and compared the output images through appropriate error metrics. Each design was compared through AED and ER, which are error metrics for approximate computing, as well as logic utilization based on the number of used half-ALMs. The simulation used the logic of approximate adders AMA, AXA, and InXA, which perform approximate computing and make a tradeoff between performance and accuracy.

Approximate filters have various types depending on their logic, such as a type that cannot perform a filter function when an approximate adder is applied to 8 bits or more (AMA2, AMA3, and InXA3), and some have good error metrics (AMA1 and AMA4). Among the 10 approximate adder logics simulated here, AMA4 took up a small area due to the low usage of the logic module and had good error metrics. It also output the best-quality edge-detected images. Simulation showed that the approximate adder logic yielded tolerable error and acceptable quality when applied to a Sobel edge detection filter. We confirmed that approximate computing can be used in various ways for edge detection and filters synthesized through Verilog-HDL yield results, which have different tradeoff characteristics from approximate computing applied at the transistor level.

ACKNOWLEDGMENTS

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) and funded by the Ministry of Education (NRF-2020R1F1A1055251). The EDA tool was supported by IC Design Education Center (IDEC), Korea.

REFERENCES

1 
Han J., Orshansky M., May 2013, Approximate computing: an emerging paradigm for energy-efficient design, in Proc. ETS, pp. 1-6DOI
2 
Juneja M., Sandhu P. S., 2009, Performance evaluation of edge detection techniques for images in spatial domain., International journal of computer theory and Engineering, Vol. 1, No. 5, pp. 614DOI
3 
Sobel I., 1990, An Isotropic 3×3 Gradient Operator, Machine Vision for Three-Dimensional Scenes. Freeman, H., Academic Pres, NY, 376379.Google Search
4 
Chaple G., Daruwala R. D., 2014, Design of Sobel operator based image edge detection algorithm on FPGA, 2014 International Conference on Communication and Signal Processing, Melmaruvathur, pp. 788-792DOI
5 
Soares L. B., da Rosa M. M. A., Diniz C. M., da Costa E. A. C., Bampi S., Exploring power-performance-quality tradeoff of approximate adders for energy efficient Sobel filtering, 2018 IEEE 9th Latin American Symposium on Circuits & Systems (LASCAS), Puerto Vallarta, pp. 1-4DOI
6 
Gupta V., et al. , 2013, Low-Power Digital Signal Processing Using Approximate Adders, IEEE TCAD, Vol. 32, No. 1, pp. 124-137DOI
7 
Yang Z., et al. , 2013, Approximate XOR/XNOR-based adders for inexact computing, in Proc. IEEE Conf. on NanotechnologyDOI
8 
Almurib H. A., et al. , Inexact Designs for Approximate Low Power Addition by Cell Replacement, DATE (2016) 660.DOI
9 
Liang J., Han J., Lombardi F., 2013, New metrics for the reliability of approximate and probabilistic adders, IEEE Transactions on Computers, Vol. 62, No. 9, pp. 1760-1771DOI
10 
Breuer M.A., 2004, Intelligible test techniques to support error-tolerance, in Proc. IEEE Asian Test Symposium, pp. 386-393DOI

Author

Yunchul Chung
../../Resources/ieie/IEIESPC.2021.10.4.355/au1.png

Yunchul Chung is in the bachelor's degree program in Electronic and Electrical Engineering at Hongik University, Seoul, Korea. His research interests are system and circuit design and approximate computing.

Youngmin Kim
../../Resources/ieie/IEIESPC.2021.10.4.355/au2.png

Youngmin Kim received a BSc in electrical engineering from Yonsei University, Seoul, Korea, in 1999, and an MSc and a PhD in electrical engineering from the University of Michigan, Ann Arbor, in 2003 and 2007, respectively. He held a senior engineering position at Qualcomm in San Diego, CA. He is currently an associate professor at Hongik University, Seoul, South Korea. Prior to joining Hongik University, he was with the School of Computer and Information Engineering at Kwangwoon University, Seoul, South Korea, and the School of Electrical and Computer Engineering at the Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea. His research interests include embedded systems, variability-aware design methodologies, design for manufacturability, design and technology co-optimization methodologies, and low-power and 3D IC designs.