IEIE SPC - IEIE Transactions on Smart Processing & Computing

1

A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.

2

Z. Dai, H. Liu, Q. V. Le, and M. Tan, “Coatnet: Marrying convolution and attention for all data sizes,” in Proc. Advances in Neural Information Processing Systems, 2021, vol. 34, pp. 3965-3977.

3

H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jégou, “Training data-efficient image transformers & distillation through attention,” in Proceedings of the International Conference on Machine Learning, 2021, pp. 10347-10357.

4

K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, “Masked autoencoders are scalable vision learners,” arXiv preprint arXiv:2111.06377, 2021.

5

Z. Liu et al., “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012-10022.

6

Z. Xia, X. Pan, S. Song, L. E. Li, and G. Huang, “Vision Transformer with Deformable Attention,” arXiv preprint arXiv:2201.00520, 2022.

7

K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961-2969.

8

A. G. Howard et al., “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.

9

Z. Liu et al., “Swin Transformer V2: Scaling Up Capacity and Resolution,” arXiv preprint arXiv:2111.09883, 2021.

10

N. Ahmed and K. R. Natarajan T_ and Rao, “Discrete cosine transform,” IEEE Transactions on Computers, vol. 100, no. 1, pp. 90-93, 1974.

11

J. Shin and H. Kim, “RL-SPIHT: Reinforcement Learning based Adaptive Selection of Compression Ratio for 1-D SPIHT Algorithm,” IEEE Access, vol. 9, pp. 82485-82496, 2021.

12

H. Kim, A. No, and H.-J. Lee, “SPIHT Algorithm with Adaptive Selection of Compression Ratio Depending on DWT Coefficients,” IEEE Transactions on Multimedia, vol. 20, no. 12, pp. 3200-3211, Dec. 2018.

13

Y. Rao, W. Zhao, Z. Zhu, J. Lu, and J. Zhou, “Global filter networks for image classification,” in Proceedings of the Advances in Neural Information Processing Systems, 2021, vol. 34.

14

K. Xu, M. Qin, F. Sun, Y. Wang, Y.-K. Chen, and F. Ren, “Learning in the Frequency Domain,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2020, pp. 1740-1749.

15

X. Shen et al., “DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2021, pp. 8720-8729.

16

C. Scribano, G. Franchini, M. Prato, and M. Bertogna, “DCT-Former: Efficient Self-Attention with Discrete Cosine Transform,” arXiv preprint arXiv:2203.01178, 2022.

17

A. Krizhevsky, G. Hinton, and others, “Learning multiple layers of features from tiny images,” 2009.

18

Y. Le and X. S. Yang, “Tiny ImageNet Visual Recognition Challenge,” 2015.

19

J. Choi, D. Chun, H. Kim, and H.-J. Lee, “Gaussian yolov3: An accurate and fast object detector using localization uncertainty for autonomous driving,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 2019, pp. 502-511.

20

G. K. Wallace, “The JPEG still picture compression standard,” IEEE Transactions on Consumer Electronics, vol. 38, no. 1, pp. xviii-xxxiv, 1992.

21

A. Vaswani et al., “Attention is all you need,” in Proc. Advances in Neural Information Processing Systems, 2017, vol. 30.

22

S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah, “Transformers in vision: A survey,” ACM Computing Surveys (CSUR), 2021.

23

NVIDIA, P. Vingelmann, and F. H. P. Fitzek, CUDA, release: 10.2.89. 2020. [Online]. Available:

24

R. Wightman, PyTorch Image Models. GitHub, 2019. doi: 10.5281/zenodo.4414861.

25

M. Ehrlich, L. Davis, S.-N. Lim, and A. Shrivastava, “Quantization Guided JPEG Artifact Correction,” 2020.

26

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.

27

M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in Proceedings of the International conference on machine learning. PMLR, 2019, pp. 6105-6114.

28

M. Tan and Q. Le, “Efficientnetv2: Smaller models and faster training,” in Proceedings of the International Conference on Machine Learning. PMLR, 2021, pp. 10 096-10 106.

29

S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” vol. 28, 2015.

30

J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei, “Deformable convolutional networks,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 2017, pp. 764-773

31

I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv preprint arXiv:1711.05101, 2017.

IEIE SPC

IEIE Transactions on Smart Processing & Computing

Journal Search

Journal XML

REFERENCES

IEIE SPC IEIE Transactions on Smart Processing & Computing

Journal Search

Journal XML

REFERENCES

IEIE SPC

IEIE Transactions on Smart Processing & Computing