Mobile QR Code QR CODE

2024

Acceptance Ratio

21%

Image Recognition Processing Technology Based on Virtual Reality Technology and Adaptive Feature Fusion

https://doi.org/10.5573/IEIESPC.2025.14.6.715

(Daogui Li)

Image recognition technology has seen gradual enhancement due to advancements in deep learning algorithms. Nonetheless, target detection algorithms encounter challenges like low positioning accuracy and missed and false detections in multiple-target, large-scale, and complex environments. Therefore, in order to improve the image recognition processing effect, an adaptive weighted fusion-YOLOv4 object detection algorithm network structure is designed and applied to object detection in virtual reality scenarios. Including the backbone network, neck network, and prediction network. An adaptive feature fusion module is incorporated into the backbone network to enhance the learning capacity of residual blocks. Additionally, a broad-scale cross-stage fusion network is implemented in the neck network to minimize information loss during feature fusion and facilitate more effective utilization of feature information by the neck network. At the same time, the SIoU loss function is used to improve the convergence speed. When applied to natural scenery, urban roads, office environments, and plateau deserts, the Adaptive Weighted Fusion YOLOv4 algorithm achieved mAP scores of 0.9056, 0.9143, 0.9106, and 0.9812, respectively. Compared to other methods, it exhibited the least fluctuation amplitude in the image signal, with maximum amplitude reduction rates at 14.25%, 17.36%, and 22.36%, respectively. This method enhances network fusion and feature extraction, resulting in superior detection accuracy during image recognition processes.

Advancement in Visual SLAM: Feature, Object Detection and 3D Scene Understanding

https://doi.org/10.5573/IEIESPC.2025.14.6.728

(Yungu Won) ; (Sung Soo Hwang)

Recent advancements in SLAM systems have made significant progress in terms of performance, accuracy, and efficiency. Particularly, Visual SLAM, a type of SLAM that utilizes cameras to perform simultaneous localization and mapping, offers advantages such as cost reduction in hardware and the ability to leverage various visual information. However, Visual SLAM still faces challenges, such as lighting variations, dynamic objects, rapid camera movements, and environments with limited texture or complex structures. In this paper, we introduce efforts aimed at addressing these challenges. We present feature-based methods that utilize various features for feature extraction. Object-based methods are discussed, focusing on identifying dynamic objects and static environments to enhance accuracy. We explore research from the perspective of 3D scene understanding and representation, which involves analyzing images to comprehend 3D space.

Flat Color Design Integrating Image Color Reconstruction Algorithm and Fabric Color Extraction

https://doi.org/10.5573/IEIESPC.2025.14.6.741

(Jifeng Zhong) ; (Yongguang Wei)

The development of multimedia technology leads to an increasing demand for diversified graphic color design. A fusion image reconstruction algorithm for fabric color extraction is proposed by combining fabric texture features. The study first extracts fabric features, then generates a matching color table, and reconstructs the image colors based on the color table. The proposed image color reconstruction algorithm is tested. After multiple iterations, the density peak clustering gradually converged to 0.36 on the test set. The density peak clustering algorithm had a high average pixel accuracy of 94.9% after multiple iterations, indicating a high pixel accuracy. The mean intersection over the union of the density peak clustering algorithm reached 91.1%, significantly higher than other methods. This proposed image reconstruction algorithm was compared with different methods. For the mean absolute error and mean square error, this method had the lowest values, which were 8.915 and 7.224, respectively. Its regression effect was good and the error was small. The proposed algorithm had the highest numerical value, with peak signal-to-noise ratio and structural similarity values of 0.875 and 28.733, respectively. It can accurately achieve image color reconstruction and then carry out planar color design.

Visual Art Image Restoration Based on Regularized Low-rank Matrix Restoration Algorithm

https://doi.org/10.5573/IEIESPC.2025.14.6.753

(Shuping Lin)

In the digital age, visual art images serve as important carriers of information transmission and aesthetic expression, and its integrity and quality are crucial. To repair damaged or degraded art images, a regularized low-rank matrix restoration algorithm is designed to repair visual art images. A low-rank matrix recovery method based on regularized singular values is proposed by incorporating regularization strategies and singular value entropy functions. This algorithm repairs visual art images of different types and styles, and evaluates its restoration effects. From the experimental results, the relative error of the low-rank matrix restoration algorithm based on regularized singular value function was 0.001, the running time was 28.54 seconds, and the F1 value was 92.51. The algorithm had a relatively high peak signal-to-noise ratio on different images, with an average of 0.93. The results indicate that the low-rank matrix restoration algorithm based on regularized singular value function has good image quality and small difference from the original image. The regularized low-rank matrix restoration algorithm can effectively repair visual art images and improve image quality and observability. The research provides solid theoretical support for image restoration, presents strong guidance for algorithm design and improvement, and displays useful reference and guidance for other related fields.

Generated Image Classification Model for Deep Learning-based Inpainting Model

https://doi.org/10.5573/IEIESPC.2025.14.6.764

(Han-gyul Baek) ; (Dong-shin Lim) ; (Hojun Song) ; (Vani Priyanka Gali) ; (Sang-hyo Park)

Thanks to image generation models (e.g., DALL-E 2) that have shown high generation performance, the generated image data has been widely used in computer vision research, which looks natural from the human perspective. In this paper, we start from the assumption that the generated images may be unstable from the perspective of deep learning models. In particular, for the inpainting task of seamlessly restoring objects or areas in an image the inpainting model may not show the excellence on generated images. Through the experiments, we demonstrate the vulnerability of the inpainting model to the generated images, and present real and generated image classification framework for future seamless inpainting research.

Deep Learning Models for Automatic Animation Generation and Active Learning

https://doi.org/10.5573/IEIESPC.2025.14.6.776

(Xuelian Gao)

In response to the problem of automatic animation generation, designs a deep learning network structure that effectively avoids the problem of losing original temporal information in animations, and a multi-channel feature fusion mechanism for temporal enhancement that increases the effective utilization of mixed models in time-frequency domain characteristics. Regardless of any task in the field of acoustics, the spectrogram of sound signals also presents obvious discrimination visually. At present, there are also methods to directly convert the spectrum map into an image, and use some methods of image processing to act on the converted spectrum image, rather than learning and analyzing the original spectrum map matrix data. Experiments show that the performance of the model learned from the converted image is not as good as the performance of the model learned directly from the spectrum map matrix data. After all, the information loss caused by the conversion of the spectrum map matrix data into the image is more direct frequency loss. The CLDNN network structure model improves the word error rate index by 4% compared to LSTM based network structure models. At the same time, if multi-scale features are used, the word error rate index is also improved by 5%. Later, their team also achieved a 45% performance improvement on a 2000-hour large-scale speech search task, which proves that their proposed CLDNN network architecture has good learning ability and robustness in various data scales or environments. Innovative deep learning model, was proposed, which improved the classification accuracy of animation automatic generation tasks. Speech animation in animation signals, animation recognition models can be divided into the following three categories: frame based models, where frames contain too little animation information, and frames near connected animations have too high similarity; Animation based models require additional animation start and end time information; Utilize convolutional networks to learn powerful representations of animations, as well as a weight sharing multi-objective classifier and its loss function.

Research on the Implementation and Performance Optimization of Supply Chain Finance System based on NSGA-III Parallel Algorithm

https://doi.org/10.5573/IEIESPC.2025.14.6.790

(Chang Liu) ; (Xiao Chen) ; (Haijing Liu)

Traditional optimization algorithms struggle with diversity and convergence speed, limiting their effectiveness in complex problem-solving. The combined computation and time costs of optimization and simulation further impede the ability to address intricate engineering structures. This paper enhances the NSGA-III algorithm and establishes a parallel joint simulation optimization platform aimed at high-dimensional, multi-objective challenges in engineering. An improved NSGA-III algorithm and a joint simulation optimization method are proposed to optimize car body structures. A comparative analysis of NSGA-II and NSGA-III reveals the enhanced algorithm’s computational advantages. The joint platform facilitates parallel solver calculations, leading to nearly doubled efficiency in optimizing a car’s body side wall compared to NSGA-II. The improved NSGA-III not only provides better optimization solutions for car body design but also addresses challenges in integrating NSGA technology with supply chain finance by introducing a new service model. By refining mixed operators, optimizing variation rates, and preserving population diversity, the improved algorithm significantly enhances performance in high-dimensional optimization, resulting in increased bending and torsion stiffness, reduced maximum stress, and minimized side wall mass. Performance analysis shows superior outcomes in 69% of test problems.

Research on the Design and Implementation of Preschool Education Book Recommendation System Based on Data Mining Algorithm

https://doi.org/10.5573/IEIESPC.2025.14.6.803

(Zongli Xin)

This study improves frequent item set mining efficiency by introducing UFIM, a modified FP-Growth algorithm using the UFP-tree. The one-way frequent pattern tree uses a non-recursive method to check if an endpoint’s support count meets the minimum threshold. If not, the constrained subtree yields no frequent item sets; otherwise, the set includes nodes excluding the root. Experimental results show UFIM processes faster than similar algorithms, with a peak signal-to-noise ratio (PSNR) of 27.0 to 27.6 and structural similarity fluctuating between 0.86 and 0.92. To enhance UFIM’s performance in big data environments, a parallelization strategy was implemented on the Spark platform. Frequent 1-item sets are identified in parallel, and data for subtrees are distributed across multiple nodes. Each node mines item sets independently, aggregating local results into a global frequent set. The parallelized UFIM algorithm, demonstrated through a book recommendation system, efficiently analyzes user purchase history to suggest accurate book recommendations.

Teaching Vertical Network Learning Resource Recommendation Based on LSTM and Collaborative Filtering

https://doi.org/10.5573/IEIESPC.2025.14.6.815

(Xiaoying Zhu) ; (Xiaojing Guo) ; (Xue Zhang)

To address the negative impact of information overload, an analysis is conducted on learning resource recommendation methods in teaching vertical networks. Firstly, the gating mechanism in long short-term memory networks is utilized to reflect learners’ learning and forgetting. Meanwhile, attention mechanism is combined to capture the impact of different difficulty levels on knowledge tracking. A knowledge tracking model using long short-term memory network is constructed. Then, combining model-based collaborative filtering algorithms and attention mechanisms, a collaborative filtering-based learning resource recommendation model is constructed. The proposed knowledge tracking model performed better in precision, recall, and F1 score, with scores of 92.43%, 91.37%, and 92.16%, respectively. On the Assisment2012 and RAIEd2020 datasets, the proposed model’s area under the curve was 82.37% and 81.54%, respectively. The complete model’s performance was the best. The model without attention mechanism had the smallest area under the curve, which was 72.14% and 70.46%, respectively. The proposed learning resource recommendation model performed the best for recommendation precision, recall, and F1 score. The learning resource recommendation list’s diversity was good. These research results contribute to improving learning outcomes and promoting the sustainable development of teaching vertical networks.

Optimizing Hardware Resources for Low-Power Binary Neural Networks Using Approximate Bitwise Operation

https://doi.org/10.5573/IEIESPC.2025.14.6.825

(Dongchan Lee) ; (Youngmin Kim)

Artificial neural networks have recently been widely used in image classification, object detection, and character recognition. However, the amounts of learning and computation in the model required to achieve high accuracy have increased rapidly. As a result, a bottleneck phenomenon has intensified. Research on approaches such as reducing the weights models and optimizing calculations are being conducted to solve this problem. Binary neural networks are receiving significant attention for field-programmable gate array (FPGA)-based designs owing to their high computational efficiency and low-power designs. In this paper, we propose a binary neural network with an FPGA-based low-power accumulator. Based on the hardware resource consumption of each layer, operations are optimized by targeting more hardware-intensive layers. In addition, we propose a new method for operating the accumulator for adding the existing operation results during the learning process in the neural networks. As a result, a binary neural network using the optimized accumulator reduces power by up to 55% compared to the previous network; the other hardware usage was also reduced by 27%. Nevertheless, the delay time remains constant, and the accuracy remains at 90%.

Transmission Line Fault Diagnosis System Integrating Fuzzy Theory and RCNN Algorithm

https://doi.org/10.5573/IEIESPC.2025.14.6.837

(Fuchun Zhang) ; (Wulue Zheng) ; (Wenjun Yuan) ; (Xin Zhang) ; (Weixin Liang) ; (Zhufen Weng)

In response to the problem of insufficient accuracy and diagnostic efficiency of traditional transmission line fault diagnosis models, this study proposed an improvement of region-based convolutional neural networks based on fuzzy logic algorithm and constructed a transmission line fault diagnosis algorithm. And based on this, a transmission line fault diagnosis model was constructed. The effectiveness of the proposed fault diagnosis algorithm was evidenced, and the accuracy, precision, and F1 value of the algorithm were 98.7%, 87.2%, and 0.81, respectively, which was higher than other comparative algorithms. Besides, the study also validated the effectiveness of the transmission line fault diagnosis model integrating fuzzy theory, and found that the accuracy of the model was 0.94, the loss function value was 0.06, the precision was 96.2%, the mean absolute error was 1.64 ? 10?2 , and the root mean square error value was 1.78 ? 10?2 , which was better than other comparative models. In summary, the transmission line fault diagnosis model that integrates fuzzy theory and improved region-based convolutional neural network has better diagnostic accuracy and efficiency than traditional models. The diagnostic model proposed in the study can quickly and accurately diagnose faults in transmission lines, thereby prompting maintenance personnel to take timely measures to reduce the impact of faults on the power system, improve system stability and reliability.

HW/SW Co-design Method for Data Acquisition System on Xilinx RFSoC

https://doi.org/10.5573/IEIESPC.2025.14.6.850

(Soyeon Choi) ; (Yunjin Noh) ; (Heehun Yang) ; (Eunsang Kwon) ; (Giyoung Kim) ; (Hoyoung Yoo)

When constructing a data acquisition system for satellite systems, various functions are required, including data acquisition, signal processing, and communication with other devices such as PCs. It is advantageous to implement the digital hardware and control system after the ADC on a single chip, so implementing the system on an SoC is also required. In addition, as the signal frequency band has recently increased, the need for a data acquisition system capable of receiving and processing signals in the RF band emerged. This paper presents a hardware and software co-design method for implementing a data acquisition system based on an SoC using an RF SoC FPGA. To verify the data acquisition system, we used the ZCU208 evaluation board, which is equipped with a Gen 3 FPGA among RF SoC FPGAs. The verification results showed that it is possible to receive signals for frequencies ranging from 10 MHz to 500 MHz when sampling at 1.96608 GSPS. Therefore, by designing a system using the method in this paper, it is possible to digitize analog signals in the RF band on a single chip and process signal processing, enabling efficient design of various systems such as satellite signal reception and image processing.