Mobile QR Code QR CODE

2025

Reject Ratio

81.5%

Aesthetic Assessment Method of Advertising Design Images Based on Data Aggregation and Convolutional Neural Networks

https://doi.org/10.5573/IEIESPC.2026.15.2.163

(Lina Yu) ; (Li Xu)

Traditional print advertisements are gradually receding from the historical stage of the advertising field because of their excessively homogeneous content and monotonous presentation. A method for aesthetically evaluating advertising design images based on convolutional neural networks is proposed in the study. This method utilizes two-way sub-networks that aim to extract and model the features of the images. One of the sub-networks in the path is the region of interest sub-network, which extracts features of the most appealing regions. The other sub-network is a multi-scale information sub-network engineered to provide a diverse range of global descriptive features. Additionally, this paper presents a training approach that utilizes data aggregation to establish an optimization direction for the model, focusing on the learning of concise samples while enhancing the model’s generalization ability through the integration of properly distributed sparse samples. Empirical analysis yields a classification accuracy of 85.79% and a mean aesthetic scoring error of 0.521. The model yielded aesthetic assessment results that closely matched those of the AVA dataset, with a goodness-of-fit score of 0.907. Consequently, it effectively evaluates advertisement design images and offers a reference point for designers, thus stimulating new ideas in the advertisement design industry.

Research on Tracking Athletes and Trajectory Analysis in Skiing with YOLO Algorithm and Kalman Filter

https://doi.org/10.5573/IEIESPC.2026.15.2.176

(Xiaoguo Chang) ; (Wei Gao)

U-shaped snowboard is one of the most popular events in the Winter Olympics. In order to optimize the key technical movements and skills of athletes in the competition, this paper presents a method based on YOLO algorithm and Kalman filter to detect and track athletes, and to predict and draw the movement curve of athletes in the competition, in order to analyze the posture and movement of athletes in the competition and establish a model to improve the performance and results of athletes in the competition. Under the condition of replicating the U-shaped ski field in the laboratory, the method can track the target replaced by the ball in real time, and use Kalman filter to recognize the target and predict the trajectory. The experimental results show that the method has high accuracy and speed in tracking the athletes’ competition scene, and can accurately draw the athletes’ movement trajectory in the replica model, which can basically meet the needs of analyzing athletes’ movements and skills. The proposed algorithm uses a dataset to verify its performance. On the data set, the detection accuracy of this paper is 83.4%. Compared with the benchmark algorithm YOLOv4 (78.1%), the accuracy is improved by 5.3%, and the detection speed is 30.6 FPS.

Optimization of Speaker Recognition Technology Based on the SACNN-self-attentive Model

https://doi.org/10.5573/IEIESPC.2026.15.2.188

(Guoqiang Lu) ; (Yanmin Bai)

The demand for identity recognition and verification accuracy is increasing, presenting new research directions and challenges for enterprises and academic researchers in related fields, the accuracy of voiceprint recognition in practical applications needs improvement, and its results are easily affected by environmental factors and noise, hindering its widespread adoption and application. Traditional voiceprint recognition models often rely on fixed feature extraction and classification methods, which fail to use complex spatiotemporal information in speech signals fully. In addition, existing models do not perform well when dealing with noise and variable acoustic environments, resulting in reduced recognition accuracy. Based on this, this paper uses the SACNN-Self-attentive model to optimize the voicing recognition technology. It uses a wavelet algorithm to pre-process the speech data for noise reduction. Through experiments, the SACNN model converges well in voicing recognition, and the accuracy of the two data sets is 1.12% and 1.24% higher than that of the Deep Speaker. The experimental results show that the model exhibits higher accuracy and stability in various test environments, demonstrating its potential in voice print recognition technology.

Recreating Intangible Culture Animation Scenes Based on Neural Radiance Fields Technology

https://doi.org/10.5573/IEIESPC.2026.15.2.203

(Jiexiao Tang) ; (Lei Xu)

Intangible cultural heritage (ICH), as a treasure cultivated by diverse cultures throughout history, embodies national sentiments and the spiritual essence of human civilization. However, globalization and modernization pose significant challenges to the transmission and preservation of ICH. Digital technology offers innovative solutions for ICH preservation, with Neural Radiance Fields (NeRF) technology emerging as a promising 3D reconstruction method. This paper introduces an algorithm to recreate animated ICH scenes using NeRF technology. Experimental results demonstrate that our fast viewpoint synthesis algorithm significantly surpasses traditional methods in generating high-quality views. On the ScanNet dataset, our method achieves superior results in quantitative metrics, including PSNR, SSIM, and LPIPS. Specifically, our method reaches a PSNR of 31.68, an SSIM of 0.956, and an LPIPS of 0.194, outperforming conventional NeRF approaches. We propose a comprehensive ICH reconstruction system that includes data uploading and preprocessing, visualization training, and interactive rendering modules. Utilizing NeRF technology, our system efficiently generates high-quality 3D models and real-time animated scenes, offering interactive experiences. In summary, the proposed system provides innovative methods for the digital preservation and dissemination of ICH, achieving high-quality 3D reconstruction and interactive display through NeRF technology, with significant theoretical and practical implications.

Advancements in Deep Learning for Medical Image Analysis: Enhancing Diagnostic Accuracy and Disease Characterization

https://doi.org/10.5573/IEIESPC.2026.15.2.215

(Indu P. K.) ; (G. Beni) ; (D. Rene Dev)

With the expanding development of Deep Learning (DL) techniques, Medical Image Analysis (MIA) has become an active field of research. MIA typically refers to the utilization of various image modalities and techniques to obtain images of the human body, which can be used by medical experts for diagnosis and treatment. In this study, numerous advancements in MIA utilizing DL approaches for diverse pattern recognition tasks?such as segmentation, registration, categorization, detection/localization, and classification?are thoroughly surveyed. We discuss several recent research papers related to these tasks, covering applications such as liver lesion classification and segmentation, lung nodule detection and classification, brain tumor classification and detection, and breast cancer detection. A comparative description of these papers is provided in terms of organ, modality, dataset, model used, and limitations or needed improvements. This survey also describes several medical imaging modalities used in MIA and evaluates various challenges encountered in this domain. Finally, we discuss current trends for new researchers and medical instrument experts, encouraging them to leverage Deep Learning techniques for future advancements.

Performance Analysis of Deep Learning Based CNN Architectures for Stone Fruits Disease Detection

https://doi.org/10.5573/IEIESPC.2026.15.2.227

(Manju Bagga) ; (Sonali Goyal)

Producing crops is a major contribution of agriculture to the economy of every nation. Among the most important components of preserving a nation with sophisticated agricultural economy is the diagnosis of plant diseases. AI allows for the automatic identification of plant diseases from raw images through the use of DL-based CNN models. Therefore, the problem we have tackled is a multi-class classification problem that seeks to identify and categorize leaf diseases of stone fruits (mango, olive, and peach). To achieve this, we have used a dataset of 36,600 images of healthy and diseased leaves that were gathered from three distinct datasets: PlantVillage, MangoLeafBD, and GitHub. To identify diseases or their absence, four pre-trained CNN models?MobileNetV2, DenseNet201, InceptionV3, and ResNet50?have been trained. ResNet50 performs better than the other three models, with an outstanding accuracy of 93.11% on the images from the various datasets.

A Study on Text-to-image Model-based Dataset for Image Classification

https://doi.org/10.5573/IEIESPC.2026.15.2.236

(Dabin Kang) ; (Chae-yeong Song) ; (Dong-hun Lee) ; (Dong-shin Lim) ; (Sang-hyo Park)

Recent advances in image generation technology have led to the active development and remarkable performance of large-scale text-to-image models. With the development of image generation models, research on applying generated images to deep learning models has also evolved. However, the majority of research has focused on the differences between generated and real images, with minimal exploration of their potential as alternatives to image classification dataset. This paper suggests a novel framework that generates an image dataset using text-toimage models with LLM and COCO2017 captions, and refines the images for classification tasks by ranking them with the CLIP Score. Two text-to-image models are employed to create datasets with generated images and their accuracy is assessed in object classification. The images were generated from multiple perspectives by varying the types of generative models and the composition of prompts, and the dataset was refined using both quantitative and qualitative methods. The results show DALLE-3, while effectively generating images from LLM prompts, poses challenges for image classification. Deblurring generally worsens image quality, indicating a need for specialized resolution enhancement methods. The study suggests that the approach to constructing generated datasets could be broadly applied, with potential extensions from classification to segmentation tasks.

Optimization of Efficient English Phrase Translation System Based on BERT-BiLSTM and Genetic Algorithm

https://doi.org/10.5573/IEIESPC.2026.15.2.245

(Jingyun Huang) ; (Bing Wang)

In the era of deepening globalization, the demand for cross-language communication is increasing daily. With the development of deep learning, neural network machine translation has become the mainstream. However, problems still need to be solved, such as low translation accuracy and limited ability to deal with complex syntactic structures. Given this, this paper proposes an English phrase translation system based on BERT-BiLSTM and genetic algorithm, which aims to solve the problems existing in the existing translation system and achieve high-efficiency and high-quality translation results. In this system, BERT is used for word vectorization processing, BiLSTM is used for sequence modeling, and a genetic algorithm is used to optimize the model parameters to obtain a better translation effect. The experiment found that the BLEU score was as high as 28.93 on the translation test set. The BLEU score increased by about 6% compared to the unoptimized traditional system. In addition, the system’s operating efficiency has also been significantly improved, with FLOPs reduced by about 7%. While maintaining high translation quality, the system can also effectively save computing resources and significantly improve translation efficiency.

Research on Security Evaluation Technology of Active Defense Network Driven by Big Data

https://doi.org/10.5573/IEIESPC.2026.15.2.258

(Yunhong Guo) ; (Shihao Zhang)

In the context of the rapid development of big data technology, the network security environment has become increasingly complex, making traditional passive defense strategies difficult to meet the needs of modern network security. Therefore, active defense network security assessment technology has become the focus of research. This comprehensive article offers an insightful examination of the intricate challenges confronting network security amidst the proliferation of big data, emphasizing the paramount significance of adopting proactive defense strategies. It meticulously explores the cornerstone of proactive defense network security evaluation technology, encompassing its fundamental principles, nuanced classification, established standards, and a meticulously crafted indicator system that together form a robust framework for assessing and enhancing cybersecurity posture. Through data comparison, it was found that active defense technology has a significant effect on network security. Out of 10 system paralysis events, organizations using passive defense technology experienced 8 paralysis incidents, with a success rate of up to 80%. In terms of data leakage incidents, passive defense technology has 65 leaks out of 100 incidents with a success rate of 65%, while active defense technology has reduced this rate to 30%, a decrease of about 45%. After adopting proactive defense technology, the average response time for security incidents has been reduced from 3 hours to 1 hour, and the average response time for serious security incidents has been reduced from 5 hours to 2 hours. The malware detection rate has also increased from 70% to 95%. Through case analysis and practical experience, this article can help enterprises develop and implement effective security defense strategies.

SSL Encryption Traffic Attack Behavior Recognition Method Based on Traffic Behavior Characteristics

https://doi.org/10.5573/IEIESPC.2026.15.2.272

(Weijie Song) ; (Zufeng Hou) ; (Sixiao Guo) ; (Zhige Liao) ; (Jiadong Yan)

In recent years, cyberattackers have increasingly exploited SSL/TLS encrypted traffic to hide their attacks, including but not limited to distributed denial of service (DDoS) attacks, malware propagation, data theft, and botnet control. Traditional content-based security detection methods are ineffective against encrypted traffic, as they cannot directly analyze the content, posing a serious threat to the safe and stable operation of network systems. To address this challenge, we propose a method to identify SSL encryption traffic attack behavior based on traffic behavior characteristics. Our approach introduces advanced statistical features, such as autocorrelation functions and sliding window statistics, to capture the dynamic behavior patterns of encrypted traffic. In the feature optimization and selection phase, we use information gain and mutual information to select the most effective feature set through recursive reduction, wrapping, and embedding strategies. For model fusion, we discuss ensemble learning methods, detailing the weight assignment and result fusion processes, and establish an adaptive learning mechanism by combining online learning and feedback adjustment. We evaluate the prediction performance, resource consumption, and processing speed of our model using a comprehensive performance evaluation framework. The experimental part of this study uses a comprehensive encrypted traffic dataset, covering a wide range of normal network activities and encrypted malicious behavior examples. Experimental results show that single models such as GBT, CNN, XGBoost, LightGBM, and ResNet perform well in terms of accuracy, recall, and F1 score. The performance of the weighted average fusion model with multiple weight configurations is further improved, demonstrating the impact of different weight configurations on model performance. Additionally, the Boosting model performance improves with increasing iteration numbers, highlighting the effect of iteration numbers on model performance. Our findings provide a robust and efficient solution for detecting and mitigating SSL/TLS encrypted traffic attacks, enhancing the overall security and stability of network systems. This research is significant because it addresses a critical gap in current cybersecurity practices and offers a practical approach to securing encrypted traffic. Experimental results show that the proposed method performs well in terms of precision, recall and F1 score, outperforming single models. Our research provides network administrators with powerful tools to effectively detect and block malicious activities in encrypted traffic without sacrificing privacy.

Research on Detection of Digital Video Forgery from a Legal Perspective

https://doi.org/10.5573/IEIESPC.2026.15.2.284

(Feng Wang) ; (Yong Zhong Cuo Mu)

With the increasing proliferation of digital video forgery, detecting such forgery has become increasingly important due to the inadequacy of existing laws. This paper briefly analyzed the harm caused by digital video forgery and the current legal regulations. Then, EfficientNet as a method for detecting digital video forgery was introduced. EfficientNet-V2 was optimized through the integration of the convolutional block attention module and the Mish function. Experiments were performed on the improved EfficientNet-V2 using existing datasets of forged digital videos. A significant improvement was observed in the accuracy of the improved EfficientNet-V2. The accuracy for LQ and HQ in FF++ were 83.45% and 95.21%, respectively, while the accuracy for DFDC and Celeb-DF were 97.66% and 99.17%, respectively. These results outperformed existing detection methods such as MesoNet. The improved EfficientNet-V2 also showed good generalization ability in cross-domain experiments. The findings validate the effectiveness of the proposed method for detecting forged digital videos, making it suitable for practical application and promotion.

Multi-sensor Fusion Vision Algorithm for Robot Autonomous Mobility Enhancement

https://doi.org/10.5573/IEIESPC.2026.15.2.293

(Qin Dong)

A common approach to improving the autonomous mobility of wheeled robots is to utilize instantaneous localization and map creation algorithms. This approach suffers from accuracy degradation when faced with situations such as low texture and robot steering. In order to solve this problem and enhance the autonomous mobility of wheeled robots, an improved instantaneous localization and map creation algorithm based on multi-feature fusion of points, lines and surfaces is designed. To address the shortcomings of this multi-feature fusion algorithm in the case of robot steering, a new algorithm based on multi-sensor fusion has been designed. The study found that the improved algorithms for instantaneous localization and map creation, which were based on points, lines, and surfaces, had a maximum and minimum root mean square error of 0.058 m and 0.015 m, respectively. Additionally, there were no instances of tracking loss under different data packets. In contrast, the pre-improved algorithm experienced three instances of tracking loss. The improved multi-sensor fusion-based algorithm’s running time was 97.8 ms with closed-loop detection and 73.2 ms without it on the indoor_general_quad packet. Both improved algorithms designed in the study have good performance and can provide technical support for the improvement of wheeled robots’ automatic mobility capability.