Mobile QR Code QR CODE

Intelligent Video Surveillance System with Abnormal Behavior Recognition and Metadata Retrieval

https://doi.org/10.5573/IEIESPC.2024.13.6.541

(Hyungtae Kim) ; (Joongchol Shin) ; (Seokmok Park) ; (Joonki Paik)

Huge-scale video surveillance systems have become essential in crime prevention and situation recording. Traditional surveillance systems relied on human monitoring of video streams, which often led to errors and difficulties in understanding events. Furthermore, locating specific scenes within recorded videos required extensive human investigation. To overcome these inefficiency, inconvenience, and potential risk challenges, we propose an intelligent analysis scheme that utilizes abnormal behavior recognition and metadata retrieval algorithms to replace human monitoring. The proposed method consists of three stages: i) basic metadata generation through object detection and tracking, ii) abnormal behavior recognition for event metadata, and iii) SQL-based metadata retrieval. By incorporating specific information such as object color and aspect ratio, our technique enhances retrieval usability. Moreover, our module for recognizing abnormal behavior demonstrates robust classification capabilities for activities such as pushing, violence, falling, and crossing barriers. Since our system considers the possible harsh scenarios that cause the computation limitation of edge devices, we choose the adoptable best algorithm for each edge device. In addition, analysis sever complements the detection results of edge cameras. As a result, the proposed method can robustly generate the metadata without any object exception and search the specific object by query. Therefore, the proposed method can be seamlessly deployed on both edge cameras and analysis servers, making it adaptable to various surveillance setups. This approach revolutionizes the traditional surveillance paradigm, enabling more efficient, reliable, and secure video monitoring and analysis.

Basketball Trajectory Capture Method based on Neural Network Under the Background of Sports Teaching

https://doi.org/10.5573/IEIESPC.2024.13.6.553

(Hong Liu)

Traditional basketball training methods have shortcomings such as insufficient efficiency and low training specificity, and the effectiveness of traditional basketball training strategies is difficult to guarantee. The research on basketball robots can help basketball players practice in daily training, save training costs and improve training efficiency. The field of basketball robot research, it involves many aspects of knowledge learning, such as deep learning, robot kinematics, robot control, etc., which has very high research value and significance. Basketball has the characteristics of fast speed, which brings great difficulty to the research of basketball. According to the real-time and accuracy requirements of basketball robot vision system, aiming at the shortcomings and problems of traditional methods, combined with the current rapid development of neural network method, this paper carries out the research on basketball target detection and rotating ball trajectory prediction based on deep learning. By stacking the neural network, we can achieve the task of predicting the trajectory, and meet the real-time and certain accuracy. The experimental results show that compared with the traditional physical model, the network in this paper has higher anti-interference ability and accuracy. The proposed network has significant real-time advantages in predicting time, and its accuracy in predicting points is closer to the true value. Compared with traditional algorithms, the accuracy has been greatly improved. It can help basketball players practice in daily training, saving training costs and improving training efficiency. In subsequent basketball training, the system in this article can be used for auxiliary training

Recent Advancements of Computer Vision in Healthcare: A Systematic Review

https://doi.org/10.5573/IEIESPC.2024.13.6.562

(Mahmudul Islam) ; (Nasim Mahmud Nayan) ; (Ashraful Islam) ; (Sankar Sikder) ; (Masud Rana Rashel) ; (Md Zahangir Alam)

The advancements in Computer Vision (CV) techniques have demonstrated significant promise in healthcare applications. The adoption of CV within the healthcare sector has drawn considerable attention. The aim of this study is to systematically review and synthesize the recent advancements and applications of CV techniques in various domains of healthcare. 125 papers were selected initially, and after gradual filtering, 20 papers were selected for the final study. In this study, we have identified five medical domains such as disease detection, drug discovery, surgical procedures, human identity decoding, and remote patient monitoring where CV applications are being successfully implemented. Among these domains, the use of CV in surgical assistance is notable. It capitalizes on the precision and efficiency offered by CV. Deep Learning (DL) models show adaptability and accuracy in medical imaging. The combination of computer vision and sensors enhances real-time surgical skills assessment. This study revealed that CV applications can be utilized for predictive analytics and personalized patient treatments. Standardized performance metrics, ethics, and data governance are crucial for responsible computer vision deployment in healthcare.

Automated Detection of COVID-19 in Chest Radiographs: Leveraging Machine Learning Approaches

https://doi.org/10.5573/IEIESPC.2024.13.6.572

(Raheela Batool) ; (Ghulam Musa Raza) ; (Usman Khalid) ; (Byung-Seo Kim)

The World Health Organization (WHO) has designated the COVID-19 pandemic a global health emergency, prompting responses all over the world. The fatality rate is between 2% and 5%, and millions of people around the world have been infected. While the WHO recommends tests, resource-intensive testing has motivated the development of CNN technology for automated identification. Research employing machine learning models shows great accuracy in classifying X-ray and CT images for COVID-19 detection. These models include denseNet201, resnet50V2, inceptionv3, mobile net, and custom CNNs. The interpretation of chest X-rays has come a long way, yet there are still obstacles to overcome. In this paper, we present a way for using a machine learning model to categorize chest X-ray pictures into normal, COVID-19, viral pneumonia, and lung opacity, demonstrating the model's efficacy in assisting medical diagnosis, especially in time-sensitive situations like COVID-19.

Optimization of Radiation Pattern for Circular Antenna Array using Genetic Algorithm and Particle Swarm Optimization with Combined Objective Function

https://doi.org/10.5573/IEIESPC.2024.13.6.579

(Nguyen Dinh Tinh)

This paper proposes a solution to determine the amplitude distribution reducing side-lobe level (SLL) to less than the required value and increasing the directivity for circular antenna array (CAA) using a genetic algorithm and particle swarm optimization with a combined objective function. With the proposed solution, the radiation pattern of CAA is optimized when the distances between two consecutive elements on a circle are more than half a wavelength and the radiation pattern of each element is taken into account. The proposed solution is applicable in both uniform circular antenna arrays and non-uniform circular antenna arrays. Simulation results demonstrate the validation and effectiveness of the proposed solution according to the particular requirements of SLL and directivity compared to the conventional solutions.

Vision-based Multi-task Hybrid Model for Teacher-Student Behavior Recognition in Classroom Environment

https://doi.org/10.5573/IEIESPC.2024.13.6.587

(Huan Zhou) ; (Wenrui Zhu)

Teacher-student concentration in the teaching process is an essential indicator for evaluating teaching quality. Many researches assess students' learning interests by identifying their classroom behaviors but ignore the influence of teachers' behavior on students' behavior. Therefore, we collect classroom video data of teacher and student perspectives to analyse the interplay between their behaviors. Considering the particularity of data collection in classroom environments, we design a vision-based multi-task hybrid model for multi-mode data (RGB, optical flow and skeleton data). This model structure is divided into two parts. The RGB and optical flow are input into a spatio-temporal dual-stream framework for real-time action localization of the teacher. This dual-stream framework includes a 2D-CNN branch to extract spatial information and a Vision Transformer (ViT) branch to extract temporal information. In another part, skeleton data is obtained through the pose estimation method, and we propose a multi-level stacked spatio-temporal graph convolutional network (MSSTGCN) for skeleton-based student behavior recognition. This network can process the multi-order semantic information of the skeleton data and fuse the features at different scales through the Non-local block.

Basketball Foul Tracking Method based on Video Detail Enhancement

https://doi.org/10.5573/IEIESPC.2024.13.6.598

(Chen Liu)

In basketball matches, the referees are often the criteria for judging the fouls, but there are often disputes in the manual form of judging. Therefore, the study proposes a basketball game foul action tracking method based on video detail enhancement, which mainly consists of two parts: object detection and object tracking. Among them, the object detection part integrates wavelet transform, three frame difference method, and background subtraction method to achieve the detection of moving targets. In the object tracking section, the improved CamShift tracking algorithm and HMM are used to complete the tracking and recognition of target actions. The simulation experiment shows that the F-measure value of the tracking model is 0.89, the accuracy rate of foul action recognition reaches 99.76%, and the average error is only about 0.003. Therefore, the research-proposed basketball foul tracking method based on video detail enhancement has a good tracking effect, which can achieve the effect of accurate punishment in the basketball game and ensure the fairness of the game.

Enhanced Control of Human Motion Generation using Action-conditioned Transformer VAE with Low-rank Factorization

https://doi.org/10.5573/IEIESPC.2024.13.6.609

(Hyunsung Kim) ; (Kyeongbo Kong) ; (Joseph Kihoon Kim) ; (James Lee) ; (Geonho Cha) ; (Ho-Deok Jang) ; (Dognyoon Wee) ; (Suk-Ju Kang)

This paper presents an action-conditioned transformer variational autoencoder (VAE) designed to generate realistic and diverse human motion sequences. The model enables control of specific body parts of the generated human motions, thereby achieving more degrees of freedom and diversity in human actions. In order to achieve control of the body parts, this paper acquires attribute vectors through low-rank factorization and null space projection. We employ scheduling schemes for the KL-term ( ) and data augmentation to address posterior collapse to promote motion diversity. Evaluations on the UESTC and HumanAct12 datasets demonstrate the effectiveness of the proposed model and methods, showing plausible and humanlike actions. In addition, we show the application of control to actions generated in unconditional settings, thus revealing the potential for future research. To the best of our knowledge, this is a pioneering work on directly controlling motions in the latent space without using other modalities.

Deep Network Learning based on TF-IDF Text Features for Electric Power Speech Text Pre-disposal Method

https://doi.org/10.5573/IEIESPC.2024.13.6.622

(Xin Zhao) ; (Changda Huang)

Aiming at the challenge of lack of effective application of massive power operation text data, this paper proposes a graph convolutional neural network processing method including power speech text data responsible for text analysis. After pre-processing the electric power speech text, the word frequency-inverse document frequency (TF-IDF) algorithm is further used to extract the electric power operation text feature items. The power operation information model based on text data feature recognition is comprehensively designed. The recognition and classification results of power speech text data are verified through experiments on power data text datasets. The experimental results show that the accuracy of text classification of the topic model based on TF graph convolutional neural network is 76.4%. The recall rate is 75.2% and the F1 value is 75.8%, which is 3% higher than the accuracy rate of graph convolutional neural network text classification method and 3. 4% higher than the recall rate, 3.2% higher than the F1 value, and 3.2% higher than the Labeled-LDA model text classification method. The feature extraction method improves the text classification accuracy by 3.5%, recall by 1% and F1 value by 2.3%.

Node Sequencing and Visualization Model Construction of Information Propagation Temporal Network based on Interlayer Coupling Intensity Attenuation

https://doi.org/10.5573/IEIESPC.2024.13.6.632

(Li Cai)

Online social networks have become a significant medium for disseminating and acquiring information. This paper proposes a modeling method to mine important nodes in social networks using a super adjacency matrix temporal network based on the weakening of interactions between layers and the influence maximization algorithm of a temporal network. The centrality of eigenvectors was introduced to assess the importance of nodes, and the intensity of interlayer coupling was described using an attenuation factor. In addition, the calculation method of the propagation probability between nodes was also defined. The maximum connectivity components of the proposed model on the Enrons dataset were 0.744 and 0.7412 under different circumstances, and the maximum network performance changes were 0.229 and 0.02998. The maximum running times of the influence maximization algorithm under different conditions were 25.656 s and 58.302 s. The research results have practical significance in providing accurate advertising and information dissemination.

Student Emotion Analysis by Integrating Attention Mechanism Algorithm and Neural Network Algorithm

https://doi.org/10.5573/IEIESPC.2024.13.6.642

(Jing Xiao)

This study proposes an emotion analysis model for the education system, aiming to improve teaching quality and effectiveness. The model integrates multiple attention mechanisms and neural network algorithms to construct a sentiment analysis model. By dynamically adjusting the allocation of global and local attention weight and enhancing semantic information through a gating unit, hidden information in the text is fully mined. Model validation indicates that the fused multi-attention mechanism model outperforms the single attention model by increasing the F1 value by 5.76% on average. Compared to other models, the integrated model shows higher accuracy and a 6.91% average increase in F1 value. The proposed model also demonstrates superior classification accuracy compared to a convolutional neural network model. Overall, the study concludes that the integrated student sentiment analysis model effectively considers hidden text information, leading to improved text classification and sentiment analysis results.

Energy Efficient Routing Algorithm using Zone-based Hybrid Cluster Chain Approach in Wireless Sensor Networks

https://doi.org/10.5573/IEIESPC.2024.13.6.654

(Annie Sujith) ; (Laya T.) ; (Srinidhi Kulkarni V.) ; (Ananthanagu U.) ; (Sowmya N.) ; (Barnali C.)

Wireless Sensor Networks (WSN) have scarce resources, such as energy and processing power, which make energy conservation critical for enhancing network lifetime. The proposed method is an enhanced protocol that utilizes a hybrid approach combining zone-based clustering and a chain approach to increase the WSNs' lifetime. It builds upon an existing routing algorithm, EEARZC [1], which finds the optimal route to the sink using a Fuzzy Inference System to select the best Cluster Head (CH) for relaying from a pool of willing candidate CHs. The proposed approach retains these routes as chains in the network, reducing the need for frequent next-hop determination during data transfers to the sink, thereby enhancing energy conservation in nodes. Also, when a CH that is part of a chain stops relaying data to the sink, it is dynamically replaced with another CH, eliminating the need to reconstruct the entire chain. This process effectively mitigates packet losses, guaranteeing a smooth and uninterrupted data transmission flow throughout the network. The benefits include efficient energy utilization, increased data delivery, and an extended network lifetime. The performance of the proposed method was compared with other methods in three different network scenarios. The results demonstrate that it outperforms other approaches.