Mobile QR Code QR CODE

2024

Acceptance Ratio

21%

Main Menu

※ The user interface design of www.ieiespc.org has been recently revised and updated. Please contact inter@theieie.org for any inquiries regarding paper submission.

Journal Search

IEIESPC(IEIE Transactions on Smart Processing and Computing)

IEIESPC Vol. 14, No. 02, p.229-241

ISSN (online) :

2287-5255

Received : 31 March 2024Revised : 30 April 2024Accepted : 4 June 2024

DOI :

https://doi.org/10.5573/IEIESPC.2025.14.2.229

Regular Paper

Application of Posture Estimation Algorithm Based on Extended Kalman Filter in Sports Action Recognition

FanZhehua¹ MaKun²

(Physical Education College, Putian University, Putian 351100, China)
(Ministry of Sports, Xiamen Institute of Technology, Xiamen 361021, China)

^*Corresponding Author: Kun Ma, Kun_Makm@outlook.com

License :

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.(www.theieie.org).

Abstract

In response to the shortcomings of low recognition accuracy and long recognition time in current sports action recognition models, this study combines extended Kalman filtering and microelectromechanical system sensors to build a new sports action recognition model. Firstly, the full angle pose calculation method is introduced to optimize the extended Kalman filtering algorithm. Then, the optimized pose estimation algorithm is combined with microelectromechanical system sensors to build the final sports motion recognition model. The research results indicated that the estimation error of the optimized attitude estimation algorithm was as low as 0.01. The motion recognition model constructed had high accuracy rates of 0.99, 0.98, and 0.98 for recognizing serve, drop, and spike movements in badminton, with a time consumption of 2.01 s, 1.88 s, and 1.96 s, respectively, demonstrating good recognition performance. The above results indicate that the attitude estimation algorithm and recognition model designed in this study have good performance and practical application effects, and can provide new reference methods for intelligent recognition of sports.

Keywords

EKF, Attitude estimation, Action recognition, Sports, MEMS

1. Introduction

In the rapid development of modern sports technology, precise identification and analysis of athlete movements have become increasingly important. This not only helps to improve the performance of athletes, but also has important significance for the prevention and rehabilitation treatment of sports injuries ^[1,^2]. With the improvement of sensor technology and computing power, sensor-based athlete attitude estimation and motion recognition (AE-MR) is becoming popular ^[3]. In the field of sports competition and sports science, the development of motion recognition technology is of great significance for improving athlete performance and sports safety. Traditional motion recognition methods often rely on visual recording and post-analysis, which have obvious limitations in real-time application and accuracy. To overcome these limitations, researchers have begun to explore more efficient algorithms and techniques to improve the real-time and accuracy of motion recognition. extended Kalman filter (EKF) is an excellent nonlinear estimation technique, which can effectively process dynamic data containing noise through continuous prediction and update of system state ^[4,^5]. In the fields of aerospace and robotic navigation, EKF has been widely used in real-time path planning and navigation systems due to its excellent performance. In the field of sports action recognition, although EKF has shown great potential in processing fast and dynamic information, considering the complexity, diversity and professionalism of sports action, further research needs to explore the application effect of EKF in sports action recognition. Based on this background, this study aims to explore the application of posture estimation algorithms based on EKF in sports action recognition. Through this study, it is expected to contribute a practical and innovative AE-MR solution to the field of sports technology. The innovation and contribution of the research are mainly reflected in the following aspects. Firstly, the EKF algorithm was optimized for specific movements in sports, improving the accuracy and real-time performance of posture estimation. Secondly, by combining sensor data and EKF, this study proposes a new action recognition model that can more effectively handle the posture changes of athletes in different sports environments. This study was divided into four parts. Section 1 is a review of research related to others. Section 2 is a specific introduction to the research methods. Section 3 is a test of the performance of the algorithm and recognition model. Section 4 is a summary of the entire text.

2. Related Work

Attitude estimation algorithm (AEA) is a computer vision technique used to detect human pose and actions from images or videos. There are already many studies related to AEA. To determine the attitude and inertia parameters of non cooperative targets in orbit services, Meng Q et al. proposed a model free method. This method effectively solved the uncertainty problem in the registration process by combining an enhanced Kalman filter with attitude map optimization, and reduced the impact of measurement noise and drift error. The experiment has verified that this method has significant advantages in accurately identifying inertial parameters compared to existing technologies ^[6]. Liu S et al. proposed a lightweight pose estimation network based on polarization self attention mechanism. This network reduced the parameter count of the feature extraction network through ghost convolution and introduced a polarization self attention module to optimize pixel level regression tasks and improve the accuracy of key point regression. This method significantly reduced model parameters while ensuring minimal accuracy loss ^[7]. In the complex background of construction sites, Gong F et al. proposed a posture estimation-based elbow bending behavior recognition (EBBR) method to lift the smoking and phone behavior accuracy among construction site personnel. By using AlphaPose to retrain the key points of the upper body of the human body, the localization and key point detection of human targets have been achieved. Subsequently, a key point-based EBBR model was constructed, which reduced the interference of complex backgrounds. The detection accuracy of the constructed recognition model has been enhanced by 5.6%, and the false detection rate has been reduced by 13% ^[8].

Sports recognition technology is commonly used to automatically recognize and analyze actions and behaviors in sports. Currently, many experts have conducted research on sports recognition technology. Sun C et al. proposed a dynamic template mechanism to address the problem of target recognition errors caused by uneven lighting and sudden changes in sports competitions. The correlation degree of data feature changes was fully took into account, and time control factors were introduced when utilizing support vector machine. Meanwhile, unsupervised clustering methods were used to design classification strategies to cope with rapid target discrimination and improve recognition accuracy when environmental lighting changes occur. This method could detect athletes with multiple gestures and partial occlusion in complex contexts, and gives an effective means for detecting motion features in sports competitions ^[9]. Geng X used an accelerometer as a medium to convert athlete actions into machine recognizable action units through human-computer interaction. Based on the actual situation, a human motion model was constructed, and corresponding computer hardware and software platforms were built. This way could classify the collected sports data, and had a certain role in analyzing athlete action ^[10]. Hao Z has proposed a new athlete action recognition algorithm to address the issue of low efficiency in current athlete detection and recognition algorithms. It first used the Otsu method for grayscale processing, and then combined the Harris corner algorithm to achieve multi-target tracking, effectively tracking different parts of athletes. In addition, a sequential algorithm was used to complete the labeling of connected components. This algorithm performed excellently in performance and recognition efficiency, and had reference value for accurate identification of athletes in practice ^[11].

In summary, many experts have used various neural networks to build attitude estimation models and have achieved certain research results. In addition, many experts have proposed various mechanisms to optimize sports recognition technology. In the field of sports action recognition, although existing research has provided a variety of effective pose estimation and motion recognition techniques, most of the techniques are aimed at static and simple sports actions. With the advent of the national sports fitness craze, all kinds of sports have gradually become rich and diverse. In the processing of high-speed complex sports and multi-athlete interaction scenes, it is still necessary to design a new recognition method to improve the recognition effect of actions. This study aims to optimize EKF and combine various sensors to improve this deficiency, aiming to better adapt to the special recognition needs in the sports field.

3. The application of AEA in sports action recognition

In sports action recognition systems, the posture of athletes is constantly changing. In order to achieve accurate estimation of various postures of athletes and determine their position and action information, this study first designed an optimized EKF algorithm. On this basis, a motion recognition model was constructed by combining various sensors in Micro Electro Mechanical Systems (MEMS).

3.1 Optimization design of full-angle AEA based on EKF

In sports action recognition, accurately tracking and estimating an athlete's dynamic posture is a challenging task, especially when fast and complex motion sequences are involved. The key of action recognition lies in how to extract useful dynamic information accurately and timely from real-time data. Traditional motion recognition technology, such as video analysis or simple sensor data processing, is often limited by data processing delay and noise interference, and it is difficult to meet the requirements of high precision and real-time. Therefore, the extended Kalman filter is used to design a new motion recognition algorithm. The extended Kalman filter can update state estimates at each time step based on new observed data, thus providing continuous and accurate prediction of dynamic system state. This feature makes it particularly suitable for processing real-time multi-dimensional dynamic data collected by MEMS sensors during sports movements. By continuously predicting and updating the state of the system, EKF can not only effectively suppress the observation noise, but also adapt to the rapid changes in the process of motion ^[12]. In the application scenario of this study, due to the existence of numerous nonlinear functions, EKF was used to process these nonlinear discrete systems. The attitude estimation framework structure under the EKF algorithm is Fig. 1.

Fig. 1. Frame diagram of attitude estimation under EKF algorithm.

In Fig. 1, EKF serves as a nonlinear form of Kalman filtering, with the attitude quaternion of the vehicle as the state variable, the gyroscope reading as the prediction step, and the attitude angle output from the accelerometer and magnetometer as the observation update value. The various updated values obtained are processed using EKF, and then the attitude estimation results are output.

In the EKF algorithm, assuming the state of a nonlinear discrete system is $X$, the state history at time $k$ is obtained as shown in Eq. (1).

(1)

$ X(k)=f\left(X\left(k-1\right),U\left(k-1\right),W\left(k-1\right)\right). $

In Eq. (1), $X(k)$ represents the state variable at time $k$. $U$ represents the output variable. $W$ represents the process noise sampled. $f\left(\cdot \right)$ represents the state function.

Assuming that the output variable of a nonlinear discrete system is $Z$, the output variable equation at time $k$ is obtained as shown in Eq. (2).

(2)

$ Z(k)=h\left(X\left(k-1\right),V\left(k-1\right)\right). $

In Eq. (2), $Z(k)$ represents the output variable at time $k$. $V$ represents the sampled measurement noise. $h\left(\cdot \right)$ represents the output function.

The statistical characteristics of process noise and measurement noise can be expressed using Eq. (3).

(3)

$ \left\{\begin{aligned} & W(k)\sim N\left(0,Q\right),\\ & V(k)\sim N\left(0,R\right). \end{aligned}\right. $

In Eq. (3), $W(k)$ and $V(k)H$ represent the process noise and measurement noise values at time $k$, respectively. $W(k)\sim N\left(0,Q\right)$ represents that $W(k)$ is a multivariate normal distribution with a mean of 0 and a covariance matrix of $Q$. $V(k)\sim N\left(0,R\right)$ represents that $V(k)$ is also a multivariate normal distribution with a mean of 0 but a covariance matrix of $R$.

Considering that both $W$ and $V$ belong to zero mean white noise sequences and are not correlated with each other, first-order linearization can be applied to Eqs. (1) and (2) to obtain the processed first-order state equation and output equation as shown in Eq. (4).

(4)

$ \left\{\begin{aligned} & X(k)\approx \tilde{X}(k)+A\left(X\left(k-1\right)-\hat{X}\left(k-1\right)\right)\\ &\hskip 2.5pc +\Gamma W\left(k-1\right), \\ & Z(k)\approx \tilde{Z}(k)+H\left(X(k)-\hat{X}(k)\right)+\wedge V(k). \end{aligned}\right. $

In Eq. (4), $A$ and $\Gamma $ are the partial derivatives (PD) of the state function with respect to $X$ and $W$. and $\wedge $ represent the PD of the output function over $X$ and $V$, respectively. $\tilde{X}(k)$ and $\tilde{Z}(k)$ are the approximate values of $X(k)$ and $Z(k)$ at time $k$, respectively. $\hat{X}$ represents a posterior estimate of $X$.

The Kalman filtering algorithm is applied to the first-order state equation and first-order output equation in Eq. (4), and the formula for calculating the filter gain is Eq. (5) ^[13].

(5)

$\begin{align} K_{g} (k)&=\bar{P}(k)H^{T} (k)\big[H(k)\bar{P}(k)H^{T} (k)\nonumber\\ &\quad +\wedge (k)R(k)\wedge ^{T} (k)\big]^{-1}.\end{align} $

In Eq. (5), $K_{g} (k)$ represents the filter gain at time $k$. $\bar{P}(k)$ is the covariance matrix of the estimation error at time $k$. $H^{T} $ and $\wedge ^{T} $ represent the transposes of $H$ and $\wedge $, respectively. $R(k)$ represents the covariance matrix of measurement noise at time $k$. $H(k)$ and $\wedge (k)$ represent the PD of the output function with respect to $X$ and $V$ at time $k$.

According to Eqs. (1) to (5), the time and state update equation of the nonlinear dynamic model under the EKF algorithm can be obtained. The update equation for time is Eq. (6).

(6)

$ \left\{\begin{aligned} & \bar{X}(k)=f\left(\hat{X}\left(k-1\right),U\left(k-1\right),0\right),\\ & \bar{P}(k)=A(k)\hat{P}\left(k-1\right)A^{T} (k)+\Gamma (k)Q\left(k-1\right)\Gamma ^{T} (k). \end{aligned}\right. $

In Eq. (6), $\bar{X}(k)$ represents the estimated state at $k$. $A(k)$ represents the PD of the state function over $X$ at $k$. $\hat{P}$ represents a posterior estimate of $P$. $A^{T} $ and $\Gamma ^{T} $ represent the transposes of $A$ and $\Gamma $, respectively. The update equation for the state is Eq. (7).

(7)

$ \left\{\begin{aligned} & \hat{X}(k)=\bar{X}(k)+K_{g} (k)\left[Z(k)-h\left(\bar{X}\left(k-1\right),0\right)\right],\\ & \hat{P}(k)=\left(1-K_{g} (k)H(k)\right)\bar{P}(k). \end{aligned}\right. $

According to Eqs. (6) and (7), different real-time update time values and update status values can be obtained. The obtained values are filtered and processed, and then the athlete's pose estimation results are output.

In the accurate estimation of motion attitude, MEMS sensors are widely used in dynamic data acquisition because of their miniaturization and portability. However, these sensors are susceptible to various factors during fast or complex motion capture processes, such as mechanical vibration and temperature changes, which can lead to increased errors and noise in data acquisition. Especially in the acquisition of angular velocity and acceleration data, the output data of MEMS sensors often contain large errors and deviations due to the nonlinear characteristics and limited measurement range, which will affect the accuracy and reliability of attitude estimation. Therefore, this study introduces a full angle attitude calculation method (FAACM) to optimize the EKF algorithm. By considering all possible attitude Angle variations, this method enables more comprehensive utilization of the output data of MEMS sensors. In addition, by introducing this method for optimization, nonlinear and high dynamic changes in MEMS sensor data can be handled more effectively, ensuring the robustness and high-precision performance of the algorithm in practical applications.

In order to overcome the problem of excessive pitch angle leading to increased error in roll angle calculation, this study first performs a rotation operation on the coordinate system, and then calculates quaternions in that rotation coordinate system. Next, the coordinate system is rotated back to its initial state, and after this process, the attitude quaternion is obtained. This FAACM can effectively calculate the attitude angle in a wide range of pitch angles and avoid increasing the error in roll angle calculation.

Before introducing the rotational coordinate system, the roll angle measured by the MEMS accelerometer is Eq. (8).

(8)

$ \theta =\arcsin \left(\frac{y_{g} }{g} \right) . $

In Eq. (8), $\theta $ represents the initial measured roll angle. $g$ represents gravitational acceleration. $y_{g} $ represents the projection of gravitational acceleration on the Y axis. At this point, the formula for calculating the attitude angle is Eq. (9).

(9)

$ \phi =\arctan \left(-\frac{x_{g} }{z_{g} } \right) . $

In Eq. (9), $\phi $ represents the initial measured attitude angle. $x_{g} $ and $z_{g} $ represent the projection of gravitational acceleration on the X and Z axes, respectively. Introducing a new rotating coordinate system, the roll angle and attitude angle under the FAACM are obtained as shown in Eq. (10).

(10)

$ \left\{\begin{aligned} & \theta '=\arcsin \left(\frac{y_{g} {}^{{'} } }{g} \right), \\ & \phi '=\arctan \left(-\frac{x_{g} {}^{{'} } }{z_{g} {}^{{'} } } \right). \end{aligned}\right. $

In Eq. (10), $\theta^{ '}$ and $\phi^{ '}$ represent the roll angle and attitude angle in the rotating coordinate system, respectively. $y_{g} {}^{{'} } $, $x_{g} {}^{{'} } $ and $z_{g} {}^{{'} } $ represent the projections of gravitational acceleration on the Y-axis, X-axis, and Z-axis of the rotating coordinate system, respectively. According to Eq. (10), further to calculate the value of azimuth, as shown in Eq. (11).

(11)

$ \Psi =\arctan \left(-\frac{y^{'}_{2T}}{x^{'}_{2T}} \right) . $

In Eq. (11), $\Psi $ represents the azimuth in the rotating coordinate system. $y^{'}_{2T}$ and $x^{'}_{2T}$ represent the projections of the geomagnetic field vector on the $Y$-axis and $X$-axis of the rotating coordinate system, respectively. The quaternion rotating in a rotating coordinate system is denoted as D, and its expression is Eq. (12).

(12)

$ C=\left[\frac{\sqrt{2} }{2} ,0,0,\frac{\sqrt{2} }{2} \right]. $

Combining Eqs. (8) to (12), the formula for calculating the final attitude angle is Eq. (13).

(13)

$ \left(\theta ,\Psi ,\phi \right)=C^{-1} \cdot \left(\theta ',\Psi ,\phi '\right)\cdot C . $

By combining the FAACM based on rotational coordinate system with EKF, the optimized algorithm is recorded as Extended Kalman Filter Full Angle Attention Estimation Algorithm (EKF-FAAEA). The operational flowchart of EKF-FAAEA is Fig. 2.

Fig. 2 shows the operation flow of the EKF-FAAEA algorithm, where $\alpha $ and $\alpha _{0} $ represent the calculated pitch Angle and pitch Angle threshold respectively. In Fig. 2, the data of the accelerometer and magnetometer in the MEMS sensor system were first obtained, and then the pitch angle was calculated using the EKF algorithm. To compare the pitch angle calculation value with the threshold value. If the pitch angle calculation value ${\leqslant}$ the threshold value, directly to use EKF to calculate the roll angle and azimuth angle, then to obtain the quaternion of the attitude angle, and finally output the result. If the calculated elevation angle is greater than the threshold, the attitude angle is calculated based on the rotation coordinate system and the quaternion in the rotation coordinate system is output.

Fig. 2. Flowchart of the operation of EKF-FAAEA.

3.2 Construction of a sports action recognition model integrating FAAEA and MEMS

After completing the optimization design of the EKF-FAAEA algorithm, this study further analyzed the performance indicators of various MEMS sensors and built a sports action recognition model combining AEA and various MEMS sensors. The aim of this study is to collect athlete movement information and perform AE-MR using this model. The action recognition model built is referred to as EKF-FAAEA-MEMS, and the framework structure of EKF-FAAEA-MEMS is Fig. 3.

In Fig. 3, the entire EKF-FAAEA-MEMS framework mainly consists of three parts: attitude measurement unit (AMU), AEA, and data processing and communication (DPC). AMU is composed of various types of MEMS sensors, which athletes wear on various parts of their bodies to obtain initial motion data ^[14]. AEA calculates the posture information of athletes based on the data collected by AMU. The values calculated by AEA can be stored in the DPC unit, which mainly collects, stores, and sends real-time data information to the computer to assist operators in completing final action recognition based on attitude estimation results. The hardware structure design of EKF-FAAEA-MEMS is Fig. 4.

In Fig. 4, the hardware design of the entire model is mainly focused on the AMU and DPC units ^[15]. AMU mainly consists of MEMS sensors such as accelerometers, gyroscopes, magnetometers, and microprocessors. Considering the different requirements for sensor size, wearing method, and accuracy in practical applications, this study designed two types of AMUs, one is a small AMU and the other is a multifunctional AMU. The DPC unit includes attitude data storage unit (ADSU) and Bluetooth communication unit (BCU). ADSU can store various motion information collected by AMU in a storage card when the data volume is too large or real-time calculation is not required, and offline data processing can be performed at this time. BCU is mainly responsible for real-time sending various collected motion information to data relay nodes, achieving real-time collection, calculation, and analysis functions of human motion information. The specific composition structure of AMU is Fig. 5.

Fig. 5 shows the structural design of miniature AMUs and multifunctional AMUs, where SPI represents the Serial Peripheral Interface. When designing miniature AMUs, due to strict requirements for volume and weight, an integrated three-axis accelerometer, gyroscope, and magnetometer nine axis sensor MPU9250 was chosen. Considering the insufficient accuracy of the built-in magnetometer in MPU9250, a higher precision three-axis MEMS magnetometer HMC5983 was used as a replacement, and STM32F401CEU6 was selected as the microprocessor chip. For multifunctional AMUs, due to lower volume requirements, higher precision MEMS sensors ADXL355, ADXRS453, and AK09970N, as well as the more powerful STM32F407VGT6 microprocessor, were selected to meet higher measurement accuracy and expansion requirements. To enhance the stability of the multifunctional AMU, this study chose to use the 9-axis motion sensor ICM20948 as a backup for the ADXL355 accelerometer, ADXRS453 gyroscope, and AK09970N magnetometer to achieve redundant functionality. The extended functional structure diagram of the multifunctional AMU is Fig. 6.

In Fig. 6, the expanded multifunctional AMU can add a global positioning system or ultra wideband chip according to actual needs, thereby achieving indoor and outdoor motion positioning. Due to its low volume requirements, this unit can expand more human motion information collection functions and is suitable for installation on the waist, back, or sports equipment of the human body. In addition to global positioning systems and ultra wideband chips, pressure sensors and airspeed sensors can also be integrated to measure the height and speed of movement. The expanded multifunctional AMU is more conducive to posture measurement during winter sports.

Fig. 3. EKF-FAAEA-MEMS framework diagram.

Fig. 4. EKF-FAAEA-MEMS hardware structure design diagram.

Fig. 5. Design of the two components of the AMU.

Fig. 6. Extended structure of the multifunctional AMU.

4. Analysis of the application effect of FAAEA performance testing and sports action recognition model

To demonstrate the effectiveness of the EKF-FAAEA-AEA and EKF-FAAEA-MEMS action recognition models, this study used SportsPose as the experimental dataset. Performance testing and application analysis were conducted using indicators such as fitness value iteration, pose estimation error fluctuations, and recognition accuracy.

4.1 Performance testing of FAAWA

The publicly available SportsPose serves as the dataset for the experimental process. SportsPose is a 3D human pose dataset designed specifically for sports action recognition, which contains over 176000 3D poses. These postures were collected from 24 different subjects while engaging in 5 different sports activities. The five sports activities are running, basketball, badminton, swimming, and high jump. 176000 3D poses were divided into training and testing sets in an 8:2 ratio. To avoid experimental errors, this study conducted various experiments using the same equipment. Table 1 shows the specific parameter settings.

Table 1 provides the parameter settings for this experimental environment. EKF, Mask-Region-Based Convolutional Neural Network (Mask R-CNN), and Pose Network (PoseNet) are used as comparison algorithms. The iterative fitness values of EKF, Mask R-CNN, PoseNet, and EKF-FAAEA under the same experimental environment were obtained, as shown in Fig. 7.

Fig. 7 shows the iteration of fitness values for EKF, Mask R-CNN, PoseNet, and EKF-FAAEA. As the number of iterations increases, the optimal fitness values of the four pose estimation algorithms are continuously decreasing. When the number of iterations is 58, 56, 45, and 37, EKF, Mask R-CNN, PoseNet, and EKF-FAAEA all begin to reach a stable state, and the optimal fitness values for each algorithm are 0.23, 0.20, 0.26, and 0.18, respectively. By comparing the fitness value iteration of the four algorithms, it can be found that the decline speed and final value of EKF-FAAEA are superior to the other three algorithms, indicating that EKF-FAAEA algorithm has better performance when dealing with non-linear and dynamic sports action data. The reason for this result is that the full Angle attitude calculation method and nonlinear processing method are introduced into the EKF-FAAEA algorithm, so that EKF-FAAEA can track and predict the dynamic attitude of athletes more accurately.

Table 1. Experimental environment parameter table.

Experimental equipment	Value
CPU	Intel Core i9-10900K
GPU	NVIDIA RTX 3080
Memory	32GB
Graphics Memory	10GB GDDR6X
Development Environment	Windows 10 Python 3.8
Programming Tools	PyTorch 1.7

Fig. 7. Iteration of fitness values for different pose estimation algorithms.

Fig. 8. Pose estimation error fluctuation results for different pose estimation algorithms.

Figs. 8(a) and 8(b) show the fluctuation results of pose estimation errors for EKF, Mask R-CNN, PoseNet, and EKF-FAAEA in the training and testing datasets, respectively. In Fig. 8(a), the fluctuation ranges of pose estimation for EKF, Mask R-CNN, PoseNet, and EKF-FAAEA in the training set are $-0.2\sim0.3$, $-0.2 \sim 0.2$, $-0.1 \sim 0.2$, and $-0.1 \sim 0.1$, respectively. In Fig. 8(b), the fluctuation ranges of pose estimation for EKF, Mask R-CNN, PoseNet, and EKF-FAAEA in the test set are $-0.2 \sim 0.3$, $-0.2 \sim 0.3$, $-0.1 \sim 0.2$, and $-0.1 \sim 0.1$, respectively. Overall, the error fluctuation range of EKF-FAAEA is the smallest, indicating that this algorithm has the most accurate estimation of posture.

Table 2 shows the measured values of each roll angle under different algorithms. When the actual roll angles are $-30^\circ$, 0$^\circ$, 30$^\circ$, and 60$^\circ$, the measured values of EKF are $-31.24^\circ$, 1.05$^\circ$, 31.08$^\circ$, and 61.32$^\circ$, respectively. The measurement values of Mask R-CNN are $-30.38^\circ$, 0.52$^\circ$, 30.47$^\circ$, and 60.63$^\circ$, respectively. The measurement values of PoseNet are $-30.18^\circ$, 0.09$^\circ$, 30.14$^\circ$, and 60.31$^\circ$, respectively. The measured values of EKF-FAAEA are $-30.01^\circ$, 0.02$^\circ$, 30.03$^\circ$, and 60.01$^\circ$, respectively. EKF-FAAEA has a very small error in the measurement of each roll Angle, which is almost close to the actual value, showing a high accuracy. In contrast, other algorithms such as Mask R-CNN and PoseNet, while also showing high accuracy, still have slight deviations in the measurement of some angles. The high precision of EKF-FAAEA algorithm is due to its application of MEMS sensor optimization processing and data fusion technology in the algorithm, which effectively reduces the impact of sensor error and environmental noise, and improves the algorithm's adaptability to dynamic changes and measurement accuracy. Overall, the measured values of EKF-FAAEA are closest to the actual values, so this algorithm can more accurately estimate the specific pose angle.

Table 2. Measurements of cross-roll angle with different pose estimation algorithms.

Algorithm type	Traverse roll angle actual angle	Measured angle of roll angle
EKF	-30°	-31.24°
	0°	1.05°
	30°	31.08°
	60°	61.32°
Mask R-CNN	-30°	-30.38°
	0°	0.52°
	30°	30.47°
	60°	60.63°
PoseNet	-30°	-30.18°
	0°	0.09°
	30°	30.14°
	60°	60.31°
EKF-FAAEA	-30°	-30.01°
	0°	0.02°
	30°	30.03°
	60°	60.01°

4.2 Analysis of the application effect of sports action recognition model

To verify the good practicality of the designed EKF-FAAEA-MEMS model, this study selected object keypoint similarity (OKS), recognition time, and recognition accuracy as evaluation indicators. The experiment compared the recognition performance of three models: EKF-FAAEA-MEMS, 3D CNNs, and long short term memory networks (LSTM) in actual badminton sports. Fig. 9 shows the variation of OKS values for EKF-FAAEA-MEMS, 3D CNNs, and LSTM in different datasets.

Figs. 9(a) and 9(b) show the OKS values of EKF-FAAEA-MEMS, 3D CNNs, and LSTM in the training and testing datasets, respectively. In Fig. 9(a), as the number of training samples increases from 0 to 100, the OKS values of all three models show a trend of first increasing and then stabilizing. Finally, the optimal OKS values for EKF-FAAEA-MEMS, 3D CNNs, and LSTM were 0.98, 0.88, and 0.75. In Fig. 9(b), the optimal OKS values for EKF-FAAEA-MEMS, 3D CNNs, and LSTM in the test dataset are 0.97, 0.86, and 0.77. The OKS value is usually within 0 to 1. The closer the value is to 1, the closer the predicted key points are to the actual key points, indicating better performance in motion recognition. In summary, the EKF-FAAEA-MEMS model has better action recognition performance.

Figs. 10(a)-10(c) show the accuracy and time consumption of EKF-FAAEA-MEMS, 3D CNNs, and LSTM in identifying three different badminton movements, respectively. In Fig. 10, numbers 1, 2, and 3 correspond to three types of badminton movements: serve, drop, and spike. EKF-FAAEA-MEMS has the highest accuracy in identifying serve, drop, and spike movements, with values of 0.99, 0.98, and 0.98, respectively. It takes the shortest time, with values of 2.01 seconds, 1.88 seconds, and 1.96 seconds, respectively. The accuracy of 3D CNNs in identifying serve, drop, and spike movements is 0.87, 0.92, and 0.85, respectively, with a time consumption of 6.86 seconds, 4.82 seconds, and 4.98 seconds. The accuracy of LSTM in identifying serve, drop, and spike movements is 0.76, 0.84, and 0.78, respectively, with a time consumption of 7.21 seconds, 10.96 seconds, and 8.64 seconds.

In Fig. 11, three movements of serving, dribbling, and spiking are selected for recognition. Each set of images, from left to right, displays the recognition results of the original image, EKF-FAAEA-MEMS, 3D CNNs, and LSTM. Figs. 11(a)-11(c) indicate that the recognition box of EKF-FAAEA-MEMS can select all postures of athletes, thus enabling better recognition of their movements. On the contrary, the recognition boxes of 3D CNNs and LSTM cannot accurately frame all postures, so they cannot recognize various actions well.

Fig. 9. OKS values for different action recognition models.

Fig. 10. Recognition accuracy and recognition time for different action recognition models.

Fig. 11. Actual effect of different action recognition models in recognizing badminton actions.

5. Conclusion

In response to the low accuracy of traditional sports motion recognition models, this study combined the EKF-FAAEA algorithm and MEMS sensors to build a sports motion recognition model and tested its performance. The results indicated that EKF-FAAEA outperformed standard EKF, Mask R-CNN, and PoseNet in fitness value iteration, pose estimation error fluctuations, and roll angle measurement. EKF-FAAEA only required 37 iterations to achieve a stable fitness value of 0.18. In addition, the fluctuation range of pose estimation for EKF-FAAEA in both training and testing sets was $-0.1 \sim 0.1$, which is much smaller than that of EKF, Mask R-CNN, and PoseNet. Finally, four different roll angles were selected for detection, and it was found that the measured value of EKF-FAAEA was closest to the actual value. Further testing of the application effect of the EKF-FAAEA-MEMS action recognition model revealed that the OKS value of EKF-FAAEA-MEMS was close to 1, with a maximum of 0.98. In addition, the accuracy of the EKF-FAAEA-MEMS model in identifying serve, drop, and spike movements was as high as 0.99, 0.98, and 0.98, respectively, with time consumption as low as 2.01 seconds, 1.88 seconds, and 1.96 seconds. In summary, the AEA and motion recognition models designed this time have good performance and application effects, respectively. However, considering that the EKF-FAAEA and EKF-FAAEA-MEMS models only perform well in badminton recognition, their applicability and effectiveness in other competitive or high-speed sports still need to be fully validated in subsequent research.

Fundings

The research is supported by: Fujian Province Higher Education Teaching Research Project, Innovation and Prac- tice of Cultivating Applied Talents in Physical Education Based on OBE Concept (No.FBJY20230213)

REFERENCES

S. Barbon Jr., A. Pinto, J. V. Barroso, F. G. Caetano, F. A. Moura, S. A. Cunha, and R. D. S. Torres, ``Sport action mining: Dribbling recognition in soccer,'' Multimedia Tools and Applications, vol. 81, no. 3, pp. 4341-4364, 2022.

S. Sowmyayani and P. A. J. Rani, ``STHARNet: Spatio-temporal human action recognition network in content based video retrieval,'' Multimedia Tools and Applications, vol. 82, no. 24, pp. 38051-38066, 2023.

C. Zhang, M. Wang, and L. Zhou, ``Recognition method of basketball players' throwing action based on image segmentation,'' International Journal of Biometrics, vol. 15, no. 2, pp. 121-133, 2023.

Y. Egi, ``Basketball self training shooting posture recognition and trajectory estimation using computer vision and Kalman filter,'' Journal of Electrical Engineering, vol. 73, no. 1, pp. 19-27, 2022.

N. Wang, Q. Li, and Y. Li, ``Study on athlete's facial emotion recognition based on Kalman filter,'' International Journal of Reasoning-based Intelligent Systems, vol. 14, no. 4, pp. 221-226, 2022.

Q. Meng, D. Han, and Z. Wang, ``A model-free method for attitude estimation and inertial parameter identification of a noncooperative target,'' Advances in Space Research, vol. 71, no. 3, pp. 1735-1751, 2023.

S. Liu, N. He, C. Wang, H. Yu, and W. Han. ``Lightweight human pose estimation algorithm based on polarized self-attention,'' Multimedia Systems, vol. 29, no. 1, pp. 197-210, 2023.

F. Gong, Y. Li, X. Yuan, X. Liu, and Y. Gao, ``Human elbow flexion behaviour recognition based on posture estimation in complex scenes,'' IET Image Processing, vol. 17, no. 1, pp. 178-192, 2023.

C. Sun and D. Ma, ``SVM-based global vision system of sports competition and action recognition,'' Journal of Intelligent & Fuzzy Systems, vol. 40, no. 2, pp. 2265-2276, 2021.

X. Geng, ``Research on athlete's action recognition based on acceleration sensor and deep learning,'' Journal of Intelligent & Fuzzy Systems, vol. 40, no. 2, pp. 2229-2240, 2021.

Z. Hao, X. Wang, and S. Zheng, ``Recognition of basketball players' action detection based on visual image and Harris corner extraction algorithm,'' Journal of Intelligent & Fuzzy Systems, vol. 40, no. 4, pp. 7589-7599, 2021.

T. Li, K. S. Severson, F. Wang, and T. W. Dunn, ``Improved 3D markerless mouse pose estimation using temporal semi-supervision,'' International Journal of Computer Vision, vol. 131, no. 6, pp. 1389-1405, 2023.

H. Chen, R. Feng, S. Wu, H. Xu, F. Zhou, and F. Liu, ``2D human pose estimation: A survey,'' Multimedia Systems, vol. 29, no. 5, pp. 3115-3138, 2023.

H. Mokayed, T. Z. Quan, L. Alkhaled, and V. Sivakumar, ``Real-time human detection and counting system using deep learning computer vision techniques,'' Artificial Intelligence and Applications, vol. 1, no. 4, pp. 221-229, 2023.

W. An, S. Yu, Y. Makihara, X. Wu, C. Xu, Y. Yu, and Y. Yagi, ``Performance evaluation of model-based gait on multi-view very large population database with pose sequences,'' IEEE Transactions on Biometrics, Behavior, and Identity Science, vol. 2., no. 4, pp. 421-430, 2020.

Author

Zhehua Fan

Zhehua Fan obtained his bachelor's degree in sports biology from Beijing Sport University in 1997 and a master's degree in sports education and training from Fujian Normal University in 2007. His research areas include sports human science, sports education and training, and sports rehabilitation. From August 1997 to December 1998, he participated in social practice at the Land Management Institute of Zhuangbian Town, Hanjiang District, Putian City; From January 1999 to March 2002, he taught in the Department of Physical Education at Putian Vocational College; From April 2002 to August 2005, he taught at the School of Physical Education of Putian University; From September 2005 to December 2007, he taught at the School of Physical Education of Putian University (studying as an in-service graduate student at Fujian Normal University); From January 2008 to July 2011, he taught at the School of Physical Education of Putian University; From August 2011 to present, he served as an associate professor at the School of Physical Education of Putian University (part-time research secretary of the School of Physical Education of Putian University from 2012 to 2015; part-time staff member of the Office of the Sports Committee of Putian University from 2017 to 2019). He published over 10 papers, including 11 independently published academic papers; participated in 4 provincial and ministerial level projects, and 10 municipal and ministerial level projects, including leading 1 Fujian Provincial Social Science Fund project; participated in the writing of two textbooks and independently completed one monograph.

Kun Ma

Kun Ma is male, Han nationality, was born in September 1982 and is from Kaifeng City, Henan Province. In 2007, he graduated from Anyang Normal University majoring in physical education and obtained a bachelor's degree. In 2010, he graduated from Xi'an Physical Education University majoring in physical education training and obtained a master's degree. Currently, he works at Xiamen Institute of Technology and is an associate professor in the Physical Education Department of Xiamen Institute of Technology. His research field is physical education training.