3.1 Construction of the 3D Motion Capture System
After optimizing the 3D motion human capture algorithm, the 3D motion capture system
should be constructed. The purpose of the research was to allow the objects of the
online physical education course to wear inertial sensors so that the human body model
could follow the students’ movements in real-time [15]. In human motion capture, the number of inertial sensors and their wearing positions
significantly impact the capture results. To capture human motion better, the number
of inertial sensors was set to 15 by referring to the tree structure and considering
the accuracy of the motion capture model. Fig. 1 shows their wearing positions in the human body.
As shown in Fig. 1, inertial sensors were worn at the pelvis, spine, head, left forearm, left forearm,
left hand, right forearm, right forearm, right hand, left thigh, left calf, left foot,
right thigh, right calf, and right foot of the human body, respectively. Human motion
capture can be achieved by collecting information on these 15 positions. The data
generation rate of the sensor was 100 Hz. The 3D motion capture system constructed
by the research was divided into two parts: the upper computer and the lower computer.
The lower computer was composed of inertial sensor nodes. Each node included an inertial
sensor chip and a wireless module. The upper computer was a software platform. The
data of the sensor nodes in the system was sent to the software platform for analysis
and calculation. The model was then driven by the calculation results to realize the
human body model following the action of the real person. The specific structural
design of the 3D motion capture system is shown in Fig. 2.
Fig. 1. Wearing position of the human inertial sensor.
Fig. 2. Overall design of the motion capture system.
Fig. 3. Schematic diagram of the algorithm structure.
As shown in Fig. 2, the 3D motion capture system was divided into four modules, a data acquisition module,
a data transmission module, an action reconstruction module, and a data processing
module. As the information input terminal of the system, the primary function of the
data acquisition module was to collect and input the information on the inertial sensor
worn on the human body. The information collected included the motion information
on the gyroscope, accelerometer, and magnetometer on the sensor. All motion information
was collected, organized, and packed. The wireless module was used to send it to the
software. A crucial module in the motion capture system is the data transmission module,
which ensures the integrity and speed of the transmitted information and the comfort
and mobility of the body-worn sensor. Therefore, the data transmission method in this
motion capture system is wireless transmission, and the received information data
is transmitted to the upper computer software through the wireless module in each
sensor node without affecting the information integrity and speed, improve the comfort
of the human body. In addition to the above two modules, the data processing module
of the motion capture system mainly includes three parts, which are data preprocessing,
foot posture calculation, and each node posture calculation. The data preprocessing
is mainly to correct the errors that may exist in the sensors, gyroscopes, and magnetometers;
when the foot posture is calculated, and the posture of each node is calculated, the
posture of the foot and each node of the human body was calculated to obtain the human
body. The motion poses of the limbs were used for model reconstruction. As the last
module of the motion capture system, motion reconstruction is one of the indicators
to evaluate the quality of the motion capture system. In this module, the method of
unconstrained nodes was used to reconstruct the human body posture, and the proposed
human body model was used to test the accuracy of the reconstruction method. Finally,
the 3D human body model was loaded on the software platform to follow the real human
body in real time. Various algorithms were used in the capture system, and the specific
algorithm structure is shown in Fig. 3.
The algorithm used in the capture system is divided into six parts: sensor error correction
module, inertial navigation solution module, gait detection module, zero-speed correction
module, data fusion algorithm module, and 3D human body Refactor module. Because the
sensor has unavoidable errors in its process, and these errors have a significant
influence on the results, it is necessary to correct the data transmitted by the sensor.
The sensor error correction module is mainly to correct the received raw data. (Ed
note: ``Get'' should not be used in written English. Other synonyms should be chosen.)
Obtaining more accurate data makes the overall model more accurate. The primary function
of the inertial navigation calculation module is to calculate foot displacement. It
updates the formulae of human posture, speed, and position by working with the zero-speed
correction algorithm and error correction. It solves the position of the human foot
in real time. The gait detection module distinguishes the support phase and the swing
phase to facilitate the subsequent zero-speed correction calculation. The zero-speed
correction module suppresses the navigation error in the support phase and improves
the accuracy of the foot displacement result. The core algorithm in the motion capture
system is the data fusion algorithm module. In this module, the data fusion algorithm
is used to solve the collected gyroscope, accelerometer, and magnetometer data information
to obtain the key posture data of the human body. After obtaining the key posture
data of the human body, the human body movement posture is tracked in the 3D human
body reconstruction module, and the position of the root node is estimated by using
the foot space position in this module, to realize the model following the human body
movement. In the process of online physical education, the motion capture system can
be used to capture the teaching actions of the teacher and the learning actions of
the students in class, and the actions of the students can be compared with the actions
of the teacher, to evaluate the learning quality of the students. (Ed note: Short
coordinating conjunctions like ``and'' and ``but'' should not be used at the beginning
of sentences.)Moreover, it can find students’ problems quickly and help them improve
their movements.
3.2 3D Action Body Capture based on Data Fusion
The human body model plays a key role in human motion capture. The human body model
will change with posture. Its posture update is based on a node of its tree structure
as the root node. The attitude algorithm will be used to update the pose of the human
model. A good attitude algorithm greatly impacts the accuracy of human motion capture.
Because the gradient descent method can find the optimal solution of the objective
function, it can help the human body model to find the optimal solution better in
the pose solution process and help improve the accuracy of human motion capture. The
gradient descent method has a simple structure and produces stable and reliable results
[16]. Therefore, this paper proposes a method of attitude calculation based on the gradient
descent method, which is used to capture human motion. In this solution method, the
attitude error is represented by an error function. When used for 3D motion human
body capture, the error is generated mainly by the accelerometer and magnetometer.
In the process of human posture recognition, the human posture is expressed by a quaternion,
and its expression is $q=q_{0}+q_{1}i+q_{2}j+q_{3}k$, where $q_{0}$, $q_{1}$, $q_{2}$,
and $q_{3}$ are real numbers. The error caused by the acceleration in the proposed
method is $F_{f}(q,f)$, and its expression is shown in Eq. (1).
Eq. (1), $f_{x}^{b}$, $f_{y}^{b}$, and $f_{z}^{b}$are the measured values of the accelerometer
in three directions in the carrier coordinate system. The carrier coordinate system
is connected to the carrier, and the coordinate origin is the coordinate system of
the center of the carrier, referred to as the b system. In Eq. (1), $b$ represents the carrier coordinate system, $q$ represents the quaternion of human
posture, and $f$ represents the measured value. The error generated by the magnetometer
is used to express that(Ed note: I am unclear what ``that'' is. The estimation vector
of the magnetometer needs to be expressed first to calculate the magnetometer error.
The navigation-coordinate system is the reference coordinate system used to determine
the navigation parameters of the carrier, referred to as the n-system. In the navigation-coordinate
system, the expression of the estimated vector of the magnetometer is shown in Eq.
(2).
where $h_{x}$, $h_{y}$and $h_{z}$are the measured values of the magnetometer in three
directions in the navigation-coordinate system, and $m^{b}$are the observation vectors
in the carrier coordinate system, $n$represents the navigation-coordinate system.
Among them $h_{x}=\sqrt{(m_{x}^{n})^{2}+m_{y}^{n})^{2}}$, $h_{y}=0$, $h_{z}=m_{z}^{n}$,
will be $h$transformed into the estimated vector in the carrier-coordinate system,
$h'$, and its expression is shown in Eq. (3).
The expression of magnetometer error $F_{m}(q,m)$ is shown in Eq. (4).
Eq. (4), $\frac{\partial F_{mx}}{dq}$, $\frac{\partial F_{my}}{dq}$, and $\frac{\partial
F_{mz}}{dq}$, respectively, represent the errors of the magnetometer in three directions
in the carrier coordinate system. The expression of the total error can be obtained
by combining the errors of the magnetometer and the acceleration, as shown in Eq.
(5).
The partial derivative of Eq. (5) was calculated to obtain the expression shown in Eq. (6).
To minimize the error of Eq. (6), it(Ed note: I am unsure what ``it'' is.) was calculated using Gauss\textendash{}Newton
method, and $q_{\nabla ,t}$the calculation target is expressed as\_(ed note: What
is it expressed as? Please check the change.) Eq. (7).
where $\mu $ is the step size; $\nabla $ is a differential operator; $\nabla F_{f,m}(q,f,m)$
expressed as Eq. (8).
The gradient-descent data-fusion algorithm is used to solve the attitude; the algorithm
flow chart is shown in Fig. 3.
The algorithm flow chart in Fig. 4 shows that the errors obtained from the magnetometer and accelerometer are passed
through the Jacobian matrix of the error function to obtain the gyroscope correction
value. The differential equation was then solved through the gyroscope output to obtain
the final attitude quaternion. Eq. (9) shows the expression of the final attitude quaternion.
where $q_{t}$ represents the final attitude quaternion; $\alpha $ is the weight of
the corrected quaternion; $q_{\omega ,t}$ is the quaternion obtained by solving the
differential equation. If $\beta $ is the gyroscope error and $t_{s}$ the sampling
period, Eq. (10) can be used.
where $\alpha $ is the weight, and $\mu $ is the step size. In this case, $q_{t}$,
the expression can be changed to Eq. (11).
Initialization and calibration are very important processes in human motion capture.
They solve the rotation quaternion from the carrier coordinate system to the sensor
coordinate system. The content of the motion capture is the motion posture of the
human body. The content of the motion capture is also inaccurate. When the initial
calibration is performed, the carrier coordinates coincide with the geographic coordinate
system. The geographic coordinate system takes the position of the carrier as the
origin of the coordinate system. Generally, the attitude description of the carrier
is roll angle, pitch angle, and yaw angle. At this time, the expression is shown in
Eq. (12).
where $t_{i}$ is the initial state. The axis of the sensor coordinate system of the
inertial sensor uses the coordinate axis constrained by the three sensitive axes of
the gyroscope to form a right-hand rectangular coordinate system that unifies the
coordinate axes of other inertial measuring elements into the sensor coordinate system,
which is referred to as the s system. The transformation relationship between the
sensor coordinate system and the carrier coordinate system is shown in Eq. (13).
where $s$ is the sensor coordinate system and $b$ is the carrier coordinate system.
The purpose of this system is to make the human body model follow the movement of
the human body in reality, solve the posture quaternion of each moving limb in the
navigation-coordinate system to the rotation quaternion between two connected limbs,
and solve the joints in the human body model. Solving the three-dimensional coordinates
of the points is the focus of the research. Taking the moving limbs of the lower limbs
of the human body as an example, the solution process is explained. Fig. 5 shows the structure diagram of the lower limbs of the human body.
Fig. 4. Flow chart of the gradient descent algorithm.
Fig. 5. Structural diagram of the human lower limbs.
Fig. 6. Node traversal update sequence.
As shown in Fig. 5, $i$ the quaternion of rotation between the two moving limbs can be represented $q_{i}^{i+1}$
by $i+1$ denoting the first $i+1$ segment of the moving limb, and the first segment
of the moving limb $i$. In addition, $i$ the rotation quaternion of the first limb
sensor coordinate system relative to the navigation-coordinate system is used ${ }_s^n
q_i$${ }_b^s q_i$ to represent the rotation matrix of the first $i$limb relative to
the wearable sensor coordinate system. The quaternion from the navigation-coordinate
system to the carrier coordinate system can be obtained through the sum of ${ }_b^n
q_i$, ${ }_s^n q_i$, and ${ }_b^s q_i$; its expression is shown in Eq. (14).
A similar ${ }_b^n q_{i+1}$ expression is shown in Eq. (15).
where ${ }_b^n q_{i+1}$ is the $i+1$ rotation matrix of the first limb relative to
the coordinate system of the wearable sensor. Eq. (16) can be obtained using Eqs. (14) and (15).
where ${ }^b q_i^{i+1}$ represents the rotation quaternion expression between two
moving limbs. In space, coordinate transformation can be accomplished through coordinate
translation, and its transformation equation is shown in Eq. (17).
where the $\left[x_{0},y_{0},z_{0}\right]$ original coordinates are $\left[x,y,z\right]$,
and the converted coordinates are represented. $C$ is a left-multiplying rotation
matrix, and $T$ is a translation vector. The human body tree model can represent the
real human body in the motion capture system. The positions of its joints determine
the spatial position of the human body model. Therefore, if the spatial position of
a certain limb is known, the posture and position of the human body model following
the movement of the human body can be inferred. For the motion capture system, the
most likely cause of the error is its drift problem. Selecting the pelvis position
as the root node can improve the drift problem, but this method will ignore the movement
of the human body in space, such as jumping. The squatting action cannot be tracked
effectively [17]. The research adopted the method of unconstrained root to solve the problem of not
being able to track the ups and downs of the human body correctly due to the selection
of the pelvis as the root node and used the spatial position of the foot to solve
the drift problem [18]. The foot space position was calculated using the zero-speed update algorithm. The
calculation result was accurate, and the drift error was small. Fig. 6 presents the traversal sequence of the human body nodes.
As shown in Fig. 6, the human body takes the foot as the starting point for traversal calculation, obtains
the position of the pelvic root node from the foot through the lower limb traversal,
and then recurses the spatial position of other nodes through the tree structure model
of the human body to realize Action refactoring. When the zero-speed update algorithm
calculates the foot displacement, the foot state is divided into support and swing
phases. When the foot is in the swing phase, the inertial navigation algorithm can
be used to calculate the foot state. When the foot is in the support phase, the foot
error can be estimated and corrected, and the accurate foot space position can be
obtained by combining the two methods. After obtaining the accurate foot space position,
the pose of the human body can be reconstructed using the root-free method. Similar
algorithms for human motion capture include the human motion real-time tracking strategy
proposed by Pengzhan et al. in 2016. This method captures the posture of human motion
through the complementary and Kalman filters with high accuracy [19]. Compared with the tracking strategy proposed in the study, the root unconstrained
method is added, which can be traversed from the foot as the starting point. This
method has a higher accuracy rate for capturing the motion posture with supporting
phase and short swing time and is more suitable for online sports teaching.