2.2. ANN Method Concept
Characterizing and identifying systems is a fundamental task in systems theory, where
the former involves establishing the mathematical representation of a system. This
means the main challenge in SI entails determining an appropriate model structure
of the experimental data to be a representation of the real system. Creating a model
that can capture the dynamic behavior of the system can be done through an understanding
of the essential elements of the system. And since we do not know the essential element
or parameter of the physical system, we can’t write the model directly using the first
principle and physics. In this situation, a black-box approach can be used. The black-box
SI approach is basically constructed as a suitable identification model as shown in
Fig. 2, which is subjected to the same input $u(t)$ as the plant, produce an output $\hat{y}(t)$
that approximates $y(t)$ according to desire sense, and $e(t)$ is the error between
desire sense and the approximates output.
Fig. 2. General SI scheme.
The formulation of the SI model can be categorized into linear and nonlinear. In nonlinear
SI, the relationship between the input $u(t)$ and the output $y(t)$ of the system
can be expressed using nonlinear equations. The general form of a nonlinear SI can
be written as
where $y(t)$ is the system output at time $t$, $u(t)$ denotes the input to the system
at time $t$, $\theta$ represents the vector of unknown parameters that need to be
estimated, $f$ is a known set of functions that relate the input and parameters to
the output, and $e(t)$ represents the model error at time $t$.
This study encountered EPS with its nonlinearities behavior and proposed a nonlinear
method in SI of its dynamic. The objective is to achieve a realization or approximation
of the underlying dynamics $f$ by using ANN. To accomplish this, let’s introduce the
input vector $v(t)$ for the network which is a confined number of the past inputs
and the output as
In practical applications, these decisions are often influenced by the availability
of prior information pertaining or training set to the plant under identification
that can be written as
Based on the data, we deduce the correlation or connection as
where $f$ is a function deduced from Eq. (2). In the presence of unknown physical parameters within the system, we can examine
the relationship between $y$ and $v$.
The equation representation in Eq. (1) is simplified of a general non-linear system of the NARMAX model as
where $y(t -1), y(t -2), ..., y(t -n)$ denotes the past output values, $u(t -1), u(t
-2), ..., u(t -m)$ represents the past input values, and $e(t -1), e(t -2), ..., e(t
- p)$ denotes the past error or residual values. The output $y(t)$ on Eq. (1) is a function of the current input $u(t)$, the unknown parameters $\theta$, and an
error term $e(t)$. This equation suggests a direct relationship between the current
input and output, potentially with some error or noise. On the other hand, Eq. (5) represents a more complex nonlinear model that the output $y(t)$ depends not only
on the current input $u(t)$ and the unknown parameters $\theta$ but also on the past
outputs $y(t -1)$, $y(t -2), ..., y(t -n)$, and the past inputs and errors. This equation
incorporates a memory or feedback mechanism, allowing for the consideration of the
system’s history and potentially capturing dynamic or temporal dependencies.
In the context of SI, an ANN is used to approximate the underlying dynamics or relationship
between input and output variables of a system based on observed data. It serves as
a flexible and adaptive model that can capture nonlinearities and complex interactions
in the system’s behavior. Learning processes in ANN refer to the mechanisms by which
ANN adapts and improves their performance based on the input data and desired outputs.
In Fig. 3, we can observe an illustration depicting the process of learning and error correction
in an ANN.
Fig. 3. Illustrating the learning process and error correction: (a) An ANN with one
output, (b) neuron output signal flow graph, and (c) flow on forward and backward
propagation.
Assuming the network depicted in Fig. 3(a) comprises a single hidden layer, with inputs represented by the vector $x(z)$, and
an output layer containing a neuron labeled $k$. This neuron in the output layer is
driven by a signal vector $h(z)$ generated by the previous layer, and $y_k(z)$ corresponds
to the predicted output associated with this configuration. In this context, the variable
$z$ signifies a discrete time step, representing the iterative stages involved in
adjusting the interconnection weights of the neurons within the network.
Now, let’s examine a simpler scenario involving a neuron-$k$ located in the output
layer, serving as the only computational neuron, with a designated signal vector $h(z)$
as shown in detail in Fig. 3(b). After the signals have undergone forward propagation, the output of the ANN is compared
to a target output represented by $g_k(z)$. This comparison leads to the derivation
of an error signal, labeled as $e_k(z)$. By definition, this error signal quantifies
the difference between the ANN’s output and the desired target output. Consequently,
the error output signal at the output neuron-$k$ during the $z$th iteration or time
step is formally expressed as
To ensure that errors in both positive and negative directions are treated equally,
the error needs to be square, we thus have
The learning process entails iteratively modifying the weights of the network’s connections.
The term iteration is defined as a complete cycle of calculations (forward and backward
passes) as illustrated in Fig. 3(c). This objective is to achieve a minimize of error criterion by minimizing a cost
function. Normalizing the sum of squared error with respect to the set size $N$, we
get
According to the BP algorithm, applied a weight’s correction $\Delta W_{kN}(z)$ to
the weight $W_{kN}(z)$ which is proportional to the partial derivative $E_k(z)$ with
respect to $W_{kN}(z)$, we get
The $W_{kN}(z)$ is the weight of the neurons in between the output layer and hidden
layer (see Fig. 3(b)) and $\Delta W_{kN}(z)$ is the tuning of the weight $W_{kN}$ at time step $z$. The
correction $\Delta W_{kN}(z)$ applied to $W_{kN}(z)$ is defined by
where $\eta$ is learning rate. Having computed the weight adjustment $\Delta W_{kN}(z)$,
the updated value of the weight $W_{kN}(z)$ is defined by
where $W_{kn}(z+1)$ described a new weight.