2.1. Quaternion Algebra
Quaternions are 4-D hypercomplex numbers. A quaternion q ∈ H is defined as:
where w, x, y, z are all real numbers, and the imaginary units (i, j, k) have the following properties:
Quaternions form a noncommutative normed division algebra H, for q0
, q1
∈ H the multiplication q0q1
≠ q1q0
, and is given by:
Let Q ∈ Hm×n be a quaternion matrix of rank r, the singular value decomposition (SVD) of Q is defined as follows [22]:
where Σr = diag(σ1, ..., σr) is a real diagonal matrix and has r non-null entries in descending order on its diagonal denoted the singular values
of Q. U ∈ Hn×n, and V ∈ Hm×m are quaternion unitary matrices, contain respectively the left and right quaternion
singular vectors of Q, VH
is the conjugate transpose matrix of V. Multichannel EEG signals can be represented in quaternion domain, denoted as q(t), as follow:
where Ch1(t), Ch2(t), Ch3(t), and Ch4(t) are real value measured at sampling point t of the four channels of EEG signal being selected for analysis. Therefore, the four
real-valued EEG channel time series data are converted to a more compact quaternion-valued
time-series data.
2.2. Enhanced QPCA Algorithm
Unlike other methods such as those in the time domain, frequency domain, and time-frequency
domain, the QPCA is a prevalent signal decomposition method. It generates features
that are highly representative with minimal redundancy of information and can handle
signals that are non-stationary and non-linear in nature [33]. Originally formulated for color image recognition, QPCA restricts the quaternion
representation to pure quaternions due to the limitation of having only three color
channels [31]. However, when applied to multichannel EEG processing, the traditional QPCA algorithm
is enhanced with a full quaternion representation to enrich the mutual relationship
information across channels. Meanwhile, to reduce computational load, the proposed
algorithm is applied to the features of segmented signals, rather than to the raw
data. In addition, to enhance the generalization power for classification, the features
for classification are selected based on the principal components of the singular
matrix. The enhanced QPCA algorithm for multichannel EEG signal processing is outlined
in Algorithm 1 as follows:
Algorithm 1: QPCA for multichannel EEG signal processing.
1) For each subject l, four EEG channels, denoted as Ch1
, Ch2
, Ch3
, and Ch4
, are selected. Ch1(t), Ch2(t), Ch3(t), and Ch4(t) are real value measured at sampling point t of the EEG signal under analysis. Let Tw
denote the length of whole time duration of the EEG signals under analysis, and fw
is the sampling frequency. Then the total number of sampling points is Nw
= Tw
× fw
.
2) To reduce the overall computational load, the EEG signals are segmented into short
intervals (duration denoted as Ts
) and the relative band power in these segmented signals are computed. Let Ns
= Tw
/Ts
be the number of segments. The relative band power of the delta band, theta band,
alpha band, and beta band are denoted as Rσ
, Rθ
, Rα
, and Rβ
. They are defined as the spectral power of the specific frequency band (BPσ
, BPθ
, BPα
, and BPβ
) divided by the total spectral power (BP) of full frequency spectrum (1-30 Hz) as shown in Eq. (6). Hence, the EEG signal is transformed into a vector of relative band power, each
with length of Ns
per channel:
3) To analyze the characteristic of alpha band, the relative band power of alpha band,
Rα
Chi
, i ∈ (1, 2, 3, 4) of these four EEG channels are used to construct a quaternion vectors
of length Ns
as in Eq. (7), where ql
∈ H1×Ns
and n ∈ (1, ..., Ns
). Similar procedure is used in for the analysis of other frequency bands.
4) Let the training set contains m subjects. Therefore m quaternion vectors in alpha band ql
∈ H1×Ns
, l = 1, ..., m, are constructed. The mean vector is defined as:
The difference between each quaternion vector and the mean vector is denoted as:
Then, a quaternion matrix $\tilde{Q} = (\tilde{q}_1^T, ..., \tilde{q}_m^T)^T \in \mathbb{H}^{m
\times N_s}$ is formed by concatenating all the centered quaternion vectors.
5) Compute the covariance matrix of $\tilde{Q}$ as:
6) By SVD, the covariance matrix is decomposed as:
U ∈ HNs×Ns
is the singular matrix, whose column vectors are eigenvectors of the covariance matrix
C. The principle components are defined as the first p eigenvectors in which the accumulated sum of corresponding eigenvalues exceeds a
preset threshold. The eigenvectors space is denoted by Up
∈ HNs×p.
7) The training feature is calculated by:
Each row of Y ∈ Hm×p is the training feature corresponding to each subject.
8) For testing each subject, follow Steps 1 to 3 to represent the 4 selected channels
EEG signals by quaternion vectors zl
∈ H1×Ns
and deducting the mean vectors z̃l
= zl
− E(ql). The testing feature F can then be expressed as:
9) As the training feature and testing feature are quaternion vectors, projection
scheme is needed to transform the feature vector from quaternion domain into real
domain for SVM classification. Let q, denoted in Eq. (1), as one of the quaternion value in the feature vector. The mean projection method
is employed here as q̄ = (x+y+w+z)/4. Hence, the quaternion feature matrix is transformed into real valued feature
matrix FReal
and fitted to SVM classifier for detection.
2.3. Simulations
To assess the performance of the proposed QPCA algorithm in multichannel EEG processing,
a series of simulations have been designed and implemented. A public EEG dataset for
AD detection has been utilized. For the purpose of real-time analysis, the data dimension
is significantly reduced through band power feature extraction. The enhanced QPCA
algorithm is then applied, generating classification features for AD detection via
an SVM classifier. General performance metrics such as accuracy, sensitivity, and
specificity are employed for evaluation. Extensive simulations are conducted for all
permutations of any 4-channel combinations of EEG signals. Based on these simulation
results, the properties of QPCA as a measure for brain connectivity analysis are presented.
Meanwhile, the most prominent channels and regions of interest in the brain for AD
classification are identified and analyzed, referencing existing track records. Finally,
the selection of several critical parameters in the enhanced QPCA algorithm, including
the segmentation interval, the projection method, and the number of principal components,
are further studied to optimize the performance. All the simulations were performed
on a computer with the following specifications: Intel Core i7-10700K CPU @ 2.90GHz,
16 GB DDR4 RAM, and Windows 10 Pro 64-bit operating system.
2.3.1 Dataset
The dataset, sourced from Ziaeian Hospital in Tehran, Iran [34], was used for the simulation. This dataset contains EEG signals from 11 participants.
The records of two other participants were excluded due to incomplete information.
Among the selected subjects, five had memory complaints. They were diagnosed through
neurological examination as either normal aging (non-AD) or suffering from mild AD.
The EEG data were recorded using 19 mono-polar channels in the standard 10/20 system
[35], as shown in Fig. 1. The data was preprocessed using EEGLAB [36] following Makoto’s pipeline, which includes several noise filtering steps: (i) high-pass
filtering at 1 Hz to remove slow drifts and low-frequency noise, (ii) notch filtering
to eliminate line noise at 50/60 Hz, (iii) artifact subspace reconstruction (ASR)
to remove transient artifacts, (iv) independent component analysis (ICA) to identify
and remove components related to eye blinks, muscle activity, and other non-neural
artifacts, and (v) low-pass filtering at 40 Hz to remove high-frequency noise. For
each participant, EEG data were recorded during the presentation of an auditory stimulant.
Each session consisted of six trials of 40-second stimuli, interleaved by 20-second
rest intervals in silence. The entire task resulted in 340 seconds of EEG signal.
Fig. 1. Standard 10/20 EEG setup that consists of 19 channels.
2.3.2 Simulation Procedures
From the EEG dataset, the six sessions during auditory stimulation for each participant
are used for classification. The first five sessions are grouped as training samples,
and the remaining one is used as the testing sample. This results in a total of 55
training samples and 11 testing samples. Each sample includes EEG recordings from
19 channels. As the stimuli last for 40 seconds at a 250 Hz sampling frequency, there
are 10,000 sampling points for each channel. In primary experiments for performance
evaluation, the length of each segment is set to one second, and the signal is divided
into 40 segments. This choice is based on a systematic review by Cassani et al. of
EEG studies for AD diagnosis [41], which indicated that the most common EEG epoch duration used in such studies was
1-2 seconds. However, to optimize the performance of our classification model, we
conducted an additional experiment to compare the performance for different segment
lengths. The spectral power of the data is estimated in power spectral method via
Matlab BANDPOWER function. The frequency range is defined as 1Hz to 30 Hz, which contains
the four major frequency bands: the alpha (1 ∼ 4 Hz), theta (4 ∼ 8 Hz), beta (8 ∼
13 Hz), and delta (13 ∼ 30 Hz) bands. As a result, the segmented signal for each channel
is transformed into a data vector with only 40 components of relative band power.
For each permutation of the 4-channel combinations, the enhanced QPCA algorithm is
applied to the training dataset. Initially, a full quaternion representation of the
selected four-channel data vector is constructed for each training sample. Then, a
quaternion data matrix is formed by concatenating all the centered quaternion vectors,
and its dimension equals to 55 (rows of training samples) × 40 (columns of relative
band power). Using the Matlab QTFM toolbox, developed by S. Sangwine and N. L. Bihan
[37], the covariance of the data matrix is computed and decomposed by quaternion SVD.
The leading column vectors of the singular matrix act as the principal components.
For evaluation purpose, the leading 20 principal components will be simulated separately
to search for the highest accuracy. The MEAN projection methods are applied, converting
the principal components from quaternion value to real value for binary classification
of AD by the SVM classifier. The SVM classifier is implemented using the Matlab FITCSVM
function, and the kernel function is set to the LINEAR function. Finally, accuracy,
sensitivity, and specificity are used as statistical measures to evaluate the performance
of the classification. The mathematical expressions of these statistical measures
are given by:
where true positive (TP) is a test result that correctly indicates the presence of
Alzheimer’s disease (AD). In other words, TP is the number of AD patients who are
correctly classified as having AD. True negative (TN) is a test result that correctly
indicates the absence of a condition or characteristic. False positive (FP) is a test
result which wrongly indicates that a particular condition or attribute is present.
False negative (FN) is a test result which wrongly indicates that a particular condition
or attribute is absent. The overall operation flow of the experiment is shown in Fig. 2. Furthermore, a 10-fold cross-validation on the existing dataset is conducted to
validate the generalization of the QPCA algorithm. The cross-validation method is
implemented using the Matlab CROSSVAL function with k=10. The experiments will be
repeated 1000 times to ensure the robustness and reliability of the results.
Fig. 2. Flowchart of the simulation procedures.
2.3.3 Brain connectivity analysis
Investigation of the relationship between EEG channels and early AD remains a hot
topic. Some researchers have proposed selecting channels based on specific ranking
methods. For instance, Puri et al. employed a subband-based energy to entropy ratio
for channel ranking [38], while Perez-Valero et al. recently selected channels based on the statistical analysis
of relative power features and complexity features [39]. However, the performance of these methods is data-dependent and lacks generalization.
On the other hand, Cicalese et al. reported that an increasing number of researchers
are determining the EEG channels in AD-related regions based on medical evidence [40]. As shown in Fig. 3, the right prefrontal area and left parietal area have been identified as being associated
with AD-linked cognitive decline. These regions are implicated in multiple cognitive
processes, such as working memory, attention, and executive function, and their dysfunction
may cause the cognitive deficits observed in AD.
Fig. 3. Diagrammatic representation of the major parts of the brain (lateral view).
In the simulation, all permutations of any 4-channel combinations are implemented
to conduct a comprehensive search for the most critical channels or brain regions
in AD classification. Given that there are 19 channels in the EEG dataset, there are
C4
19 = 3876 trials in QPCA analysis. However, due to the noncommutative property in the
quaternion domain, the permutation of these four channels should be considered as
different inputs. Therefore, the complete search consists of 93024 trials. Compared
with traditional functional connectivity measures, which only have 171 combinations
of pairwise channels for 19 channels, the quaternion representation of brain connectivity
can be used to explore high-dimensional correlations among channels and to generate
a more comprehensive description of brain dynamics. In traditional functional connectivity
measures, a 2D connectivity matrix is constructed to illustrate the measure values
for different pairs of channels. However, it is challenging to present the dynamic
functional connectivity in QPCA, which is a 4D matrix with leading principal components
as the measure value. To illustrate such high-dimensional connectivity relations,
the data dimension should be reduced, and only three channels are used to construct
the quaternion. The same simulation procedure is executed, and only the first principal
component is used to compute the measure value. Consequently, a 3D brain connectivity
matrix is formed to illustrate the brain dynamics under different frequency bands
of EEG. To compute the discrimination power of the QPCA measure value between the
normal subjects (NS) and AD patients, the interclass distance (Dist) is calculated
as follows:
where PCfb
NS
and PCfb
AD
are the measure value in different frequency bands based on principal component of
NS and AD. NNS
and NAD
are the number of samples of NS and AD.
2.3.4 Selection of QPCA Parameters
In the QPCA algorithm, there are three critical parameters: segmentation interval,
projection method, and number of principal components. The selection of these parameters
can significantly affect the discrimination power of features and the overall performance
in AD classification. Therefore, their settings need to be further rationalized and
optimized for better performance. To reduce the computational load, the raw EEG signals
are segmented into non-overlapping short intervals. The duration of these intervals
can affect the temporal resolution. If the duration is too short, real-time implementation
becomes difficult. However, if the duration is too long, there can be a loss of temporal
information. Therefore, selecting a suitable interval duration is critical. In this
study, twelve different durations of segmentation intervals, as shown in Table 1, are tested and evaluated based on classification performance.
Table 1. Segmentation interval settings.
|
|
Simulation settings
|
|
Case No.
|
1
|
2
|
3
|
4
|
5
|
6
|
|
Segment Interval (sec)
|
0.1
|
0.2
|
0.4
|
0.5
|
0.8
|
1
|
|
Length of Data Vector
|
400
|
200
|
100
|
80
|
50
|
40
|
|
Case No.
|
7
|
8
|
9
|
10
|
11
|
12
|
|
Segment Interval (sec)
|
1.25
|
2
|
2.5
|
4
|
5
|
10
|
|
Length of Data Vector
|
32
|
20
|
16
|
10
|
8
|
4
|
The projection method in the QPCA algorithm is used to convert the feature value from
the quaternion domain into the real domain for SVM classification. This conversion
is necessary because the SVM classifier operates in the real domain. The advantages
of this approach include ensuring compatibility with the SVM classifier, preserving
the essential information captured by QPCA, potentially enhancing classification performance,
and facilitating standardization and comparison with other studies. Apart from the
mean projection method, several other projection methods are implemented for comparison.
The absolute projection method calculates the mean of the absolute value of each quaternion
component. The norm projection method computes the square root of the sum of the squares
of each quaternion component. The phase projection method extracts the phase angle
of the quaternion. These methods are summarized in Table 2. As only a few leading principal components are employed for feature extraction,
the selection of the number of principal components is very important. Simulations
based on different numbers of principal components (from 1 to 40) will be conducted
to investigate their performance in AD classification. Additionally, the comparison
results will be analyzed to develop a better selection scheme. All of the above simulations
are conducted on the same channel set, which has the highest accuracy in the alpha
band. The performance metric is solely dependent on classification accuracy.
Table 2. Summary of projection methods for simulations.
|
|
Projection method
|
Implementation $q = w+xi +yj +zk$
|
|
1
|
Mean
|
$\bar{q} = (w+x+y+z)/4$
|
|
2
|
Absolute
|
$|\bar{q}| = (|w|+|x|+|y|+|z|)/4$
|
|
3
|
Norm
|
$\|q\| = \sqrt{w^2 + x^2 + y^2 + z^2}$
|
|
4
|
Phase
|
$\theta = \tan^{-1} \sqrt{x^2 + y^2 + z^2}/w$
|