Mobile QR Code QR CODE

  1. (Department of Computer Engineering, Kwangwoon University / Seoul, Korea jw03070@naver.com, zhsjzhsj@gmail.com, byeng3@kw.ac.kr, hjn040281@gmail.com, parkcheolsoo@kw.ac.kr )



Sleep stages, Automatic classification, EEG, 1D-CNN, bi-LSTM

1. Introduction

Although sleep is a very important factor in our lives [1], many people suffer from sleep disorders, such as insomnia, narcolepsy, and sleep apnea [2]. Therefore, they visit hospitals or sleep centers to test and evaluate their sleep quality to overcome sleep disorders. In order to diagnose sleep disorders, classification and analysis of sleep stages should be conducted to estimate their sleep qualities [3].

According to the American Academy of Sleep Medicine (AASM) standard [4], there are five stages: the wakefulness stage (W), REM stage (REM), and non-REM sleep stage (N1-N3). These sleep stages are determined using measured polysomnography (PSG) signals from electroencephalography (EEG), electrocardiogram (ECG), electrooculogram (EOG), and electromyography (EMG). PSG signals are divided into 30-second segments called epochs to determine the sleep stage.

So far, sleep experts have manually classified the five sleep stages using PSG signals based on AASM standards. The manual approach takes a long time and is labor-intensive. In addition, it produces unstable and inaccurate results because it is a subjective decision by a sleep expert [5]. In order to solve these problems, automatic sleep staging models using machine learning and deep learning approaches have been continuously proposed in several studies [6-10]. In this study, we propose a state-of-the-art model with higher performance than that of previous studies. We employed an end-to-end model using deep neural networks such as one-dimensional convolutional neural networks (1D-CNN) with an inception time network [11] module and bidirectional long short-term memory (bi-LSTM) [12].

2. The Proposed Method

In this study, we propose a model without preprocessing to reduce learning time and generalize the model. Fig. 1 illustrates the proposed model used in this study. An

EEG epoch’s features are extracted by using inception module layers inspired by InceptionTime [11] and the ensemble method. Ensemble learning produces high performance for classification and decreases overfitting [13]. Additionally, we used bidirectional-LSTM to teach the model stage transition rules that sleep experts rely on for their manual sleep scoring.

We performed a two-step training process with different learning epochs inspired by DeepSleepNet [8]. Since our CNN layers for feature extraction have higher complexity than bidirectional-LSTM layers, CNN layers do not need many learning epochs. Therefore, the two-step training process prevents overfitting of our CNN parts.

Fig. 1. Model architecture consisting of two-step training process.
../../Resources/ieie/IEIESPC.2021.10.6.464/fig1.png

2.1 Epoch Feature Extraction

In this study, we used a 1D-CNN-based InceptionTime module for extracting an epoch’s features. InceptionTime is a state-of-the-art model for time series classification proposed by Hassan et al. [10], which is based on an Inception Network [14] designed for image classification tasks. It adopts the receptive field concept for time series data for tuning the filter size of the CNN [15]. A large receptive field detects large patterns better, and a small receptive field detects smaller patterns better. Therefore, instead of using the same filter size for CNN layers, various filter sizes were used.

The number of filters of the convolutional layer in the Inception module is 32, and the stride is 1. In addition, the features passed through the bottleneck layer are entered into convolutional neural networks with filter sizes of 10, 20, and 40. In addition, a feature that has passed through a max pooling layer is entered into a convolutional layer with a filter size of 1, and a total of four features are concatenated.

2.2 Epoch Sequence Learning

We used three bidirectional-LSTM layers to learn the correlation between features of epochs. According to the AASM standard [3], if sleep spindles or a K-complex intervenes in a record that meets the requirements of the N1 stage (low amplitude and mixed frequency), both the epochs before and after are scored as the N2 stage. To learn this rule, the proposed model considers 10 epochs as one sentence and classifies N1 stages and N2 stages. Bidirectional-LSTMs manage forward and backward sequences by merging two LSTMs. Thus, past and future information can be learned.

2.3 Model Hyperparameters

There are several hyperparameters in this study, as shown in Table 1. We manually selected hyperparameters by trial and error. The first learning epoch is smaller than the second learning epoch because the complexity of our feature extraction part is higher than the epoch sequence learning part, and the batch size is 80. However, in order to connect an epoch feature extraction and epoch sequence learning, the batch shape was changed to (8,10) tuples instead of 80 scalar values. Thus, the first and second axes of the input shape are batches, the third axis is the data length, and the fourth axis is the channel. There are four random seeds due to the ensemble learning.

Table 1. Model hyperparameters.

Hyperparameter

Value

Optimizer

Adam

Learning rate

0.0001

Batch size

80

First learning epoch

10

Second learning epoch

50

Weight initializer

GlorotUniform

Random seeds

0, 777, 1234, 1479

Input shape

(8,10,3000,1)

3. Performance Evaluation

3.1 Dataset

We employed the Sleep-EDF dataset [16], which has two PSG records per participant and is divided into two participant groups. The SC group consists of participants who did not take sleep-related drugs, and the ST group consists of participants who took temazepam to study its effects as a drug for treating insomnia. Each PSG record contains EEG, EOG, and chin EMG signals, among which EEG signals have Fpz-Cz and Pz-Cz channels.

The EOG and EEG signals in PSG are each sampled at 100 Hz, and each epoch (30sec) has a sleep stage label. These records were classified and labeled in six stages by sleep experts based on the R&K standard [17]. We integrated the N3 and N4 stages into the N3 stage to meet the AASM standard, making it five stages. We used only PSG records of the SC group subjects to use signals from healthy participants and used the Fpz-Cz channel of EEG signals. In the Sleep-EDF database, we used 20 participants from the SC group. Since participant 13 only has one record due to disk loss, we used a total of 39 PSG records.

3.2 Performance Evaluation

To evaluate the generalization performance, we used k-fold cross validation [18]. We performed leave-one-patient-out cross validation so that we do not mix the PSG records of the same subject in the training dataset and the test dataset. This method evaluates whether the model is practical in a real-world application.

We evaluated our model performance using four metrics: overall accuracy (ACC), per-class recall (RE), per-class precision (PRE), per-class F1 score (F1), and macro-averaged F1 score (MF1). Eqs. (1)-(5) show the calculations of each metric:

(1)
$ACC=\frac{TP+TN}{TP+FP+TN+FN}$
(2)
$RE=\frac{TP}{TP+FN}$
(3)
$PRE=\frac{TP}{TN+FN}$
(4)
$F1=2\times \frac{PRE\times RE}{PRE+RE}$
(5)
$MF1=\frac{1}{C}\sum _{~ }^{~ }F1_{c}$

where C is the number of classes, and TP, TN, FN, and FP are true positive, true negative, false positive, and false positive, respectively. Table 2 shows a confusion matrix for each sleep stage, and Table 3 shows the performance metrics of each sleep stage through 20-fold cross validation. We averaged the F1 score for each sleep stage and calculated the overall accuracy. The macro-averaged F1 score was 79.05%, and the accuracy was 85.05%.

Table 2. Confusion matrix for the five sleep stages.

Prediction

W

N1

N2

N3

REM

True

W

7346

507

123

25

156

N1

524

1158

641

12

469

N2

437

384

15686

690

602

N3

36

3

556

5108

0

REM

215

201

721

6

6574

Table 3. Per-class performance: recall, precision, and F1 score.

Wake

N1

N2

N3

REM

RE (%)

90.06

41.30

88.13

89.87

85.19

PRE (%)

85.83

51.40

88.49

87.45

84.27

F1 (%)

87.90

45.80

88.85

88.50

84.73

3.3 Benchmark Test

Table 4 shows the performance metrics from other studies using the same dataset and a single EEG channel. Compared to the others, our proposed model has the highest performance metrics, accuracy, and macro-F1 average. Therefore, our model has better generalization performance than others.

Table 4. Comparison of proposed model with other approaches.

Study

Overall metrics

Per-class F1 score (%)

ACC

(%)

MF1

(%)

W

N1

N2

N3

REM

Tsinalis et al. [19]

78.9

73.7

71.6

47.0

84.6

84.0

81.4

IITNET [20]

84.0

77.7

87.9

44.7

88.0

85.7

82.1

DeepSleepNet [8]

82.0

76.9

84.7

46.6

85.9

84.8

82.4

Zhu et al. [10]

82.8

77.8

90.3

47.1

86.0

82.1

83.2

Eldele et al. [21]

84.4

78.1

89.7

42.6

88.8

90.2

79.0

Proposed

85.1

79.1

87.9

45.8

88.9

88.5

84.7

5. Conclusion

Our model for automatic staging was designed using an ensemble method and two-step learning based on a 1D-CNN and bidirectional LSTM. These techniques had the highest performance for classification. Since our model has no preprocessing, it is lighter and more generalizable than other deep learning sleep stage classification models. However, due to the lack of the number of N1 stages, the N1 stage classification performance is still weak. Thus, we will improve our algorithm to solve the sleep stage imbalance problem in the future.

ACKNOWLEDGMENTS

This research was supported by the MIST (Ministry of Science and ICT) under the National Program for Excellence in SW (2017-0-00096), which is supervised by the IITP (Institute for Information & communications Technology Promotion).

REFERENCES

1 
Mukherjee S., Patel S. R., Kales S. N., Ayas N. T., Strohl K. P., Gozal D., Malhotra A., 2015, An official American Thoracic Society statement: the importance of healthy sleep. Recommendations and future priorities., American journal of respiratory and critical care medicine, Vol. 191, No. 12DOI
2 
Chokroverty S., 2010, Overview of sleep & sleep disorders, Indian J. Med. Res., Vol. 131, No. 2, pp. 126-140Google Search
3 
Krystal A. D., Edinger J. D., 2008, Measuring sleep quality, Sleep Med., Vol. 9, No. suppl. 1, pp. 10-17DOI
4 
Berry R. B., et al. , 2017, AASM scoring manual updates for 2017 (version 2.4), J. Clin. Sleep Med., Vol. 13, No. 5, pp. 665-666DOI
5 
Whitney C. W., et al. , 1998, Reliability of scoring respiratory disturbance indices and sleep staging, Sleep, Vol. 21, No. 7, pp. 749-757DOI
6 
Fraiwan L., Lweesy K., Khasawneh N., Wenz H., Dickhaus H., 2012, Automated sleep stage identification system based on time-frequency analysis of a single EEG channel and random forest classifier, Comput. Methods Programs Biomed., Vol. 108, No. 1, pp. 10-19DOI
7 
Shen X., Fan Y., 2012, Sleep stage classification based on eeg signals by using improved hilbert-huang transform, Appl. Mech. Mater., Vol. 138-139, pp. 1096-1101DOI
8 
Supratak A., Dong H., Wu C., Guo Y., 2017, DeepSleepNet: A model for automatic sleep stage scoring based on raw single-channel EEG, IEEE Trans. Neural Syst. Rehabil. Eng., Vol. 25, No. 11, pp. 1998-2008DOI
9 
Phan H., Andreotti F., Cooray N., Oliver Chen Y., De Vos M., 2018, DNN Filter Bank Improves 1-Max Pooling CNN for Single-Channel EEG Automatic Sleep Stage Classification, Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS, Vol. 2018-july, pp. 453-456DOI
10 
Zhu T., Luo W., Yu F., 2020, Convolution-and attention-based neural network for automated sleep stage classification, Int. J. Environ. Res. Public Health, Vol. 17, No. 11, pp. 1-13DOI
11 
Ismail Fawaz H., et al. , 2020, InceptionTime: Finding AlexNet for time series classification, Data Min. Knowl. Discov., Vol. 34, No. 6, pp. 1936-1962DOI
12 
M. A., Alex Graves G. H., 2013, Speech Recognition with Deep Recurrent Neural Networks, Department of Computer Science, University of Toronto, Dep. Comput. Sci. Univ. Toronto, Vol. 3, No. 3, pp. 45-49DOI
13 
Dietterich, Thomas G. , 2002, Ensemble learning The handbook of brain theory and neural networks 2.1, pp. 110-125Google Search
14 
Luo W., Li Y., Urtasun R., Zemel R., 2016, Understanding the effective receptive field in deep convolutional neural networks, Adv. Neural Inf. Process. Syst., no. Nips, pp. 4905-4913URL
15 
Goldberger A. L., Amaral L. A., Glass L., Hausdorff J. M., Ivanov P. C., Mark R. G., Stanley H. E., 2000, PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals., circulation 101.23.DOI
16 
T. Hori, et al. , 2001, Proposed supplements and amendments to ‘A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects’, the Rechtschaffen & Kales (1968) standard, Psychiatry Clin. Neurosci., Vol. 55, No. 3, pp. 305-310DOI
17 
Stone M., 1974, Cross-Validatory Choice and Assessment of Statistical Predictions, J. R. Stat. Soc. Ser. B, Vol. 36, No. 2, pp. 111-133DOI
18 
Tsinalis O., Matthews P. M., Guo Y., 2016, Automatic Sleep Stage Scoring Using Time-Frequency Analysis and Stacked Sparse Autoencoders, Ann. Biomed. Eng., Vol. 44, No. 5, pp. 1587-1597DOI
19 
Seo H., Back S., Lee S., Park D., Kim T., Lee K., 2020, Intra- and inter-epoch temporal context network (IITNet) using sub-epoch features for automatic sleep scoring on raw single-channel EEG, Biomed. Signal Process. Control, Vol. 61DOI
20 
Eldele E., et al. , 2021, An Attention-Based Deep Learning Approach for Sleep Stage Classification With Single-Channel EEG, in IEEE Transactions on Neural Systems and Rehabilitation Engineering, Vol. 29, pp. 809-818DOI

Author

JaeWoo Baek
../../Resources/ieie/IEIESPC.2021.10.6.464/au1.png

JaeWoo Baek received his B.S. degree in computer engineering from Kwangwoon University in Seoul, South Korea. His research interests include biological signal processing, machine learning, deep learning, and reinforcement learning.

SuHwan Baek
../../Resources/ieie/IEIESPC.2021.10.6.464/au2.png

SuHwan Baek received his B.S. degree in computer engineering from Kwangwoon University in Seoul, South Korea. His research interests include overall Medical AI and Auto ML (ENAS). He is also attracted to reinforcement learning and generative models.

Hyunsu Yu
../../Resources/ieie/IEIESPC.2021.10.6.464/au3.png

Hyunsu Yu received his BS degree in robotics engineering from Kwangwoon University in Seoul, South Korea. His research interests include experimental setting, signal processing, machine learning, and artificial intelligence.

Junghwan Lee
../../Resources/ieie/IEIESPC.2021.10.6.464/au4.png

Junghwan Lee is in the MSc Program at the Bio Computing & Machine Learning Laboratory (BCML) in the Department of Computer Engineering at Kwangwoon University, Seoul, Republic of Korea. His research interests include machine learning and deep learning algorithms.

Cheolsoo Park
../../Resources/ieie/IEIESPC.2021.10.6.464/au5.png

Cheolsoo Park is an associate professor in the Computer Engineering Department at Kwangoon University, Seoul, South Korea. He received a B.Eng. in Electrical Engineering from Sogang University, Seoul, and an MSc from the Biomedical Engineering Department at Seoul National University, South Korea. In 2012, he received his PhD in Adaptive Nonlinear Signal Processing from Imperial College London, U.K., and worked as a postdoctoral researcher in the Bioengineering Department at the University of California, San Diego, U.S.A. His research interests are mainly in the areas of machine learning and adaptive and statical signal processing, with applications in brain computer interfaces, computational neuroscience, and wearable technology.