LeeJunghwan1
EomHeesang2
HariyaniYuli Sun2
KimCheonjung3
YooYongkyoung4
LeeJeonghoon3*
ParkCheolsoo2*
-
(Department of Information Convergence, Kwangwoon University / Seoul, Korea hjn040281@gmail.com)
-
(Department of Computer Engineering, Kwangwoon University / Seoul, Korea
{9200heesang@gmail.com, yulisun@telkomuniversity.ac.id}
)
-
(Department of Electrical Engineering, Kwangwoon University / Seoul, Korea kimrja123@naver.com)
-
(Department of Electronic Engineering, Catholic Kwandong University, 24, Beomil-ro
579beongil, Gangneung-si
Gangwon-do 25601, Korea yongkyoung0108@gmail.com
)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
1. Introduction
Influenza is one of the main causes of respiratory diseases all over the world
[1]. Although the influenza virus can manifest as minor infections in healthy adults,
it can cause severe acute respiratory diseases in high-risk groups, such as chronic
patients, infants, and the elderly, posing a life-threatening risk. In the United
States, an estimated 30,000 people died during the influenza pandemic [2]. Hence, early diagnosis of influenza is crucial.
Attempts of an early examination of influenza have been made in various ways.
The present study aimed to detect the influenza virus efficiently before the development
of the symptoms. Influenza produces more severe symptoms than the common cold and
increases the frequency of dangerous complications that require proper stability and
treatment [3]. Furthermore, early diagnosis is very important because it varies slightly depending
on whether it is type A or B [4]. An influenza diagnosis appears to require a detailed diagnosis, not binary. Therefore,
this study suggests the possibility of classification using deep learning.
In this study, influenza was classified by processing its detect kit images based
on deep neural networks. A two-dimensional convolutional neural network (2D CNN) algorithm
was designed for the classification task, and its structure was fine-tuned for influenza
kit images using an optimization tool, Bayesian optimization hyperband (BOHB) [5]. The performance of the designed model was assessed by calculating the accuracy,
precision, and recall. The remainder of this paper is classified as follows. Section
2 reviews the background, and Section 3 presents the experiment. The results are reported
in Section 4.
2. Background
This section describes the proposed CNN Model and BOHB method.
2.1 CNN MODEL
LeCun $\textit{et al.}$ proposed the fundamental structure of the convolutional
neural network [6], which is an algorithm used widely to classify images, sounds, text, and videos.
The CNN has improved performance in recognizing image objects and finding patterns
by extracting the features directly from the data [7].
Unlike the general neural network, the CNN can encode the characteristics (horizontal,
vertical, and color channels) of image data assuming that the input data is an image
and consists of learning weights and bias [8].
Each neuron has an internal operation and a non-linear operation according to
its choice. The entire network has a one score function and a loss function on the
last layer. The same techniques can be applied to general neural network learning.
The CONV(Convolutional) layer is a key component of a CNN. Receiving input from
the input image, this connected area and its weight are calculated, and the output
of the CONV layer is interpreted as neurons arranged in three dimensions.
ReLU(Rectified Linear Unit. Activation function) is a layer that applies the
following function (1) to the output of the CONV layer.
The reason for applying RELU as an activation function is that it solves the
problem of a vanishing gradient better than the sigmoid function.
POOL(Pooling) is a layer that reduces the size to reduce the number of parameters
or computations in the network periodically in the middle of the CONV layer in the
CNN structure. This layer performs downsampling for each depth and controls the overfit.
The neurons in the Fully-Connected (FC) layer are linked to all the activations
in the previous layer. Hence, the FC layer activations multiply the metrics, and add
the bias.
Fig. 1 presents the CNN model.
Fig. 1. Unilateral CNN Model Structure.
2.2 Bayesian Optimize and Hyperband
The BOHB is a combination of the Bayesian optimization method and a hyperband.
The algorithm produces hyperparameter sets of Bayesian optimization, selects one of
them, and progresses the hyperband to find the hyperparameters.
The performance of the deep learning algorithm depends greatly on the configuration
of the hyperparameter. Bayesian optimization is used to reduce the unnecessary hyperparameter
repetitive navigation to find the optimal hyperparameter faster.
Hyperband algorithms first extract the arbitrary sets within a given range for
hyperparameter navigation. The predictive performance after learning was compared
by applying the corresponding hyperparameters. The low-performance hyperparameter
is removed, and the remaining values are used. The performance is then compared. This
process is repeated to obtain the optimum until the last hyperparameter remains.
The estimation of hyperparameters can be made faster with the BOHB than Bayesian
optimization because it combines the hyperband and Bayesian optimization methods using
stochastic estimation information [9].
In this experiment, the three parameters were explored: learning rate (learning
rate), number of learning (epochs), and weight decay (weight attenuation) using the
BOHB. This is a preferable method to estimate the hyperparameters in a deep learning
algorithm because there are more parameters to decide than traditional machine learning
algorithms.
2.3 Influenza Kit
Influenza A and B antigens can be detected qualitatively and rapidly in nasal
swab specimens using an ulti med Influenza A/B Antigen Test, a chromatographic immunoassay.
The test can provide a rapid differential diagnosis of influenza A and B viral infections
[10].
Influenza (commonly known as ‘flu’) is a highly contagious, acute viral infection
of the respiratory tract that can be transmitted easily by coughing and the sneezing
of aerosolized droplets containing live virus. Influenza outbreaks occur each year,
generally during the fall and winter months. Type A viruses are typically more prevalent
and are associated with most serious influenza epidemics compared to type B viruses.
In contrast, type B infections are usually milder. The Influenza A/B Antigen Test
detects the presence of the Influenza A or B antigen or both qualitatively in nasal
swab specimens, providing results within 15 minutes. The test uses antibodies specific
to Influenza A and B to detect the Influenza A and B antigen selectively in nasal
swab specimens [10].
The Influenza A/B Antigen Test is a qualitative, lateral flow immunoassay that
detects Influenza A and B nucleoproteins in nasal swab specimens. In this test, an
antibody specific to the Influenza A and B nucleoproteins are coated separately on
the test line regions of the test cassette. During testing, the extracted specimen
reacts with the antibodies to Influenza A or B or both. The mixture migrates along
the membrane and reacts with the antibodies to Influenza A or B or both. A positive
outcome is illustrated by one or two colored lines in the test regions. As a procedural
control, a colored line always appears in the control region to show that the test
had been performed properly [11].
3. Materials and Methods
This section describes data acquisition, augmentation, preprocessing methods,
and influenza image classification, respectively. Fig. 2 presents the experimental procedure.
Fig. 2. Experimental procedure.
3.1 Data Acquisition
In this experiment, the designed deep learning model trains the patterns of the
influenza detection kit images. One hundred images of six class influenza types were
used, which included ‘None’ and low, medium, and high concentrations of ‘Type A’ and
‘Type B’ influenza.
Types A and B influenza image data were the sample data produced by diluting
them with a mixture of water, borate-based tween, and bovine serum albumin at the
same rate, as listed in Table 2. The data were extracted from an influenza detection kit, and the deep learning model
was trained using the corresponding images.
Table 1. Influenza image dataset.
Class name (Influenza type)
|
Number of data
|
NC (None type)
|
100
|
A type (Low)
|
15
|
A type (Mid)
|
15
|
A type (High)
|
15
|
B type (Low)
|
15
|
B type (Mid)
|
15
|
B type (High)
|
15
|
Table 2. Influenza dilution ratio.
Class name(Influenza type)
|
Dilution
|
A type (Low)
|
1/625,000
|
A type (Mid)
|
1/75,000
|
A type (High)
|
1/25,000
|
B type (Low)
|
1/160,000
|
B type (Mid)
|
1/20,000
|
B type (High)
|
1/5,000
|
3.2 Data Augmentation
The performance of the designed CNN model was affected strongly by the quality
of the image data. Even if the number of image data is sufficient, overfitting can
occur when a model is taught with a set of data with a high or low percentage of data
in a particular label, i.e., an imbalanced dataset [11].
The dataset in the present study was also unbalanced between the number of ‘None’
and other classes. Hence, data were generated using image aggregation in the remaining
classes except for the ‘None’ type class [5].
Geometric variations were used for data augmentation [12]. To set a balance between the images of ‘None’ and the other types, the data were
generated by randomly adjusting the image rotation (between 0$^{\circ}$ and 30$^{\circ}$)
and brightness level to the remaining data. Data aggregation was carried out by randomly
setting the rotation and bright levels of the images, as shown in Table 3.
Table 3. Augmentation image dataset.
Class name(Influenza type)
|
Number of data
|
NC (None type)
|
100
|
A type (Low)
|
90
|
A type (Mid)
|
90
|
A type (High)
|
90
|
B type (Low)
|
90
|
B type (Mid)
|
90
|
B type (High)
|
90
|
3.3 Image Classification
Image classification of the influenza detection kits was conducted using an optimized
deep learning algorithm, which is explained in Section 2. The model used in this experiment
was constructed, as shown in Fig. 4.
Approximately 70% of the total data (427) were used as training data, with the
remaining 30% (213) used as test data. The input images were resized to 128${\times}$128.
The RGB values were separated, and the convolution operation was conducted. The SGD
optimizer was used to update the weights of the networks. The decay rate prevents
overfitting. As mentioned earlier, the deep learning model attempted to find better
hyperparameters using the BOHB. The BOHB algorithm was used to search for the optimal
parameters (learning rate, number of epochs, and weight decay) to fine-tune the image
classification model, as shown in Table 4.
Fig. 3. Influenza image data example.
Fig. 4. Architecture of the 2D CNN.
Table 4. Model parameters.
Hyperparameter
|
Value
|
Optimization
|
|
Learning rate
|
Optimal value
(0.001 ~ 0.01)
|
Number of epochs
|
Optimal value
(100 ~ 1000)
|
Batch size
|
32
|
Optimizer
|
SGD
|
Weight decay
|
Optimal value
(0.001 ~ 0.01)
|
Model
|
|
Input shape
|
128*128*3
|
Number of class
|
7
|
4. Results
The classification performance was measured using three metrics: average accuracy
(ACC), recall (TPR), and precision (PPV) for each class. Each metric was calculated
using Eqs. (2)-(4).
TP and TN are the true positives and true negatives. Similarly, the FP and FN
are the false positives and false negatives. Table 5 lists the three best models in this experiment among the models tested. Table 6 presents the results of the three cases. Table 7 lists the confusion matrices of the three cases. Overall, the Case 1 Model showed
the highest performance, as shown in Tables 6 and 7.
Table 5. Three best models.
3 model’s information
|
|
Learning Late
|
Weight decay
|
epoch
|
Case 1
|
0.00597168
|
0.00303567
|
832
|
Case 2
|
0.00852943
|
0.00101431
|
340
|
Case 3
|
0.00227625
|
0.00316404
|
500
|
Table 6. Details of the experiment results.
Experiment results
|
Indicator
Model
|
ACC (%)
|
TPR (%)
|
PPV (%)
|
Case 1
|
90.14
|
90.71
|
90.00
|
Case 2
|
88.73
|
89.14
|
88.57
|
Case 3
|
82.40
|
87.57
|
87.28
|
Table 7. The confusion matrix for each case.
Case 1
|
|
True label
|
Predicted label
|
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
1
|
33
|
0
|
0
|
0
|
0
|
0
|
0
|
2
|
0
|
20
|
2
|
0
|
2
|
0
|
0
|
3
|
0
|
6
|
28
|
1
|
0
|
0
|
0
|
4
|
0
|
0
|
0
|
29
|
0
|
0
|
0
|
5
|
0
|
4
|
0
|
0
|
28
|
5
|
0
|
6
|
0
|
0
|
0
|
0
|
0
|
25
|
1
|
7
|
0
|
0
|
0
|
0
|
0
|
0
|
29
|
Case 2
|
|
True label
|
Predicted label
|
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
1
|
33
|
0
|
0
|
0
|
0
|
0
|
0
|
2
|
0
|
20
|
2
|
0
|
5
|
0
|
0
|
3
|
0
|
6
|
28
|
1
|
0
|
0
|
0
|
4
|
0
|
0
|
0
|
29
|
0
|
0
|
0
|
5
|
0
|
4
|
0
|
0
|
25
|
5
|
0
|
6
|
0
|
0
|
0
|
0
|
0
|
25
|
1
|
7
|
0
|
0
|
0
|
0
|
0
|
0
|
29
|
Case 3
|
|
True label
|
Predicted label
|
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
1
|
33
|
2
|
0
|
0
|
0
|
0
|
0
|
2
|
0
|
9
|
2
|
0
|
6
|
0
|
0
|
3
|
0
|
6
|
27
|
0
|
0
|
0
|
0
|
4
|
0
|
0
|
1
|
30
|
0
|
0
|
0
|
5
|
0
|
13
|
0
|
0
|
24
|
9
|
0
|
6
|
0
|
0
|
0
|
0
|
0
|
21
|
0
|
7
|
0
|
0
|
0
|
0
|
0
|
0
|
30
|
(None: 1, Type A low : 2, Type A mid : 3, Type A high: 4, Type B low : 5, Type
B mid : 6, Type B high: 7 )
5. Conclusion
Influenza infections can have serious consequences for humans. This study simulated
the classification of influenza detection kit images. The fraction using 2d-CNN was
attempted, and the hyperparameter optimized using the BOHB algorithm yielded the best
performance. For future research, various neural architecture search algorithms will
be assessed to improve the performance of the detection kit.
ACKNOWLEDGMENTS
“This research was supported by the MIST(Ministry of Science and ICT), under the
National Program for Excellence in SW (2017-0-00096), supervised by the IITP(Institute
for Information & communications Technology Promotion)”
REFERENCES
KONG Byoungjae, KIM Yuna, MOON Seokoh, JUNG Younghun, YU Seok-Hyeon, SHIN Jonghyeok,
PARK Joonbum, KIM Hye Rin, KWEON. Dae-Hyuk, 2018, Perforation of membrane envelope
of influenza A virus., The Korean Society for Biotechnology and Bioengineering, pp.
100-100
Kim Ju-sung, Park Geun-yong, Lee Min-koo, Kim Young-sun, Ahn Jung-bae, Yoon Hyun-ju.,
1995, Rapid and Type-Specific Detection of Human Influenza Viruses using Reverse Transcription-Polymerase
Chain Reaction, Journal of the Korean Microbiological Society, Vol. 30, No. 2, pp.
233-243
Kim Sang-tae, Kim Young-gyun, Kim Jang-soo., 2011, Early Diagnostic Method of Avian
Influenza Virus Subtype Using Ultra Real-Time PCR., Journal of the Microbiological
Society, Vol. 47, No. 1, pp. 30-37
Park Yoon-hyung, Woo Young-dae, Kim Seung-gon, Bae Hyung-joon, Park Sang-wook, 2001,
Simultaneous detection of the cell fusion respiratory virus, influenza virus A (H3N2HH1N1)
and B using multiple reversible polymerase chain reactions in a single laboratory.,
JOURNAL OF BACTERIOLOGY AND VIROLOGY, Vol. 31, No. 3, pp. 269-274
Cho Hyung-hun, Lee Won-jong., 2019, Meta-learning and superparametric optimization
for enhancing learning speed and performance of enhanced learning., Journal of Information
Science, Vol. 37, No. 11, pp. 25-33
Park Sung-hyun, Lim Byung-yeon, Jung Hoe-kyung, 2020, CNN-based toxic plant discriminating
system., Journal of the Korean Society of Information and Communication, Vol. 24,
No. 8, pp. 993-998
Kim Jung-jin, Cho Sung-wook, Ji Young-min, 2017, Image Classification Using CNN.,
Journal of Comprehensive Academic Presentation of the Korean Society of Information
Technology, pp. 452-453
Falkner Stefan, Klein Aaron, Hutter Frank, 2018, BOHB: Robust and efficient hyperparameter
optimization at scale., arXiv preprint arXiv:1807. 01774
https://www.ultimed.org/produkte/influenza-ab-antigen-test/
Na Seong-Won, Bae Hyo-Churl, Yoon Kyoungro, 2017, A study of efficient learning methods
of CNN for small dataset., The Korean Society Of Broad Engineers, pp. 243-244
Kim Geo-sik, Lee Moon-sup, Son Dong-hoon, Kim Jong-un, Min Ki-hyun, Kim Kye-eun, Kang
Hyun-seo, 20202, Performance analysis of CNN-based dermal disease image classifier
using data balancing algorithm., Journal of the Society of Electronics Engineers,
Vol. 57, No. 7, pp. 76-8
Author
Junghwan Lee is in the MSc Program at the Bio Computing & Machine Learning Laboratory
(BCML) in the Department of Computer Engineering at Kwangwoon University, Seoul, Republic
of Korea. His research interests include machine learning, and deep learning algorithms.
Heesang Eom is in the MSc Program at the Bio Computing & Machine Learning Laboratory
(BCML) in the Department of Computer Engineering at Kwangwoon University, Seoul, Republic
of Korea. He received a BSc from the Department of Software Engineering, Korea Polytechnic
University, Gyeonggi, Republic of Korea, in 2018. His research interests include computer
vision, and deep learning algorithms.
Yuli Sun Hariyani received the B.S. degree in telecommunication engi-neering and
the M.S. degree in electrical engineering from Telkom University, Bandung, Indonesia,
in 2010 and 2013, respectively. She is currently pursuing the Ph.D. degree with Computer
Engineering Depart-ment, Kwangwoon University, Seoul, South Korea. Since 2014, she
has been a Lecturer with Telkom University, Indonesia. Her research interests include
pattern recognition, medical image processing and biomedical signal processing.
Cheonjung Kim received the B.S. degree in Electronic Engineering from Kwangwoon
University, South Korea in 2015. He is recently studing toward the Electrical Egineering,
Kwangwoon University. His research interest includes bioelectronics for biomolecule
detection and preconcentration.
Yong Kyoung Yoo is an assistant professor at Electronic Engineering, Catholic Kwandong
University, South Korea. He received the B.S., M.S., and Ph.D. degrees from Kwangwoon
University, Seoul,Korea in 2011, 2013 and 2017. He was a researcher in Department
of ClinicalPharmacology and Therapeutics, College of Medicine, Kyung Hee University,
Seoul,Republic of Korea in 2017. He joined the Department of Electrical Engineering,
Kwangwoon University, where he was as a post-doctoral course in 2018. His current
research is focused on biosensors and bioelectronics.
Jeong Hoon Lee is a professor at Electrical Engineering, Kwangwoon University,
South Korea. He received the B.S. degree in the department of Ceramic Engineering
at Yonsei University, Seoul, South Korea, in 1997. He received the Ph.D from same
department in 2004. He specialized in MEMS/Nanomechanics from 1999 to 2005 at Korea
Institute of Science and Technology (KIST) in Seoul, South Korea. Before joining Kwangwoon
University in Sep 2008, he was a Postdoctoral Associate at RLE and EECS, Massachusetts
Institute of Technology (MIT), USA (2005 to 2008). His current main research is the
development of simple and powerful POCT and diagnostic systems based on the integration
of electronics and fluidics.
Cheolsoo Park is an associate professor in the Computer Engineering Department
at Kwangwoon University, Seoul, South Korea. He received a B.Eng. in Electrical Engineering
from Sogang University, Seoul, and an MSc from the Biomedical Engineering Department
at Seoul National Univer-sity, South Korea. In 2012, he received his PhD in Adaptive
Nonlinear Signal Processing from Imperial College London, London, U.K., and worked
as a postdoctoral researcher in the Bioengineering Department at the University of
California, San Diego, U.S.A. His research interests are mainly in the areas of machine
learning and adaptive and statistical signal processing, with applications in brain
computer interfaces, computational neuroscience, and wearable technology.