RazaGhulam Musa1
KimByung-Seo1
-
(Department of Software and Communications Engineering, Hongik University / Sejong,
Korea ghulammusaraza96@gmail.com, jsnbs@hongik.ac.kr
)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Keywords
COVID-19, Predictive social analysis, Polymerase chain reaction (PCR), Natural language processing (NLP)
1. Introduction
In recent years, because of the COVID-19 pandemic, the whole world has been suffering
from the virus and its effects. Both developed and developing countries have had deadly
effects from the disease. Everyday life is being severely affected by COVID-19. Private
and government institutions across many countries have been closed for a certain period
now. While appreciating the government’s initiative to control the pandemic, it is
important to know what the public opinion about COVID-19 was and whether standard
operating procedures issued by the government are being taken seriously or not. A
person’s opinion can be used to gauge how serious that person is about a particular
thing.
While living in this pandemic, people became more active on social media. People used
different platforms to express themselves and their mindsets. In the same way, we
can find out what people thought about COVID-19 and the result of PCR tests for it.
We have used the most popular social media platform, Twitter, to find people’s opinions.
Twitter has been known for its popularity in the information flow of most talked-about
trends [1]. Governments and its spokespersons also use Twitter for official announcements [2]. It is also famous among every generation. It even allows children to share their
thoughts via tweets. Children can use Twitter with their parents’ or guardians’ consent.
This is a new feature of Twitter to make it useful for children.
COVID-19-related tweets were used to find the sentiments of people about COVID-19.
It is an established source for forecasting many big events, such as the opening of
films and general elections. Sentiment analysis helps to evaluate the expressive orientation
of user feedback [3]. In this regard, machine learning and artificial intelligence have played an incredible
role in relevant areas, and there has been growing interest in using deep learning
techniques to accomplish sentiment analysis and emotional intelligence.
Since there was a lockdown and less physical interaction between people, much research
has been done on different aspects of COVID-19 to deal with any epidemic. People have
worked on many features of the problems caused by COVID-19. However, the opinion of
people has been the most neglected thing when planning precautional strategies in
research. Robust planning for saving people from a future pandemic is still a big
issue. Our focus is to determine people’s seriousness toward COVID-19 in regions at
different times by analyzing PCR results and COVID-19-related tweets. For this purpose,
we used 36 days of Twitter data from 10 different countries: the United States of
America, Pakistan, India, China, South Africa, the Philippines, the United Kingdom,
Switzerland, and Ireland. The dataset is not related to any single date. The dataset
is composed of 36 continuous days.
We implemented NLP-based tasks including data preprocessing, text processing, extracting
features, and making words into a form that a machine understands. Then, we used five
AI classifiers, SVM, BNB, LR, SLP, and MLP, to implement on NLP processed data and
predict emotions with high accuracy and low elapsed time. PCR results were taken for
comparison with social behavior for the same regions. The following are the main contributions
of this work:
1. High accuracy and low elapsed time are achieved in the prediction of social response
to COVID-19.
2. Region-wise sentiment analysis was performed and provides insights into the seriousness
of people toward COVID-19.
3. Finally, we compared the region-wise sentiment analysis of these regions with the
PCR results of the same regions to figure out the difference in thinking and actual
behavior. Results can lead to making a real-time international measure to detect these
regions’ behavior for any future pandemic. If we know how a region thinks about an
upcoming pandemic, then we can predict the region’s real-time behavior.
The related literature is discussed in Section 2. The proposed predictive analysis
model is elaborated in Section 3. Results and analysis are discussed in Section 4,
and recommended practices during the COVID-19 situation are detailed in Section 5.
Section 6 concludes the study.
2. Literature Review
In this section, we briefly review the existing methods for sentiment analysis, including
single modal and multimodal approaches. Several classifiers, including Support Vector
Machines (SVMs), Logistic Regression (LR), Random Forest (RF), and Multinomial Naive
Bayes (MNB) for sentiment analysis of tweets, are available for classification tasks.
These are used to represent tweets and match feature hashing using a bag of words,
which was obtained by combining lexicons, feature hashing, and emoticons. To produce
new data, [5] used some simple data alteration techniques (EDA), such as synonymous terms, random
change, random insertion, and random deletion. The developers of one approach [6] took the tweets and converted them into a series of terms, which they then interpreted
using a sentiment evaluation module based on the SWN lexicon. Then, for rule inference,
each of the represented tweets was evaluated using algorithms based on approximate
set theory (RST).
The relational increase was suggested by Kobayashi [7], who stochastically replaced terms with other words predicted by a bidirectional
language model in word positions. The most commonly used algorithms to polish and
prepare Twitter message data are tokenization, derivation, and noise words for removing
punctuation and symbols [8,9] performed well while using various data processing techniques. They compared 15 pre-processing
techniques that are widely used in an experiment. An immersive multitasking learning
system [10] for text sentiment classification of Chinese text was suggested. It adopts the ERNIE
paradigm as the main learning model of textual representation. Then, the text sentiment
classification task was accomplished to the highest stage, and the related level
In experiment-based research [11], the authors implemented a short long-term memory (LSTM) model for sentiment analysis.
Results show that the proposed model achieved higher accuracy and greater performance
compared to CMOS and FGPA-based methods. Other work [12] implemented different artificial techniques on a Twitter dataset for emotion classification
and compared the obtained results. The Weka library was used [13] with 40 different features to train a model. Different classifiers were also used
to compare the results obtained by the proposed model.
Predicting ordinal regression [14] is always interesting in sentiment analysis. A publicly available Twitter dataset
from NLTK corpora resources was used. Experimental findings revealed that the proposed
approach can detect ordinal regression using machine learning methods with good accuracy.
The combination of a bigram, information gain, and object-oriented words was used
in the method for extracting new feature sets, and each term was represented using
binary occurrences in the training of two classifiers: Naive Bayes and Support Vector
Machine (SVM) [15]. One study [16] aimed to gather and preprocess data to convert the Sentiment140 dataset to a proper
format. The authors performed feature selection on the dataset and trained, analyzed,
and compared various machine-learning models using it. They then used the classifier
that achieved the highest accuracy along with vectorization through a pipeline to
determine the sentiment of a new tweet that was provided as an input.
3. Proposed Methodology
In this section, we elaborate on the details of the proposed scheme. The overall architecture
of the proposed model is shown in Fig. 1.
Fig. 1. Flow diagram of proposed methodology.
3.1 Data Collection
We used two datasets to provide a measure. One for the COVID-19 tweets from a particular
region and the second for the COVID-19 PCR results from that specific region. Numerous
datasets have been developed for the classification of supervised learning for Twitter
data. These datasets are made up of tweets that have been manually tagged into a single
opinion group by human experts. Although some data sets often have numerical labels
related to the strength of emotions, positive and negative are the most general. We
used a dataset from an open-source platform [17] which presents Twitter sentiment ranking datasets. This dataset contains more than
one lac tweets from around the world. On the other hand, COVID-19 PCR results are
obtained from [18], a nonprofitable open-source platform. This data contains the number of PCR test
samples, their dates, times, and results (negative or positive). This platform has
very considerable statistics related to COVID-19.
3.2 NLP Transformation
First we clean the data in which we first remove the unimportant columns from the
tweeter dataset. Then we employed the following NLP-based pre-processing techniques.
· Numbers and Punctuation excluded: We removed numbers from the text as they do not
convey any emotion.
· Lowercase: We converted all capital words to lowercase. As a result, many terms
are combined, and the dimensionality of the problem is minimized.
· Mention and URL Removal: Each sentence in a Twitter text is a query that contains
a mention of the user and a URL. Including them does not imply sentiment. However,
one solution is to remove them with pre-processing tags.
· Stop Word Removal: Word breaks are important, but we observe that words appear frequently
in all sentences. It is believed that there is no need to analyze them because they
do not provide valuable material. The list of these terms is not completely predefined
and depends on the application. But these words can be simply deleted.
· Contraction Replacement: We removed the contractions, as phrases like ``I don’t
want'' after contraction replacement will be converted to ``I do not want''. Interpreting
a body of text to explain the opinion it conveys is a central feature of sentiment
analysis.
3.3 Text Classification
Here, it also refers to tweet’s classification. This subsection briefly describes
how to classify text based on sentiments. How to convert the text into machine understanding
form. So, Python usually assigns a positive or negative meaning to this emotion, known
as polarity. The polarity score sign is often used to infer whether the general mood
is positive, neutral, or negative.
We have used an NLP library named ‘Textblob’ that has a method ‘Sentiment’ with a
property ‘Polarity’. The value of polarity is a float type that lies between -1 and
1 where -1 refers to negative and 1 refers to positive sentiment. Some of the tweets
are taken and their respective sentiments according to their polarity score are shown
in Table 1.
So far, we are done with data preprocessing and sentiment score. However, we need
to reduce the dimensionality of the data. To do this, we employed a stemming procedure.
Stemming is the process of removing a suffix from a word and reducing it to its basic
form. Stemming is a natural language-processing normalization strategy that reduces
the amount of computation involved.
After stemming, a bag of words model, or BOW for short, is used as a method of extracting
features from the text for use in modeling, such as machine learning algorithms. The
method is simple and adaptable and can be used to remove features from records in
a variety of ways. It is a representation of text that represents how often words
appear in a document. It is called a ``bundle'' of terms because all the details about
the word order or the composition of the document are discarded. The model only cares
whether recognized terms appear in the text or not, not where they appear. The assumption
is that records with similar material are equivalent.
Until now, all major related activities were performed with data. All we need now
is to convert the words to vector form. Machines only understand vectors. Vectorization
is a mechanism by which text data is translated into a machine-readable format. To
achieve this, we used a technique called count vectorization. A count vectorizer helps
us both to create word bags and to convert them later into vector form. The result
is a coded vector of the length of the entire vocabulary and an integer count of how
many times each word appears in the text.
The vectors returned by the transform() function are scattered vectors, and we can
use the toarray() function to convert them back to NumPy arrays to investigate and
better understand them. For this purpose, we used the ``count vectorizer'' tool from
the sklearn library. Since the amount of vocabulary is so high, it is important to
keep the scale of function vectors to a minimum. The 700 most common terms (features)
are used in this.
It is also worth noting that we set min df = 2 and ngram range = 0. (1,3). min gl
= 2 indicates that a word must appear in at least two texts for the language to be
used in the array. The term ngram interval refers to the number of ngrams used to
cut a sentence. Let us say we have a sentence, ``I am a child.'' If we cut the sentence
by bigraph (ngrams = 2), the sentence will be cut as [``I am'', ``am a'', ``a child''].
Table 1. Polarity Table.
Text
|
Polarity Score
|
Sentiments
|
Risk dying statistics related.
|
Positive
|
0.03
|
Hospital blood component need
Plasma b ve COVID-19 recover
|
Positive
|
0.22
|
Coronavirus COVID-19 deaths
continue rise Smelled scent
hand sanitizers
|
Negative
|
-0.12
|
3.4 Feature Extraction
To extract the suitable feature and build a prediction model, various machine learning
algorithms have been implemented with natural language processing tasks to predict
the sentiments. Those AI classifiers are mentioned below, and their hyperparameters
are mentioned in Table 2.
In the integration process of AI classifiers and NLP feature extraction, a total of
700 features are selected to build the models. I chose 700 features because in NLP
features are words. That is why it can be many. So, it is inconceivable to look at
all 700 coefficients at the same time in one figure. The bar chart in Fig. 2 shows the only 10 largest and 10 smallest coefficients in the linear SVM model, and
the bars indicate the size of each coefficient. Red Lines are indicating negative
words (lowest coefficient) like ``bad'' and ``worst'' while blue lines are showing
positive words (highest coefficient) like ``health'' and ``true''.
Fig. 2. 10 Largest and smallest coefficients features are extracted and shown.
Table 2. AI Classifiers with Hyperparameters.
AI Classifiers
|
Hyper Parameters
|
SVM
|
param grid = lr2 param, verbose = 1, cv = kfold, n jobs=-1, scoring=roc auc, lr2param=(dual=True,
C=0.05, class weight=balance, loss=hinge)
|
Naïve Bayes
|
mlp param grid= [alpha=0.03, binarize=0.001], cv = kfold, scoring = roc auc, n jobs=
1, verbose = 1
|
SLP
|
hidden layer sizes=1, activation=logistic, solver=sgd, alpha=0.1, learning rate=constant,
max iter=1000
|
LG
|
param grid = [lr2 param], cv = kfold scoring = ’roc auc’, n jobs = 1, verbose = 1]
lr2 param=penalty=l2, dual=False C=0.05, class weight=balanced
|
MLP
|
mlp param grid = [hidden layer sizes=5, activation=relu, solver=adam, alpha=0.3, learning
rate=constant, maxiter=1000], param grid = mlp param grid, cv = kfold, scoring = roc
auc, n jobs= -1, verbose = 1
|
4. Results
The results section is divided into three subsections. We first discuss Accuracy and
Elapsed time for sentiment assignment, then PCR observations, and finally, we discuss
time and region-wise Twitter-based PCR analysis.
4.1 Sentiment Prediction
After implementing all AI classifiers and feature extraction on a dataset concerning
their best hyperparameter we came to know that the SVM classifier has achieved the
highest accuracy rate of 93.78% with an elapsed time of 1.2s. In our case, prediction
entails providing the optimal Label for a text. We monitor this potential using the
accuracy rate metric and elapsed time. For a given dataset, the accuracy rate is the
proportion of accurate predictions. This indicates that statistically speaking, we
should anticipate having 93 accurate predictions for every 100 made using proposed
SVM with an accuracy rate of 93%. Elapsed time is the amount of "wall clock" time
from the start of the model training to its end after model testing. We have the results
of all implemented classifiers are shown in Table 3.
We also compare our results with some of the existing work in Table 4.
Table 3. Accuracy Rates and Elapsed Time.
AI Classifiers
|
Accuracy Rates
|
Elapsed Time
|
SVM
|
93.78
|
1.2s
|
Naïve Bayes
|
90
|
1.0s
|
Single Layer Perceptron
|
54
|
12.9s
|
Logistic Regression
|
93.21
|
2.4s
|
Multi-Layer Perceptron
|
93.73
|
60s
|
Table 4. Comparison of Accuracy Rates.
Related Work
|
Implemented Classifier
|
Accuracy Rate
|
19
|
CNN
|
89
|
17
|
LSTM
|
84.3
|
21
|
Decisions Trees
|
91.81
|
16
|
BiLSTM + attention + CRF
|
85.94
|
4
|
Naive Bayes
SVM
Multinomial Naive Bayes
|
75
78
86
|
Our Model
|
SVM
|
93
|
4.2 Regional Sentimental Analysis
This work also includes a regional factor for assigning emotions. In this regard,
10 countries have different tweets related to COVID-19. We have thoroughly analyzed
the texts and found the percentage of positive and negative tweets in these regions.
Here, positive tweets refer to tweets referring to people being serious about COVID-19,
and negative tweets refer to ignorance of people toward COVID-19 like COVID-19 is
not harmful. We show the analysis details, including the percentage of positive tweets
and negative tweets of related regions shown in Figs. 3 and 4.
In Figs. 3 and 4, the x-axis is for the percentage of positive and negative opinions, and the
y-axis is for regions. It is shown that different regions had different behavior of
opinions about COVID-19. The Australian region had the highest negative tweets about
COVID-19 at 32%. After that, the Philippines and United Kingdom were leading with
31%. Ireland was at number four with a 25% negative opinion. India was at number five
with 24%. The USA had a rate of 23%, China had 22%, South Africa had 21%, and Pakistan
and Switzerland had 18%.
Fig. 3. Opinion behaviors of Its five regions in term of positive and negative for COVID-19.
Fig. 4. Opinion behaviors of remaining five regions in term of positive and negative for COVID-19.}
4.3 Regional PCR Analysis
We analyzed PCR results from an open-source platform which is a nonprofitable resource.
This platform has a short record of even just one day from each country. Like How
many PCR test samples were collected in which country, on what day, at what time,
and what result was obtained about these tests from 25th of July to 29th of august.
For a specific time interval of 36 days, the USA had a total of 88906 PCR tests per
million, and out of this total 5457.2 were positive, so the USA had a 6.13% ratio
of positive Cases. India had 8.99% of positive cases in specific time intervals from
the 25th of July to the 29th of August. India had a total of 16899 PCR tests per million
and out of this total 1519.48 are positive, so India has 8.99 ratios of positive Cases.
36 days PCR results for the USA are shown in Table 5 and for India are shown in Table 6.
In the same way, we can extract required positive results from all required regions.
Th PCR results of all required regions are shown in Table 7.
We can see that South Africa had the highest positive PCR ratio with 21.94%. After
SA, the Philippines has the highest positive PCR ratio with 10.5%. After the Philippines,
it is shown that India has the highest positive PCR ratio with 8.99%. The USA is leading
with a 6.13% of positive PCR ratio from the available data set of 10 countries. Pakistan
is at number five with 2.89% of positive PCR in a list of ten countries. Switzerland
is at number six with 2.25%. Ireland has 1.13% of positive PCR. China is in 2nd last
position with a low positive PCR ratio of 0.57%. Australia has the lowest positive
PCR percentage with 0.56.
Now we have opinions and results of regions to COVID-19, so we can compare both to
make a measure for a future pandemic. So, we can estimate the contradiction or similarity
found in the region’s opinion about COVID-19 and PCR results of the COVID-19 of these
regions. Fig. 5 shows the trend of the Twitter-based PCR ratio for the USA, India, China, Australia,
and Ireland. Fig. 6 shows the trend of the Twitter-based PCR ratio in Switzerland, South Africa, Pakistan,
the United Kingdom, and the Philippines. In Figs. 5 and 6, the x-axis’s cornflower blue line is for negative opinion and a turquoise
line is for positive PCR results and the y-axis is for regions.
As an example, in the USA, the contradiction rate between negative opinions and positive
PCR was 18% because 24% of the USA region gave negative comments about COVID-19 (like
they do not believe in COVID-19), and 6% of the region was COVID-19 positive. If the
administration knows a region’s contradiction rate, they can take necessary steps
accordingly. For example, in the future, if any other pandemic comes and there are
no kits to explain the status like the positivity or negativity of the deceases related
to that pandemic, then the only thing available would be the people’s opinion about
it. At that time, one can correlate this research experiment with coming pandemic.
Fig. 5. Time & region-wise Twitter-based PCR analysis of the first five regions.
Fig. 6. Time & region-wise Twitter-based PCR analysis of the remaining five regions.
Table 5. USA Total PCR Tests with Positive PCR Results of 36 days.
Date
|
Total Test Per Million
|
Positive Results
|
25/07/2020
|
2,905
|
197.98
|
25/07/2020
|
2,856
|
195.68
|
..
|
..
|
..
|
..
|
..
|
..
|
28/08/2020
|
2,549
|
125.91
|
29/08/2020
|
2,549
|
125.74
|
Table 6. INDIA Total PCR Tests with Positive PCR Results of 36 days.
Date
|
Total Test Per Million
|
Positive Results
|
25/07/2020
|
250
|
31.87
|
25/07/2020
|
259
|
32.86
|
..
|
..
|
..
|
..
|
..
|
..
|
28/08/2020
|
622
|
50.55
|
29/08/2020
|
612
|
51.53
|
Table 7. Positive PCR Results of Required Regions.
Country
|
Positive PCR per Million
|
USA
|
6.13
|
India
|
8.99
|
China
|
0.57
|
Australia
|
0.56
|
Ireland
|
1.13
|
Switzerland
|
2.25
|
South Africa
|
21.94
|
Pakistan
|
2.89
|
United Kingdom
|
0.90
|
Philippines
|
10.50
|
5. Discussion & Recommendations
This research can help the administrations in effectively taking many decisions. According
to our proposed knowledge, decisions by the government should be justifiable if governments
would only know the region’s thoughts. With the help of knowing how people think,
the admin can take necessary decisions. if we take examples of the South African region,
we will know that there is 0% contradiction in people’s opinions and positive PCR.
South Africa has a 21% of negative opinion of the total count, and the same 21% has
positive PCR, as shown in Fig. 7.
In Fig. 7, it is also clear that there is a large contradiction between the ratio of people’s
negativity about COVID-19 and the PCR-positive ratio. The reason is that the government
of Australia took strict action to control it. For the future, The Australian admin
has stats on the region. They know better that their people showed carelessness towards
COVID-19 in their opinion, but the result was the opposite. So, they can repeat their
policies and strategies for controlling COVID-19. But still, they will know that this
region had no positive thoughts about COVID-19.
So, we can say that in a future pandemic, if the SA region will have negative thoughts
about the pandemic, then the region will be highly affected by the pandemic. In this
case, SA will require more attention and priority than other areas like China which
has a difference of 21% in opinion and PCR results. The following recommended steps
can be taken to avoid and control future pandemics, according to regional behavior.
· Efficient resource allocation
· Efficient regional monitoring
· Region wise health care steps
· Urgent lockdowns
· Region-specific awareness campaign
· Apprehend people spreading propaganda
· Region-specific relief fund
· Traveling restrictions
· Social protection
· Avoid gathering
Fig. 7. Contradiction ratio of SA and Australia.
6. Conclusion
In this work, we took the COVID-19 tweets and PCR results of 10 regions for a specific
time interval. The priority was to predict the sentiments of the regions with high
probability and low elapsed time. To accomplish this goal, different AI classifiers
and NLP techniques were implemented. As a result, we predicted sentiments with the
highest possible accuracy and low elapsed time. We achieved a 93.78% accuracy rate
and elapsed time of 1.2 s with the SVM model. The obtained results were better than
the existing work. In addition, we collected the PCR test results for COVID-19 in
the same regions and then we compare them and the sentiment rate. We found remarkable
differences in these regions. South Africa had almost a 0% difference in negative
opinion and positive PCR results, and the USA had a difference of 18% between thoughts
and practical behavior. This work is beneficial for any future pandemic. If we will
know the sentiments of regions about a pandemic, then we can predict the actual behavior
in that pandemic with the help of the proposed study. It helps in tracking the region’s
behavior towards future pandemics and provides a platform to take necessary measures.
ACKNOWLEDGMENTS
This work was supported by the National Research Foundation (NRF), Korea, under
project BK21 FOUR (F21YY8102068)
REFERENCES
S. Sangwan at al., "Social Media Sentiment Analysis- A Relative Study on Twitter Dataset,"
in Proc of. 2022 6th International Conference on Trends in Electronics and Informatics
(ICOEI), pp. 436-441, Apr. 2022,
V. Israel-Turim at al., “Who Did Spanish Politicians Start Following on Twitter? Homophilic
Tendencies among the Political Elite,” Social Sciences, vol. 11, no. 7, pp. 292, Jul.
2022,
I. Deutscher at al., “Sentiments and Acts,” Berlin, Boston: De Gruyter, Dec. 2021,
B. Heredia, T. M. Khoshgoftaar, J. Prusa and M. Crawford, "Cross-Domain Sentiment
Analysis: An Empirical Investigation," in Proc of. 2016 IEEE 17th International Conference
on Information Reuse and Integration (IRI), pp. 160-165, Jul. 2016,
Wei, Jason, Zou and Kai, “EDA: Easy Data Augmentation Techniques for Boosting Performance
on Text Classification Tasks,” arXiv, Aug. 2019,
H Keshavarz and M Abadeh, “ALGA: Adaptive lexicon learning using genetic algorithm
for sentiment analysis of microblogs,” Knowledge-Based Systems, pp. 1-16, Apr. 2017,
S.Kobayashi “Contextual augmentation: Data augmentation by words with paradigmatic
relations,” computation and language, AsssxXiv, May. 2018,
J. S. Vimali and S. Murugan, "A Text Based Sentiment Analysis Model using Bi-directional
LSTM Networks," 2021 6th International Conference on Communication and Electronics
Systems (ICCES), 2021, pp. 1652-1658,
Kobayashi and Sosuke, “Contextual Augmentation: Data Augmentation by Words with Paradigmatic
Relations,” computation and language, AxXiv, May. 2018,
H. Zhang, S. Sun, Y. Hu, J. Liu and Y. Guo, "Sentiment Classification for Chinese
Text Based on Interactive Multitask Learning," IEEE Access, vol. 8 pp. 129626-12963,
Jul. 2020,
S. Wen at al., "Memristive LSTM Network for Sentiment Analysis," IEEE Transactions
on Systems, Man, and Cybernetics: Systems, vol. 51, no. 3, pp. 1794-1804, Mar. 2021,
S. Thota, S. P. Hanish and Y. Raju, “Opinion Mining of Twitter Data Using Machine
Learning,” in EBSCO, International Journal of Advanced Research Computer Science,
vol. 11, pp. 92-95, May. 2020.
A. M. Alharbi and E Doncker, “Twitter sentiment analysis with a deep neural network:
An enhanced approach using user behavioral information” Cognitive Systems Research,
pp. 50-61, May. 2019,
S. E. Saad and J. Yang, "Twitter Sentiment Analysis Based on Ordinal Regression,"
IEEE Access, vol. 7, pp. 163677-163685, Nov. 2019,
D. Malik and G. Munjal, "Reviewing Classification Methods on Health Care," Intelligent
Healthcare. EAI/Springer Innovations in Communication and Computing, Jul. 2021,
A. Ikram, M. Kumar and G. Munjal, "Twitter Sentiment Analysis using Machine Learning"
in Proc of.12th International Conference on Cloud Computing, Data Science & Engineering
(Confluence), pp. 629-634, Mar. 2022,
G. Preda, “covid-19 Tweets,” on Kaggle, Dec.2020,Available on:
Ourworldindata, Oct. 2022.
Author
Ghulam Musa Raza received his BS degree in Computer Sciences from Comsats University
Islamabad in 2019. His major in BS was Intelligent Robotics. He received his MS degree
in Computer Sciences from SEECS, NUST Islamabad in 2021. His research interest in
Masters was Natural Language Processing (Artificial Intelligence). From 2017 to 2019,
he was working as a Software Engineer in Snaky Solutions Pvt Limited. He served as
Machine Learning based Research Assistant in TUKL lab, NUST Islamabad at the start
of 2021. He served as Lecturer in Alhamd Islamic University, Islamabad from 2021 to
2022. His major interests are in the field of Natural Language Processing, Internet
of things (IOT), Information Centric Networking and Named Data Networking. He is currently
pursuing the Ph.D. degree with the Department of Communication and Software Engineering
in Graduate School, Hongik University, South Korea.
Byung-Seo Kim received his B.S. degree in electrical engineering from In-Ha University,
In-Chon, Korea, in 1998 and his M.S. and Ph.D. degrees in electrical and computer
engineering from the University of Florida in 2001 and 2004, respectively. His Ph.D.
study was supervised by Dr. Yuguang Fang. Between 1997 and 1999, he worked for Motorola
Korea Ltd., PaJu, Korea, as a computer integrated manufacturing (CIM) engineer in
advanced technology research and development (ATR&D), and he was the chairman with
the Department of Software and Communications Engineering, Hongik University, South
Korea, where he is currently a professor. He served as the General Chair for 3rd IWWCN
2017 and the TPC member for the IEEE VTC 2014-Spring and the EAI FUTURE2016 and ICGHIC
2016 2019 conferences. He served as a guest editor of special issues of the International
Journal of Distributed Sensor Networks (SAGE), IEEE Access, and Journal of the Institute
of Electrics and Information Engineers. His work has appeared in around 167 publications
and 22 patents. He is an IEEE Senior Member and Associative Editor of IEEE Access.
His research interests include the design and development of efficient wireless/wired
networks including, link-adaptable/cross-layer-based protocols, multi-protocol structures,
wireless CCNs/NDNs, mobile edge computing, physical layer design for broadband PLC,
and resource allocation algorithms for wireless networks.