Mobile QR Code QR CODE

2024

Acceptance Ratio

21%

Main Menu

※ The user interface design of www.ieiespc.org has been recently revised and updated. Please contact inter@theieie.org for any inquiries regarding paper submission.

Journal Search

IEIESPC(IEIE Transactions on Smart Processing and Computing)

IEIESPC Vol. 12, No. 01, p.72-79

ISSN (online) :

2287-5255

Received : 5 October 2022Revised : 20 December 2022Accepted : 31 December 2022

DOI :

https://doi.org/10.5573/IEIESPC.2023.12.1.72

Regular Paper

Tweets and PCR Test-based Analysis and Prediction of Social Response to a Future Pandemic. A Case Study

RazaGhulam Musa¹ KimByung-Seo¹

(Department of Software and Communications Engineering, Hongik University / Sejong, Korea ghulammusaraza96@gmail.com, jsnbs@hongik.ac.kr )

^* Corresponding Author: Byung-Seo Kim

License :

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.(www.theieie.org).

Abstract

The COVID-19 pandemic has greatly affected our society badly. It has been a subject of discussion since 2019 due to the increased prevalence of social media and its extensive use, and it has been a source of tension, fear, and disappointment for people all over the world. In this research, we took data from COVID-19 tweets from 10 different regions from July 25, 2020, to August 29, 2020. Using the well-known word embedding technique count-vectorizer, we experimented with different machine learning classifiers on data to train deep neural networks to improve the accuracy of predicted opinions with a low elapsed time. In addition, we collected PCR results from these regions for the same time interval. We compared the opinions in the form of positive or negative responses with the results of the PCR tests per million people. With the help of the results, We figured out a real-time international measure to detect these regions’ behaviors for any future pandemic. If we know how a region thinks about an upcoming pandemic, then we can predict the region’s real-time behavior for the particular pandemic. This would happen if we had past case studies to compare, like in our proposed research.

Keywords

COVID-19, Predictive social analysis, Polymerase chain reaction (PCR), Natural language processing (NLP)

1. Introduction

In recent years, because of the COVID-19 pandemic, the whole world has been suffering from the virus and its effects. Both developed and developing countries have had deadly effects from the disease. Everyday life is being severely affected by COVID-19. Private and government institutions across many countries have been closed for a certain period now. While appreciating the government’s initiative to control the pandemic, it is important to know what the public opinion about COVID-19 was and whether standard operating procedures issued by the government are being taken seriously or not. A person’s opinion can be used to gauge how serious that person is about a particular thing.

While living in this pandemic, people became more active on social media. People used different platforms to express themselves and their mindsets. In the same way, we can find out what people thought about COVID-19 and the result of PCR tests for it. We have used the most popular social media platform, Twitter, to find people’s opinions. Twitter has been known for its popularity in the information flow of most talked-about trends ^[1]. Governments and its spokespersons also use Twitter for official announcements ^[2]. It is also famous among every generation. It even allows children to share their thoughts via tweets. Children can use Twitter with their parents’ or guardians’ consent. This is a new feature of Twitter to make it useful for children.

COVID-19-related tweets were used to find the sentiments of people about COVID-19. It is an established source for forecasting many big events, such as the opening of films and general elections. Sentiment analysis helps to evaluate the expressive orientation of user feedback ^[3]. In this regard, machine learning and artificial intelligence have played an incredible role in relevant areas, and there has been growing interest in using deep learning techniques to accomplish sentiment analysis and emotional intelligence.

Since there was a lockdown and less physical interaction between people, much research has been done on different aspects of COVID-19 to deal with any epidemic. People have worked on many features of the problems caused by COVID-19. However, the opinion of people has been the most neglected thing when planning precautional strategies in research. Robust planning for saving people from a future pandemic is still a big issue. Our focus is to determine people’s seriousness toward COVID-19 in regions at different times by analyzing PCR results and COVID-19-related tweets. For this purpose, we used 36 days of Twitter data from 10 different countries: the United States of America, Pakistan, India, China, South Africa, the Philippines, the United Kingdom, Switzerland, and Ireland. The dataset is not related to any single date. The dataset is composed of 36 continuous days.

We implemented NLP-based tasks including data preprocessing, text processing, extracting features, and making words into a form that a machine understands. Then, we used five AI classifiers, SVM, BNB, LR, SLP, and MLP, to implement on NLP processed data and predict emotions with high accuracy and low elapsed time. PCR results were taken for comparison with social behavior for the same regions. The following are the main contributions of this work:

1. High accuracy and low elapsed time are achieved in the prediction of social response to COVID-19.

2. Region-wise sentiment analysis was performed and provides insights into the seriousness of people toward COVID-19.

3. Finally, we compared the region-wise sentiment analysis of these regions with the PCR results of the same regions to figure out the difference in thinking and actual behavior. Results can lead to making a real-time international measure to detect these regions’ behavior for any future pandemic. If we know how a region thinks about an upcoming pandemic, then we can predict the region’s real-time behavior.

The related literature is discussed in Section 2. The proposed predictive analysis model is elaborated in Section 3. Results and analysis are discussed in Section 4, and recommended practices during the COVID-19 situation are detailed in Section 5. Section 6 concludes the study.

2. Literature Review

In this section, we briefly review the existing methods for sentiment analysis, including single modal and multimodal approaches. Several classifiers, including Support Vector Machines (SVMs), Logistic Regression (LR), Random Forest (RF), and Multinomial Naive Bayes (MNB) for sentiment analysis of tweets, are available for classification tasks. These are used to represent tweets and match feature hashing using a bag of words, which was obtained by combining lexicons, feature hashing, and emoticons. To produce new data, ^[5] used some simple data alteration techniques (EDA), such as synonymous terms, random change, random insertion, and random deletion. The developers of one approach ^[6] took the tweets and converted them into a series of terms, which they then interpreted using a sentiment evaluation module based on the SWN lexicon. Then, for rule inference, each of the represented tweets was evaluated using algorithms based on approximate set theory (RST).

The relational increase was suggested by Kobayashi ^[7], who stochastically replaced terms with other words predicted by a bidirectional language model in word positions. The most commonly used algorithms to polish and prepare Twitter message data are tokenization, derivation, and noise words for removing punctuation and symbols ^[8,^9] performed well while using various data processing techniques. They compared 15 pre-processing techniques that are widely used in an experiment. An immersive multitasking learning system ^[10] for text sentiment classification of Chinese text was suggested. It adopts the ERNIE paradigm as the main learning model of textual representation. Then, the text sentiment classification task was accomplished to the highest stage, and the related level

In experiment-based research ^[11], the authors implemented a short long-term memory (LSTM) model for sentiment analysis. Results show that the proposed model achieved higher accuracy and greater performance compared to CMOS and FGPA-based methods. Other work ^[12] implemented different artificial techniques on a Twitter dataset for emotion classification and compared the obtained results. The Weka library was used ^[13] with 40 different features to train a model. Different classifiers were also used to compare the results obtained by the proposed model.

Predicting ordinal regression ^[14] is always interesting in sentiment analysis. A publicly available Twitter dataset from NLTK corpora resources was used. Experimental findings revealed that the proposed approach can detect ordinal regression using machine learning methods with good accuracy. The combination of a bigram, information gain, and object-oriented words was used in the method for extracting new feature sets, and each term was represented using binary occurrences in the training of two classifiers: Naive Bayes and Support Vector Machine (SVM) ^[15]. One study ^[16] aimed to gather and preprocess data to convert the Sentiment140 dataset to a proper format. The authors performed feature selection on the dataset and trained, analyzed, and compared various machine-learning models using it. They then used the classifier that achieved the highest accuracy along with vectorization through a pipeline to determine the sentiment of a new tweet that was provided as an input.

3. Proposed Methodology

In this section, we elaborate on the details of the proposed scheme. The overall architecture of the proposed model is shown in Fig. 1.

Fig. 1. Flow diagram of proposed methodology.

3.1 Data Collection

We used two datasets to provide a measure. One for the COVID-19 tweets from a particular region and the second for the COVID-19 PCR results from that specific region. Numerous datasets have been developed for the classification of supervised learning for Twitter data. These datasets are made up of tweets that have been manually tagged into a single opinion group by human experts. Although some data sets often have numerical labels related to the strength of emotions, positive and negative are the most general. We used a dataset from an open-source platform ^[17] which presents Twitter sentiment ranking datasets. This dataset contains more than one lac tweets from around the world. On the other hand, COVID-19 PCR results are obtained from ^[18], a nonprofitable open-source platform. This data contains the number of PCR test samples, their dates, times, and results (negative or positive). This platform has very considerable statistics related to COVID-19.

3.2 NLP Transformation

First we clean the data in which we first remove the unimportant columns from the tweeter dataset. Then we employed the following NLP-based pre-processing techniques.

· Numbers and Punctuation excluded: We removed numbers from the text as they do not convey any emotion.

· Lowercase: We converted all capital words to lowercase. As a result, many terms are combined, and the dimensionality of the problem is minimized.

· Mention and URL Removal: Each sentence in a Twitter text is a query that contains a mention of the user and a URL. Including them does not imply sentiment. However, one solution is to remove them with pre-processing tags.

· Stop Word Removal: Word breaks are important, but we observe that words appear frequently in all sentences. It is believed that there is no need to analyze them because they do not provide valuable material. The list of these terms is not completely predefined and depends on the application. But these words can be simply deleted.

· Contraction Replacement: We removed the contractions, as phrases like ``I don’t want'' after contraction replacement will be converted to ``I do not want''. Interpreting a body of text to explain the opinion it conveys is a central feature of sentiment analysis.

3.3 Text Classification

Here, it also refers to tweet’s classification. This subsection briefly describes how to classify text based on sentiments. How to convert the text into machine understanding form. So, Python usually assigns a positive or negative meaning to this emotion, known as polarity. The polarity score sign is often used to infer whether the general mood is positive, neutral, or negative.

We have used an NLP library named ‘Textblob’ that has a method ‘Sentiment’ with a property ‘Polarity’. The value of polarity is a float type that lies between -1 and 1 where -1 refers to negative and 1 refers to positive sentiment. Some of the tweets are taken and their respective sentiments according to their polarity score are shown in Table 1.

So far, we are done with data preprocessing and sentiment score. However, we need to reduce the dimensionality of the data. To do this, we employed a stemming procedure. Stemming is the process of removing a suffix from a word and reducing it to its basic form. Stemming is a natural language-processing normalization strategy that reduces the amount of computation involved.

After stemming, a bag of words model, or BOW for short, is used as a method of extracting features from the text for use in modeling, such as machine learning algorithms. The method is simple and adaptable and can be used to remove features from records in a variety of ways. It is a representation of text that represents how often words appear in a document. It is called a ``bundle'' of terms because all the details about the word order or the composition of the document are discarded. The model only cares whether recognized terms appear in the text or not, not where they appear. The assumption is that records with similar material are equivalent.

Until now, all major related activities were performed with data. All we need now is to convert the words to vector form. Machines only understand vectors. Vectorization is a mechanism by which text data is translated into a machine-readable format. To achieve this, we used a technique called count vectorization. A count vectorizer helps us both to create word bags and to convert them later into vector form. The result is a coded vector of the length of the entire vocabulary and an integer count of how many times each word appears in the text.

The vectors returned by the transform() function are scattered vectors, and we can use the toarray() function to convert them back to NumPy arrays to investigate and better understand them. For this purpose, we used the ``count vectorizer'' tool from the sklearn library. Since the amount of vocabulary is so high, it is important to keep the scale of function vectors to a minimum. The 700 most common terms (features) are used in this.

It is also worth noting that we set min df = 2 and ngram range = 0. (1,3). min gl = 2 indicates that a word must appear in at least two texts for the language to be used in the array. The term ngram interval refers to the number of ngrams used to cut a sentence. Let us say we have a sentence, ``I am a child.'' If we cut the sentence by bigraph (ngrams = 2), the sentence will be cut as [``I am'', ``am a'', ``a child''].

Table 1. Polarity Table.

Text	Polarity Score	Sentiments
Risk dying statistics related.	Positive	0.03
Hospital blood component need Plasma b ve COVID-19 recover	Positive	0.22
Coronavirus COVID-19 deaths continue rise Smelled scent hand sanitizers	Negative	-0.12

3.4 Feature Extraction

To extract the suitable feature and build a prediction model, various machine learning algorithms have been implemented with natural language processing tasks to predict the sentiments. Those AI classifiers are mentioned below, and their hyperparameters are mentioned in Table 2.

In the integration process of AI classifiers and NLP feature extraction, a total of 700 features are selected to build the models. I chose 700 features because in NLP features are words. That is why it can be many. So, it is inconceivable to look at all 700 coefficients at the same time in one figure. The bar chart in Fig. 2 shows the only 10 largest and 10 smallest coefficients in the linear SVM model, and the bars indicate the size of each coefficient. Red Lines are indicating negative words (lowest coefficient) like ``bad'' and ``worst'' while blue lines are showing positive words (highest coefficient) like ``health'' and ``true''.

Fig. 2. 10 Largest and smallest coefficients features are extracted and shown.

Table 2. AI Classifiers with Hyperparameters.

AI Classifiers	Hyper Parameters
SVM	param grid = lr2 param, verbose = 1, cv = kfold, n jobs=-1, scoring=roc auc, lr2param=(dual=True, C=0.05, class weight=balance, loss=hinge)
Naïve Bayes	mlp param grid= [alpha=0.03, binarize=0.001], cv = kfold, scoring = roc auc, n jobs= 1, verbose = 1
SLP	hidden layer sizes=1, activation=logistic, solver=sgd, alpha=0.1, learning rate=constant, max iter=1000
LG	param grid = [lr2 param], cv = kfold scoring = ’roc auc’, n jobs = 1, verbose = 1] lr2 param=penalty=l2, dual=False C=0.05, class weight=balanced
MLP	mlp param grid = [hidden layer sizes=5, activation=relu, solver=adam, alpha=0.3, learning rate=constant, maxiter=1000], param grid = mlp param grid, cv = kfold, scoring = roc auc, n jobs= -1, verbose = 1

4. Results

The results section is divided into three subsections. We first discuss Accuracy and Elapsed time for sentiment assignment, then PCR observations, and finally, we discuss time and region-wise Twitter-based PCR analysis.

4.1 Sentiment Prediction

After implementing all AI classifiers and feature extraction on a dataset concerning their best hyperparameter we came to know that the SVM classifier has achieved the highest accuracy rate of 93.78% with an elapsed time of 1.2s. In our case, prediction entails providing the optimal Label for a text. We monitor this potential using the accuracy rate metric and elapsed time. For a given dataset, the accuracy rate is the proportion of accurate predictions. This indicates that statistically speaking, we should anticipate having 93 accurate predictions for every 100 made using proposed SVM with an accuracy rate of 93%. Elapsed time is the amount of "wall clock" time from the start of the model training to its end after model testing. We have the results of all implemented classifiers are shown in Table 3.

We also compare our results with some of the existing work in Table 4.

Table 3. Accuracy Rates and Elapsed Time.

AI Classifiers	Accuracy Rates	Elapsed Time
SVM	93.78	1.2s
Naïve Bayes	90	1.0s
Single Layer Perceptron	54	12.9s
Logistic Regression	93.21	2.4s
Multi-Layer Perceptron	93.73	60s

Table 4. Comparison of Accuracy Rates.

Related Work	Implemented Classifier	Accuracy Rate
19	CNN	89
17	LSTM	84.3
21	Decisions Trees	91.81
16	BiLSTM + attention + CRF	85.94
4	Naive Bayes SVM Multinomial Naive Bayes	75 78 86
Our Model	SVM	93

4.2 Regional Sentimental Analysis

This work also includes a regional factor for assigning emotions. In this regard, 10 countries have different tweets related to COVID-19. We have thoroughly analyzed the texts and found the percentage of positive and negative tweets in these regions. Here, positive tweets refer to tweets referring to people being serious about COVID-19, and negative tweets refer to ignorance of people toward COVID-19 like COVID-19 is not harmful. We show the analysis details, including the percentage of positive tweets and negative tweets of related regions shown in Figs. 3 and 4.

In Figs. 3 and 4, the x-axis is for the percentage of positive and negative opinions, and the y-axis is for regions. It is shown that different regions had different behavior of opinions about COVID-19. The Australian region had the highest negative tweets about COVID-19 at 32%. After that, the Philippines and United Kingdom were leading with 31%. Ireland was at number four with a 25% negative opinion. India was at number five with 24%. The USA had a rate of 23%, China had 22%, South Africa had 21%, and Pakistan and Switzerland had 18%.

Fig. 3. Opinion behaviors of Its five regions in term of positive and negative for COVID-19.

Fig. 4. Opinion behaviors of remaining five regions in term of positive and negative for COVID-19.}

4.3 Regional PCR Analysis

We analyzed PCR results from an open-source platform which is a nonprofitable resource. This platform has a short record of even just one day from each country. Like How many PCR test samples were collected in which country, on what day, at what time, and what result was obtained about these tests from 25th of July to 29th of august. For a specific time interval of 36 days, the USA had a total of 88906 PCR tests per million, and out of this total 5457.2 were positive, so the USA had a 6.13% ratio of positive Cases. India had 8.99% of positive cases in specific time intervals from the 25th of July to the 29th of August. India had a total of 16899 PCR tests per million and out of this total 1519.48 are positive, so India has 8.99 ratios of positive Cases. 36 days PCR results for the USA are shown in Table 5 and for India are shown in Table 6.

In the same way, we can extract required positive results from all required regions. Th PCR results of all required regions are shown in Table 7.

We can see that South Africa had the highest positive PCR ratio with 21.94%. After SA, the Philippines has the highest positive PCR ratio with 10.5%. After the Philippines, it is shown that India has the highest positive PCR ratio with 8.99%. The USA is leading with a 6.13% of positive PCR ratio from the available data set of 10 countries. Pakistan is at number five with 2.89% of positive PCR in a list of ten countries. Switzerland is at number six with 2.25%. Ireland has 1.13% of positive PCR. China is in 2nd last position with a low positive PCR ratio of 0.57%. Australia has the lowest positive PCR percentage with 0.56.

Now we have opinions and results of regions to COVID-19, so we can compare both to make a measure for a future pandemic. So, we can estimate the contradiction or similarity found in the region’s opinion about COVID-19 and PCR results of the COVID-19 of these regions. Fig. 5 shows the trend of the Twitter-based PCR ratio for the USA, India, China, Australia, and Ireland. Fig. 6 shows the trend of the Twitter-based PCR ratio in Switzerland, South Africa, Pakistan, the United Kingdom, and the Philippines. In Figs. 5 and 6, the x-axis’s cornflower blue line is for negative opinion and a turquoise line is for positive PCR results and the y-axis is for regions.

As an example, in the USA, the contradiction rate between negative opinions and positive PCR was 18% because 24% of the USA region gave negative comments about COVID-19 (like they do not believe in COVID-19), and 6% of the region was COVID-19 positive. If the administration knows a region’s contradiction rate, they can take necessary steps accordingly. For example, in the future, if any other pandemic comes and there are no kits to explain the status like the positivity or negativity of the deceases related to that pandemic, then the only thing available would be the people’s opinion about it. At that time, one can correlate this research experiment with coming pandemic.

Fig. 5. Time & region-wise Twitter-based PCR analysis of the first five regions.

Fig. 6. Time & region-wise Twitter-based PCR analysis of the remaining five regions.

Table 5. USA Total PCR Tests with Positive PCR Results of 36 days.

Date	Total Test Per Million	Positive Results
25/07/2020	2,905	197.98
25/07/2020	2,856	195.68
..	..	..
..	..	..
28/08/2020	2,549	125.91
29/08/2020	2,549	125.74

Table 6. INDIA Total PCR Tests with Positive PCR Results of 36 days.

Date	Total Test Per Million	Positive Results
25/07/2020	250	31.87
25/07/2020	259	32.86
..	..	..
..	..	..
28/08/2020	622	50.55
29/08/2020	612	51.53

Table 7. Positive PCR Results of Required Regions.

Country	Positive PCR per Million
USA	6.13
India	8.99
China	0.57
Australia	0.56
Ireland	1.13
Switzerland	2.25
South Africa	21.94
Pakistan	2.89
United Kingdom	0.90
Philippines	10.50

5. Discussion & Recommendations

This research can help the administrations in effectively taking many decisions. According to our proposed knowledge, decisions by the government should be justifiable if governments would only know the region’s thoughts. With the help of knowing how people think, the admin can take necessary decisions. if we take examples of the South African region, we will know that there is 0% contradiction in people’s opinions and positive PCR. South Africa has a 21% of negative opinion of the total count, and the same 21% has positive PCR, as shown in Fig. 7.

In Fig. 7, it is also clear that there is a large contradiction between the ratio of people’s negativity about COVID-19 and the PCR-positive ratio. The reason is that the government of Australia took strict action to control it. For the future, The Australian admin has stats on the region. They know better that their people showed carelessness towards COVID-19 in their opinion, but the result was the opposite. So, they can repeat their policies and strategies for controlling COVID-19. But still, they will know that this region had no positive thoughts about COVID-19.

So, we can say that in a future pandemic, if the SA region will have negative thoughts about the pandemic, then the region will be highly affected by the pandemic. In this case, SA will require more attention and priority than other areas like China which has a difference of 21% in opinion and PCR results. The following recommended steps can be taken to avoid and control future pandemics, according to regional behavior.

· Efficient resource allocation

· Efficient regional monitoring

· Region wise health care steps

· Urgent lockdowns

· Region-specific awareness campaign

· Apprehend people spreading propaganda

· Region-specific relief fund

· Traveling restrictions

· Social protection

· Avoid gathering

Fig. 7. Contradiction ratio of SA and Australia.

6. Conclusion

In this work, we took the COVID-19 tweets and PCR results of 10 regions for a specific time interval. The priority was to predict the sentiments of the regions with high probability and low elapsed time. To accomplish this goal, different AI classifiers and NLP techniques were implemented. As a result, we predicted sentiments with the highest possible accuracy and low elapsed time. We achieved a 93.78% accuracy rate and elapsed time of 1.2 s with the SVM model. The obtained results were better than the existing work. In addition, we collected the PCR test results for COVID-19 in the same regions and then we compare them and the sentiment rate. We found remarkable differences in these regions. South Africa had almost a 0% difference in negative opinion and positive PCR results, and the USA had a difference of 18% between thoughts and practical behavior. This work is beneficial for any future pandemic. If we will know the sentiments of regions about a pandemic, then we can predict the actual behavior in that pandemic with the help of the proposed study. It helps in tracking the region’s behavior towards future pandemics and provides a platform to take necessary measures.

ACKNOWLEDGMENTS

This work was supported by the National Research Foundation (NRF), Korea, under project BK21 FOUR (F21YY8102068)

REFERENCES

S. Sangwan at al., "Social Media Sentiment Analysis- A Relative Study on Twitter Dataset," in Proc of. 2022 6th International Conference on Trends in Electronics and Informatics (ICOEI), pp. 436-441, Apr. 2022,

V. Israel-Turim at al., “Who Did Spanish Politicians Start Following on Twitter? Homophilic Tendencies among the Political Elite,” Social Sciences, vol. 11, no. 7, pp. 292, Jul. 2022,

I. Deutscher at al., “Sentiments and Acts,” Berlin, Boston: De Gruyter, Dec. 2021,

B. Heredia, T. M. Khoshgoftaar, J. Prusa and M. Crawford, "Cross-Domain Sentiment Analysis: An Empirical Investigation," in Proc of. 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI), pp. 160-165, Jul. 2016,

Wei, Jason, Zou and Kai, “EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks,” arXiv, Aug. 2019,

H Keshavarz and M Abadeh, “ALGA: Adaptive lexicon learning using genetic algorithm for sentiment analysis of microblogs,” Knowledge-Based Systems, pp. 1-16, Apr. 2017,

S.Kobayashi “Contextual augmentation: Data augmentation by words with paradigmatic relations,” computation and language, AsssxXiv, May. 2018,

J. S. Vimali and S. Murugan, "A Text Based Sentiment Analysis Model using Bi-directional LSTM Networks," 2021 6th International Conference on Communication and Electronics Systems (ICCES), 2021, pp. 1652-1658,

Kobayashi and Sosuke, “Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations,” computation and language, AxXiv, May. 2018,

H. Zhang, S. Sun, Y. Hu, J. Liu and Y. Guo, "Sentiment Classification for Chinese Text Based on Interactive Multitask Learning," IEEE Access, vol. 8 pp. 129626-12963, Jul. 2020,

S. Wen at al., "Memristive LSTM Network for Sentiment Analysis," IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 51, no. 3, pp. 1794-1804, Mar. 2021,

S. Thota, S. P. Hanish and Y. Raju, “Opinion Mining of Twitter Data Using Machine Learning,” in EBSCO, International Journal of Advanced Research Computer Science, vol. 11, pp. 92-95, May. 2020.

A. M. Alharbi and E Doncker, “Twitter sentiment analysis with a deep neural network: An enhanced approach using user behavioral information” Cognitive Systems Research, pp. 50-61, May. 2019,

S. E. Saad and J. Yang, "Twitter Sentiment Analysis Based on Ordinal Regression," IEEE Access, vol. 7, pp. 163677-163685, Nov. 2019,

D. Malik and G. Munjal, "Reviewing Classification Methods on Health Care," Intelligent Healthcare. EAI/Springer Innovations in Communication and Computing, Jul. 2021,

A. Ikram, M. Kumar and G. Munjal, "Twitter Sentiment Analysis using Machine Learning" in Proc of.12th International Conference on Cloud Computing, Data Science & Engineering (Confluence), pp. 629-634, Mar. 2022,

G. Preda, “covid-19 Tweets,” on Kaggle, Dec.2020,Available on:

Ourworldindata, Oct. 2022.

Author

Ghulam Musa Raza

Ghulam Musa Raza received his BS degree in Computer Sciences from Comsats University Islamabad in 2019. His major in BS was Intelligent Robotics. He received his MS degree in Computer Sciences from SEECS, NUST Islamabad in 2021. His research interest in Masters was Natural Language Processing (Artificial Intelligence). From 2017 to 2019, he was working as a Software Engineer in Snaky Solutions Pvt Limited. He served as Machine Learning based Research Assistant in TUKL lab, NUST Islamabad at the start of 2021. He served as Lecturer in Alhamd Islamic University, Islamabad from 2021 to 2022. His major interests are in the field of Natural Language Processing, Internet of things (IOT), Information Centric Networking and Named Data Networking. He is currently pursuing the Ph.D. degree with the Department of Communication and Software Engineering in Graduate School, Hongik University, South Korea.

Byung-Seo Kim

Byung-Seo Kim received his B.S. degree in electrical engineering from In-Ha University, In-Chon, Korea, in 1998 and his M.S. and Ph.D. degrees in electrical and computer engineering from the University of Florida in 2001 and 2004, respectively. His Ph.D. study was supervised by Dr. Yuguang Fang. Between 1997 and 1999, he worked for Motorola Korea Ltd., PaJu, Korea, as a computer integrated manufacturing (CIM) engineer in advanced technology research and development (ATR&D), and he was the chairman with the Department of Software and Communications Engineering, Hongik University, South Korea, where he is currently a professor. He served as the General Chair for 3rd IWWCN 2017 and the TPC member for the IEEE VTC 2014-Spring and the EAI FUTURE2016 and ICGHIC 2016 2019 conferences. He served as a guest editor of special issues of the International Journal of Distributed Sensor Networks (SAGE), IEEE Access, and Journal of the Institute of Electrics and Information Engineers. His work has appeared in around 167 publications and 22 patents. He is an IEEE Senior Member and Associative Editor of IEEE Access. His research interests include the design and development of efficient wireless/wired networks including, link-adaptable/cross-layer-based protocols, multi-protocol structures, wireless CCNs/NDNs, mobile edge computing, physical layer design for broadband PLC, and resource allocation algorithms for wireless networks.

IEIE SPC IEIE Transactions on Smart Processing & Computing

Journal Search

Journal XML

Journal Information

Tweets and PCR Test-based Analysis and Prediction of Social Response to a Future Pandemic. A Case Study

Abstract

Keywords

1. Introduction

2. Literature Review

3. Proposed Methodology

Fig. 1. Flow diagram of proposed methodology.

3.1 Data Collection

3.2 NLP Transformation

3.3 Text Classification

Table 1. Polarity Table.

3.4 Feature Extraction

Fig. 2. 10 Largest and smallest coefficients features are extracted and shown.

Table 2. AI Classifiers with Hyperparameters.

4. Results

4.1 Sentiment Prediction

Table 3. Accuracy Rates and Elapsed Time.

Table 4. Comparison of Accuracy Rates.

4.2 Regional Sentimental Analysis

Fig. 3. Opinion behaviors of Its five regions in term of positive and negative for COVID-19.

Fig. 4. Opinion behaviors of remaining five regions in term of positive and negative for COVID-19.}

4.3 Regional PCR Analysis

Fig. 5. Time & region-wise Twitter-based PCR analysis of the first five regions.

Fig. 6. Time & region-wise Twitter-based PCR analysis of the remaining five regions.

Table 5. USA Total PCR Tests with Positive PCR Results of 36 days.

Table 6. INDIA Total PCR Tests with Positive PCR Results of 36 days.

Table 7. Positive PCR Results of Required Regions.

5. Discussion & Recommendations

Fig. 7. Contradiction ratio of SA and Australia.

6. Conclusion

ACKNOWLEDGMENTS

REFERENCES

Author

Ghulam Musa Raza

Byung-Seo Kim

Article Information (continued)

Keywords

IEIE SPC

IEIE Transactions on Smart Processing & Computing