Mobile QR Code QR CODE

2024

Acceptance Ratio

21%

Main Menu

※ The user interface design of www.ieiespc.org has been recently revised and updated. Please contact inter@theieie.org for any inquiries regarding paper submission.

Journal Search

IEIESPC(IEIE Transactions on Smart Processing and Computing)

IEIESPC Vol. 13, No. 04, p.354-360

ISSN (online) :

2287-5255

Received : 28 February 2023Revised : 5 November 2023Accepted : 2 January 2024

DOI :

https://doi.org/10.5573/IEIESPC.2024.13.4.354

Regular Paper

Discrimination of Feature Influence Model for Obesity Prediction using Machine Learning Techniques

MondalSubhash^1,² KarmakarMithun¹ NagAmitava¹

( Computer Science and Engineering, Central Institute of Technology Kokrajhar, Kokrajhar, Assam, India {ph22cse1001, m.karmakar, amitava.nag}@cit.ac.in)
( Computer Science and Engineering (AI & ML), Dayananda Sagar University, Bengaluru, Karnataka, India mywork.subhash@gmail.com)

^*Corresponding Author: Mithun Karmakar

License :

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.(www.theieie.org).

Abstract

It has been generally observed that a set of clinically established features can be used to predict obesity. Due to lifestyle habits, most of the population deviates from the suggested treatment to control the state of obesity. This study is an experimental analysis of the effect of related features on the classification of obesity. Two research questions have been designed: “With what degree of accuracy can obesity be categorized using a feature vector (FS) with 16 features?” (RQ1), and, “Can a feature subset (FSS) classify the disease with an accuracy of over 90% compared to the accuracy obtained in RQ1?” (RQ2). It was observed that an FS comprising 16 features reflected an accuracy of 96.68% in the classification of obesity in RQ1, and an FSS comprising four features (selected using the SelectKBest algorithm) exhibited an accuracy of 88.38% on the same dataset. Since 88.38% is 91.42% of 96.68%, the FSS attains accuracy over 90% concerning FS in classifying obesity. Three machine learning (ML) models were selected based on the best accuracy values in the literature. Moreover, both RQ1 and RQ2 have far better accuracy than other methods.

Keywords

Obesity, Machine learning, Classification, Obesity features, SelectKBest, Feature Subset (FSS)

1. Introduction

With the global growth of obesity cases in the past few decades, there has been an exponential rise in morbidity and death rates. According to the World Health Organization (WHO), obesity is a condition where excess fat accumulates in the human body, which results in severe health issues (Obesity and Overweight, 2021). Obesity results in 2.8 million deaths worldwide (Global health risks: mortality and burden of disease attributable to selected major risks, 2009), making it one of the prevalent risk factors for mortality. Individuals diagnosed with this disease generally tend to suffer from other health ailments such as stroke, diabetes, and sometimes various types of cancers ^[3], impacting their life expectancy rates.

Various types of obesity are prevalent (Miklishanskaya, Solomasova, & Mazur, 2021). The driving forces for obesity are unhealthy food habits, hereditary factors, physical inactivity, and lack of awareness regarding health issues. Obesity has caused a widespread effect on individuals of all age groups, and the number of children affected by the disease is increasing at an alarming rate worldwide.

There has been a surge in the percentage of obese children aged 6-7 years (Obesity, 2021). According to reports, around 39 million children under 5 years old were affected by this disease in 2020 (Obesity and Overweight, 2021). Children suffering from obesity generally tend to become obese adults (Whitaker, Wright, Pepe, Seidel, & Dietz, 1997), and they become susceptible to the adverse effects of adult obesity (Khan, et al., 2014) (Mollard, et al., 2014) ( Wing, et al., 2013) (Wadden, et al., 2011). Thus, there is huge interest among researchers in formulating an effective artificial intelligence (AI)-based solution for monitoring obesity. In recent years, numerous AI solutions have been proposed for obesity classification. Besides diagnosing the disease (Devi, Bai, & Nagarajan, 2020), identifying the relevant features contributing to obesity can also help in the early monitoring of the disease. Much attention has recently been attracted to the features and factors ^[12] globally and regionally (Filho, Gon\c{c}alves, de Lira, Ribeiro, Eickmann, & Lima, 2022).

In this study, the following two research questions are addressed:

· Research Question 1 (RQ1): With what degree of accuracy can obesity be categorized using a feature vector (FS) with 16 features?

· Research Question 2 (RQ2): Can a smaller feature vector ($\mathrm{FSS}$) classify the disease with an accuracy of over 90% compared to the accuracy obtained in RQ1?

To determine the degree of accuracy that may be attained, RQ1 examines the viability of categorizing obesity using a complete collection of 16 criteria. RQ2 further investigates the possibility of improving classification accuracy by using a smaller subset of features (FSS) while keeping a threshold accuracy of 90%.

For answering RQ1 and RQ2, three machine learning (ML) algorithms have been studied based on their performance: logistic regression (LR), support vector machine (SVM), and XGBoost (XGB). A publicly available dataset has been used for the study. A very deep comparative analysis based on the performance of each feature subset has been done. We believe this study will be a benchmark for understanding the correlation between different features leading to obesity.

1.1 Study Design and Setting

The study design and the workflow for RQ1 has the following steps.

· Dataset acquisition

· Selection of two best-performing machine learning models for classification

· The addition of XGB as a third ML algorithm for the study

· Calculating performance-metric values based on all the features using the three models on the complete dataset.

The following are the steps for the study design and the workflow for RQ2.

· Creating feature subsets (each subset contains four features)

· Classification of obesity using the three algorithms and feature subsets

· Determining the best feature subset out of the complete feature set based on the performance-metric values

1.2 Dataset Acquisition

We performed the study using a publicly available dataset. The dataset comprises 16 features, as shown in Table 1. We have used the Obesity Levels dataset found online in the UCI Machine Learning Repository (Estimation of obesity levels based on eating habits and physical condition Data Set, 2019). The dataset consists of 2111 instances with 17 attributes or columns. Some of the attributes in the dataset are weight, age, family history of overweight, transportation used, etc. A complete description of each attribute is presented in Table 1.

Table 1 reflects only 16 features since we omitted the serial number. The target variables had the following labels: about 287 were normal weight, and the other 1824 were insufficient weight, overweight level I, overweight level II, obesity type I, obesity type II, and obesity type III. The dataset page mentioned that 77% of the data were synthetically generated using the Weka tool and the SMOTE filter, and 23% were collected directly from users through a web platform. Since this is a preliminary study to judge whether the research questions can be addressed, this dataset establishes the basis for further explorations with more clinical user-defined datasets.

Table 1. 16 features extracted from datasets.

Feature Set (FS)	Description
Gender	Gender of the individual
Age	Age of the individual
Height	Height of the individual
Family History with overweight	Has any family history of overweight
FAVC	Frequent consumption of high-caloric food
FCVC	Frequency of consumption of vegetables
NCP	Number of main meals
CAEC	Consumption of food between meals
SMOKE	Does the individual smoke or not
CH2O	Consumption of water daily
SCC	Calorie's consumption monitoring
FAF	Physical activity frequency
TUE	Time-using technology devices
CALC	Consumption of alcohol
MTRANS	Transportation used
NObeyesdad	Target Variable (obese or not)

2. Related Study

The performance of related AI methods is presented based on different evaluation metrics in Table 2. The performance metrics used for the analysis were accuracy (Ac), precision (Pr), recall (Re), F1 score (Fs), and Area under the Receiver Operating Characteristic Curve (ROC-AUC) score (Ra). The performance results in Table 2 have been analyzed for model selection for RQ1. For example, Table 3 shows results from an analysis of literary proposals concerning SVM (Celik, Guney, & Dengiz, 2021). For LR, for the classification of obesity, it can be observed that the accuracy (Ferdowsy, Rahi, Jabiullah, & Habib, 2021) is better than the accuracies in other studies (Pang, Forrest, L\^{e}-Scherban, & Masino, 2021), (Zheng & Ruggiero, 2017), (Thamrin, Arsyad, Kuswanto, Lawi, & Nasir, 2021). It can also be observed that for all the performance evaluation parameters for proposals using LR, one method presented the best results (Ferdowsy, Rahi, Jabiullah, & Habib, 2021).

For DT, one method (Ferdowsy, Rahi, Jabiullah, & Habib, 2021) reflects the best results, whereas for RF, another method (Dugan, Mukhopadhyay , Carroll, & Downs, 2015) exhibits the best accuracy, but another method (Singh & Tawfik, 2020) presents the best Pr and Re values. Some authors (Zheng & Ruggiero, 2017) claimed the best accuracy in a set of proposals using KNN algorithms. For the classification of obesity using NN, one proposal (Celik, Guney, & Dengiz, 2021) claims the best accuracy.

XGB was previously used to classify obesity. This motivated us to implement XGB in this study. Although the method (Pang, Forrest, L\^{e}-Scherban, & Masino, 2021) does not exhibit very impressive accuracy or other performance metrics, the implementation of XGB presents better results in comparison to others. From the analysis result in Table 2, the best values of accuracies were generated by SVM (Celik, Guney, & Dengiz, 2021) (97.8% and 97.09%) (Ferdowsy, Rahi, Jabiullah, & Habib, 2021) for the implementation of LR. Hence, we used these two algorithms and the XGB model for this study.

Table 2. Performance metric statistics for obesity classification of different models.

Models	# Ref	Ac (%)	Pr (%)	Re (%)	Fs (%)	Ra (%)
Support Vector Machine (SVM)	^[17]	64.79	29.99		43.63
	^[16]	66.02	53.00	66.00
	^[15]	97.8	97.7		56.00
	^[21]		85.00	78.00
	^[23]					90.54
Logistic Regression (LR)	^[17]	64.81	30.00		43.65
	^[16]	97.09	97.00	97.00	97.00
	^[18]	56.02
	^[19]	72.24	69.55		71.49	79.80
Decision Tree (DT)	^[17]	64.61	29.70		43.28
Decision Tree (DT)	^[16]	70.30	57.00	70.00	61.00
Random Forest (RF)	^[16]	72.30	57.00	72.00	63.00
	^[21]		84.00	82.00
	^[23]					87.98
	^[24]	81.01
	^[20]	84.00
K-Nearest Neighbour (KNN)	^[16]	77.50	79.00	77.00	77.00
	^[21]		83.00	79.00
	^[23]					88.62
	^[18]	88.82
Neural Network (NN)	^[17]	63.67	29.61		43.05
Neural Network (NN)	^[15]	96.50
Xtreme Gradient Boost (XGB)	^[17]	66.14	30.90		44.60
Naïve Bayes (NB)	^[16]	86.04	86.00	86.00	86.00
	^[20]	65.00
	^[19]	71.47	69.00		70.53	78.48
	^[24]	70.52
Gaussian Naïve Bayes (BNB)	^[17]	63.23	29.00		42.59
Bernoulli Naïve Bayes (GNB)	^[17]	61.76	28.06		41.47

3. RQ1

The detection framework for RQ1 is shown in Fig. 1. Pre-processing mechanisms were carried out to improve the accuracy and performance of the models. Pre-processing steps included cleaning and organizing data to increase the model performance.

Fig. 1. Obesity detection framework used in both FS and FSS.

The correlation among the numerical features of the raw dataset is shown in Fig. 2. The attributes with missing values in the dataset were filled using the median imputation method. The attributes with less than two categorical features were encoded using label encoder techniques, and the ones with more than two were encoded using one hot encoder. The outliers in the dataset were treated using three standard deviation techniques.

Fig. 2. Graphical representation of correlation of features using heat map.

After applying all data pre-processing techniques, before training the ML model, we split the dataset to increase the efficiency of the trained model. We split the data into training and testing sets using K-fold cross-validation, leave-P-out cross-validation, leave-one-out cross-validation, holdout cross-validation, and nested cross-validation, which ensured independent training and testing on the dataset for better model performance with unseen data. The XGB, SVC, and LR algorithms were trained on a complete feature set using the sci-kit-learn library (Pedregosa, et al., 2011).

In RQ2, we selected an $\mathrm{FSS}$ out of the 16 original $\mathrm{FS}$s. The $\mathrm{FSS}$ selection has been done by applying the SelectKBest algorithm on the FS and sorting the rankings. Table 3 presents the performance scores of each feature using the SelectKBest algorithm. This performance score was used to evaluate the influence of each feature on obesity individually. The complete workflow is presented in Fig. 3.

Fig. 3. High-level diagram for RQ2.

After sorting the 16 features' correlation importance values, the 8 highest-ranking features were selected for further processing. The feature identifier assigns an ``id'' to the respective feature for ease of analysis. The four FSSs were selected from Table 3 using Eq. (1).

(1)

$\mathrm{FSS}_{\mathrm{d}}=\left\{\mathrm{FI}_{\mathrm{id}}\right\}\forall \text{Logic}_{\mathrm{d}}$

where

$ \begin{align*} Logic_{1}=FSS\left\{FI_{id}\right\}\forall id\in \left\{1,2,3,4\right\} \\ Logic_{2}=FSS\left\{FI_{id}\right\}\forall id\in \left\{5,6,7,8\right\} \\ Logic_{3}=FSS\left\{FI_{id}\right\}\forall id\in \left\{1,3,5,7\right\} \\ Logic_{4}=FSS\left\{FI_{id}\right\}\forall id\in \left\{2,4,6,8\right\} \end{align*}$

It is interesting to note that Eq. (1) uses $FI_{id}$ with $1\leq id\leq 8,$ as discussed earlier, because the eight highest-ranking features have been used. The four $FSS$s generated are mentioned in Table 4 with their respective features.

Table 3. Ranking of different features based on SelectKBest algorithm.

Feature Identifier (FI)	Feature Set	Performance Score
1	Weight	14186.71
2	Age	635.6424
3	Gender	324.9784
4	SCC	117.4293
5	Family history with overweight	113.4354
6	MTRANS	102.7809
7	FAF	71.76225
8	FCVC	60.32221
9	NCP	33.78086
10	SMOKE	31.46798
11	FAVC	27.0813
12	TUE	26.12619
13	CAEC	21.81961
14	CALC	21.81961
15	CH2O	17.40355
16	Height	1.066227

Table 4. Features corresponding to four different FSSs (feature subsets).

Fss₁	Fss₂	Fss₃	Fss₄
Weight	Family history with overweight	Weight	Age
Age	MTRANS	Gender	SCC
Gender	FAF	Family history of overweight	MTRANS
SCC	FCVC	FAF	FCVC

5. Experimental Results

Table 5 presents the performance analysis results of the shortlisted ML algorithms in terms of standard implementations, hyper-parameter tuned implementations, and their comparison results. It can be observed that XGB presents the best accuracy among the three algorithms. The two ML algorithms, SVC and LR, exhibited the best results, whereas the XGB results were not very impressive. But, in this implementation, we observed that XGB produces better results than the other two ML algorithms. Moreover, the performance metrics are even better for all ML implementations, including XGB. We also present the increase or decrease in performance of the tuned versions for the standard implementation of the ML algorithms.

Table 5. Performance analysis of selected ML algorithms on a complete feature set containing 16 features.

Model	Ac	Pr	Re	Fs	Cs	Ra
Normal Model
XGB	96.44	0.96	0.97	0.96	0.96	0.99
SVC	51.42	0.53	0.53	0.51	0.44	0.89
LR	64.21	0.63	0.64	0.63	0.58	0.91
Tuned Model
XGB	96.68	0.97	0.97	0.97	0.96	0.99
SVC	90.04	0.89	0.90	0.90	0.88	0.90
LR	63.03	0.61	0.63	0.62	0.57	0.92
Comparison Analysis
XGB	+0.24	+0.01	0	+0.01	0	0
SVC	+38.62	+0.36	+0.37	0.39	+0.44	+0.01
LR	-1.09	-0.02	-0.01	+0.01	-0.01	+0.01

5.1 Discussion

The four feature subsets obtained were trained and tested separately for building an ML model for detecting obesity. Table 6 presents the performance analysis, tuned performance analysis, and comparative analysis results of the ML algorithms on$~ FSS_{1}$ through$FSS_{4}$. We observed the differences in the performance values for individual metrics of the algorithms. The comparative analysis was done among the reduced and complete feature vectors. The individual reduced feature vector comprised four features, while the complete feature vector comprised 16 features. This study mainly focused on the most relevant features that should contribute to developing obesity.

Table 6. Performance analysis of selected ML algorithms on FSS.

FSS	Model	Ac	Pr	Re	Fs	Cs	Ra
Fss₁	Performance Analysis
	XGB	86.49	0.86	0.85	0.85	0.84	0.98
	SVC	61.37	0.63	0.63	0.62	0.55	0.89
	LR	51.42	0.46	0.51	0.46	0.43	0.85
	Tuned Performance Analysis
	XGB	88.38	0.88	0.88	0.88	0.86	0.98
	SVC	85.87	0.85	0.84	0.85	0.83	0.90
	LR	53.08	0.49	0.52	0.48	0.45	0.86
	Comparison Analysis between Normal and Tuned ML algorithms
	XGB	+1.89	+0.02	+0.03	+0.03	+0.02	0
	SVC	+24.50	+0.22	+0.21	+0.22	+0.28	+0.01
	LR	+1.66	+0.03	+0.01	+0.02	+0.02	+0.01
Fss₂	Performance Analysis
	XGB	55.68	0.55c	0.54	0.51	0.48	0.83
	SVC	40.04	0.37	0.39	0.33	0.29	0.79
	LR	39.81	0.37	0.38	0.33	0.28	0.74
	Tuned Performance Analysis
	XGB	54.50	0.55	0.54	0.54	0.47	0.84
	SVC	42.18	0.34	0.40	0.34	0.31	0.81
	LR	39.09	0.38	0.38	0.32	0.28	0.74
	Comparison Analysis between Normal and Tuned ML algorithms
	XGB	-1.18	0	0	+0.03	-0.01	+0.01
	SVC	+2.14	-0.03	+0.01	+0.01	+0.02	+0.02
	LR	-0.72	+0.01	0	-0.01	0	0
Fss₃	Performance Analysis
	XGB	81.51	0.82	0.82	0.82	0.78	0.97
	SVC	58.29	0.61	0.59	0.59	0.51	0.89
	LR	61.84	0.58	0.62	0.58	0.55	0.91
	Tuned Performance Analysis
	XGB	80.56	0.81	0.81	0.81	0.77	0.97
	SVC	76.06	0.76	0.76	0.76	0.72	0.90
	LR	58.53	0.58	0.59	0.57	0.51	0.91
	Comparison Analysis between Normal and Tuned ML algorithms
	XGB	-0.95	-0.01	-0.01	-0.01	-0.01	0
	SVC	+17.77	+0.15	+0.17	+0.17	+0.22	+0.01
	LR	-3.31	0	-0.03	-0.01	-0.04	0
Fss₄	Performance Analysis
	XGB	56.87	0.55	0.55	0.54	0.49	0.87
	SVC	34.59	0.25	0.32	0.26	0.51	0.84
	LR	40.28	0.39	0.37	0.33	0.23	0.77
	Tuned Performance Analysis
	XGB	60.18	0.58	0.59	0.58	0.53	0.87
	SVC	46.91	0.47	0.45	0.43	0.37	0.85
	LR	39.57	0.40	0.36	0.33	0.28	0.77
	Comparison Analysis between Normal and Tuned ML algorithms
	XGB	+3.31	+0.03	+0.04	+0.04	+0.04	0
	SVC	+12.32	+0.22	+0.13	+0.17	-0.14	+0.01
	LR	-0.71	+0.01	-0.01	0	+0.05	0

6. Conclusion

In this study, we have addressed two research questions about the effect of a complete feature vector generated from the parameters influencing obesity and a much smaller feature vector's influence on obesity. Hence, of all the parameters that are responsible for obesity, it can be concluded that concentrating on a smaller parameter set reduces obesity by about 91% than when all the parameters are taken into consideration.

The study’s objective was to investigate whether a smaller set of parameters can reduce or prevent obesity to the same extent as that provided by all the parameters. The experimental results reflect that a smaller subset of only four parameters like weight, age, gender, and SCC influences obesity levels by almost 91% compared to the influence subjected by the complete set of 16 parameters. The study targeted individuals who feel inert while addressing all related factors for obesity control and who may be more comfortable with addressing only a few factors to some extent.

REFERENCES

"Obesity and Overweight," World Health Organization, 9 June 2021. [Online]. Available: Article (CrossRef Link) [Accessed 23 December 2021].

Global health risks: mortality and burden of disease attributable to selected major risks, World Health Organization, 2009.

L. N. Borrell and L. Samuel, "Body Mass Index Categories and Mortality Risk in US Adults: The Effect of Overweight and Obesity on Advancing Death," American journal of public health, vol. 104, no. 3, pp. 512-519, March 2014.

S. V. Miklishanskaya, L. V. Solomasova and N. A. Mazur, "Types of obesity and their prognostic value," Obesity Medicine, p. 100350, 2021.

"Obesity," Centers for Disease Control and Prevention, 21 September 2021. [Online]. Available: Article (CrossRef Link) [Accessed 23 December 2021].

R. C. Whitaker, J. A. Wright, M. S. Pepe, K. D. Seidel and W. H. Dietz, "Predicting obesity in young adulthood from childhood and parental obesity," New England journal of medicine, vol. 337, no. 13, pp. 869-873, 25 September 1997.

N. A. Khan, L. B. Raine, E. S. Drollette, M. R. Scudder, M. B. Pontifex, D. M. Castelli, S. M. Donovan, E. M. Evans and C. H. Hillman, "Impact of the FITKids Physical Activity Intervention on Adiposity in Prepubertal Children," vol. 133, no. 4, pp. 875-883, 1 April 2014.

R. C. Mollard, M. Senechal, A. C. Maclntosh, J. Hay, B. A. Wicklow, K. D. M. Wittmeier, E. A. C. Sellers, H. J. Dean, L. Ryner, L. Berard and J. M. McGavock, "Dietary determinants of hepatic steatosis and visceral adiposity in overweight and obese youth at risk of type 2 diabetes," American Journal of Clinical Nutrition, vol. 99, no. 4, pp. 804-812, April 2014.

R. R. Wing, P. Bolin, F. L. Brancati, G. A. Bray, J. M. Clark, M. Coday, R. S. Crow, J. M. Curtis, C. M. Egan, M. A. Espeland, M. Evans, J. P. Foreyt, S. Ghazarian, E. W. Gregg, B. Harrison, H. P. Hazuda, J. O. Hill, E. S. Horton, V. S. Hubbard, J. M. Hubbard, R. W. Jeffery, K. C. Johnson, S. E. Kahn, A. E. Kitabchi, W. C. Knowler, C. E. Lewis, B. J. Maschak-Carey, M. G. Montez, A. Murillo, D. M. Nathan, J. Patricio, A. Peters, X. Pi-Sunyer, H. Pownall, D. Reboussin, J. G. Regensteiner, A. D. Rickman, D. H. Ryan, M. Safford, T. A. Wadden, L. E. Wagenknecht, D. S. West, D. F. Williamson and S. Z. Yanovski, "Cardiovascular effects of intensive lifestyle intervention in type 2 diabetes," The New England Journal of Medicine, vol. 369, no. 2, pp. 145-154, 11 July 2013.

T. A. Wadden, S. Volger, D. B. Sarwer, M. L. Vetter, A. G. Tsai, R. I. Berkowitz, S. Kumanyika, K. H. Schmitz, L. K. Diewald, R. Barg, J. Chittams and R. H. Moore, "A Two-Year Randomized Trial of Obesity Treatment in Primary Care Practice," New England Journal of Medicine, vol. 365, no. 21, pp. 1969-1979, 24 November 2011.

R. D. H. Devi, A. Bai and N. Nagarajan, "A novel hybrid approach for diagnosing diabetes mellitus using farthest first and support vector machine algorithms," Obesity Medicine, p. 100152, 2020.

H. A. Afolabi, Z. B. Zakaria, M. N. M. Hashim, C. R. Vinayak and A. B. A. Shokri, "Body Mass Index and predisposition of patients to knee osteoarthritis," Obesity Medicine, p. 100143, 2019.

S. L. V. N. Filho, F. C. S. d. P. Gonçalves, P. I. C. de Lira, A. M. Ribeiro, S. H. Eickmann and M. d. C. Lima, "Birth weight and postnatal weight gain as predictors of abdominal adiposity in childhood and adolescence: A cohort study in northeast Brazil," Obesity Medicine, p. 100379, 2022.

"Estimation of obesity levels based on eating habits and physical condition Data Set," 27 August 2019. [Online]. Available: Article (CrossRef Link) [Accessed 23 December 2021].

Y. Celik, S. Guney and B. Dengiz, "Obesity Level Estimation based on Machine Learning Methods and Artificial Neural Networks," in 2021 44th International Conference on Telecommunications and Signal Processing (TSP), Brno, Czech Republic, 2021.

F. Ferdowsy, K. S. A. Rahi, M. I. Jabiullah and M. T. Habib, "A machine learning approach for obesity risk prediction," Current Rin Behavioral Sciences, vol. 2, November 2021.

X. Pang, C. B. Forrest, F. Lê-Scherban and A. J. Masino, "Prediction of early childhood obesity with machine learning and electronic health record data," International Journal of Medical Informatics, vol. 150, June 2021.

Z. Zheng and K. Ruggiero, "Using machine learning to predict obesity in high school students," in 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, USA, 2017.

S. A. Thamrin, D. S. Arsyad, H. Kuswanto, A. Lawi and S. Nasir, "Predicting Obesity in Adults Using Machine Learning Techniques: An Analysis of Indonesian Basic Health R2018," vol. 8, 21 June 2021.

T. M. Dugan, S. Mukhopadhyay , A. Carroll and S. Downs, "Machine Learning Techniques for Prediction of Early Childhood Obesity," Applied Clinical Informatics, vol. 6, no. 3, pp. 506-520, 12 August 2015.

B. Singh and H. Tawfik, "Machine Learning Approach for the Early Prediction of the Risk of Overweight and Obesity in Young People," in Computational Science – ICCS 2020, 2020.

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot and E. Duchesnay, "Scikit-learn: Machine Learning in Python," vol. 12, no. 2011, pp. 2825-2830, 2011.

C. A. C. Montañez, P. Fergus, A. Hussain, D. Al-Jumeily, B. Abdulaimma, J. Hind and N. Radi, "Machine learning approaches for the prediction of obesity using publicly available genetic profiles," in 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 2017.

K. Rajput, G. Chetty and R. Davey, "Obesity and Co-Morbidity Detection in Clinical Text Using Deep Learning and Machine Learning Techniques," in 2018 5th Asia-Pacific World Congress on Computer Science and Engineering (APWC on CSE), Nadi, Fiji, 2018.

Author

Subhash Mondal

Subhash Mondal (Member, IEEE) received a B.TECH. and M.TECH. degrees in Computer Science & Engineering from the University of Calcutta, Kolkata, India, in 2005 and 2007, respectively. He is pursuing a Ph.D. in CSE at the Central Institute of Technology Kokrajhar, Kokrajhar, Assam, India. He works as an Assistant Professor in CSE (AI & ML) at Dayananda Sagar University, Bengaluru, Karnataka, India. He has more than 17 years of teaching experience. His research interest is in the fields of machine learning, deep learning, and natural language processing. He has published more than 30 research publications. He is a professional member of IEEE and ACM.

Mithun Karmakar

Mithun Karmakar is currently employed as an Assistant Professor at the Central Institute of Technology Kokrajhar (CITK), Kokrajhar, Assam, India. He received his M.Tech. degree in Information Technology from Tezpur University, Assam, in 2009. He is pursuing his Ph.D. in Computer Science and Engineering at the CITK Assam, India. He has more than 13 years of teaching experience and five research publications. His research interests include artificial intelligence, deep learning, and image processing.

Amitava Nag

Amitava Nag (Senior Member, IEEE) is currently working as a Professor of Computer Science and Engineering at the Central Institute of Technology Kokrajhar, Kokrajhar, Assam, India. He has more than 70 research publications in various international journals and conference proceedings. His research interests include IoT, information security, machine learning, and deep learning. He is also a fellow of the Institution of Engineers India (IEI).