MondalSubhash1,2
KarmakarMithun1
NagAmitava1
-
( Computer Science and Engineering, Central Institute of Technology Kokrajhar, Kokrajhar,
Assam, India {ph22cse1001, m.karmakar, amitava.nag}@cit.ac.in)
-
( Computer Science and Engineering (AI & ML), Dayananda Sagar University, Bengaluru,
Karnataka, India mywork.subhash@gmail.com)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Keywords
Obesity, Machine learning, Classification, Obesity features, SelectKBest, Feature Subset (FSS)
1. Introduction
With the global growth of obesity cases in the past few decades, there has been an
exponential rise in morbidity and death rates. According to the World Health Organization
(WHO), obesity is a condition where excess fat accumulates in the human body, which
results in severe health issues (Obesity and Overweight, 2021). Obesity results in
2.8 million deaths worldwide (Global health risks: mortality and burden of disease
attributable to selected major risks, 2009), making it one of the prevalent risk factors
for mortality. Individuals diagnosed with this disease generally tend to suffer from
other health ailments such as stroke, diabetes, and sometimes various types of cancers
[3], impacting their life expectancy rates.
Various types of obesity are prevalent (Miklishanskaya, Solomasova, & Mazur, 2021).
The driving forces for obesity are unhealthy food habits, hereditary factors, physical
inactivity, and lack of awareness regarding health issues. Obesity has caused a widespread
effect on individuals of all age groups, and the number of children affected by the
disease is increasing at an alarming rate worldwide.
There has been a surge in the percentage of obese children aged 6-7 years (Obesity,
2021). According to reports, around 39 million children under 5 years old were affected
by this disease in 2020 (Obesity and Overweight, 2021). Children suffering from obesity
generally tend to become obese adults (Whitaker, Wright, Pepe, Seidel, & Dietz, 1997),
and they become susceptible to the adverse effects of adult obesity (Khan, et al.,
2014) (Mollard, et al., 2014) ( Wing, et al., 2013) (Wadden, et al., 2011). Thus,
there is huge interest among researchers in formulating an effective artificial intelligence
(AI)-based solution for monitoring obesity. In recent years, numerous AI solutions
have been proposed for obesity classification. Besides diagnosing the disease (Devi,
Bai, & Nagarajan, 2020), identifying the relevant features contributing to obesity
can also help in the early monitoring of the disease. Much attention has recently
been attracted to the features and factors [12] globally and regionally (Filho, Gon\c{c}alves, de Lira, Ribeiro, Eickmann, & Lima,
2022).
In this study, the following two research questions are addressed:
· Research Question 1 (RQ1): With what degree of accuracy can obesity be categorized
using a feature vector (FS) with 16 features?
· Research Question 2 (RQ2): Can a smaller feature vector ($\mathrm{FSS}$) classify
the disease with an accuracy of over 90% compared to the accuracy obtained in RQ1?
To determine the degree of accuracy that may be attained, RQ1 examines the viability
of categorizing obesity using a complete collection of 16 criteria. RQ2 further investigates
the possibility of improving classification accuracy by using a smaller subset of
features (FSS) while keeping a threshold accuracy of 90%.
For answering RQ1 and RQ2, three machine learning (ML) algorithms have been studied
based on their performance: logistic regression (LR), support vector machine (SVM),
and XGBoost (XGB). A publicly available dataset has been used for the study. A very
deep comparative analysis based on the performance of each feature subset has been
done. We believe this study will be a benchmark for understanding the correlation
between different features leading to obesity.
1.1 Study Design and Setting
The study design and the workflow for RQ1 has the following steps.
· Dataset acquisition
· Selection of two best-performing machine learning models for classification
· The addition of XGB as a third ML algorithm for the study
· Calculating performance-metric values based on all the features using the three
models on the complete dataset.
The following are the steps for the study design and the workflow for RQ2.
· Creating feature subsets (each subset contains four features)
· Classification of obesity using the three algorithms and feature subsets
· Determining the best feature subset out of the complete feature set based on the
performance-metric values
1.2 Dataset Acquisition
We performed the study using a publicly available dataset. The dataset comprises 16
features, as shown in Table 1. We have used the Obesity Levels dataset found online in the UCI Machine Learning
Repository (Estimation of obesity levels based on eating habits and physical condition
Data Set, 2019). The dataset consists of 2111 instances with 17 attributes or columns.
Some of the attributes in the dataset are weight, age, family history of overweight,
transportation used, etc. A complete description of each attribute is presented in
Table 1.
Table 1 reflects only 16 features since we omitted the serial number. The target variables
had the following labels: about 287 were normal weight, and the other 1824 were insufficient
weight, overweight level I, overweight level II, obesity type I, obesity type II,
and obesity type III. The dataset page mentioned that 77% of the data were synthetically
generated using the Weka tool and the SMOTE filter, and 23% were collected directly
from users through a web platform. Since this is a preliminary study to judge whether
the research questions can be addressed, this dataset establishes the basis for further
explorations with more clinical user-defined datasets.
Table 1. 16 features extracted from datasets.
Feature Set (FS)
|
Description
|
Gender
|
Gender of the individual
|
Age
|
Age of the individual
|
Height
|
Height of the individual
|
Family History with overweight
|
Has any family history of overweight
|
FAVC
|
Frequent consumption of high-caloric food
|
FCVC
|
Frequency of consumption of vegetables
|
NCP
|
Number of main meals
|
CAEC
|
Consumption of food between meals
|
SMOKE
|
Does the individual smoke or not
|
CH2O
|
Consumption of water daily
|
SCC
|
Calorie's consumption monitoring
|
FAF
|
Physical activity frequency
|
TUE
|
Time-using technology devices
|
CALC
|
Consumption of alcohol
|
MTRANS
|
Transportation used
|
NObeyesdad
|
Target Variable (obese or not)
|
2. Related Study
The performance of related AI methods is presented based on different evaluation metrics
in Table 2. The performance metrics used for the analysis were accuracy (Ac), precision (Pr),
recall (Re), F1 score (Fs), and Area under the Receiver Operating Characteristic Curve
(ROC-AUC) score (Ra). The performance results in Table 2 have been analyzed for model selection for RQ1. For example, Table 3 shows results from an analysis of literary proposals concerning SVM (Celik, Guney,
& Dengiz, 2021). For LR, for the classification of obesity, it can be observed that
the accuracy (Ferdowsy, Rahi, Jabiullah, & Habib, 2021) is better than the accuracies
in other studies (Pang, Forrest, L\^{e}-Scherban, & Masino, 2021), (Zheng & Ruggiero,
2017), (Thamrin, Arsyad, Kuswanto, Lawi, & Nasir, 2021). It can also be observed that
for all the performance evaluation parameters for proposals using LR, one method presented
the best results (Ferdowsy, Rahi, Jabiullah, & Habib, 2021).
For DT, one method (Ferdowsy, Rahi, Jabiullah, & Habib, 2021) reflects the best results,
whereas for RF, another method (Dugan, Mukhopadhyay , Carroll, & Downs, 2015) exhibits
the best accuracy, but another method (Singh & Tawfik, 2020) presents the best Pr
and Re values. Some authors (Zheng & Ruggiero, 2017) claimed the best accuracy in
a set of proposals using KNN algorithms. For the classification of obesity using NN,
one proposal (Celik, Guney, & Dengiz, 2021) claims the best accuracy.
XGB was previously used to classify obesity. This motivated us to implement XGB in
this study. Although the method (Pang, Forrest, L\^{e}-Scherban, & Masino, 2021) does
not exhibit very impressive accuracy or other performance metrics, the implementation
of XGB presents better results in comparison to others. From the analysis result in
Table 2, the best values of accuracies were generated by SVM (Celik, Guney, & Dengiz, 2021)
(97.8% and 97.09%) (Ferdowsy, Rahi, Jabiullah, & Habib, 2021) for the implementation
of LR. Hence, we used these two algorithms and the XGB model for this study.
Table 2. Performance metric statistics for obesity classification of different models.
Models
|
# Ref
|
Ac (%)
|
Pr (%)
|
Re (%)
|
Fs (%)
|
Ra (%)
|
Support Vector
Machine (SVM)
|
[17]
|
64.79
|
29.99
|
|
43.63
|
|
[16]
|
66.02
|
53.00
|
66.00
|
|
|
[15]
|
97.8
|
97.7
|
|
56.00
|
|
[21]
|
|
85.00
|
78.00
|
|
|
[23]
|
|
|
|
|
90.54
|
Logistic
Regression (LR)
|
[17]
|
64.81
|
30.00
|
|
43.65
|
|
[16]
|
97.09
|
97.00
|
97.00
|
97.00
|
|
[18]
|
56.02
|
|
|
|
|
[19]
|
72.24
|
69.55
|
|
71.49
|
79.80
|
Decision Tree (DT)
|
[17]
|
64.61
|
29.70
|
|
43.28
|
|
[16]
|
70.30
|
57.00
|
70.00
|
61.00
|
|
Random Forest (RF)
|
[16]
|
72.30
|
57.00
|
72.00
|
63.00
|
|
[21]
|
|
84.00
|
82.00
|
|
|
[23]
|
|
|
|
|
87.98
|
[24]
|
81.01
|
|
|
|
|
[20]
|
84.00
|
|
|
|
|
K-Nearest Neighbour (KNN)
|
[16]
|
77.50
|
79.00
|
77.00
|
77.00
|
|
[21]
|
|
83.00
|
79.00
|
|
|
[23]
|
|
|
|
|
88.62
|
[18]
|
88.82
|
|
|
|
|
Neural Network (NN)
|
[17]
|
63.67
|
29.61
|
|
43.05
|
|
[15]
|
96.50
|
|
|
|
|
Xtreme Gradient Boost (XGB)
|
[17]
|
66.14
|
30.90
|
|
44.60
|
|
Naïve Bayes (NB)
|
[16]
|
86.04
|
86.00
|
86.00
|
86.00
|
|
[20]
|
65.00
|
|
|
|
|
[19]
|
71.47
|
69.00
|
|
70.53
|
78.48
|
[24]
|
70.52
|
|
|
|
|
Gaussian Naïve Bayes (BNB)
|
[17]
|
63.23
|
29.00
|
|
42.59
|
|
Bernoulli Naïve Bayes (GNB)
|
[17]
|
61.76
|
28.06
|
|
41.47
|
|
3. RQ1
The detection framework for RQ1 is shown in Fig. 1. Pre-processing mechanisms were carried out to improve the accuracy and performance
of the models. Pre-processing steps included cleaning and organizing data to increase
the model performance.
Fig. 1. Obesity detection framework used in both FS and FSS.
The correlation among the numerical features of the raw dataset is shown in Fig. 2. The attributes with missing values in the dataset were filled using the median imputation
method. The attributes with less than two categorical features were encoded using
label encoder techniques, and the ones with more than two were encoded using one hot
encoder. The outliers in the dataset were treated using three standard deviation techniques.
Fig. 2. Graphical representation of correlation of features using heat map.
After applying all data pre-processing techniques, before training the ML model, we
split the dataset to increase the efficiency of the trained model. We split the data
into training and testing sets using K-fold cross-validation, leave-P-out cross-validation,
leave-one-out cross-validation, holdout cross-validation, and nested cross-validation,
which ensured independent training and testing on the dataset for better model performance
with unseen data. The XGB, SVC, and LR algorithms were trained on a complete feature
set using the sci-kit-learn library (Pedregosa, et al., 2011).
In RQ2, we selected an $\mathrm{FSS}$ out of the 16 original $\mathrm{FS}$s. The $\mathrm{FSS}$
selection has been done by applying the SelectKBest algorithm on the FS and sorting
the rankings. Table 3 presents the performance scores of each feature using the SelectKBest algorithm.
This performance score was used to evaluate the influence of each feature on obesity
individually. The complete workflow is presented in Fig. 3.
Fig. 3. High-level diagram for RQ2.
After sorting the 16 features' correlation importance values, the 8 highest-ranking
features were selected for further processing. The feature identifier assigns an ``id''
to the respective feature for ease of analysis. The four FSSs were selected from Table 3 using Eq. (1).
where
It is interesting to note that Eq. (1) uses $FI_{id}$ with $1\leq id\leq 8,$ as discussed earlier, because the eight highest-ranking
features have been used. The four $FSS$s generated are mentioned in Table 4 with their respective features.
Table 3. Ranking of different features based on SelectKBest algorithm.
Feature Identifier (FI)
|
Feature Set
|
Performance Score
|
1
|
Weight
|
14186.71
|
2
|
Age
|
635.6424
|
3
|
Gender
|
324.9784
|
4
|
SCC
|
117.4293
|
5
|
Family history with overweight
|
113.4354
|
6
|
MTRANS
|
102.7809
|
7
|
FAF
|
71.76225
|
8
|
FCVC
|
60.32221
|
9
|
NCP
|
33.78086
|
10
|
SMOKE
|
31.46798
|
11
|
FAVC
|
27.0813
|
12
|
TUE
|
26.12619
|
13
|
CAEC
|
21.81961
|
14
|
CALC
|
21.81961
|
15
|
CH2O
|
17.40355
|
16
|
Height
|
1.066227
|
Table 4. Features corresponding to four different FSSs (feature subsets).
Fss1 |
Fss2 |
Fss3 |
Fss4 |
Weight
|
Family history with overweight
|
Weight
|
Age
|
Age
|
MTRANS
|
Gender
|
SCC
|
Gender
|
FAF
|
Family history of overweight
|
MTRANS
|
SCC
|
FCVC
|
FAF
|
FCVC
|
5. Experimental Results
Table 5 presents the performance analysis results of the shortlisted ML algorithms in terms
of standard implementations, hyper-parameter tuned implementations, and their comparison
results. It can be observed that XGB presents the best accuracy among the three algorithms.
The two ML algorithms, SVC and LR, exhibited the best results, whereas the XGB results
were not very impressive. But, in this implementation, we observed that XGB produces
better results than the other two ML algorithms. Moreover, the performance metrics
are even better for all ML implementations, including XGB. We also present the increase
or decrease in performance of the tuned versions for the standard implementation of
the ML algorithms.
Table 5. Performance analysis of selected ML algorithms on a complete feature set containing 16 features.
Model
|
Ac
|
Pr
|
Re
|
Fs
|
Cs
|
Ra
|
Normal Model
|
XGB
|
96.44
|
0.96
|
0.97
|
0.96
|
0.96
|
0.99
|
SVC
|
51.42
|
0.53
|
0.53
|
0.51
|
0.44
|
0.89
|
LR
|
64.21
|
0.63
|
0.64
|
0.63
|
0.58
|
0.91
|
Tuned Model
|
XGB
|
96.68
|
0.97
|
0.97
|
0.97
|
0.96
|
0.99
|
SVC
|
90.04
|
0.89
|
0.90
|
0.90
|
0.88
|
0.90
|
LR
|
63.03
|
0.61
|
0.63
|
0.62
|
0.57
|
0.92
|
Comparison Analysis
|
XGB
|
+0.24
|
+0.01
|
0
|
+0.01
|
0
|
0
|
SVC
|
+38.62
|
+0.36
|
+0.37
|
0.39
|
+0.44
|
+0.01
|
LR
|
-1.09
|
-0.02
|
-0.01
|
+0.01
|
-0.01
|
+0.01
|
5.1 Discussion
The four feature subsets obtained were trained and tested separately for building
an ML model for detecting obesity. Table 6 presents the performance analysis, tuned performance analysis, and comparative analysis
results of the ML algorithms on$~ FSS_{1}$ through$FSS_{4}$. We observed the differences
in the performance values for individual metrics of the algorithms. The comparative
analysis was done among the reduced and complete feature vectors. The individual reduced
feature vector comprised four features, while the complete feature vector comprised
16 features. This study mainly focused on the most relevant features that should contribute
to developing obesity.
Table 6. Performance analysis of selected ML algorithms on FSS.
FSS
|
Model
|
Ac
|
Pr
|
Re
|
Fs
|
Cs
|
Ra
|
Fss1
|
Performance Analysis
|
XGB
|
86.49
|
0.86
|
0.85
|
0.85
|
0.84
|
0.98
|
SVC
|
61.37
|
0.63
|
0.63
|
0.62
|
0.55
|
0.89
|
LR
|
51.42
|
0.46
|
0.51
|
0.46
|
0.43
|
0.85
|
Tuned Performance Analysis
|
XGB
|
88.38
|
0.88
|
0.88
|
0.88
|
0.86
|
0.98
|
SVC
|
85.87
|
0.85
|
0.84
|
0.85
|
0.83
|
0.90
|
LR
|
53.08
|
0.49
|
0.52
|
0.48
|
0.45
|
0.86
|
Comparison Analysis between Normal and Tuned ML algorithms
|
XGB
|
+1.89
|
+0.02
|
+0.03
|
+0.03
|
+0.02
|
0
|
SVC
|
+24.50
|
+0.22
|
+0.21
|
+0.22
|
+0.28
|
+0.01
|
LR
|
+1.66
|
+0.03
|
+0.01
|
+0.02
|
+0.02
|
+0.01
|
Fss2
|
Performance Analysis
|
XGB
|
55.68
|
0.55c
|
0.54
|
0.51
|
0.48
|
0.83
|
SVC
|
40.04
|
0.37
|
0.39
|
0.33
|
0.29
|
0.79
|
LR
|
39.81
|
0.37
|
0.38
|
0.33
|
0.28
|
0.74
|
Tuned Performance Analysis
|
XGB
|
54.50
|
0.55
|
0.54
|
0.54
|
0.47
|
0.84
|
SVC
|
42.18
|
0.34
|
0.40
|
0.34
|
0.31
|
0.81
|
LR
|
39.09
|
0.38
|
0.38
|
0.32
|
0.28
|
0.74
|
Comparison Analysis between Normal and Tuned ML algorithms
|
XGB
|
-1.18
|
0
|
0
|
+0.03
|
-0.01
|
+0.01
|
SVC
|
+2.14
|
-0.03
|
+0.01
|
+0.01
|
+0.02
|
+0.02
|
LR
|
-0.72
|
+0.01
|
0
|
-0.01
|
0
|
0
|
Fss3
|
Performance Analysis
|
XGB
|
81.51
|
0.82
|
0.82
|
0.82
|
0.78
|
0.97
|
SVC
|
58.29
|
0.61
|
0.59
|
0.59
|
0.51
|
0.89
|
LR
|
61.84
|
0.58
|
0.62
|
0.58
|
0.55
|
0.91
|
Tuned Performance Analysis
|
XGB
|
80.56
|
0.81
|
0.81
|
0.81
|
0.77
|
0.97
|
SVC
|
76.06
|
0.76
|
0.76
|
0.76
|
0.72
|
0.90
|
LR
|
58.53
|
0.58
|
0.59
|
0.57
|
0.51
|
0.91
|
Comparison Analysis between Normal and Tuned ML algorithms
|
XGB
|
-0.95
|
-0.01
|
-0.01
|
-0.01
|
-0.01
|
0
|
SVC
|
+17.77
|
+0.15
|
+0.17
|
+0.17
|
+0.22
|
+0.01
|
LR
|
-3.31
|
0
|
-0.03
|
-0.01
|
-0.04
|
0
|
Fss4
|
Performance Analysis
|
XGB
|
56.87
|
0.55
|
0.55
|
0.54
|
0.49
|
0.87
|
SVC
|
34.59
|
0.25
|
0.32
|
0.26
|
0.51
|
0.84
|
LR
|
40.28
|
0.39
|
0.37
|
0.33
|
0.23
|
0.77
|
Tuned Performance Analysis
|
XGB
|
60.18
|
0.58
|
0.59
|
0.58
|
0.53
|
0.87
|
SVC
|
46.91
|
0.47
|
0.45
|
0.43
|
0.37
|
0.85
|
LR
|
39.57
|
0.40
|
0.36
|
0.33
|
0.28
|
0.77
|
Comparison Analysis between Normal and Tuned ML algorithms
|
XGB
|
+3.31
|
+0.03
|
+0.04
|
+0.04
|
+0.04
|
0
|
SVC
|
+12.32
|
+0.22
|
+0.13
|
+0.17
|
-0.14
|
+0.01
|
LR
|
-0.71
|
+0.01
|
-0.01
|
0
|
+0.05
|
0
|
6. Conclusion
In this study, we have addressed two research questions about the effect of a complete
feature vector generated from the parameters influencing obesity and a much smaller
feature vector's influence on obesity. Hence, of all the parameters that are responsible
for obesity, it can be concluded that concentrating on a smaller parameter set reduces
obesity by about 91% than when all the parameters are taken into consideration.
The study’s objective was to investigate whether a smaller set of parameters can reduce
or prevent obesity to the same extent as that provided by all the parameters. The
experimental results reflect that a smaller subset of only four parameters like weight,
age, gender, and SCC influences obesity levels by almost 91% compared to the influence
subjected by the complete set of 16 parameters. The study targeted individuals who
feel inert while addressing all related factors for obesity control and who may be
more comfortable with addressing only a few factors to some extent.
REFERENCES
"Obesity and Overweight," World Health Organization, 9 June 2021. [Online]. Available:
Article (CrossRef Link) [Accessed 23 December 2021].
Global health risks: mortality and burden of disease attributable to selected major
risks, World Health Organization, 2009.
L. N. Borrell and L. Samuel, "Body Mass Index Categories and Mortality Risk in US
Adults: The Effect of Overweight and Obesity on Advancing Death," American journal
of public health, vol. 104, no. 3, pp. 512-519, March 2014.
S. V. Miklishanskaya, L. V. Solomasova and N. A. Mazur, "Types of obesity and their
prognostic value," Obesity Medicine, p. 100350, 2021.
"Obesity," Centers for Disease Control and Prevention, 21 September 2021. [Online].
Available: Article (CrossRef Link) [Accessed 23 December 2021].
R. C. Whitaker, J. A. Wright, M. S. Pepe, K. D. Seidel and W. H. Dietz, "Predicting
obesity in young adulthood from childhood and parental obesity," New England journal
of medicine, vol. 337, no. 13, pp. 869-873, 25 September 1997.
N. A. Khan, L. B. Raine, E. S. Drollette, M. R. Scudder, M. B. Pontifex, D. M. Castelli,
S. M. Donovan, E. M. Evans and C. H. Hillman, "Impact of the FITKids Physical Activity
Intervention on Adiposity in Prepubertal Children," vol. 133, no. 4, pp. 875-883,
1 April 2014.
R. C. Mollard, M. Senechal, A. C. Maclntosh, J. Hay, B. A. Wicklow, K. D. M. Wittmeier,
E. A. C. Sellers, H. J. Dean, L. Ryner, L. Berard and J. M. McGavock, "Dietary determinants
of hepatic steatosis and visceral adiposity in overweight and obese youth at risk
of type 2 diabetes," American Journal of Clinical Nutrition, vol. 99, no. 4, pp. 804-812,
April 2014.
R. R. Wing, P. Bolin, F. L. Brancati, G. A. Bray, J. M. Clark, M. Coday, R. S. Crow,
J. M. Curtis, C. M. Egan, M. A. Espeland, M. Evans, J. P. Foreyt, S. Ghazarian, E.
W. Gregg, B. Harrison, H. P. Hazuda, J. O. Hill, E. S. Horton, V. S. Hubbard, J. M.
Hubbard, R. W. Jeffery, K. C. Johnson, S. E. Kahn, A. E. Kitabchi, W. C. Knowler,
C. E. Lewis, B. J. Maschak-Carey, M. G. Montez, A. Murillo, D. M. Nathan, J. Patricio,
A. Peters, X. Pi-Sunyer, H. Pownall, D. Reboussin, J. G. Regensteiner, A. D. Rickman,
D. H. Ryan, M. Safford, T. A. Wadden, L. E. Wagenknecht, D. S. West, D. F. Williamson
and S. Z. Yanovski, "Cardiovascular effects of intensive lifestyle intervention in
type 2 diabetes," The New England Journal of Medicine, vol. 369, no. 2, pp. 145-154,
11 July 2013.
T. A. Wadden, S. Volger, D. B. Sarwer, M. L. Vetter, A. G. Tsai, R. I. Berkowitz,
S. Kumanyika, K. H. Schmitz, L. K. Diewald, R. Barg, J. Chittams and R. H. Moore,
"A Two-Year Randomized Trial of Obesity Treatment in Primary Care Practice," New England
Journal of Medicine, vol. 365, no. 21, pp. 1969-1979, 24 November 2011.
R. D. H. Devi, A. Bai and N. Nagarajan, "A novel hybrid approach for diagnosing diabetes
mellitus using farthest first and support vector machine algorithms," Obesity Medicine,
p. 100152, 2020.
H. A. Afolabi, Z. B. Zakaria, M. N. M. Hashim, C. R. Vinayak and A. B. A. Shokri,
"Body Mass Index and predisposition of patients to knee osteoarthritis," Obesity Medicine,
p. 100143, 2019.
S. L. V. N. Filho, F. C. S. d. P. Gonçalves, P. I. C. de Lira, A. M. Ribeiro, S. H.
Eickmann and M. d. C. Lima, "Birth weight and postnatal weight gain as predictors
of abdominal adiposity in childhood and adolescence: A cohort study in northeast Brazil,"
Obesity Medicine, p. 100379, 2022.
"Estimation of obesity levels based on eating habits and physical condition Data Set,"
27 August 2019. [Online]. Available: Article (CrossRef Link) [Accessed 23 December
2021].
Y. Celik, S. Guney and B. Dengiz, "Obesity Level Estimation based on Machine Learning
Methods and Artificial Neural Networks," in 2021 44th International Conference on
Telecommunications and Signal Processing (TSP), Brno, Czech Republic, 2021.
F. Ferdowsy, K. S. A. Rahi, M. I. Jabiullah and M. T. Habib, "A machine learning approach
for obesity risk prediction," Current Rin Behavioral Sciences, vol. 2, November 2021.
X. Pang, C. B. Forrest, F. Lê-Scherban and A. J. Masino, "Prediction of early childhood
obesity with machine learning and electronic health record data," International Journal
of Medical Informatics, vol. 150, June 2021.
Z. Zheng and K. Ruggiero, "Using machine learning to predict obesity in high school
students," in 2017 IEEE International Conference on Bioinformatics and Biomedicine
(BIBM), Kansas City, MO, USA, 2017.
S. A. Thamrin, D. S. Arsyad, H. Kuswanto, A. Lawi and S. Nasir, "Predicting Obesity
in Adults Using Machine Learning Techniques: An Analysis of Indonesian Basic Health
R2018," vol. 8, 21 June 2021.
T. M. Dugan, S. Mukhopadhyay , A. Carroll and S. Downs, "Machine Learning Techniques
for Prediction of Early Childhood Obesity," Applied Clinical Informatics, vol. 6,
no. 3, pp. 506-520, 12 August 2015.
B. Singh and H. Tawfik, "Machine Learning Approach for the Early Prediction of the
Risk of Overweight and Obesity in Young People," in Computational Science – ICCS 2020,
2020.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel,
P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M.
Brucher, M. Perrot and E. Duchesnay, "Scikit-learn: Machine Learning in Python," vol.
12, no. 2011, pp. 2825-2830, 2011.
C. A. C. Montañez, P. Fergus, A. Hussain, D. Al-Jumeily, B. Abdulaimma, J. Hind and
N. Radi, "Machine learning approaches for the prediction of obesity using publicly
available genetic profiles," in 2017 International Joint Conference on Neural Networks
(IJCNN), Anchorage, AK, USA, 2017.
K. Rajput, G. Chetty and R. Davey, "Obesity and Co-Morbidity Detection in Clinical
Text Using Deep Learning and Machine Learning Techniques," in 2018 5th Asia-Pacific
World Congress on Computer Science and Engineering (APWC on CSE), Nadi, Fiji, 2018.
Author
Subhash Mondal (Member, IEEE) received a B.TECH. and M.TECH. degrees in Computer
Science & Engineering from the University of Calcutta, Kolkata, India, in 2005 and
2007, respectively. He is pursuing a Ph.D. in CSE at the Central Institute of Technology
Kokrajhar, Kokrajhar, Assam, India. He works as an Assistant Professor in CSE (AI
& ML) at Dayananda Sagar University, Bengaluru, Karnataka, India. He has more than
17 years of teaching experience. His research interest is in the fields of machine
learning, deep learning, and natural language processing. He has published more than
30 research publications. He is a professional member of IEEE and ACM.
Mithun Karmakar is currently employed as an Assistant Professor at the Central
Institute of Technology Kokrajhar (CITK), Kokrajhar, Assam, India. He received his
M.Tech. degree in Information Technology from Tezpur University, Assam, in 2009. He
is pursuing his Ph.D. in Computer Science and Engineering at the CITK Assam, India.
He has more than 13 years of teaching experience and five research publications. His
research interests include artificial intelligence, deep learning, and image processing.
Amitava Nag (Senior Member, IEEE) is currently working as a Professor of Computer
Science and Engineering at the Central Institute of Technology Kokrajhar, Kokrajhar,
Assam, India. He has more than 70 research publications in various international journals
and conference proceedings. His research interests include IoT, information security,
machine learning, and deep learning. He is also a fellow of the Institution of Engineers
India (IEI).