(Jaeseo Choi)
1
(Changuk Choi)
2
(Heera Ha)
3
(Junghwan Kim)
4*
(Kyeongbo Kong)
5
(Jaewon Royce Choi)
6
-
(College of Media and Communication, Korea University / Seoul, South Korea)
-
(Department of Media and Communication, Pukyong National University / Busan, South
Korea)
-
(Department of Media and Communication, Pukyong National University / Busan, South
Korea)
-
(College of Media and Communication, Korea University / Seoul, South Korea)
-
(Department of Electronics Engineering, Pusan National University / Busan, South Korea)
-
(Manship School of Mass Communication, Louisiana State University / Louisiana, United
States)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Keywords
Computer vision, Research trend, Semantic network analysis (SNA)
1. Introduction
Computer vision is the field of artificial intelligence (AI) in which computers and
systems can derive meaningful information from digital images, video, and other inputs.
Intelligent information technologies, particularly those focused on AI and big data,
influence industries broadly. In media services and content production, AI technologies,
especially computer vision, are driving digital transformation and improving services
in practical and diverse ways [1]. Among machine learning techniques, computer vision can play a role in extracting
critical information from images. Devices that capture images can extract information
about both the external features and internal structures of target objects [2]. Building on these characteristics, computer vision has contributed to detection
systems, optical character recognition, robotics, and suspect identification [3].
The potential of computer vision has led to numerous studies and a wealth of published
research. Accordingly, various computer vision-related service technologies have been
explored, creating new markets and industry revitalization. Several studies have been
conducted to identify the research trends in machine learning [4-6]. However, despite the widespread use of computer vision technology and its scope,
the analysis of research trends has been insufficient. Research and technical discourse
on computer vision are expanding, and the technology is being applied across various
domains, significantly impacting the video content industry. Thus, examining this
technological trend is expected to have implications for both industry and academia
in the media field. Therefore, this study aims to understand the chronological trends
in international research on computer vision and, based on this, suggest directions
for future research development.
2. Related Work
2.1. Computer Vision in the Media Industry
Processing photos or videos through various operations for specific purposes is referred
to as digital image processing, which serves as a foundation for computer vision.
Digital image processing digitizes images and represents them as functions, enabling
various operations for machine learning and deep learning [1]. Based on these characteristics, computer vision technology dramatically influences
the media industry, especially the overall value chain of video content production.
Computer vision and digital platforms are intricately linked in today’s innovative
business environment. Incorporating computer vision technology into the media industry
yields use cases across platform services, video content production, and functionalities
embedded within smart devices. From a platform service perspective, computer vision
technology has applications in image retrieval, personalized recommendation systems,
and content generation facilitation. Leading portal platforms, such as Naver and Google,
highlight this integration through services, such as Naver Smart Lens and Google Lens.
These offerings empower users to initiate searches based on the images they upload.
This functionality extends beyond simple image search and is widely applied in e-commerce.
Notably, OMNIOUS.AI stands out by delivering a proprietary AI-driven commerce solution
encompassing OMNIOUS tagger, lens, and search components. This innovative solution
caters to specialized fashion malls, such as KakaoStyle (ZIGZAG), ABLY, Hyundai Home
Shopping, and Lotte ON. The technology effectively supports product searches and provides
recommendations for similar products, significantly enhancing user experience and
engagement. Additionally, computer vision technology extends to the webtoon industry,
demonstrating its role in identifying and mitigating unauthorized distribution of
creative work. Automating coloring processes through computer vision tools has significantly
elevated the efficiency and productivity of webtoon creators, augmenting their artistic
output and streamlining the creative workflow [7].
A representative example of computer vision technology applied in video content production
is as follows. Disney employs factorized variational autoencoders for emotion analysis
via facial recognition and to develop technology that predicts movie success based
on this analysis [8]. Additionally, tvN, a leading content provider, used face editing technology–an AI
application in computer vision–in the drama Navillera, producing results that were
difficult for actors to replicate in person [9]. Recently, object-erasing technology has been used with image segmentation, and the
generative adversarial network (GAN) has been introduced in film, drama, entertainment
shows, and movies to erase logos and institutional names [10].
Finally, in smart devices, computer vision is integrated into core functions, offering
diverse capabilities that have already become part of everyday life. For instance,
functionalities include extracting textual content from images, applying correction
filters and composition features during photo capture, and addressing problems, such
as shaky hands. Notably, in the recent release of iOS 16, Apple introduced a novel
feature that allows users to extract specific objects from videos and photos, enabling
their insertion into messages, notes, emails, and more. Similarly, Samsung’s recent
Galaxy series incorporated the photo remaster feature, which enhances low-quality
photos with minimal degradation, and the AI eraser feature, allowing users to seamlessly
remove unwanted objects from their images [11].
Beyond these examples, computer vision has diverse applications across the media industry,
often synergizing with deep learning technologies to create novel forms of digital
content. Consequently, as discussions and explorations of computer vision expand within
the media industry, its influence extends across multiple sectors, including media
and video content production. This demonstrates both the practical and academic influence
of discourse on computer vision trends, highlighting the technology’s significance
and future trajectory. Therefore, this study aims to provide insight into the media
industry by comprehending the global research trends associated with computer vision.
2.2. Research Trends from Literature Analysis
Several studies have used research articles to examine trends in a specific field.
In engineering, article titles, abstracts, and keywords have been used to identify
research trends and suggest implications for future research [4,
12,
13].
Kang et al. conducted a systematic literature analysis using academic papers to examine
research trends in machine learning applications for production lines, focusing on
the digitization of the entire production process [12]. Semeraro et al. reviewed the literature on machine learning techniques in the context
of human-robot collaboration [13]. Hoekstra et al. analyzed literature on training text-event prediction algorithms
with natural language processing in healthcare, using research data from the healthcare
domain [4].
As noted, prior studies have sought to identify research trends and suggest future
directions using machine learning techniques. However, differences in analysis methods
were observed depending on the characteristics of each research area. For example,
Zhang et al. identified research trends using keyword network analysis with virtual
reality application-related data in the architecture and engineering fields [14]. Avsar and Mowla identified emerging technologies with high potential for future
applications in smart agriculture through a research trend analysis of IoT-based agricultural
systems [15].
The preceding research endeavors have investigated trends and anticipated future directions
in fields directly influenced by AI. In traditional bibliometrics, methods involving
keyword analysis of documents, such as papers and reports, have been used to comprehend
trends within specific domains [16]. More recently, with the advancement of data science and data analysis software,
semantic network analysis (SNA) using big data analysis methods has emerged to uncover
research trends in particular disciplines [17]. In addition, SNA automatically extracts words from text-based documents and analyzes
their inherent meanings based on co-occurrence relationships between words [18]. Given its capability to reveal latent semantic structures and contexts embedded
in text, SNA is a highly advantageous methodology for analyzing tendencies and patterns
across various domains [19].
The preceding studies employed SNA to comprehensively grasp research trends in big
data. Analyzing keyword centrality, they successfully discerned vital research areas
and themes within big data research. Based on the literature review of computer vision,
this study aims to identify overall research trends, categorize research areas, and
determine trending keywords in computer vision research by year using keyword network
analysis.
3. Study Method
3.1. Data Collection
This study collected articles published between 2017 and 2022 from the Computer Vision
and Pattern Recognition (CVPR) Conference, the European Conference on Computer Vision
(ECCV), and the International Conference on Computer Vision (ICCV), which are leading
global conferences in computer vision. CVPR, ECCV, and ICCV are widely recognized
as the foremost international conferences in engineering, particularly in computer
vision. These conferences serve as premier platforms for researchers and practitioners
in computer vision and related domains to present, discuss, and advance cutting-edge
research findings, ranking among Google Scholar’s top publications.
We used the Web of Science database to collect conference articles, which provided
most of the data. Web of Science was used for this analysis due to its trustworthiness
and reliability, as researchers widely use it to determine titles, countries, and
affiliated organizations. This study included articles from the Web of Science database,
most of which contained information on titles, countries, and affiliated organizations.
The research collected conference paper data not covered by the Web of Science from
the “Accepted Paper List” provided by each conference’s website for each year. Specifically,
15,328 papers were gathered from the CVPR, ECCV, and ICCV conferences.
3.2. Data Analysis Methodology
3.2.1 Semantic network analysis (SNA)
Before analyzing the collected studies, we preprocessed keywords with overlapping
meanings for each research title. After preprocessing, a keyword network analysis
was conducted using VOSviewer software (v1.6.19), which is widely used to visualize
research data in semantic networks and to examine research trends in computer vision.
The visualization approach is based on the distance between nodes. Thus, a short distance
between nodes reveals a closer relationship [18].
To examine research trends and international collaborations, we conducted an SNA on
the titles and coauthorship data of the conference articles. Through the SNA of the
article titles, we identified the frequently researched keywords within the computer
vision field. Additionally, by analyzing the national collaborations in coauthored
papers, we gained insight into the interconnected relationships between countries
engaged in collaborative research.
3.2.2 Annual research trends
We analyzed the distribution of institutions engaged in computer vision research using
author affiliations. Research presented at international conferences in computer vision
tends to originate from universities, corporations, and government-affiliated research
centers. In this study, organizations were categorized into four groups: universities
(U), corporations (C), government-affiliated research centers (G), and others (E).
This categorization enabled an investigation of trends in the number of papers by
year and by type of research organization.
Following prior studies on development trends in computer vision [20] and session categorizations at major conferences, we classified research article
titles into four categories–study field, task, learning method, and learning model–to
better understand research trends in computer vision. The field categories were divided
into image, 3D, video, and language modalities. The tasks were divided into classification,
detection, and segmentation. Recognition and categorization were recoded as classification.
The learning methods were divided into representation, deep, shot, unsupervised, semi-supervised,
supervised, and reinforcement learning. Finally, the learning models were divided
into networks, such as the GAN, convolutional neural network (CNN), deep neural network
(DNN), graph neural network (GNN), recurrent neural network (RNN), and transformer.
4. Study Results
4.1. Annual Publications
Fig. 1. Annual publications.
The trend in the number of papers presented at major international conferences in
computer vision from 2017 to 2022 is illustrated in Fig. 1 Research in computer vision has consistently exhibited a rising trend every year
since 2017 (Fig. 1).
4.2. Semantic Network Analysis
4.2.1 Title
After refining the collected paper titles using VOSviewer, 28,198 keywords were generated.
The most frequently appearing keyword was video (258 times), followed by image (199),
end (177), object detection (166), wild (141), and semantic segmentation (136). One
keyword was used more than 200 times, whereas nine were used between 100 and 200 times,
and 22 were used more than 50 times. The study fields and tasks commonly used in computer
vision research and learning methods and models employed in computer vision AI training
were prominently observed (Fig. 2). Table 1 summarizes keywords used more than 50 times with their frequencies.
Table 1. Keywords Frequency Analysis
|
Frequency
|
Keywords
|
|
200 times and above ($n = 1$)
|
video (258)
|
|
100-200 times ($n = 9$)
|
image (199), end (177), object detection (166), wild (141), semantic segmentation
(136), GAN (116), transformer (113), person re-identification (113), human pose estimation
(101)
|
|
50 times and above ($n = 22$)
|
CNN (99), deep neural network (84), single image (80), object (77), motion (76), point
clouds (73), unsupervised domain adaptation (68), depth (65), action recognition (65),
translation (61), shot (61), dataset (60), segmentation (58), reconstruction (55),
neural network (55), fast (55), shape (54), network (54), model (54), efficient (52),
representation (51), autonomous (51)
|
Fig. 2. Semantic network analysis results of the article titles.
4.2.2 Coauthor Countries Analysis
The SNA based on the coauthors’ nationality was conducted (Fig. 3) to identify the collaborative research dynamics in computer vision among nations.
The analysis of collaborative research trends in international computer vision conferences
reveals that the US has the highest number of papers, citations, and total links (3,732).
Moreover, Table 2 presents the ranking of countries based on total links: the US (3,732), People’s
Republic of China (3,557), Australia (1,049), England (932), Switzerland (909), Germany
(891), Singapore (746), Canada (600), South Korea (464), and France (395).
Table 2. Top 10 Countries based on the Total Link Strength
|
Country
|
Documents
|
Citations
|
Total links
|
|
USA
|
5958
|
443151
|
3732
|
|
People’s Republic of China
|
5779
|
336909
|
3557
|
|
Australia
|
908
|
56942
|
1049
|
|
England
|
882
|
63978
|
932
|
|
Switzerland
|
673
|
30074
|
909
|
|
Germany
|
975
|
52870
|
891
|
|
Singapore
|
677
|
32416
|
746
|
|
Canada
|
627
|
29921
|
600
|
|
South Korea
|
806
|
33228
|
464
|
|
France
|
428
|
22271
|
395
|
4.3. Affiliated Organization Analysis
An analysis was conducted on the affiliations of papers presented at international
computer vision conferences from 2017 to 2022 (Fig. 4). The analysis revealed that university-affiliated institutions accounted for the
majority (51.29%) of the papers, indicating that research conducted solely by universities
was most prevalent. Furthermore, collaborations between universities and corporations
ranked second in frequency, followed by joint research efforts between universities
and government-affiliated research institutions.
The year-by-year trends in collaborative computer vision research across affiliations
revealed notable patterns (Tables 3 and 4). Overall, collaborations between universities and corporations displayed a substantial
increase. Additionally, joint research between universities, corporations, and government-affiliated
institutions generally grew faster than research conducted by each institution independently.
The result suggests a trend toward increased collaborative research endeavors rather
than independent research efforts by these institutions.
Fig. 3. Semantic network analysis results of the coauthor countries.
Table 3. Annual changes in affiliated organizations
|
Affiliation
|
Years
|
|
2017
|
2018
|
2019
|
2020
|
2021
|
2022
|
Total
|
|
U
|
841
|
1047
|
1271
|
1234
|
1835
|
1634
|
7862
|
|
C
|
55
|
77
|
140
|
176
|
202
|
196
|
846
|
|
G
|
43
|
31
|
28
|
35
|
30
|
46
|
213
|
|
U, C
|
229
|
373
|
583
|
926
|
648
|
1399
|
4158
|
|
U, G
|
188
|
168
|
270
|
265
|
405
|
229
|
1525
|
|
C, G
|
6
|
10
|
8
|
12
|
5
|
27
|
68
|
|
U, C, G
|
37
|
42
|
56
|
142
|
132
|
150
|
559
|
|
E
|
5
|
7
|
11
|
34
|
11
|
29
|
97
|
Table 4. Year-over-year growth rate for affiliation organizations.
|
Affiliation
|
Years
|
|
2018
|
2019
|
2020
|
2021
|
2022
|
Total
|
|
U
|
24.5
|
21.4
|
-2.9
|
48.7
|
-11.0
|
94.3
|
|
C
|
40.0
|
81.8
|
25.7
|
14.8
|
-3.0
|
256.4
|
|
G
|
-27.9
|
-9.7
|
25.0
|
-14.3
|
53.3
|
7.0
|
|
U, C
|
62.9
|
56.3
|
58.8
|
-30.0
|
115.9
|
510.9
|
|
U, G
|
-10.6
|
60.7
|
-1.9
|
52.8
|
-43.5
|
21.8
|
|
C, G
|
66.7
|
-20.0
|
50.0
|
-58.3
|
440.0
|
350.0
|
|
U, C, G
|
13.5
|
33.3
|
153.6
|
-7.0
|
13.6
|
305.4
|
|
E
|
40.0
|
57.1
|
209.1
|
-67.6
|
163.6
|
480.0
|
4.4. Annual Keyword Frequency Analysis
4.4.1 Research field
Based on the categorized keywords, research related to images, videos, 3D, and multimodal
language steadily increased (Table 5 and Fig. 5). During the period, there were 2,273 articles related to images, 1,380 to videos,
1,350 to 3D, and 335 to multimodal language. The number of research articles with
these keywords more than doubled from 2017 to 2022. In particular, research on images
grew rapidly after 2018, while other computer vision fields increased more gradually.
Research on 3D followed a similar upward trend but stagnated in 2020. However, with
the increase in research on videos, research on 3D again increased after 2021. Moreover,
research on multimodal language, considered a distinct field, has increased since
2021.
Fig. 4. Affiliated organization analysis results; university (U), corporation (C),
government-affiliated research centers (G).
Table 5. Annual changes in research field.
|
Keyword
|
Years
|
|
2017
|
2018
|
2019
|
2020
|
2021
|
2022
|
Total
|
|
image
|
195
|
265
|
394
|
415
|
446
|
558
|
2273
|
|
video
|
131
|
169
|
193
|
228
|
304
|
355
|
1380
|
|
3D
|
91
|
137
|
182
|
252
|
298
|
360
|
1350
|
|
language (multimodal)
|
29
|
24
|
32
|
52
|
67
|
131
|
335
|
Fig. 5. Annual changes in research fields.
4.4.2 Task
Detection was the most frequently studied task, with 1,313 papers. Then, research
on segmentation and classification followed at 1,108 and 1,046. In 2017, the first
year analyzed in this study, classification was the most researched task, remaining
dominant until 2018. However, from 2018, research on detection and segmentation increased
rapidly, surpassing classification (Table 6 and Fig. 6).
4.4.3 Learning method
Significant changes were observed in learning methods, including representation learning,
deep learning, shot learning, unsupervised learning, semi-supervised learning, supervised
learning, and reinforcement learning. The analysis highlights dynamic shifts in learning
methods over time. Deep learning dominated in the early stages, but its prevalence
shifted over time. Since deep learning is a broad term, more specific learning methods
are now emphasized. Research on representation and unsupervised learning methods increased
significantly, whereas deep and reinforcement learning research declined. Specifically,
supervised, semi-supervised, and unsupervised learning increased until 2018, after
which growth slowed. Unsupervised learning research increased rapidly after semi-supervised
learning gained attention in 2019. This suggests that growth in semi-supervised learning
boosted research on unsupervised learning (Table 7 and Fig. 7).
Table 6. Annual changes in tasks.
|
Keyword
|
Years
|
|
2017
|
2018
|
2019
|
2020
|
2021
|
2022
|
Total
|
|
classification
|
124
|
139
|
157
|
211
|
194
|
211
|
1046
|
|
detection
|
89
|
125
|
211
|
255
|
283
|
350
|
1313
|
|
segmentation
|
73
|
104
|
174
|
204
|
263
|
290
|
1108
|
Fig. 6. Annual changes in tasks.
Table 7. Annual changes in learning method.
|
Keyword
|
Years
|
|
2017
|
2018
|
2019
|
2020
|
2021
|
2022
|
Total
|
|
representation learning
|
18
|
18
|
50
|
73
|
111
|
118
|
388
|
|
deep learning
|
61
|
78
|
77
|
63
|
53
|
39
|
371
|
|
shot learning
|
14
|
18
|
49
|
46
|
51
|
64
|
242
|
|
unsupervised learning
|
22
|
27
|
45
|
55
|
101
|
102
|
352
|
|
semi supervised learning
|
2
|
8
|
12
|
25
|
29
|
32
|
108
|
|
supervised learning
|
8
|
16
|
13
|
19
|
18
|
15
|
89
|
|
reinforcement learning
|
9
|
21
|
9
|
22
|
7
|
6
|
74
|
Fig. 7. Annual changes in learning methods.
4.4.4 Learning model
Analysis of learning models revealed that research on transformers has increased dramatically
since 2020. Hence, the studies on the CNN and RNN demonstrated a significant decrease
after the rise of the transformer. In contrast, research on the GAN and DNN was steadily
conducted until 2021 (Table 8 and Fig. 8).
Table 8. Annual changes in learning model.
|
Keyword
|
Years
|
|
2017
|
2018
|
2019
|
2020
|
2021
|
2022
|
Total
|
|
convolutional neural network (CNN)
|
65
|
63
|
87
|
69
|
29
|
33
|
346
|
|
deep neural network (DNN)
|
39
|
62
|
65
|
71
|
64
|
54
|
355
|
|
graph neural network (GNN)
|
2
|
2
|
3
|
10
|
15
|
11
|
43
|
|
recurrent neural network (RNN)
|
7
|
7
|
4
|
3
|
1
|
1
|
23
|
|
generative adversarial network (GAN)
|
21
|
61
|
89
|
91
|
86
|
98
|
446
|
|
transformer
|
4
|
6
|
5
|
23
|
135
|
346
|
519
|
Fig. 8. Annual changes in learning models.
5. Conclusion
This study examined academic research trends in computer vision through analyses of
semantic networks, international collaborations, institutional affiliations, and keyword
frequencies across years and research domains. The study collected conference papers
from major international computer vision conferences (i.e., CVPR, ECCV, and ICCV)
from 2017 to 2022, resulting in 15,328 paper titles for analysis. In addition, SNA
was conducted using VOSviewer.
Keyword frequency analysis showed that “video” emerged as the predominant keyword,
consistent with the nature of computer vision research. Subsequent critical terms
included “image,” “end,” “object detection,” “wild,” “semantic segmentation,” “GAN,”
“transformer,” “person re-identification,” and “human pose estimation.” Next, four
distinct clusters were identified through a cluster analysis derived from semantic
network connections. Among them, one cluster consists of keywords significantly influencing
the entire spectrum of computer vision research, whereas another cluster reveals a
concentration of keywords associated with specific research domains.
Based on the study findings, research in the computer vision field is predominantly
conducted in the US and China. These results align with the trends observed in previous
research that aimed to understand tendencies in computer vision research [20]. Analysis of international collaborations revealed that partnerships in the US mainly
involve corporations and universities. In contrast, universities and governmental
institutions in China have a relatively higher involvement in collaborative research
endeavors. In particular, the US stands out for its vibrant computer vision research
activities facilitated by global tech giants, such as Google, Microsoft, Amazon, and
Facebook, which have established headquarters and research centers dedicated to the
field. In contrast, China’s research landscape displays a different pattern, with
research activities concentrated in institutions, such as the Chinese Academy of Sciences,
a prominent national research institute.
According to the study findings, the research landscape in computer vision is characterized
by similarities and differences between the US and China. Although the number of research
papers and extent of collaborative research appear comparable, citation influence
significantly favors the US over China. The Center for Security and Emerging Technology
reported that China’s AI research output has increased in quantity, yet its quality
and influence still lag behind those of the US [21].
In the US, the National Science Foundation, under the Department of Commerce, primarily
supports long-term research and development projects at universities and research
institutions. The 2016 National AI R&D Strategic Plan emphasizes federal government
data, computing resources open to the public, and international cooperation. The 2019
Public-Private Partnership initiative promotes collaboration between industry, academia,
and government agencies to drive technological advancement [22].
In contrast, China has embarked on establishing the hardware infrastructure for AI
development, exemplified by the construction of the exascale supercomputer called
Tianhe-3 under the leadership of the National University of Defense Technology. China
is fostering its open ecosystem in the realm of AI frameworks. The Chinese government’s
plan to build the National Open Platform for Next-Generation Artificial Intelligence
indicates a strategy to create an independent open-source ecosystem, diverging from
US-led open-source initiatives [22].
The study results elucidate the different AI technology promotion policies in computer
vision between the US and China. The study provides insight into the national-level
concentration and dynamics of research in computer vision and serves as a meaningful
indicator of shifts in AI power dynamics.
Based on the study findings, collaborative efforts between academia and industry increasingly
shape the research landscape in computer vision [23]. Notably, the joint research endeavors between universities and companies experienced
a remarkable 510.9% growth from 2017 to 2022, aligning with previous trends in computer
vision research collaborations. The orientation of collaborative projects in computer
vision entails variations based on the predominant stakeholders in the research process
[24,
25]. Research conducted by companies often stems from immediate business goals and specific
product functionalities, providing strong motivational incentives. Conversely, academic
research tends to pursue diverse objectives and innovative ideas that may not have
been explored in previous studies, potentially leading to more imaginative outcomes
than corporate research. Given these distinctive attributes of institutional research,
collaborative ventures between companies and universities are expected to achieve
a harmonious balance between corporate objectives and academic creativity. This anticipation
underscores the establishment of collaborative frameworks that can foster synergistic
contributions, cultivating a thriving ecosystem within computer vision research.
According to this study, the frequent appearance of the keyword transformer in the
learning model domain since 2020 can be attributed to the innovative performance of
the transformer model introduced by Google. Although initially successful in natural
language processing, the transformer’s adaptable structure and performance have led
to active research and applications in image processing and computer vision [26]. Experts have predicted that learning models would shape the paradigm of AI, and
the findings of this paper support that notion. Moreover, these results align with
the trajectory observed since the emergence of the CNN winning in ImageNet in 2012,
followed by the impressive performance of the transformer model in natural language
processing, triggering a surge in related research. The outcome implies that computer
vision research heavily relies on emerging state-of-the-art applications for specific
tasks, shaping the field’s advancements.
This study focused on major international conferences in the computer vision field
to analyze research topics, countries, institutions, and trends over the years using
SNA and a year-by-year trend analysis. However, a limitation of this study is that
it did not cover recent years, including the latest trends in generative AI, which
has become a vigorously researched area. Future research should consider the publication
timelines of papers presented at internationally prominent computer vision conferences,
while also comprehensively examining policy reports supporting AI development in different
countries. This approach will help identify the latest trends in computer vision and
suggest collaborative directions for technological and industrial advancement. In
conclusion, to establish a robust ecosystem for AI and computer vision research, future
studies should encompass the latest trends, including analyses of critical topics
across nations and institutions. This comprehensive approach is expected to provide
more comprehensive insight into the significant research trends of various stakeholders.
Acknowledgement
This research was supported by the Global Joint Research Program funded by the Pukyong
National University (202301000001, 30%), the Ministry of Education of the Republic
of Korea and the National Research Foundation of Korea (NRF-2023S1A5C2A03095169, 30%),
and the Institute of Information & Communications Technology Planning & Evaluation
(IITP)-ITRC (Information Technology Research Center) grant funded by the Korea government
(Ministry of Science and ICT) (IITP-2026-RS-2020-II201749, 40%).
References
P. Daemin , A study on the applicability of media videos of deep learning models
related to computer vision, Communication Theories, Vol. 18, No. 1, pp. 111-154, 2022

M. Modzelewska-Kapituła , S. Jun , The application of computer vision systems
in meat science and industry: A review, Meat Science, Vol. 192, 2022

A. A. Khan , A. A. Laghari , S. A. Awan , Machine learning in computer vision:
A review, EAI Endorsed Transactions on Scalable Information Systems, Vol. 8, No. 32,
pp. e4, 2021

O. Hoekstra , W. Hurst , J. Tummers , Healthcare related event prediction from
textual data with machine learning: A systematic literature review, Healthcare Analytics,
Vol. 2, 2022

S. Filom , A. M. Amiri , S. Razavi , Applications of machine learning methods
in port operations: A systematic literature review, Transportation Research Part E:
Logistics and Transportation Review, Vol. 161, 2022

A. Heidari , N. J. Navimipour , M. Unal , Applications of ML/DL in the management
of smart cities and societies based on new trends in information technologies: A systematic
literature review, Sustainable Cities and Society, Vol. 85, 2022

S. Jung , AI Lee Hyun-se is coming out. . . Even after the death of a cartoonist,
a new work is possible, 2022

M. Gianluca , Disney is using facial recognition to predict how you’ll react to
movies, Mashable, 2017

H. K. Shin , CJ OliveNetworks applies ‘AI face synthesis’ technology to tvN drama
‘Navillera’, New Daily Economy, 2021

C. Changuk , J. Yumi , K. Junghwan , Discourse analysis on deepfake technology
using text mining, Journal of Korean Institute of Communications and Information Sciences,
Vol. 47, No. 6, pp. 870-881, 2022

S. Woo , Large capacity battery and AI eraser. . . Galaxy A53·33 5G, which costs
590,000 won, will be released, 2022

Z. Kang , C. Catal , B. Tekinerdogan , Machine learning applications in production
lines: A systematic literature review, Computers & Industrial Engineering, Vol. 149,
2020

F. Semeraro , A. Griffiths , A. Cangelosi , Human-robot collaboration and machine
learning: A systematic review of recent research, Robotics and Computer-Integrated
Manufacturing, Vol. 79, 2022

Y. Zhang , H. Liu , S. C. Kang , M. Al-Hussein , Virtual reality applications
for the built environment: Research trends and opportunities, Automation in Construction,
Vol. 118, 2020

E. Avsar , M. N. Mowla , Wireless communication protocols in smart agriculture:
A review on applications, challenges and future trends, Ad Hoc Networks, Vol. 136,
2022

M. Callon , J. P. Courtial , F. Laville , Co-word analysis as a tool for describing
the network of interactions between basic and technological research: The case of
polymer chemistry, Scientometrics, Vol. 22, pp. 155-205, 1991

P. Jang , Analysis of international research trends in Metaverse: Focusing on the
publications in Web of Science indexed journals, Journal of Korea Society of Computer
and Information, Vol. 27, No. 10, pp. 155-162, 2022

Y. J. Lee , J. Y. Park , Emerging gender issues in Korean online media: A temporal
semantic network analysis approach, Journal of Contemporary Eastern Asia, Vol. 18,
No. 2, pp. 118-141, 2019

S. Hwang , M. Kim , An analysis of artificial intelligence (A.I.) related studies’
trends in Korea focused on topic modeling and semantic network analysis, Journal of
Digital Contents Society, Vol. 20, No. 9, pp. 1847-1855, 2019

Y. Ci , F. Jiao , W. Tu , J. Fang , Analysis on development tendency of computer
vision and graphics based on bibliometrics, Journal of Physics: Conference Series,
Vol. 1883, No. 1, 2021

A. Acharya , B. Dunn , Comparing US and Chinese contributions to high-impact
AI research, CSET Data Brief, 2022

K. Park , J. Kim , Points to watch for the US-China AI hegemony competition in
the post-COVID era, 2020

I. Kotseruba , M. Papagelis , J. K. Tsotsos , Industry and academic research
in computer vision, arXiv preprint, 2021

R. Rothe , Bringing machine learning research to product commercialization, 2018

A. Sahuguet , Personal views on the future of artificial intelligence, 2016

A. Parvaiz , M. A. Khalid , R. Zafar , H. Ameer , M. Ali , M. M. Fraz
, Vision transformers in medical computer vision-A contemplative retrospection, Engineering
Applications of Artificial Intelligence, Vol. 122, 2023

Jaeseo Choi received his B.A. degree in mass communication from Pukyong National University
and his M.A. degree in media and communication. He is currently a Ph.D. candidate
in the College of Media and Communication at Korea University. His research interests
include media technology, the entertainment industry, and media business.
Changuk Choi received his B.A. degree in media and communication from Pukyong National
University, followed by the acquisition of master’s degree from the Department of
Media and Communication. His research interests include media artificial intelligence,
platform services, small and medium-sized enterprises.
Heera Ha received her B.A. degree in media & communication from Pukyong National University,
followed by the acquisition of master’s degree from the Department of Media & Communication.
Her research interests include the content industry.
Junghwan Kim received his B.A., M.A., and Ph.D. degrees in media and communication
from Korea University, Seoul, Korea. Dr. Kim worked as a research fellow for NAVER
from 2014 to 2020 and was a visiting scholar at the Institute for Communication Research
in The Media School at Indiana University in the United States. He is currently an
associate professor in College of Media and Communication at Korea University. His
research interests include technology and entertainment, entrepreneurship, digital
platform ecosystem, and emerging media.
Kyeongbo Kong received his B.S. degree in electronics engineering from Sogang University,
Seoul, South Korea, in 2015, and his M.S. and Ph.D. degrees in electrical engineering
from the Pohang University of Science and Technology (POSTECH), Pohang, South Korea,
in 2017 and 2020, respectively. From 2020 to 2021, he worked as a postdoctoral fellow
with the Department of Electrical Engineering, POSTECH, Pohang, South Korea. From
2021 to 2023, he was an assistant professor of Media School at Pukyong National University,
Busan. He is currently an assistant professor of Electronics Engineering at Pusan
National University. His current research interests include image processing, computer
vision, machine learning, and deep learning.
Jaewon Royce Choi is an assistant professor in digital advertising at LSU’s Communication.
After earning his Ph.D. from UT Austin, he served as a postdoctoral research fellow
at the Spiegel Research Center within Northwestern University’s Medill School of Mass
for two years, focusing on marketing communications and audience engagement. Choi’s
research explores the dynamic interplay between digital information technologies and
consumer engagement, examining the effects of consumer engagement and the broader
social implications of technology on advertising. His methodological expertise spans
a range of quantitative approaches, including survey research, statistical modeling,
and computational social science techniques.