Mobile QR Code QR CODE

2025

Reject Ratio

81.5%

Title A Study on Text-to-image Model-based Dataset for Image Classification
Authors (Dabin Kang) ; (Chae-yeong Song) ; (Dong-hun Lee) ; (Dong-shin Lim) ; (Sang-hyo Park)
DOI https://doi.org/10.5573/IEIESPC.2026.15.2.236
Page pp.236-244
ISSN 2287-5255
Keywords Image classification; Computer vision; Generative model; Object detection
Abstract Recent advances in image generation technology have led to the active development and remarkable performance of large-scale text-to-image models. With the development of image generation models, research on applying generated images to deep learning models has also evolved. However, the majority of research has focused on the differences between generated and real images, with minimal exploration of their potential as alternatives to image classification dataset. This paper suggests a novel framework that generates an image dataset using text-toimage models with LLM and COCO2017 captions, and refines the images for classification tasks by ranking them with the CLIP Score. Two text-to-image models are employed to create datasets with generated images and their accuracy is assessed in object classification. The images were generated from multiple perspectives by varying the types of generative models and the composition of prompts, and the dataset was refined using both quantitative and qualitative methods. The results show DALLE-3, while effectively generating images from LLM prompts, poses challenges for image classification. Deblurring generally worsens image quality, indicating a need for specialized resolution enhancement methods. The study suggests that the approach to constructing generated datasets could be broadly applied, with potential extensions from classification to segmentation tasks.