||Method and Steps for Diagnosing the Possibility of Corporate Bankruptcy Using Massive News Articles
|| Corporate bankruptcy; Unstructured data; Machine learning; Text mining; Big data; News articles; Labeling; Prediction
||technology have been used in various fields in South Korea, and the techniques are being applied to, and are complemented by, various service fields in which they had not been introduced before. Especially in order to secure credit stability for companies borrowing from financial institutions, and to preemptively respond to risks, attempts to forecast the possibility of bankruptcy and adopt the forecasts into the actual business are actively conducted by major domestic banks by using online news articles and social networking site data. In this study, we suggest a specified step added to the established machine language analysis processes for opinion mining using massive amounts of text, and we provide an essential tool for it. Also, we designed an algorithm for diagnosing corporate bankruptcy, which has never been proposed in the related research area. Through deep exploratory data analysis and domain-specific data analysis, we implement a rule-based automatic tagger for labeling massive amounts of unlabeled news articles and have devised a novel prediction algorithm. As a result, we achieved 92% accuracy, with an area under the curve of 0.96 with respect to performance, and a hit ratio of 50% among the 26 candidates on a list predicted by financial data analysis. Thus, we recognize that the result of our study is sufficiently complementary to financial data analysis in performance and values, but several limitations remain, such as data coverage, reliability, and incompleteness due to linguistic characteristics.