Discussion Paper Series 2026-E-7

Forecasting Recessions Using Machine Learning on Text Data and Mixed-Frequency Predictors

Yusuke Oh, Mototsugu Shintani

We forecast Japanese recessions by integrating machine learning methods, mixed-frequency data, and text-based indicators within an unrestricted mixed data sampling (U-MIDAS) framework. The model combines monthly macroeconomic variables with weekly financial indicators and newspaper-based text indicators. A pseudo-real-time forecasting exercise over three decades shows that machine learning models consistently outperform traditional logit benchmarks. The model confidence set (MCS) suggests horizon dependence: Text indicators are more informative at short horizons, while financial variables are more informative at longer horizons. To improve interpretability, we apply sparse principal component analysis (Sparse PCA) to the text indicators and identify three economic narratives: 'Corporate Distress,' 'Financial Distress,' and 'Deflationary Pressure.' Furthermore, SHAP (SHapley Additive exPlanations) analysis indicates that different recession episodes are associated with different combinations of these narratives, underscoring the heterogeneous nature of economic downturns.

Keywords: business cycles; mixed data sampling; model confidence set; text analysis; recession forecasting

Views expressed in the paper are those of the authors and do not necessarily reflect those of the Bank of Japan or Institute for Monetary and Economic Studies.

Home Japanese Home