Research | (Albert) Bo ZHAO 赵博

My research explores how institutions shape economic and financial outcomes, and how advances in machine learning can be used to study complex social and economic systems. More broadly, I am interested in methodological questions concerning scientific discovery, prediction, and the generation of knowledge.

Publications

JLawEcon

Does Securities Regulation Matter? Mandatory Disclosure, Excess Stock Volatility and the U.S. 1934 Securities Exchange Act

Albert Bo Zhao, Sheng Li, and Chenggang Xu

Journal of Law and Economics 2026

Abs DOI Supp My Thoughts

We examine whether the US Securities Exchange Act of 1934 significantly stabilized the market by introducing mandatory disclosure of information. We argue that mandatory information disclosure can curb stock manipulation by enhancing transparency, thereby reducing excess stock volatility. After a comprehensive assessment of the voluntary disclosure practices of companies listed on the New York Stock Exchange before 1934, we find that those with poor disclosure practices experienced a significantly greater reduction in volatility after the implementation of the act compared with those with good disclosure practices. Further analysis reveals that the liquidity of these companies with poor disclosure practices also improved significantly more than that of companies with better disclosure, and the improvement in liquidity was linked to the decrease in their volatility. Given that one key purpose of the act’s legislators was to reduce excess market volatility, our findings provide empirical support for considering this legislative aim successful.

Can regulation make markets fundamentally more stable, or does it merely impose additional compliance costs? Our research tends to support the former view.
JEF

Is Machine Learning a Necessity? A Regression-based Approach for Stock Return Prediction

Tingting Cheng, Shan Jiang, Albert Bo Zhao, and Junyi Zhao

Journal of Empirical Finance 2025

Abs DOI My Thoughts

We propose a simple, linear-regression-based method for prediction of the time series of stock returns. The method achieves out-of-sample performances comparable to machine learning methods while having ignorable computational costs. The key component of the method is to integrate a straightforward cross-market factor screening into the iterated combination method proposed by Lin et al., (2018). Our empirical results on the U.S. stock market show that the method outperforms many state-of-the-art machine learning methods in certain periods. The method also exhibits greater utility gain and investment profits in most periods after considering transaction costs.

I have some thoughts on the predictability of the timeseries of stock returns. They are shared in the Introduction.
FRL

Complete Subset Averaging Methods in Corporate Bond Return Prediction

Tingting Cheng, Shan Jiang, Albert Bo Zhao, and Zhimin Jia

Finance Research Letters 2023

Abs DOI

We investigate the performances of two methods of complete subset averaging—complete subset linear averaging (CSLA) and complete subset quantile averaging (CSQA)—on the problem of corporate bond return prediction. We find that the two methods are overwhelmingly better than univariate linear regression and simple forecast combination. Meanwhile, CSQA is better than CSLA in most cases. For practical implementation, we also provide discussions on the selection of the hyperparameter k when applying these complete subset averaging methods.
JEF

Stock Return Prediction: Stacking a Variety of Models

Albert Bo Zhao, and Tingting Cheng

Journal of Empirical Finance 2022

Abs DOI Code My Thoughts

We employ an ensemble learning approach, “stacking”, to refine and combine a variety of linear and nonlinear individual stock return prediction models. In an application of forecasting U.S. market excess return, stacking with a simple structure can outperform the traditional historical mean benchmark, Mallows model averaging, simple combination forecast, complete subset regression, combination elastic net forecast, and several other models in terms of both in- and out-of-sample performance measures on a consistent basis. More importantly, we find that the out-of-sample gains of stacking are especially evident during extreme downside market movements. Overall, stacking can generate substantive improvements in market excess return predictability.

I was expecting an even better outcome. It may still be improved, though.
PBFJ

The Impact of COVID-19 Pandemic on the Volatility Connectedness Network of Global Stock Market

Tingting Cheng, Junli Liu, Wenying Yao, and Albert Bo Zhao

Pacific-Basin Finance Journal 2022

Abs DOI

This paper investigates how the COVID-19 pandemic affects the connectedness network of stock market volatility in 19 economies around the world. Our method builds on the Diebold-Yilmaz volatility network model to construct the volatility spillover index, and uses lag sparse group LASSO to accommodate the high-dimensional system. We find that the outbreak of the COVID-19 pandemic strengthens the overall volatility connectedness, and the global connectedness level remains high throughout 2020. In particular, connections across different continents have become stronger during this period. However, China is shown to be disconnected from the global volatility connectedness network until late November 2020. We find evidence that China is not the main source of volatility spillover during the COVID-19 pandemic.

Working papers

经济学研究“过度模型化”之辨：行为公设、制度框架与经验校准

Albert Bo Zhao

Abs HTML PDF My Thoughts

近期，经济学研究的方法论问题重新在中文经济学界引起关注。陆铭教授最近把经济学研究中的一种现象概括为“过度模型化”：模型构建未能很好纳入现实情境，出现“为了让模型成立而忽略现实”；过度追求方法复杂性和精确性，出现“为了让方法成立而牺牲问题”(陆铭, 2026)。本文赞同对这一现象的观察，但认为其所指向的并非中国经济学近期才有的特殊现象，而是经济学长期面对的基本问题。本文认为，这一问题的关键不在于“模型化”，而在于如何判断“过度”。为此，本文提出行为公设、制度框架与经验校准构成的三元框架：经济解释需要说明行动者如何形成目标、信念和选择方式，行动者处于何种规则和收益后果之下，以及相关机制在经验上具有多大量级和适用边界。所谓“过度”，可以理解为三者之间的失衡或断裂。本文以李嘉图式抽象及熊彼特对其“李嘉图恶习”的批评、门格尔与历史学派之争、米塞斯与弗里德曼围绕先验演绎和经验检验的分歧为线索，说明关于模型、制度语境和经验证据之间关系的争论贯穿经济学方法史。随后，本文讨论现代因果识别进入传统历史研究时引发的争议，说明经验工具进入既有命题时仍需面对机制解释、制度语境和知识增量问题。最后，本文以拍卖理论和期权定价为例说明，模型复杂本身并不必然导致“过度”。

对这个问题的完整叙述需要更长篇幅。我会尽力把自己的（可能错误的）想法慢慢讲清楚。
Predicting the Predictable: Decomposing and Forecasting Stock Returns in a Data-rich Environment

Tingting Cheng, Xuanbin Yang, and Albert Bo Zhao

Resubmitted to Management Science

Abs DOI My Thoughts

We propose a new method for forecasting stock market returns in a data-rich environment: the factor-augmented sum-of-the-parts (FA-SOP) approach. Rather than predicting returns directly, FA-SOP decomposes them into three components—dividend-price ratio, earnings growth, and price-earnings ratio growth (gm)—and models each separately. We emphasize that gm is a more promising target for forecasting due to its stronger connection with macroeconomic conditions and greater variability over time. FA-SOP forecasts gm using latent macroeconomic factors extracted from high-dimensional data, capturing the underlying state of the economy while avoiding overfitting. Applied to S&P 500 returns from 1960 to 2022, FA-SOP outperforms predictive regressions, factor-augmented regressions, and traditional decomposition approaches, yielding robust out-ofsample gains in both statistical and economic terms. Simulations based on a present-value model further show that FA-SOP’s advantage stems from its ability to track the true data-generating process more closely. Our results highlight the value of decomposing returns and focusing on components that are more predictably linked to economic fundamentals.

Rather than treating return predictability as an all-or-nothing property, we argue that stock returns consist of both more predictable and less predictable components. Disentangling these components and analyzing them separately provides a clearer understanding of where and how predictability arises. For components with inherently low signal-to-noise ratios, a passive approach—such as using the historical mean—is sufficient. But for components with clearer economic meaning or stronger signals, more active and structured modeling could uncover substantial predictive value. This targeted approach allows us to focus modeling effort where it matters most, and avoid overfitting where little can be gained.
Combination Forecast of Corporate Bond Return: An Ensemble Learning Approach

Tingting Cheng, Shan Jiang, Hai Lin, Chunchi Wu, and Albert Bo Zhao

Revise & Resubmit at Journal of Banking and Finance

Abs DOI My Thoughts

This study employs an ensemble machine learning method, known as “Stacking," to forecast corporate bond returns. We find that the Stacking method introduces new features into combination forecasts, increasing the predictive model’s power and generating higher statistical and economic gains across bond ratings and maturities. Moreover, the method is efficient for tackling high dimensionality and achieves the best result when using predictors from corporate bond, Treasury, and stock markets jointly. While the overall performance of different Stacking models is satisfactory, simpler Stacking models appear to outperform others and generate optimal forecasts.

Combination is still what we need in time-series prediction of returns.
Direction is More Important than Speed: A Comparison of Discrete and Continuous Modeling of Stock Excess Returns

Albert Bo Zhao, and Tingting Cheng

Abs DOI My Thoughts

We contrast continuous magnitude estimation with discrete directional classification in equity premium prediction. Through a symmetric, single-pass out-of-sample evaluation of traditional econometric and machine learning algorithms, we document a distinct divergence in predictive performance. While continuous models struggle with low signal-to-noise ratios and generally fail to outperform the historical mean, discrete classifiers consistently reveal statistically significant predictability and generate substantial economic utility. We demonstrate that this divergence stems from the inherent sensitivity of continuous models to magnitude noise. Under continuous estimation, algorithms face a structural dilemma: they either overreact to unforecastable extreme shocks or resort to excessive shrinkage, both of which severely limit their market-timing ability during downturns. Conversely, discarding magnitude estimation acts as a form of structural regularization. This transformation frees nonlinear algorithms from extrapolating non-stationary macroeconomic trends, allowing them to utilize persistent high-frequency risk signals to execute timely market exits. Ultimately, our findings suggest that the fundamental choice of the predictive paradigm exerts a first-order impact that outweighs specific algorithmic sophistication.

It seems that predicting the value of future stock returns has been an orthodox practice in the field of asset pricing. The beautiful theory of factor model provides both explanatory and predictive implications for statistical exercises, inspiring a voluminous body of empirical literature. However, an often encountered layperson’s first question, while not naive, is: “Do you think the market will go up (bullish) or down (bearish) in the near future?”. In this paper, we systematically compare these two predictive paradigms.
Revisiting Incentive Issues in China’s Central-Local Top-Down Hierarchy

Chenggang Xu, Albert Bo Zhao, and Ziao Zhao

Abs

Warnings hanging over my head:

The cost of computing has dropped exponentially, but the cost of thinking is what it always was. That is why we see so many articles with so many regressions and so little thought.

– Zvi Griliches

If you torture the data long enough, it will confess.

– Ronald Coase