Sunday, December 10, 2017

More on the Problem with Bayesian Model Averaging

I blogged earlier on a problem with Bayesian model averaging (BMA) and gave some links to new work that chips away at it. The interesting thing about that new work is that it stays very close to traditional BMA while acknowledging that all models are misspecified.

But there are also other Bayesian approaches to combining density forecasts, such as prediction pools formed to optimize a predictive score. (See, e.g. Amisano and Geweke, 2017, and the references therein.  Ungated final draft, and code, here.)

Another relevant strand of new work, less familiar to econometricians, is "Bayesian predictive synthesis" (BPS), which builds on the expert opinions analysis literature. The framework, which traces to Lindley et al. (1979), concerns a Bayesian faced with multiple priors coming from multiple experts, and explores how to get a posterior distribution utilizing all of the information available. Earlier work by Genest and Schervish (1985) and West and Crosse (1992) develops the basic theory, and new work (McAlinn and West, 2017), extends it to density forecast combination.

Thanks to Ken McAlinn for reminding me about BPS. Mike West gave a nice presentation at the FRBSL forecasting meeting. [Parts of this post are adapted from private correspondence with Ken.]

Sunday, December 3, 2017

The Problem With Bayesian Model Averaging...

The problem is that one of the models considered is traditionally assumed true (explicitly or implicitly) since the prior model probabilities sum to one. Hence all posterior weight gets placed on a single model asymptotically -- just what you don't want when constructing a portfolio of surely-misspecified models. The earliest paper I know that makes and explores this point is one of mine, here. Recent and ongoing research is starting to address it much more thoroughly, for example here and here. (Thanks to Veronika Rockova for sending.)



Sunday, November 26, 2017

Modeling With Mixed-Frequency Data

Here's a bit more related to the FRB St. Louis conference.

The fully-correct approach to mixed-frequency time-series modeling is: (1) write out the state-space system at the highest available data frequency or higher (e.g., even if your highest frequency is weekly, you might want to write the system daily to account for different numbers of days in different months), (2) appropriately treat most of the lower-frequency data as missing and handle it optimally using the appropriate filter (e.g., the Kalman filter in the linear-Gaussian case).  My favorite example (no surprise) is here.  

Until recently, however, the prescription above was limited in practice to low-dimensional linear-Gaussian environments, and even there it can be tedious to implement if one insists on MLE.  Hence the well-deserved popularity of the MIDAS approach to approximating the prescription, recently also in high-dimensional environments.     

But now the sands are shifting.  Recent work enables exact posterior mixed-frequency analysis even in high-dimensional structural models.  I've known Schorfheide-Song (2015, JBES; 2013 working paper version here) for a long time, but I never fully appreciated the breakthrough that it represents -- that is, how straightforward exact mixed-frequency estimation is becoming --  until I saw the stimulating Justiniano presentation at FRBSL (older 2016 version here).  And now it's working its way into important substantive applications, as in Schorfheide-Song-Yaron (2017, forthcoming in Econometrica). 

Monday, November 20, 2017

More on Path Forecasts

I blogged on path forecasts yesterday.  A reader just forwarded this interesting paper, of which I was unaware.  Lots of ideas and up-to-date references.

Thursday, November 16, 2017

Forecasting Path Averages

Consider two standard types of \(h\)-step forecast:

(a).  \(h\)-step forecast, \(y_{t+h,t}\), of \(y_{t+h}\)

(b).  \(h\)-step path forecast, \(p_{t+h,t}\), of \(p_{t+h} =  \{ y_{t+1}, y_{t+2}, ..., y_{t+h} \}\).

Clive Granger used to emphasize the distinction between (a) and (b).

As regards path forecasts, lately there's been some focus not on forecasting the entire path \(p_{t+h}\), but rather on forecasting the path average:

(c).  \(h\)-step path average forecast, \(a_{t+h,t}\), of \(a_{t+h} =  1/h [y_{t+1} +  y_{t+2} + ... +   y_{t+h}]\)

The leading case is forecasting "average growth", as in Mueller and Waston (2016).

Forecasting path averages (c) never resonated thoroughly with me.  After all, (b) is sufficient for (c), but not conversely -- the average is just one aspect of the path, and additional aspects (overall shape, etc.) might be of interest. 

Then, listening to Ken West's FRB SL talk, my eyes opened.  Of course the path average is insufficient for the whole path, but it's surely the most important aspect of the path -- if you could know just one thing about the path, you'd almost surely ask for the average.  Moreover -- and this is important -- it might be much easier to provide credible point, interval, and density forecasts of \(a_{t+h}\) than of \(p_{t+h}\).

So I still prefer full path forecasts when feasible/credible, but I'm now much more appreciative of path averages.

Wednesday, November 15, 2017

FRB St. Louis Forecasting Conference

Got back a couple days ago.  Great lineup.  Wonderful to see such sharp focus.  Many thanks to FRBSL and the organizers (Domenico Giannone, George Kapetanios, and Mike McCracken).  I'll hopefully blog on one or two of the papers shortly.  Meanwhile, the program is here.

Wednesday, November 8, 2017

Artificial Intelligence, Machine Learning, and Productivity

As Bob Solow famously quipped, "You can see the computer age everywhere but in the productivity statistics".  That was in 1987.  The new "Artificial Intelligence and the Modern Productivity Paradox: A Clash of Expectations and Statistics," NBER w.p. 24001, by Brynjolfsson, Rock, and Syverson, brings us up to 2017.  Still a puzzle.  Fascinating.  Ungated version here.

Sunday, November 5, 2017

Regression on Term Structures

An important insight regarding use of dynamic Nelson Siegel (DNS) and related term-structure modeling strategies (see here and here) is that they facilitate regression on an entire term structure.  Regressing something on a curve might initially sound strange, or ill-posed.  The insight, of course, is that DNS distills curves into level, slope, and curvature factors; hence if you know the factors, you know the whole curve.  And those factors can be estimated and included in regressions, effectively enabling regression on a curve.

In a stimulating new paper, “The Time-Varying Effects of Conventional and Unconventional Monetary Policy: Results from a New Identification Procedure”, Atsushi Inoue and Barbara Rossi put that insight to very good use. They use DNS yield curve factors to explore the effects of monetary policy during the Great Recession.  That monetary policy is often dubbed "unconventional" insofar as it involved the entire yield curve, not just a very short "policy rate".

I recently saw Atsushi present it at NBER-NSF and Barbara present it at Penn's econometrics seminar.  It was posted today, here.

Sunday, October 29, 2017

What's up With "Fintech"?

It's been a while, so it's time for a rant (in this case gentle, with no names named).

Discussion of financial technology ("fintech", as it's called) seems to be everywhere these days, from business school fintech course offerings to high-end academic fintech research conferences. I definitely get the business school thing -- tech is cool with students now, and finance is cool with students now, and there are lots of high-paying jobs.

But I'm not sure I get the academic research thing. We can talk about "X-tech" for almost unlimited X: shopping, travel, learning, medicine, construction, sailing, ..., and yes, finance. It's all interesting, but is there something extra interesting about X=finance that elevates fintech to a higher level? Or elevates it to a serious and separate new research area? If there is, I don't know what it is, notwithstanding the cute name and all the recent publicity.

(Some earlier rants appear to the right, under Browse by Topic / Rants.)

Sunday, October 22, 2017

Pockets of Predictability

The possibility of localized "pockets of predictability", particularly in financial markets, is obviously intriguing.  Recently I'm noticing a similarly-intriguing pocket of research on pockets of predictability.  

The following paper, for example, was presented at 2017 the NBER-NSF Time Series conference at  Northwestern University, even if it is evidently not yet circulating:
"Pockets of Predictability", by Leland Farmer (UCSD), Lawrence Schmidt (Chicago), and Allan Timmermann (UCSD).  Abstract:  We show that return predictability in the U.S. stock market is a localized phenomenon, in which short periods, “pockets,” with significant predictability are interspersed with long periods with little or no evidence of return predictability. We explore possible explanations of this finding, including time-varying risk premia, and find that they are inconsistent with a general class of affine asset pricing models which allow for stochastic volatility and compound Poisson jumps. We find that pockets of return predictability can, however, be explained by a model of incomplete learning in which the underlying cash flow process is subject to change and investors update their priors about the current state. Simulations from the model demonstrate that investors’ learning about the underlying cash flow process can induce patterns that look, ex-post, like local return predictability, even in a model in which ex-ante expected returns are constant.

And this one just appeared as an NBER w.p.: "Sparse Signals in the Cross-Section of Returns", by Alexander M. Chinco, Adam D. Clark-Joseph, Mao Ye, NBER w.p. 23933, October 2017.
http://papers.nber.org/papers/w23933?utm_campaign=ntw&utm_medium=email&utm_source=ntw
Abstract: This paper applies the Least Absolute Shrinkage and Selection Operator (LASSO) to make rolling 1-minute-ahead return forecasts using the entire cross section of lagged returns as candidate predictors. The LASSO increases both out-of-sample fit and forecast-implied Sharpe ratios. And, this out-of-sample success comes from identifying predictors that are  unexpected, short-lived, and sparse. Although the LASSO uses a statistical rule rather than economic intuition to identify predictors, the predictors it identifies are nevertheless associated with economically meaningful events: the LASSO tends to identify as predictors stocks with news about fundamentals.

Here's some associated work in dynamical systems theory:  "A Mechanism for Pockets of Predictability in Complex Adaptive Systems", by Jorgen Vitting Andersen, Didier Sornette, Europhysics Letters, 2005.  https://arxiv.org/abs/cond-mat/0410762
 Abstract:  We document a mechanism operating in complex adaptive systems leading to dynamical pockets of predictability ("prediction days''), in which agents collectively take predetermined courses of action, transiently decoupled from past history. We demonstrate and test it out-of-sample on synthetic minority and majority games as well as on real financial time series. The surprising large frequency of these prediction days implies a collective organization of agents and of their strategies which condense into transitional herding regimes.

There's even an ETH Z├╝rich master's thesis:  "In Search Of Pockets Of Predictability", by AT Morera, ‎2008
https://www.ethz.ch/content/dam/ethz/special-interest/mtec/chair-of-entrepreneurial-risks-dam/documents/dissertation/master%20thesis/Master_Thesis_Alan_Taxonera_Sept08.pdf

Finally, related ideas have appeared recently in the forecast evaluation literature, such as this paper and many of the references therein:  "Testing for State-Dependent Predictive Ability", by Sebastian Fossati, University of Alberta, September 2017.
 https://sites.ualberta.ca/~econwps/2017/wp2017-09.pdf
Abstract: This paper proposes a new test for comparing the out-of-sample forecasting performance of two competing models for situations in which the predictive content may be state-dependent (for example, expansion and recession states or low and high volatility states). To apply this test the econometrician is not required to observe when the underlying states shift. The test is simple to implement and accommodates several different cases of interest. An out-of-sample forecasting exercise for US output growth using real-time data illustrates the improvement of this test over previous approaches to perform forecast comparison.