Why did the Bank of England need a review of its forecasting record?

Ben Bernanke, economics Nobel laureate and former chair of America’s central bank, the Federal Reserve, was invited to evaluate the Bank of England’s forecasting for monetary policy-making and communication. While his review makes a number of recommendations, some important issues are missing.

The Bank of England seriously underestimated the impact that the energy crisis following Russia’s invasion of Ukraine would have on UK inflation. The Bank initially assumed that the spike in inflation was temporary or ‘transitory’. This was probably a result of placing too much faith in the role of ‘anchored expectations’ – the idea that businesses and households would continue to believe that inflation would remain at the low and stable level at which it had stood for many years.

The Bank failed to forecast the subsequent rapid rise in the consumer prices index including owner-occupiers’ housing costs (CPIH) to more than 9%. As a result, monetary policy was well behind the curve (see Figure 1). Nevertheless, a 1970s-style inflation debacle did not happen, as we discuss below.

Figure 1: The Bank of England’s UK CPI price inflation forecasts with outcomes

Source: Bank of England Inflation fan chart, Monetary Policy Report February 2022, Figure 1.7 and CPI inflation outturns from the Office for National Statistics (ONS).

Why was a review of the Bank of England’s forecasting commissioned?

UK inflation actually rose to 11.1% on the consumer prices index (CPI) rate for all items, a measure that excludes owner-occupied housing and council tax. Although inflation peaks and the associated forecast errors by central banks were even higher in a range of European countries, these mistakes contributed to prompting the commissioning of a report to consider ‘the appropriate approach to forecasting and analysis in support of decision-making and communications in times of high uncertainty from big shocks and structural change’.

The subsequent review undertaken by Ben Bernanke – a former chair of the US Federal Reserve and a co-recipient of the 2022 Nobel Prize in Economic Sciences for his work on banks and financial crises – could and should lead to major changes at the Bank of England.

His review exposes important, worrying and often surprising failings. These lead to 12 recommendations (denoted R1-R12 below) to improve and maintain the institution’s forecasting infrastructure, data management, software and economic models; to learn from past forecast errors to deal with uncertainty and structural change; and to improve communication about the risks and uncertainties surrounding its outlook.

The response from the Bank of England claims that the review’s recommendations will be implemented – time will tell. In what follows, we briefly summarise Bernanke’s 12 recommendations. More details and extensive commentary can be found in the Bank Watchers book, a collection of responses to the review by leading economists.

How difficult was it to forecast UK inflation?

Figure 2 records the behaviour of UK annual CPI price inflation, and changes in unemployment, GDP and wages. The rapid inflation upsurge went outside any rates previously experienced in the 21^st century, albeit far less than the rise of 25% during the 1970s oil crises.

Conversely, the GDP changes – swinging from -28% to +25% – as a result of the Covid-19 lockdowns were unprecedented since 1860.

Because of the job furlough scheme, unemployment and wages did not respond greatly to such huge changes. Thus, during 2020, historical relationships between the key macroeconomic variables broke down.

Figure 2: UK annual price inflation, changes in unemployment, GDP and wages (12 month changes in unemployment and in logs of CPI, GDP and average weekly earnings, all percentages)

Source: Office of National Statistics (ONS)

Could better forecasts have been produced?

An agile Bank of England – using appropriate econometric technology for dealing with outliers and shifts to reveal the underlying economic relations – could have done more to circumvent these difficulties, at least for its models.

No one can forecast unpredictable events, and mis-forecasts over 2022 like those produced by the Bank of England were also common at other central banks. Nevertheless, the first two Bernanke recommendations (R1 and R2) reveal that the Bank of England even needed basic advice on data management and model development.

R1: ‘The ongoing updating and modernisation of software to manage and manipulate data should be continued with high priority and as rapidly as feasible’.
R2: ‘Model maintenance and development should be an ongoing priority, supported by a significant increase in dedicated staff time and adequate resources, including specialised software as needed’.

The Bank is in charge of monetary policy, which obviously requires up-to-date information that can be incorporated rapidly into its forecasts when big changes occur.

Why was near hyperinflation, as in the 1970s, avoided?

When recently challenged to test its forecasting model on data from the 1970s, the Bank threw up a smokescreen of `that is not relevant’. Until the Bernanke review, we had thought that was to avoid the embarrassment of seeing its models fall apart, but perhaps it hid a lack of modelling capacity. As Table 1 shows, a period with war, an energy crisis and a fuel price explosion followed by high inflation was potentially relevant.

Table 1 summarises evidence from models of UK price and wage inflation, productivity and unemployment over a century and a half of data. In 1974, for the GDP deflator, price inflation was 16% when commodity prices rose by 130%, whereas in 2022, price inflation was 9% when energy prices (an average of oil and gas prices) rose by 170% even though interest rates were far lower.

Table 1: Annual UK inflation contributions from its main components

Source: 1970s figures are from Hendry, 2001 and the 2020s figures from Castle et al, 2023, which also provides data definitions and sources.

The model does well in capturing the 2022 inflation surge. The over-prediction for 2023 came from using an assumed 80% pass-through into unit labour costs for 2022 inflation versus a 60% outcome, which would imply about 2% lower inflation.

The forecasts also assumed too large a fall in energy per unit capital in 2023 from higher energy prices. The actual fall acted like an energy supply constraint, raising the inflation forecast by 1%.

Figure 3 helps to explain why 1970s near hyperinflation was avoided (see Haskel et al, 2024 for alternative explanations).

Figure 3: UK annual energy usage and wage-price spiral

Source: Castle et al, 2023

Panel A of Figure 3 shows that the role of energy in the UK economy was 50% higher in 1973. Coal has since been essentially eliminated, and both oil and natural gas use are far lower now, so the magnitude of the energy inflation shock’s impact was smaller.

Next, Panel B shows how inflation affected wages – specifically the percentage pass-through from price inflation to wages plotted by date (see R4(c) below). In the chart, zero corresponds to 100% wage erosion from the inflation rate shown, that is when inflation is very low so ‘inattention’ is not costly, whereas 100% denotes that wages are fully compensated for price inflation, so there is complete pass-through to costs.

Inflation began at a much higher level in 1973 and then was close to 100% pass-through, whereas there was low inflation in 2021. The forecasts in Table 1 assumed pass-through would be 80% in both years, whereas it reached 90% in 2022, but fell back to 60% in 2023.

How bad were the Bank of England’s models?

The third recommendation in the Bernanke review states that ‘Over the longer term, the Bank should undertake a thorough review and updating of its forecasting framework, including replacing or, at a minimum, thoroughly revamping COMPASS’ (one of its forecasting models).

When the Bank of England’s earlier quarterly model (BEQM, a dynamic stochastic general equilibrium – DSGE – system) failed badly in the global financial crisis of 2007-09, its replacement (COMPASS, introduced in 2011) was developed. It was claimed that BEQM was too cumbersome and that COMPASS would be easier to use.

But a more fundamental problem is that the formulation on which COMPASS was also based (the New Keynesian DSGE formulation) is mathematically flawed. Further, research at the Bank in 2015 had already revealed that COMPASS was pointing in the wrong direction, and like BEQM, performed badly during the global financial crisis. This was not a surprise to us (although in practice the Bank may use a mix of forecasting models for inflation).

Existing research demonstrates the invalidity of the mathematical basis central to DSGEs (Hendry and Mizon, 2014). Specifically, conditional expectations are not always minimum mean-squared error predictors (as taught in elementary statistical textbooks) and like the law of iterated expectations used to derive DSGE equations, require the absence of shifts in the variables.

When the means of variables change, the equations of DSGEs will shift, so the existing formulation will suffer systematic forecast failure until appropriately revised. (This serious problem was summarised less technically in ‘Why DSGEs crash during crises’, and reiterated in ‘Macro theory and models at the Bank of England’, which also highlights a number of more detailed flaws in COMPASS.)

Key among these issues is the misleading account that COMPASS gives of how monetary transmission works, omitting the credit channel, the cash-flow channel and the asset price channel. Research shows the relevance of these channels, how they differ across economies, and why they would be especially important in the UK, given its housing and mortgage markets – see R4(a) below (Muellbauer, 2022).

A plethora of new thinking about micro-foundations of macroeconomics in the context of heterogeneous agents and evidence-based research has fatally undermined the credibility of New Keynesian DSGEs (see Muellbauer 2022, p.237-238, for a birds-eye view).

Is a new forecasting framework really essential?

The Bernanke review sets out requirements for a revamped forecasting framework in recommendation four, based on the lessons of recent years. This section has five sub-recommendations:

Include rich and institutionally realistic representations of the monetary transmission mechanism, allowing for alternative channels of transmission.
Empirically based modelling of inflation expectations.
Models of wage-price determination that allow gradual adjustment and causation from prices to wages as well as from wages to prices.
Detailed models of the key sectors of the UK economy (including finance, housing and energy).
Analyses of inflation should consider supply-side factors as well as the state of aggregate demand.

The detailed recommendations on a revamped forecasting framework – in parts (a), (d) and (e) – are all important, but why were they needed? Surely all central banks need realistic representations of the monetary transmission mechanism and detailed models of key sectors of the economy.

Part (b) is no surprise given Ben Bernanke’s own recent publication; but (c) and (e) again reflect the Bank of England’s unwillingness to adapt its models. A wage-price spiral was rigorously established in existing research in a real-wage model that is constant over 160 years (Castle and Hendry, 2009; see the less technical summary). Further, it has also been shown to matter considerably for the persistence of inflation (Castle et al, 2023).

Recommendation five refers to incrementalism (basing new forecasts on those that have gone before, with marginal adjustments) and the use of ad hoc judgements that could ‘obscure deeper problems with the underlying forecasting framework or unrecognised changes in the structure of the economy’.

It states that errors and misses should be highlighted and evaluated, and that staff should ‘routinely meet with Monetary Policy Committee (MPC) members to consider whether structural change, mis-specification of models, or faulty judgements warrant discrete changes to the key assumptions or modelling approaches used in forecasting’.

Together with R6, we also worry that advice is needed to re-evaluate empirical models regularly and to determine how structural change and model mis-specification may have contributed to forecast failure. Surely central banks must do so.

Should the Bank of England ditch fan charts and use scenarios?

Recommendation seven on the use of scenarios should only follow a fundamental revamping of the Bank’s modelling framework. Researchers have discussed what can go wrong in scenarios and what is required to validate them (Hendry and Pretis, 2022).

A scenario based on a bad model that would shift when the scenario is implemented is worse than useless. It is crucial (and relatively easy) to test the invariance of the relevant equations of the model to past changes in policy variables.

Most scenarios must be wrong, as, at best, only one can reflect the resulting reality. As a result, it is hard to see how, with a bad model, scenarios will help communicate the policy rationale.

A single (or at least a very small number of) properly tested scenario(s) where the model in use is invariant to the change that the Bank of England intends to implement, published alongside a viable forecast (discussed below) might help to illustrate the uncertainty and the likely impacts of a policy change.

As the review indicates, ideally the selection of scenarios ‘would help the public better understand the reasons for the policy choice, including risk management considerations’. Including a simulation of the monetary policy response could help markets to understand the reaction function of the MPC.

How should the Bank of England communicate its policy choices?

The eighth recommendation in the review refers to the publication of scenarios. The comments on R7 apply: whether or not this would help transparency depends on the invariance of the model to the relevant policy changes.

The central forecast should reflect the MPC’s views of likely policies (R9). Much of the discussion behind recommendations seven, eight and nine regarding the conditioning path of future interest rates is highly dependent on the validity of the resulting scenarios.

Bernanke also calls for the replacement of ‘detailed quantitative discussion of economic conditions in the Monetary Policy Summary in favour of a shorter and more qualitative description’ (R10). Replacing the market-based path for the policy rate with the MPC’s own forecasts of bank rate seems sensible and avoids contradictions when there is a policy change. Adopting qualitative explanations in the policy statement (that is, foredictions) would not preclude evaluating them.

The review also calls for the elimination of the use of fan charts as they have ‘have weak conceptual foundations, convey little useful information over and above what could be communicated in other, more direct ways, and receive little attention from the public’ (R11).

Fan charts can have a viable basis and can communicate the range of forecast uncertainty if the model is robust and has accurate variance estimates. The record of past forecast errors may do little to improve trust in the Bank of England, and would ideally need to distinguish between the impacts of large unexpected events that cannot be forecast (such as pandemics, wars and crises), badly mis-measured initial flash data (as occurred during the global financial crisis), flaws in the models, and judgemental mistakes from MPC interventions. Such a separation could provide a measure of the fundamental uncertainty faced by decision-makers.

Recommendation 12 calls for a phased implementation of any radical changes, which is indeed advisable.

Was the Bernanke review comprehensive?

No. The recommendations can be summarised as: more data, better models with more flexibility and an improved financial sector. It also calls for more scenario analyses, so a semi-structural model is going to be the way forward.

Balancing theory and empirical evidence, keeping theory as the null hypothesis, is easily undertaken in robust software, while also testing for outliers and shifts that would otherwise distort parameter estimates.

Nevertheless, six important issues are missing from the review:

A critique of the pernicious role of equilibrium-correction mechanisms (EqCMs) when forecasting after shifts.
The invalid derivation of DSGEs facing breaks (discussed above).
Forecasts that are robust after shifts.
Testing the invariance of policy models to interventions.
The usefulness of the modelling framework for developing macro-prudential policy.
Climate change.

EqCMs characterising dynamic adjustment around an invariant long-run solution can be disastrous when forecasting after shifts as they revert back to the previous equilibrium mean. A startling example is the decade-long series of forecasts for UK productivity by the Office for Budget Responsibility (OBR) that were seriously wrong, systematically over-predicting productivity growth five years ahead, as shown in Panel A of Figure 4.

This problem with EqCMs (which includes DSGEs) was highlighted by the forecast-error taxonomies in Clements and Hendry (1998), and by Hendry and Mizon (2012) for open models conditioned on assumptions about the paths of variables outside the system.

But since a given model can be used in different ways to produce forecasts, transforms of EqCMs that are more robust after location or trend shifts can avoid or at least reduce systematic forecast failure. Dramatic reductions are possible – for example, using a smoothed robust version of a model like the OBR reduces the root mean-squared forecast error by 75% when forecasting the decade after the first quarter of 2010, as shown in Panel B of Figure 4. Its forecasts assume no future breaks or policy interventions, so they would fail badly in 2020.

Figure 4: The OBR’s UK productivity forecasts and forecasts from a smoothed robust version

Source: Martinez et al, 2022

The lack of emphasis on the need for policy models and their forecasts or scenarios to be invariant to policy changes is a major omission, although we welcome the discussion behind R5 on model re-evaluation. Implicitly, this is a serious criticism of the failure, in recent decades, of the Bank of England’s models to be tested, and modified if necessary. This is something that the review recommends strongly.

After the global financial crisis, all central banks have focused far more than before on financial stability, and they have invested heavily in the appropriate governance and design of macro-prudential policy.

Stress tests of the financial system and the design of interventions such as the 2014 decisions by the Financial Policy Committee to impose flow limits on high loan-to-income mortgages and an affordability test need to be simulated in a system-wide context. A policy model without a banking sector and a credit channel and transmission of policy through the housing market is not useful for this purpose.

Finally, although checking for the impact of extreme events is discussed, the need for the Bank to be prepared for those deriving from climate change is not. With UK commercial banks having made huge loans to fossil fuel industries, sudden legislation to offset the worst effects of accelerating climate change could precipitate another financial crisis.

Should the Bank of England abandon forecasting?

Definitely not. Many large shocks are essentially unpredictable and will lead to forecast failure. Nevertheless ‘even murky glasses are better than a blindfold’ (letter with Jennifer Castle and Jurgen Doornik to the Financial Times published on 11 April 2020 and selected that December as the newspaper’s Letter of the Year).

Forecasting in a policy context requires additional care. Successful forecasts of a problem that ultimately does not occur are not a failure, as the aim of a policy agency should not just be ex ante forecast accuracy, but forecasts that are useful for the purpose.

Will the Bank implement the Bernanke review’s main recommendations?

Important changes are needed, but these should not be rushed, especially on the software front where the UK has an unbeaten record of failed major software schemes.

Coordination with the Office of National Statistics (ONS) on adapting the Bank’s data acquisition and preparation system seems imperative. Alternatively, the Fed’s FRED has a nice application programming interface (API) to retrieve data for further modelling, whereas that of the ONS is rather cumbersome at present. This is also a gap that the Economics Observatory’s API and data hub are seeking to fill.

One caveat concerns data revision and updating: to replicate model results requires keeping the data fixed for a period, so running models directly on live data seems risky. Instead, it appears necessary to create data vintages that can be queried, albeit requiring more work. Not maintaining a long historical record of data at the ONS is a serious handicap.

Splicing national accounts data together for fairly consistent series (for example, when accounting concepts change, or when the weights of wholesale price indices change), has not been a priority at the ONS. The Bank of England is to be commended for putting resources into ‘A millennium of macroeconomic data’.

The Bank has been adept at ducking aspects of past reports that it does not like. For example, the Pagan report’s recommendation of developing an automatically selected small macro model was ignored, while those about DSGEs were adopted.

In the meantime, the author of that report advised the Reserve Bank of Australia (RBA) in 2018 to adapt its (non-DSGE) ‘suite of models’ into the common framework of a semi-structural econometric policy model that satisfies the various national accounting and sectoral data adding-up constraints.

The Bank of England is now in a small minority of central banks without such a model type that has become widely adopted, especially since the global financial crisis.

Conclusion

Most of the recommendations in the Bernanke review of the Bank of England’s modelling, and forecasting for monetary policy-making and its communication, ought to be implemented as we have discussed. Many were unexpected as they concerned what any central bank in charge of monetary policy should have been doing. But some important issues were missing and they need to be added to the list of actions facing the Bank.