Taking Long-Term Stock Returns Seriously

Credit Suisse was founded in 1856, and then shut down earlier this month by Swiss bank regulators, who forced the sale of the firm to UBS. Thus, there is some irony and even poignancy in looking at the just published 2023 yearbook of the Credit Suisse Research Institute, titled “Credit Suisse Global Investment Returns: Leading perspectives to navigate the future,” written by Elroy Dimson, Paul Marsh, Mike Staunton. A summary edition of the report is freely available online.

Each year, a main emphasis of the report is on long-term returns going back to about 1900. Here’s a graph showing nominal and inflation-adjusted returns for US stocks, bonds and “bills” (short-term government debt). Yes, investing $1 in a diversified portfolio of US stocks back in 1900 and reinvesting all dividends since then would have led to a real gain by a factor of more than 2,000 since then. (Notice that the vertical axis is logarthmic, rising by factors of 10.)

In addition, this long-term perspective–using annual data–puts some prominent events into perspective. The authors write:

The chart shows that US equities totally dominated bonds and bills. There were severe setbacks of course, most notably during World War I; the Wall Street Crash and its aftermath, including the Great Depression; the OPEC oil shock of the 1970s after the 1973 October War in the Middle East; and four bear markets so far during the 21st century. Each shock was severe at the time. At the depths of the Wall Street Crash, US equities had fallen by 80% in real terms. Many investors were ruined, especially those who bought stocks with borrowed money. The crash lived on in the memories of investors for at least a generation, and many subsequently chose to shun equities.

The top two panels of Figure 10 set the Wall Street Crash in its long-run context by showing that equities eventually recovered and gained new highs. Other dramatic episodes, such as the October 1987 crash, hardly register; the COVID-19 crisis does not register at all since the plot is of annual data, and the market recovered and hit new highs by year-end; the bursting of the technology bubble in 2000, the Global Financial Crisis of 2007–09 and the 2022 bear market show on the chart but are barely perceptible. The chart sets the bear markets of the past in perspective. Events that were traumatic at the time now just appear as setbacks within a longer-term secular rise.

But it’s also worth remembering that the US investment experience is extraordinary. As the authors put it, it would be “unwise for investors around the world to base future projections solely on US evidence.” Here’s a figure with international comparisons. Although two tiny stock markets, South Africa (ZAF) and Australia (AUS), have outperformed the US stock market over time, the US market has dominated world returns. For the record, most of the abbreviation here are for countries, but WLD is the index for the entire world, WXU is the world leaving out the US, EUR is Europe, DEV is developed markets, and EMG is emerging markets.

The extraordinary growth in the US stock market since 1900 means that, when it comes to global equity markets, the US stock markets dominate the world. Here are the sizes of stock markets around the world in 1900 and in 2022.

Poll: Less Total Government Spending, but Not In Any Category

The American public is in favor of less total government spending, but it would prefer to avoid reducing spending in almost every category. Here are two figures showing results from an AP/NORC poll (March 29, 2023).

The first figure shows results for whether people believe the government is overspending as a whole. Overall, 60% of the public thinks government spends “too much” and 16% says “too little,” with 22% in the “about right” category.

But when you ask about specific categories, the public wants to see expanded spending in most areas. The only area which has a clear majority for spending “too much” is assistance to other countries. Other surveys have typically found that the public vastly overestimates the amount spent in this category, usually thinking that it covers about 25-30% of total federal spending–when it actually is only about 1% of all federal spending.

The conflict between these results–which are from the same survey!–suggests that the public wants politicians who advocate both for less total spending and also more spending in many individual categories. On this subject, in other words, the public is wide-open to embracing demagoguery.

Interview with Yasheng Huang on the Development of the Chinese State

Many of us comment on China by reading the second-hand literature published in English. Yasheng Huang is Professor of Global Economics and Management at MIT’s Sloan Business School, who came from China to the United States in the 1980s and has thus had the freedom and a front-row seat to study China’s evolution since then. Tyler Cowen has one of his “Conversations with Tyler” with Huang “on the development of the Chinese state” (March 8, 2023, audio and transcript available). Much of the discussion is about how the tradition of China’s civil service examinations evolved over the centuries, and the effects on literacy, creativity, and commerce. Here, I’ll focus on some of Huang’s comments more related to current events:

What is a big misconception about China’s economy?

[O]ne of them is that they look at the Chinese R&D spending, and they look at, for example, some of the impressive technological progress the country has made, and then they drew the conclusion that the Chinese economy is driven by productivity and innovations. In fact, studies show that the total productivity contributions to the GDP have been declining in the last decade and even more. As China has begun to invest more in R&D, the economic contributions coming from technology, coming from productivity have been actually declining. In the economic sense, it’s not a productivity-driven economy. It is an overwhelmingly investment-driven economy.

I think that’s one of the biggest misunderstandings of Chinese economy. It entails implications about the future prospects of the country, whether or not you can sustain this level of economic growth purely on the basis of massive investments.

Huang also offers some thoughts on the nature of political protest in China and how the Communist Party shapes the form of protests in a way that helps the Party hold on to power.

There’s a difference between a civil society consisting of isolated individual actions and a civil society that consists of organized activities that have a program, that have financial support, that have the capability to operate independently. By the second criterion, China has none of that.

If you look at the recent protests against Zero-COVID controls, let’s keep one number in perspective. By various estimates, in 2022 there were probably 400 million people under some sort of long-term quarantine. And let me just concretize that word quarantine. That means you’re essentially locked up in your home, sometimes for weeks, and in some cases, for two months. That’s the level of the suffering, and sometimes you can’t get food. Sometimes you cannot get patients into the emergency room because the hospitals also shut down, refused to take in patients who are tested positive or who cannot show a negative test on COVID. Some people have died. There are suicides, there are fires, and all these collateral damages from the Zero-COVID control.

Relative to that, China experienced a wave of protests — by one estimate, in 17 cities. I don’t really have a good idea how many people were involved, but we are not talking about millions of people. We’re talking about maybe 10,000 people, or tens of thousands of people.

Contrast that with Iran. In the case of Iran, one woman died in the hands of the moral police. There were other grievances, but that was the trigger. The protests are still going on. Millions of the people took to the street. … If you look at the color revolution in Tunisia, it started with a peddler whose assets were confiscated by the government official, and then he committed suicide. That sparked the color revolution.

Those kinds of brutalities toward small peddlers happen almost on a daily basis in China. It’s very important to specify, relative to the grievances and the level of the misery . . . We’re not talking about large-scale social movements here. These are individual actions. …

If you look at what the CCP has been doing, it is actually quite clever. It’s not the case that they don’t take input from the society. They create portals, they create websites, and they create phone numbers for the citizens to call in. They also do surveys. What they want to do is, they want to solicit opinions and information from the citizens without creating conditions for the citizens to get organized. If you think about all these opinions expressed to the government through the government control portals, you are doing it as an individual. You’re not doing it as a member of a larger group. The CCP has no problem with that, and sometimes those opinions can be quite negative. The CCP has no problem with that. …

Yes, China has had a lot of protests, but those protests tend to happen in rural areas, in less urban settings, in isolated situations, and on single issues. Usually, in the 1990s, it was about the land that the government took away. And then it was about the salary, that employers were late in paying my salary, so there were protests about that — very single-issue, very focused.

This time around, you’re talking about people demanding the CCP to step down, demanding Xi Jinping to step down. That’s just something entirely different from what we saw before. …

The reason for that is, I think — although it’s a little bit difficult to generalize because we don’t really have many data points — one reason is the charisma power of individual leaders, Mao and Xiaoping. These were founding fathers of the PRC, of the CCP, and they had the prestige and — using Max Weber’s term — charisma, that they could do whatever they wanted while being able to contain the spillover effects of their mistakes. The big uncertain issue now is whether Xi Jinping has that kind of charisma to contain future spillover effects of succession failure.

This is a remarkable statistic: Since 1976, there have been six leaders of the CCP. Of these six leaders, five of them were managed either by Mao or by Deng Xiaoping. Essentially, the vast majority of the successions were handled by these two giants who had oversized charisma, oversized prestige, and unshakeable political capital.

Now we have one leader who doesn’t really have that. He relies mostly on formal power, and that’s why he has accumulated so many titles, whereas he’s making similar succession errors as the previous two leaders. Obviously, we don’t know — because he hasn’t chosen a successor — we don’t really know what will happen if he chooses a successor. But my bet is that the ability to contain the spillover effect is going to be less, rather than more, down the road, because Xi Jinping does not match, even in a remote sense, the charisma and the prestige of Mao Zedong and Deng Xiaoping. There’s no match there.

I always gain some additional useful perspective from reading Huang’s work. Back in 2012, he wrote  “How Did China Take Off?” for the Journal of Economic Perspectives, where I work as Managing Editor. He made a persuasive case that most of us tend to see China’s economic takeoff as a matter of foreign trade and exports. However, he argues that the early stages of China’s burst of economic growth, through the 1980s and 1990s, were actually led by rural industry in the form of “township and village enterprises” that were led by private entrepreneurs in the context of a high degree of financial liberalization. At this time, China’s economic growth was driven primarily by a rise in China’s domestic consumption, not by export sales. However, in the early to mid-1990s, Huang argues, China’s leadership switched from a rural to an urban focus, took over the financial sector, and essentially drove the rural-based township and village enterprises out of business in favor of expanding state-financed and -controlled urban enterprises.

An American Industrial Policy Experiment Begins

Sometimes “industrial policy” is defined very broadly, as when people say: “Every country has an industrial policy–even not having an industrial policy is a kind of industrial policy.” But in a more specific meaning of the term, “industrial policy” doesn’t include, say, support of K-12 education or university research and development or a well-regulated banking system. Instead, it refers to when government targets the growth of specific industries with subsidies or trade protection, in the belief that these industries will repay the near-term government support by leading to stronger growth that benefits the broader economy in the future.

In the more limited use of the term, the United States is embarking on a major experiment in industrial policy. The Creating Helpful Incentives to Produce Semiconductors (CHIPS) and Science Act focuses $280 billion over the next decade on building a domestic semiconductor manufacturing industry. The Inflation Reduction Act (IRA) commits $579 billion over the next 10 yearswith a heavy focus on promotion of noncarbon methods of producing electricity from non-carbon sources (mainly solar and wind) and supporting energy users in switching to the use of such energy (including subsidies for electric cars). The Infrastructure Investment and Jobs Act (IIJA) commits $1.2 trillion over the next decade to standard infrastructure like roads, bridges, rail, and transit, which don’t fit into a narrow definition of infrastructure, but also includes less-discussed infrastructure like broadband, electrical power, and support for non-gasoline infrastructure for cars.

In an article from Deloitte Insights, William D. Eggers, John O’Leary and Kevin Pollari discuss “Executing on the $2 trillion investment to boost American competitiveness” (March 16, 2023). They emphasize that the new laws involve large amounts of money, with many different funding streams, all with different compliance standards, that need to be run and coordinated across a large number of federal agencies. In addition, the chances of success will often depend on interactions between these programs, not on the sum of the individual programs.

It stands to reason that industrial policy isn’t simple. If industrial policy was as simple as tossing a log on the fire and getting the desired heat, then every nation would be able to do it. Having a boom in manufacturing jobs, or union jobs, or strong industries related to semiconductors, green energy jobs, steel, or cars, would just be a matter of passing the legislation. But it’s obviously not that simple and easy for industrial policy to work, not in the United States and not in other countries either.

There are many examples of these complexities and constraints: I’ll just give a couple of examples here. The Deloitte authors write about the infrastructure act:

Under IIJA alone, more than 45 federal bureaus and 16 federal agencies and commissions are allocated funding for 369 new and existing programs. Grants fund more than 200 programs and represent 78% of the total funding. … These three new laws establish more than 160 entirely new programs. IIJA alone has created 129 new programs with more than $226 billion in funding. Seven existing programs worth $275 billion have been substantially revised or expanded. In the IRA, out of the total $228 billion appropriated across 18 federal agencies, more than $80 billion was appropriated for 34 new programs.

Thus, the basic workability of the new laws depends an ability to administer the money across these bureaus and agencies and commissions and grants–ranging across federal, state, and local government actors as well as universities and the private sector–and to do so in a way that actually boosts competitiveness and isn’t just a money trough for the politically connected.

As another example, “The CHIPS and Science Act, for example, has earmarked $10 billion for the Department of Commerce to create 20 regional technology hubs across the United States  in partnership with universities and private businesses.” I’m a supporter of funding for regional technology hubs, but I’m not fool enough to think that organizing them is easy.

It’s not just bureaucratic constraints, either. The Deloitte authors note an estimate that the “the country will need one million additional electricians for the clean-energy transition.” Maybe that estimate is overstated? Maybe we only need several hundred thousand more electricians. But all the plans for installing new public and home charging stations, as well as building new electricity charging facilities and transmission lines, are going to fall flat if there aren’t plenty of electricians to do the work. In turn, the electricians won’t be able to do their work without getting necessary permits, which in turn will have to pass zoning, land-use, and environmental regulations and lawsuits.

Moreover, all of this needs to happen in a way that is accountable and, if not fraud-proof, at least fraud-resistant. The authors call this the “thieving squirrel problem”:

The “thieving squirrel” problem: You put seeds into the birdfeeder, but clever, agile, and highly motivated squirrels manage to eat a big share. The only answer is a birdfeeder designed to limit access and frustrate raiders. With funding levels this large, the problem of waste, fraud, and abuse is real. Especially for agencies that are disbursing sizable grants for the first time, controls baked in up front will be critical. Governments need to ensure proper compliance, reporting, and transparency—or risk rewarding the squirrels and undermining overall trust in the process.

Around the world and over time, “industrial policy” narrowly understood doesn’t have a great reputation. There are a few successes, and a distressingly large pile of failures. It’s worth remember that the resources committed to industrial policy–including money, capital, and human talent–could have been spent on other uses. For example, a big chunk of the $200 billion per year or so being spent on these programs could have gone into supporting pregnant mothers and infant children, or rebuilding public schools, or training a few hundred thousand electricians. Perhaps setting up a steady increase in taxes related to pollution and carbon emissions over time, and then letting the incentive effects of such taxes percolate through the economy, would be more effective–but spending more money is always a more popular way of seeking change.

It’s impossible to prove that industrial policy can’t ever work, for the same reasons that it’s often very difficult to prove a negative. Thus, even if this particular US industrial policy experiment fails, I expect that its supporters will just explain that with more money or commitment or vision or energy or an improved structure, it could easily have succeeded. So this post is just laying down a marker: When these pieces of industrial policy legislation were passed, the comments of supporters often suggested that this iteration of industrial policy was nearly as simple as tossing a log on the fire–virtually certain to succeed. If the programs are only mild success, or a considerable failure, the supporters should have to eat their words.

I hope the supporters are correct. I would prefer to see public money well-spent. In a few years, we can evaluate the results. But the administrative, political and economic conditions for success of industrial policy are a difficult set of obstacles to cross. The Deloitte author do not predict success or failure, but they do say: “Once a law is passed, there is a temptation to assume that desired results will follow. But much will depend on how government actually executes its strategy.”

A Long-Run Perspective on US Economic Growth

For those not familiar with the Economic Report of the President, it is published each year by the White House Council of Economic Advisers. In turn, the CEA is led by academics, who are appointed by the president but typically plan to head back to their ivory towers in a few years. Thus, they are clearly a partisan and pro-administration group, but they also have reason to care about their own reputation for expertise and for relatively dispassionate analysis. This tension plays itself out each year in the report.

The parts of the ERP each year that offer a partisan defense of the president aren’t that interesting to me, no matter who the president is, because of how one-sided and perfunctory such defenses tend to be. Of course, if you are looking for talking points to support the economic policies of the Biden administration or if you want to take target practice against those policies, you may be attracted to those parts of the report. But each year, the report also includes facts and nuggets about the US economy and its trends and patterns that have emerged from discussion from thoughtful academics, and some of these can be worth passing on. Here’s one from the first chapter of the 2023 Economic Report of the President, showing the long-term average for US economic growth going back to 1790.

The figure breaks down overall economic growth into three chunks: growth of population (which means more workers and consumers), changes in labor force participation (the share of the adult labor force that either has a job or is looking for a job), and output per worker, which changes according to improvements in human capital (education and skills), physical capital available to workers, and “total factor productivity,” which is econo-speak for productivity improvements. Here are a few reaction to the figure:

1) The slowdown in overall economic growth in in the 2000s is readily apparent. But in a broad historical perspective, it’s also apparent is that a lot of this slowdown is due to a slower rate of population growth (shorter dark blue bars) and also a decline in labor force participation due in part to the aging and retirement of the “baby boom” generation born in the 15 years or so after the end of World War II (the light blue bars in negative territory on the graph). At least in the last decade, output per worker hasn’t been rising at an especially slow rate.

2) For political scientists and those interested in global politics, the sheer size of the US economy matters–the total height of these bars. But for economists, what matters more is a gradually rising standard of living for the average person, which is roughly captured over time by the gain in output-per-worker.

3) The future of US economic growth isn’t likely to come from population growth; instead, it will need to be generated by higher output per worker. The US economy had a mass expansion into high school education from about 1910 to 1940, and a mass expansion of higher education after World War II, but no mass expansions of education since then. US capital investment seems OK, but a lot of one’s thinking around that issue revolves around placing an economic value on information technology and internet access, which isn’t easy to do. Productivity gains are calculated as the residual of what is left over, unexplained, by forces like labor force growth, human capital and physical capital–and by that measure, the US economy isn’t doing especially well since the early 2000s.

4) The 1870s appear on the graph as a time of rapid growth. I confess that I don’t understand this. The report emphasizes that the 1870s are a time of expanding the railroad and the telegraph, along with new inventions. However, the standard dating of US business cycles suggests that the US economy was in the “Long Depression” from October 1873 to March 1879. Maybe it was just a really extraordinary economic boom in the early 1870s in the aftermath of the Civil War?

5) From the perspective of decade averages, the Great Depression of the 1930s looks less “great,” in the sense that overall growth during the decade of the 1930s was similar to that of the 1910s and 1920s. In part, this is probably because we tend to understate the multiple deep recessions of these earlier decades, including three recessions in the 1910s and another three in the 1920s, as well as understating how the US economy recovered from the Great Depression in the later part of the 1930s (albeit with a recession in 1937-38). It may seem odd that labor force participation doesn’t fall noticeably in the 1930s, given the ultra-high unemployment rates of the time. However, the unemployed are counted as “participating” in the labor market–to be outside the labor force participation rate, you need to be not looking for work (say, retired or working in the home by preference).

6) In the 1970s and 1980s, you can see that a noticeable chunk of overall economic growth was the rise labor force participation, mainly due to growing participation of women in the (paid) labor force.

There’s a remarkable economic story behind every bar and line in this graph.

Catastrophes and Costs: Some Trendlines

Each year, the Swiss Re Institute has been publishing an annual report on the natural catastrophes of the previous year, of particular interest to specialists, and some long-run trendlines, which are more interesting to me. This year’s report is “Natural catastrophes
and inflation in 2022: a perfect storm”
(Sigma, 2023, No. 1).

Here’s the number of catastrophes around the world. in the last half-century. Natural catastrophes include “floods, storms, earthquakes, droughts/forest fires/heat waves, cold waves/frost, hail, tsunamis, and other natural catastrophes.” Man-made catastrophes do not include war, but instead are divided into the categories of “major fires and explosions, aviation and space disasters, shipping disasters, rail disasters, mining accidents, collapse of buildings/bridges, and miscellaneous (including terrorism).”

It’s wise to interpret these numbers with care. After all, the global population has more than doubled since 1970, which more-or-less fits the rise up to the early 2000s, although it doesn’t explain the spike in the mid-2000s or the more recent decline (which precedes the pandemic). Also, if a natural event that would be a catastrophe if it happens in a heavily populated area occurs instead in a lightly populated area, does it count as a catastrophe of the same scale? The world’s apparatus for discovering and reporting natural catastrophes was clearly less developed a half-century ago. That said, it appears that man-made disasters have been declining in the last decade or so, while natural catastrophes have been trending up over time.

What about the costs? Let’s look first at lives lost. Notice that the vertical scale is logarithmic, not arithmetic (that is, it rises by factors of ten). The report notes: “Worldwide, 35 157 people are believed to have died or gone missing in disaster events in 2022. Natural catastrophes claimed over 32 600 victims, and man-made disasters over 2500.”

Here, the general pattern seems to be that deaths from man-made disasters are both an order of magnitude lower than natural disasters, and also declining. There doesn’t seem to be any particular trend to deaths from natural disasters, just a pattern of big spikes when an especially awful disaster occurs somewhere.

Finally, here’s an estimate of total financial costs of these disasters, and because Swiss Re is a reinsurance company, the share of those costs covered by insurance. These costs are adjusted for inflation. The sharp rise is because of economic growth and rising property values: for example, a hurricane hitting Florida now will have a higher financial cost than the same hurricane hitting in 1970.

It’s also worth remembering that financial costs will depend on local prices, such that financial costs will be higher in high-income, high-price countries. The report notes that the gap between insured financial losses and total financial losses seems to be rising: it was 53% in 2022, down from 59% over the average of the last ten years.

Interview with Annamaria Lusardi on Financial Literacy

David A. Price interviews Annamaria Lusardi “on financial literacy, seniors versus scammers, and learning from the mistakes of NFL players” (Econ Focus: Federal Reserve Bank of Richmond, First Quarter 2023, pp. 24-28). Lusardi notes:

[W]e have witnessed a highly important change in the United States and around the world, which is that more and more, we have shifted the responsibility to save for retirement from the employer to workers. I am talking about the shift from defined benefit pensions to defined contribution pensions, such as individual retirement accounts and 401(k) plans. In the past, it was the employer who had to manage the pension of the employees; the wealth was managed by a CFO or by other financial experts. Now we ask individuals to make these decisions about their wealth. So even more than when I was an assistant professor, there’s the question of whether people have the skill to manage their money.

Lusardi and Olivia Mitchell have designed a 28-question test to measure financial literacy, which has become a widely used research tool. For a flavor of the kinds of questions, consider what they call the “Big Three.” In their early work, an advantage of having just three questions is that it was practical to add three questions to piggyback on preexisting surveys.


  1. Suppose you had $100 in a savings account and the interest rate was 2% per year. After 5 years, how much do you think you would have in the account if you left the money to grow?
  • More than $102
  • Exactly $102
  • Less than $102
  • Do not know/Refuse to answer

2.Imagine that the interest rate on your savings account was 1% per year and inflation was 2% per year. After 1 year, how much would you be able to buy with the money in this account?

  • More than today
  • Exactly the same
  • Less than today
  • Do not know/Refuse to answer

3. Please tell me whether this statement is true or false. “Buying a single company’s stock usually provides a safer return than a stock mutual fund.”

  • True
  • False
  • Do not know/Refuse to answer

NOTE: Correct answers are (1) “More than $102,” (2) “Less than today,” and (3) False.
SOURCE: Annamaria Lusardi and Olivia S. Mitchell

The ongoing research in this area suggests both that financial literacy is low.

Together with a team at the World Bank, I eventually designed questions similar to the big three that were applied to a sample of more than 140 countries. I would say there are several interesting findings. One is that even though the U.S. is the country with the most advanced financial markets, it actually doesn’t score very high in terms of financial literacy. And this has been true in other surveys, as well. The second thing is that overall financial literacy is not high in other countries, either. Overall, the level of financial literacy globally is really low; only one-third of people around the world are financially literate. …

[W]hat we did recently — and it took us a good many years to do this project — is a meta-analysis of financial education programs. … Because the literature was so extensive, we then decided to concentrate on the most rigorous evaluations. So we looked at only the randomized control trials. … So you expose a group to financial education; you don’t expose the other, similar group; and then you compare what happened to the group you treated. What we found, looking at the evidence in as many as 33 countries, is that financial education works and works well — meaning it does translate into higher knowledge and also better behavior in savings and managing credit and in other areas, including insurance and money transfers. And we also found that it is cost effective. This is due to the fact that many educational programs do not cost very much.

This work has implications reaching in a number of directions. Prominent examples include professional athletes who, even in a relatively short career, might earn as much as the average college-educated person will earn in a lifetime. But these athletes are young adults, their financial literacy is no greater than average, and they are easy targets for financial “advisers” and “planners” who charge high fees for high-risk options. A less prominent but much larger group are the elderly near retirement age, at a stage when they probably have the highest level of assets for their lifetime, but again, their financial literacy is no greater than average, and at they can often find themselves to be targets for high-fee and high-risk “advisers” and “planners.” Indeed, Lusardi’s work has found that the current elderly are often taking more debt into their retirement than previous generations.

Mervyn King: “Our Ambition at the Bank of England is To Be Boring”

Back in 2000, the Deputy Governor of the Bank of England, Mervyn King, gave a speech about monetary policy that has been often-quoted by central bankers around the world (“Balancing the Economic See-Saw,” April 14, 2000). He said:

[O]ur ambition at the Bank of England is to be boring. Not, I hasten to add, at events like this. But in our management of the economy where our belief is that boring is best. Macroeconomic policy has, for most of our lifetime, been rather too exciting for comfort. … Our aim is to maintain economic stability. A reputation for being boring is an advantage – credibility of the policy framework helps to dampen the movement of the see-saw. If love is never having to say sorry, then stability is never having to be exciting.

King’s reference to the “see-saw” is pointing out that monetary policy involves movements between looser and tighter monetary policy. Some such movement is inevitable. But of course, the goal is to have the see-saw of macroeconomic policy involve small adjustments, rather than big swings. King argued that central banks should be willing to take small actions sooner, because otherwise they are likely to need to take bigger actions later. He said:

The longer the correction is left, the sharper the required adjustment will be. The higher one end of the see-saw, the greater the subsequent lurch will be. … In one of the most influential contributions to monetary policy in the post-war period, Milton Friedman wrote that the characteristic of most central banks was that “too late and too much has been the general practice”.

The often-heard recent complaint about the Federal Reserve and inflation was that it waited too long after inflation started in 2021, and then had to act more aggressively to raise interest rates starting in 2022 than would otherwise have been needed. The current concern about monetary policy is that perhaps the Fed has already taken sufficient actions to bring down inflation, but it takes some time for the past hike in interest rates to work through the macro-economy. By not waiting to see what happens from its past actions, the Fed runs a risk of overreacting. The best best description of this phenomenon that I know come from Alan Blinder, who was vice-chair of the Fed in the mid-1990s. He wrote in a 1997 article in the Journal of Economic Perspectives:

[H]uman beings have a hard time doing what homo economicus does so easily: waiting patiently for the lagged effects of past actions to be felt. I have often illustrated this problem with the parable of the thermostat. The following has probably happened to each of you; it has certainly happened to me. You check in to a hotel where you are unfamiliar with the room thermostat. The room is much too hot, so you turn down the thermostat and take a shower. Emerging 15 minutes later, you find the room still too hot. So you turn the thermostat down another notch, remove the wool blanket, and go to sleep. At about 3 a.m., you awake shivering in a room that is freezing cold …”

But setting aside the questions of whether the Fed waited too long to act (I think it did) or whether it is currently in danger of overreacting (I think not yet), perhaps the bigger complaint is that when one thinks back over the last 20 years or so of monetary policy, what Mervyn King called the “see-saw” of policy has shown dramatic shifts. It’s not just the rise and fall of the federal funds interest rate–the policy interest rate targeted by the Fed.

It’s also the policies of quantitative easing, due to which the Fed now holds about $8 trillion in Treasury bonds and mortgage-backed securities. I It’s the shift toward the use of “forward guidance,” in which the Fed seeks to shift interest rates and financial conditions in the present by making announcements about the likely course of future Fed policy.

It’s the fact that the Fed has fundamentally shifted its tools of monetary policy policy. A couple of generations of economics students were taught about the three tools of monetary policy: open market operations, reserve requirements, and the discount rate. But the Fed abolished reserve requirements in 2020, and open market operations only worked because banks wanted to avoid not holding enough reserves. Instead of discount rates, the Fed now creates of funds for short-term liquidity, which spring up during the Great Recession or the pandemic recession to reassure markets, and then vanish again. The Fed now seeks to contorl the federal funds interest rate now happens through payment of interest on bank reserves held at the Fed, which didn’t start until 2008, and using overnight reverse repurchase agreements, which didn’t start until 2013.

One can add to this some of the recent debates over whether the goals of Federal Reserve policy should reach beyond the standard see-saw of balancing risks of unemployment and inflation, and also try to take into account possible effects of monetary and banking regulation policy on issues like inequality and climate change.

Of course, the most recent example of the Fed policy see-saw is the meltdown at Silicon Valley Bank and its aftermath. Apparently, neither the Federal Reserve’s monetary policy arm nor its bank regulation arm managed to notice the elementary fact that a monetary policy decision to raise interest rates would affect the value of fixed-interest-rate bonds held by banks (as well as the value of similar assets held by the Fed itself). As a result, the Fed ended up making a rather sudden decision to guarantee all bank deposits , even those above the previous limit of $250,000, at certain “strategic” banks–a guarantee which in practice seems likely to apply to just about any bank in trouble.

It remains true in 2023, as Mervyn King said back in 2000: “Macroeconomic policy has, for most of our lifetime, been rather too exciting for comfort.” I have a reasonably good understanding of the the reasons and rationales for the various Fed policy changes in the last 20 years. But it’s worth remembering King’s other ambition as well: The Federal Reserve, along with other central banks around the world, needs to be more boring.

Some Economics of Pandemic Vaccination

The Oxford Review of Economic Policy has published a 10-paper symposium on the “Economics of Pandemic Vaccination.” Here, I’ll focus on the first overview essay, by Scott Duke Kominers and Alex Tabarrok, titled “Vaccines and the Covid-19 pandemic: lessons from failure and success” (Winter 2022, 38:4, pp.  719- 741). They write: “The expected costs of a future pandemic easily exceed a trillion dollars, thus justifying large expenditures and thought. Our knowledge of pandemics and their costs has been hard-won. What have we learned from global failures and successes in combating Covid-19?”

The makers of vaccines capture only a very small share of the benefits of the vaccines. Thus, their incentive to produce such vaccines is much less than the social benefit of the vaccines.

The vaccine industry, however, can only capture a tiny fraction of the gains from successful vaccines. Calculations from Susan Athey, Juan Camilo Castillo, Esha Chaudhuri, Michael Kremer, Alexandre Simoes Gomes, and Christopher M. Snyder (2022, this issue) and Castillo et al. (2021) suggest that vaccines had a value on the order of $5,800 per course when the price was running at just $6–$40 per course. As a result of the ‘enormous gulf between social and commercial incentives’ (Nordhaus, 2004Ahuja et al. 2021), there was an important role for government investment in vaccine research and development, as well as in production capacity.

The government incentives for vaccine production need to include speed, not just volume. Imagine that the government wants a company to produce 100 million doses of a vaccine. However, the preference of the government is to produce all 100 million as soon as possible, which means the company would need to ramp up production very fast–and then production would drop off. Thus, the company would prefer to spread out production over time, so it can make longer-lasting use of smaller production facilities. Thus, appropriate government policies involve both “push” policies to create a vaccine and “pull” policies to encourage rapid and large-scale production.

Every country underinvested in vaccines–in fact, the US did more to encourage vaccines than most.

It is important to note that underinvestment was not simply a US problem—every country underinvested in vaccines (Ahuja et al., 2021). In fact, Operation Warp Speed was by far the largest vaccine investment programme globally, so whatever problems reduced the effectiveness and scale of the US response may have been far larger elsewhere. Global underinvestment in vaccination may in part have been a result of human psychology—voters tend to reward politicians for dealing with emergencies, but not for avoiding them (Healy and Malhotra, 2009), and in the case of Covid-19, the scales in question may have been especially hard to contemplate. Human psychology may also help to explain why it appears to have been harder to spend trillions on a war against a virus than on wars against other people.

The role of the Defense Production Act in requiring firms to produce inputs for the vaccines was mixed, and deserves additional study.

Operation Warp Speed was about more than just spending. The US Department of Defense and the Defense Production Act also played key roles. The Defense Production Act is a 1950 law that gives the US president significant authority to direct civilian production to uses deemed necessary for national defence. This is sometimes mistakenly thought of as a type of command and control—an order to produce—but in practice, the Defense Production Act was mostly used to supplement the market process due to some of the limitations of investments in capacity. …

The Covid-19 pandemic necessitated vaccination on a larger scale than ever before, and thus many inputs were in net under-supply. Yet instead of allowing the price of vaccines to rise and feed into input prices, governments held prices low, and subsidized some stages of production such as clinical trials. … But governments are unlikely to have the requisite knowledge to see and coordinate the entire supply chain and its multiple substitutes, complements, and opportunity costs (Hayek, 1945). Operation Warp Speed used the Defense Production Act to imperfectly and temporarily substitute for the signalling and incentive role of prices. Instead of prices being transmitted along the supply chain, the Defense Production Act authorized priority ratings to be transmitted along the supply chain—thus, a firm given a priority rating of DO (critical to national defence) could (indeed, it had to) pass that priority rating on to its input suppliers, who in turn would pass the rating on to their suppliers, and so forth.

The Defense Production Act may have been useful in this regard, but it was not without cost. Bown (2022) notes that because the Defense Production Act forbade firms from raising prices, it likely reduced incentives to invest in new capacity, muting the long-run supply response—and, as Covid-19 wore on, the ‘long run’ quickly became the present. The Defense Production Act also allowed less substitution across inputs than might have been possible using market-clearing prices. For future pandemics, it will be important to figure out the ideal balance between using market forces and government management to drive vaccine supply.

There is a problem in pandemic of how to decide when a certain possible treatment should be pursued further, or set aside. The problem is that there is a limit on the number of treatments that can be evaluated. If you give up on a possible treatment too soon, that’s obviously bad. But if you keep pursuing at treatment that doesn’t work, there is a tradeoff of not pursuing an alternative that might have worked. How do you know when to stop?

When the pandemic began, there were no known treatments for Covid-19. Thus, flexible trial designs that dropped inefficacious treatments and added potentially efficacious treatments in real-time were critical. These sorts of ‘adaptive’ trials—including the British RECOVERY trial—were instrumental in quickly discovering useful treatments, but they also raise very complex statistical issues which are best analysed in advance. For example: how much evidence should one require to ‘prove’ that a treatment is inefficacious? Setting a high standard could mean testing an inefficacious treatment for too long, with deleterious consequences for patients and for treatment discovery. Clinical trial resources are limited, so the opportunity cost of testing an inefficacious treatment is testing a potentially efficacious one. 

Shouldn’t we be preparing to be able to vaccinate for the next pandemic now?

[A]s the pandemic has dragged on, funding for ongoing vaccination and therapeutic efforts, as well as continued R&D, has been lacking (see, for example, US White House (2022)). Next-wave vaccines—most crucially, nasal vaccines (which may be more effective at reducing transmission) and pan-coronavirus vaccines (which would provide immunity against many variants at once)—are in development (Topol and Iwasaki, 2022) but, at least so far, we have seen nothing like an Operation Warp Speed-level push. With Covid-19 continuing to cause significant morbidity and mortality worldwide, not to mention ongoing economic disruption, estimates such as those of Castillo et al. (2021) suggest that an Operation Warp Speed 2.0 to combat the ongoing threat of Covid-19 would be highly cost-effective.

There are a variety of other questions. When distributing vaccines with in a country, for example, does it make sense to prioritize those who are most likely to die of the illness, or those who are most likely to be infected, or those who are most likely to spread the infection? These groups overlap, but are not the same. Also, should the main focus of vaccine distribution be decentralized–say, through pharmacies–or centralized in government distribution centers? When distributing vaccines internationally, is it better to spread the limited supply of available vaccines across countries, with a low proportion of the population getting the vaccine, or to focus the first wave of vaccines on a subgroup of countries, where a higher proportion of the population within those countries can get the vaccine? And what kinds of contracts make sense for international distribution?

For me, one of the oddest things about the US pandemic experience was that, according to the Global Health Security (GHS) Index produced by the Johns Hopkins Center for Health Security, the US was in 2019 the country in the world most ready for a pandemic. But it turns out that this measure of pandemic preparedness was essentially useless.

Robert Tucker Omberg and Alex Tabarrok (2022, this issue) examine the Global Health Security (GHS) Index, a comprehensive measure of pandemic preparedness that was produced by the highly regarded Johns Hopkins Center for Health Security before the pandemic. The upshot of their paper is that countries that were ranked highly in pandemic preparedness did not perform better during the pandemic, whether looking at infections, confirmed deaths, excess deaths, the country-by-country time series of excess deaths, or other outputs. Indeed, almost no aspect of the GHS Index helps to predict pandemic outcomes, even after controlling for a variety of demographic factors.

Surely (he says in a pathetic and pleading tone), we can do better in being prepared when (not if) the next pandemic comes?

The Failure of Free Community College in Oregon

Economists have a knee-jerk negative reaction to proposals that are phrased in terms of “free,” whether its “free” school lunch, or “free” health care, or “free” housing, or “free” tuition. In each case, the issue isn’t whether the program is a good idea or not. It might be. The issue is that just because something is provided at no cost to a user doesn’t mean it is actually “free,” only that the costs are being paid in some other way. Thus, “free” programs need to be evaluated, like any others, based on who receives benefits and pays costs, not on a pretense that saying “free” solves the problem.

Back in 2015, Oregon passes legislation for the “Oregon Promise,” which said the state would pay the average cost of Oregon community college tuition for Oregon high school graduates. The same legislation required that a state agency called the Higher Education Coordinating Commission produce a report evaluating the Oregon Promise every other year, and the 2022 report is now out.

The hope of the Oregon Promise, of course, was that it would encourage greater college attendance, especially among those from families with lower incomes or from backgrounds traditionally under-represented in colleges. Here’s the summary of the findings:

We found that in the first two years of the program, enrollment rates rose, but declined in the last four years, especially due to the impact of the pandemic on college enrollment. The initial implementation of the Oregon Promise was associated with a clear increase in enrollment at the colleges, and early enthusiasm and attention to the program seemed to realize the program’s goals of opening the door to postsecondary education and training wider. After six years, these early increases have not been sustained, as community college and statewide college-going rates are lower than prior to the Oregon Promise program.

We did not find evidence to suggest the Oregon Promise is associated with reducing equity gaps in college-going rates. Racial/ethnic gaps in college-going rates were similar before and after the implementation of the program, at least until the pandemic. Although college-going rose for Black/African American and for Latino/a/x/Hispanic graduates increased, this increase started before the Oregon Promise and therefore is not attributable to the Oregon Promise program. Gaps in college-going rates by geography and gender widened since the program began. …

We found that Oregon Promise recipients are generally representative of their high school graduating class, though they are somewhat more likely to be women and Latino/a/x/Hispanic. Additionally, students with the Oregon Promise are more likely to be from low-income backgrounds and from urban areas. We also noted that because of the last dollar structure of the program, the vast majority of Oregon Promise dollars go to students from middle- and upper-income backgrounds, even though nearly half of the students are from low-income backgrounds. …

By design, the program maximizes federal financial aid coming into Oregon for those who receive the Oregon Promise grant. … For students, the program slightly reduces the percentage of students facing unaffordable costs. Nonetheless, nearly two out every five students receiving the Oregon Promise still cannot meet the expected cost of attendance at their college even with the grant, and almost two-thirds of students from low-income backgrounds cannot meet the cost of attendance even with the grant. …

We found no association between the Oregon Promise program and the number of terms enrolled or credits earned among recent high school graduates. In addition, we found no lasting increases in completion rates coinciding with implementation of the Oregon Promise program, though the number of cohorts and years available to assess this question are still limited.

One can of course raise a number of possibilities here. Maybe it was the pandemic that hurt the program–although little bump in community college attendance after the passage of the program was already over by 2020. Indeed, it looks in some of the more detailed data as if the bump in community college attendance was because of a drop in attendance from other Oregon four-year colleges; in other words, the program caused a few students to shift from four-year to the now “free” two-year institutions.

Maybe a bigger Oregon Promise is needed, going beyond tuition and also governing books, living expenses, and other costs? It’s notable that the program had a “top-up” design: that is, you first qualify for federal assistance for low-income families via programs like Pell Grants, and then the Oregon Promise tops up the rest. But the result of this approach was that most of the funding for the Oregon Promise ended up flowing to middle- and high-income families, who weren’t eligible for low-income aid. Maybe a Promise that gave more generous aid to those with low-incomes, and didn’t also subsidize those from middle- and high income families, would make more sense?

Or it may be that a high level of mentoring and advising that continues into the community college year is just as or more important to nontraditional or low-income students than the financial aid itself. After all, getting admitted to community college but then not having a sense of what classes to take, how to do the work, and where to go when you are having troubles may not help much. Or maybe the Oregon Promise should have requirements that high school students take certain classes or have a certain GPA or test scores to be eligible?

The details of program design matter. There are now a variety of “Promise”-style programs: the Kalamazoo Promise, the Pittsburgh Promise, and similar programs across 24 states. Laura W. Perna, Jeremy Wright-Kim, and Elaine W. Leigh look at some differences in design in “Is a College Promise Program an Effective Use of Resources? Understanding the Implications of Program Design and Resource Investments for Equity and Efficiency” (AERA Open, 6: 4, October-December 2020). They write:

Estimates of net benefits for the Kalamazoo Promise and Pittsburgh Promise are likely not transferable, as these programs differ from others in ways that may influence program outcomes, characteristics of recipients, and costs. The Kalamazoo Promise requires students to attend district schools from kindergarten through high school graduation to be eligible for the maximum financial award, does not reduce the award by other grant aid (i.e., “first dollar”), and allows students to use the award at public 4-year institutions across the state (Bartik et al., 2016). Comparatively, the Tennessee Promise, for example, requires students to apply as high school seniors, provides a financial award that is reduced by other grant aid (i.e., “last dollar”), and limits use of the award to attendance at community and technical colleges (House & Dell, 2020Meehan et al., 2019).

The Oregon Higher Education Coordinating Commission obviously isn’t the final word here, but this is how they sum up the existing research on Promise-style programs:

Whether and how College Promise programs affect access to and success in college has been of national interest. Across the country, these programs cover tuition, but they differ in both scope and design. Regarding scope, some apply to a specific college, others apply only to high school graduates in a specific school district, and still others to multiple public institutions for high school graduates statewide. Regarding eligibility, requirements vary around student residency, high school grade averages, application materials and fees, enrollment levels, and income limits.

In the initial years of these various programs, evaluations found increased college enrollment associated with both local and statewide College Promise programs. However, more recent research has found that College Promise programs do not consistently sustain these increases in enrollment, citing differences in eligibility requirements. Programs that have eligibility requirements consistent with students who are most likely to go to college have not produced long lasting enrollment increases.

Though the relative newness of College Promise programs limits the research on college completion, recent studies suggest that student supports (e.g., advising, mentoring, and other educational supports) are an important intervening factor. Programs with more minimal eligibility requirements in particular did not demonstrate increases in postsecondary credential attainment without additional support resources. Prior research has shown limited impacts of College Promise on equity in college access and success. Programs with eligibility requirements that are consistent with the characteristics of those already likely to attend college maintain existing inequities. Those structured as last-dollar programs show minimal to no improvements in equity.

There’s an old saying that states are the “laboratories of democracy,” the place where you can try things out and see what works. After all, you can learn from your failed experiments, too.