In Hume’s spirit, I will attempt to serve as an ambassador from my world of economics, and help in “finding topics of conversation fit for the entertainment of rational creatures.”
The Russian invasion of Ukraine has led to me reading more about foreign affairs and international relations than usual, which in turn reminded me of Arthur M. Schlesinger’s comments about the quality of writing in the US State Department. Schlesinger was a Harvard history professor who became a “special assistant” to President John F. Kennedy. In 1965, he published a memoir titled A Thousand Days: John F. Kennedy in the White House. Here’s his discussion of the internal battles over the quality of writing emanating from the US Department of State (pp. 418-419). I’m especially fond of a few of his comments:
“The writing of lucid and forceful English is not too arcane an art.”
“At the very least, each message should be (a) in English, (b) clear and trenchant in its style, (c) logical in its structure and (d) devoid of gobbledygook. The State Department draft on the Academy failed each one of these tests (including, in my view, the first).”
Here’s the fuller passage:
After the Bay of Pigs, the State Department sent over a document entitled “The Communist Totalitarian Government of Cuba as a Source of International Tension in the Americas,” which it had approved for distribution to NATO, CENTO, SEATO, the OAS and the free governments of Latin America and eventually for public re- lease. In addition to the usual defects of Foggy Bottom prose, the paper was filled with bad spelling and grammar. Moreover, the narrative, which mysteriously stopped at the beginning of April 1961, contained a self-righteous condemnation of Castro’s interventionist activities in the Caribbean that an unfriendly critic, alas! could have applied, without changing a word, to more recent actions by the United States. I responded on behalf of the White House:
It is our feeling here that the paper should not be disseminated in its present form. …
Presumably the document is designed to impress, not an audience which is already passionately anti–Castro, but an audience which has not yet finally made up its mind on the gravity of the problem. Such an audience is going to be persuaded, not by rhetoric, but by evidence. Every effort to heighten the evidence by rhetoric only impairs the persuasive power of the document. Observe the title: ‘The Communist Totalitarian Government of Cuba’… This title presupposes the conclusion which the paper seeks to establish. Why not call it `The Castro Regime in Cuba’ and let the reader draw his own conclusions from the evidence? And why call it both ‘Communist’ and ‘totalitarian’? All Communist governments are totalitarian. The paper, in our view, should be understated rather than overstated; it should eschew cold .war jargon; the argument should be carried by facts, not exhortations. The writing is below the level we would hope for in papers for dissemination to other countries. The writing of lucid and forceful English is not too arcane an art.
The President himself, with his sensitive ear for style, led the fight for literacy in the Department; and he had the vigorous support of some State Department officials, notably George Ball, Harriman and William R. Tyler. But the effort to liberate the State Department from automatic writing had little success. As late as 1963, the Department could submit as a draft of a presidential message on the National Academy of Foreign Affairs a text which provoked this resigned White House comment:
This is only the latest and worst of a long number of drafts sent here for Presidential signature. Most of the time it does not matter, I suppose, if the prose is tired, the thought banal and the syntax bureaucratic; and, occasionally when it does matter, State’s drafts are very good. But sometimes, as in this case, they are not.
A message to Congress is a fairly important form of Presidential communication. The President does not send so many — nor of those he does send, does State draft so many — that each one can- not receive due care and attention. My own old-fashioned belief is that every Presidential message should be a model of grace, lucidity and taste in expression. At the very least, each message should be (a) in English, (b) clear and trenchant in its style, (c) logical in its structure and (d) devoid of gobbledygook. The State Department draft on the Academy failed each one of these tests (including, in my view, the first).
Would it not be possible for someone in the Department with at least minimal sensibility to take a look at pieces of paper designed for Presidential signature before they are sent to the White House?
It was a vain fight; the plague of gobbledygook was hard to shake off. I note words like “minimal” (at least not “optimal’) and ‘pieces of paper” in my own lament.
The policy challenge of climate change is not likely to have a single magic bullet answer, but rather will require a cluster of answers. In addition, it seems extremely unlikely to me that world energy production is going to decline any time soon, especially given the fact that only about 16% of global population lives in “high-income” countries as defined by the World Bank, and the other 84% are strongly desirous of a standard of living that will require higher energy consumption. Thus, the question is how to produce the quantity of energy that will be demanded in the future in a way that will be cost-competitive and also created lower environmental costs (by which I include costs of conventional air pollution from burning fossil fuels as well as issues of carbon emissions and climate change). It is fundamentally a question to be addressed by the cluster of technologies that affect energy production, storage, and consumption.
How responsive is innovation to climate policy? … We will see that innovation is key to ensure economic growth while preserving the environment, but innovation is not a silver bullet. First, there is not a single innovation that will reduce our dependence on fossil fuels; instead, we will need many (sometimes incremental) innovations in energy-saving technologies and in clean energy. Second, innovation is not manna from heaven, and it is not even necessarily clean. Instead, the direction of innovation responds to incentives and to policies, which means that policies should be designed taking their induced effect on clean (or dirty) innovation into account.
Hémous points out that from per capita emissions of carbon dioxide have been shrinking for the world as a whole, and also for many advanced economies. However, substantial differences in per capita carbon emissions remain even among high-income countries:
“[T]he correlation between income and emissions is far from perfect. Switzerland is below the world average, France is close to it (and below it once we take changes in land use into account), but clearly both countries are relatively rich. Such cross-country differences reflect differences in technologies (how electric power is produced, how well buildings are insulated, etc.) and consumption choices (what type of cars are popular, how far people live from work, etc.).”
The key driving force here is the energy needed to produce a given amount of GDP–commonly known as “energy intensity”–has been falling all around the world. A substantial part of the reason is that as economies evolve, their economies shift from energy-hungry production of goods to a less energy-intensive production of services and knowledge. The growth of their GDP is less about “more” and instead focused on “better.”
Hémous describes his own research focused on automotive engines. He looks at “clean” innovations that offer alternatives to fossil fuel engine: “grey” innovations that make fossil fuel engines more efficient–but for that reason can also lead to a “rebound effect” of additional driving; and “dirty” innovations that tend to lead to higher emissions. Looking at “3,412 international firms over the period 1986–2005. We find that a 10% increase in fuel prices leads to 8.5% more clean innovations and 8.3% less purely dirty innovations 2 years later, with no statistically significant effect on grey innovations.”
The lesson here is a general one: the direction of technological change is not fully predetermined by the discoveries of scientists and engineers, but instead responds to economic incentives. Development of new pharmaceuticals, to choose another example, responds to the size of the perceived market–which may be quite different from the potential health benefits. Some technologies may be better at replacing labor and displacing jobs; other technologies may be better at complementing labor and raising wages. When it comes to energy, some technologies will be greener than others, and the incentives for a greener path can come in the form of price signals, regulations, and support for research and development.
Most people don’t have a clear intuitive grip on numbers that are either pretty small or pretty large. Lotteries, which combine an unthinkably low chance of winning with an unthinkably large prize, are a vivid example, but there are many others. But in a big and diverse world, the very human inability to get a clear intuitive grip on smaller and larger means that we misperceive the world around us. Here’s a vivid example from the British public opinion and data company YouGov (by Taylor Orth, “From millionaires to Muslims, small subgroups of the population seem much larger to many Americans,” March 15, 2022).
This survey asked Americans to estimate the size of various groups in the population. The blue dots show the survey estimates; the orange dots show the actual value based on data from the US Census Bureau, the Bureau of Labor Statistics, and similar sources. For example, the first row shows that the actual share of Americans who had over $1 million in income last year is close to 0% (actually, about 0.1%). But according to the survey, Americans think that about 20% of Americans had over $1 million in income last year. Conversely, the bottom row says that Americans believe about 65% of the adult population has a high school degree, when the actual share is more like 89%.
The short article with the survey mentions some of the hypotheses here. For example, one lesson is that the framing of a survey will shape the results. This survey doesn’t require percentages to add up to 100% for any given category. For example, the survey responses suggest that 30% of the US population is in New York City, another 30% is in Texas, and an additional 32% is in California–which suggests that the entire rest of the US has only 8% of the total population. Similarly, the survey suggests that 27% of the population is Muslim, 30% of the population is Jewish, 33% are atheists, and 58% are Christian–which adds up to way more than 100%. If the survey was structured in a way that required the category of where you live or your religious identification to add up to 100%, the answers would presumably look different.
Some possible explanation for these patterns is that some groups are highly salient for survey respondents: perhaps people have a few vivid examples of people in a certain category, or a strong emotional charge about people in a certain category, and thus are likely to inflate the numbers for that group. I suspect this answer has some truth. But it’s worth noting that similar divergences hold for categories like being left-handed, which wouldn’t seem to have the same issues of vividness and emotional charge.
The article suggest that what is happening here is “uncertainty-based rescaling,” which basically means what when people are uncertain, they tend to answer surveys by making the small percentages larger and the large percentages smaller.
It’s also possible to ask whether any of this matters. There are social science studies where some groups are given information before being asked their opinions, while other groups are not, and providing information often doesn’t have much effect on the opinions people express. In the Summer 2020 issue of the Journal of Economic Perspectives, Brendan Nyhan wrote “Fact and Myths about Misperceptions,” which digs into these issues. For many people, how they feel about, say, transgender rights will not change if they are informed that transgender people are 1% rather than 21% of the population.
But as an old-fashioned 20th century kind of person, I do think these wide divergences between reality and perception matter. When beliefs about actual real-world quantities become unmoored from reality, it’s gets harder even to distinguish the bigger issues from the smaller ones, much less to think about the tradeoffs of alternative policy choices.
When inflation first started kicking up its heels last summer, there was a dispute over whether it was likely to be temporary or permanent. The argument for “temporary” went along the following lines: the underlying causes of the inflation were a mixture of factors like supply chain disruptions, the fact that people shifted to buying goods rather than services during the pandemic recession, and federal government overspending earlier in 2021. Inflation has been stuck at low levels for several decades now, and as these underlying factors fade, this temporary blip won’t alter people’s long-run expectations about future inflation. The argument for “permanent” sounded like this: the supply chain and spending pressures would fad only gradually. In the meantime, rising inflation could become embedded in the expectations of firms as they set prices and workers as they looked for wage increases. In this way, the surge in inflation could become self-sustaining.
Of course, that argument happened before Russia decided to invade Ukraine. The supply chain disruptions that we were concerned about in summer 2021 were epitomized by lines of cargo ships waiting to be unloaded in west coast ports. But the current supply chain disruptions are about the waves of sanctions being imposed on Russia, the loss of agricultural output, spikes in energy prices, and a COVID outbreak in China that threatens supply chains there. So what’s the current thinking? The stable of macroeconomists at the Peterson Institute for International Economics have been been posting a series of working papers and essays on the prospects for future inflation. The full range of views are represented, just in time for the meeting of the Federal Open Market Committee meeting happening today and tomorrow (March 15-16).
Reifschneider and Wilcox spell out an overall macroecoomic model, and along the way offer an interesting calculation about the connection between short-term inflation changes and what happens in the longer-term. On the horizonal axis of this graph, each point represents a 20-year period. The question is: During this 20-year period, if there was a one-time jump in inflation, did it portend a higher inflation rate for the longer-term. Back in the 1960s and 1970s, a short-term movement in inflation pretty much always translated one-to-one into a longer term move. But in the last 20 years, short-term shocks to inflation have typically faded away quickly, having little or no longer-term persistence.
As you look at this figure, an obvious question is whether 2021 looks more like the late 1960s–that is, a run-up to lasting and higher inflation–or whether it looks more like an unclassifiable pandemic blip. I should add that the authors
The statistical analysis in this Policy Brief was conducted before Russia invaded Ukraine. As a result of the war, the inflation situation will probably get worse during the next few months before it gets better, and could do so in dramatic manner if Russian energy exports are banned altogether. Nonetheless, if the key considerations identified in this Policy Brief remain in place, and if monetary policymakers respond to evolving circumstances in a sensible manner, the inflation picture should look considerably better in the next one to three years.
For a gloomier view, Olivier Blanchard responded in “Why I worry about inflation, interest rates, and unemployment” (March 14). Blanchard points out that when inflation rises, the Fed typically raises interest rates. This graph shows the inflation rate in red. The blue line shows the real policy interest rate–that is, the federal funds interest rate set by the Federal Reserve minus the rate of inflation. Because it’s a real interest rate, the spike in inflation in the 1970s and more recently push the real interest rate down into negative territory. You can see back in the late 1970s and early 1980s that as the real policy interest rate rose, inflation came down–albeit at the cost of a severe double-dip recession in 1979-80 and again in 1981.
In short, Blanchard argues that the Federal Reserve is “behind the curve” as it was in February 1975 when inflation had already hit double-digits and would do so again in the later part of that stagflationary decade.
Of course, historical comparisons always require some interpretation. Given that short-term inflationary blips have tended to fade away for a few decades now, why won’t it happen this time? Blanchard makes a case that this time is different:
The issue, however, is how much the past few decades, characterized by stable inflation and nothing like COVID-19 or war shocks, are a reliable guide to the future. There are good reasons to doubt it. What I believe is central here is salience: When movements in prices are limited, when nominal wages rarely lag substantially behind prices, people may not focus on catching up and may not take variations in inflation into account. But when inflation is suddenly much higher, both issues become salient, and workers and firms start paying attention and caring. I find the notion that workers will want to be compensated for the loss of real wages last year, and may be able to obtain such wage increases in a very tight labor market, highly plausible, and I read some of the movement in wages as reflecting such catchup.
Will the statistical relationships of the 25 years leading up to the pandemic reassert themselves in 2022? They may. But it would not be my central case: Short-run inflation expectations and wage- and price-setting behavior indicate a degree of inflation inertia that is closer to what was experienced in the 1960s–1980s than the low inflation of recent decades. Moreover, counting on inflation to fall could lead to policy errors that could actually prevent it from happening. … Over the last two years, productivity rose at an annual rate of 2.3 percent, only slightly above trend, while compensation per hour rose at 7.0 percent per year, far above the trend rate … As a result, unit labor costs are up 4.7 percent per year, a rate that is consistent with a similar rate of inflation if the labor share is unchanged. … Given that wages can be sticky and are adjusted only periodically, it is likely that much of the big price increases will show up in wages going forward. It is likely that nominal wage growth over the next year or two will be at least 5.5 percent, given the combination of catch-up for past price increases, staggered wage setting, and tight labor markets. A reasonable forecast is that annual productivity growth will be about 1.5 percent. …
Furman also suggests that it may be time for the Fed to change its inflation target from 2% to 3%.
Inflation coming under control need not be strictly defined as 2 percent inflation or 2 percent average inflation. Stabilizing inflation at 3 percent would itself be an accomplishment. If inflation does settle there, it would be very painful to bring it down much more: With a flat Phillips curve, doing so would likely require a recession. The ideal outcome could be inflation settling at 3 percent—and the target resetting in the Fed’s next framework review—a measure that could improve macroeconomic stability by giving the Fed more scope to cut nominal interest rates to combat future recessions.
How high will the Fed need to raise the federal funds interest rate, the “policy” interest rate that it targets? The current target rate for this interest rate is near-zero: specifically, in the range of 0-0.25%. Karen Dynan (who was on the staff of the Fed for 17 years) discusses “What is needed to tame US inflation?” (March 10, 2022). She argues that the Fed needs to raise its policy interest rate dramatically and to do so soon, because it’s better to risk a recession now than to risk an even bigger recession from not acting quickly enough. She writes:
To keep inflation expectations anchored (or reanchor them) and restore slack, the Fed will need to tighten policy considerably, moving from its very accommodative current stance to a neutral stance and perhaps beyond. Doing so will entail both reductions in the size of its balance sheet and significant increases in the federal funds rate. If the equilibrium real funds rate is 0.5 percent, as currently implied by Fed projections, and expected inflation is just 2 percent, the funds rate would need to reach 2.5 percent to achieve a neutral stance. Because relevant inflation expectations are probably higher and a tighter-than-neutral stance may be needed, the Fed should move toward a federal funds rate of 3 percent or higher over the coming year. Such an increase would create a material risk of a sharp slowdown in economic activity—but not tightening policy significantly now would increase the chance that inflation stays high, which would require even tighter policy later.
In response to the 2020 pandemic-induced recession, the Fed quickly adopted an ultra-loose policy stance, dropping short-term interest rates to zero and buying bonds to pull down long-term rates. In retrospect, it should have begun returning to a more neutral policy stance after the passage of the American Rescue Plan, in March 2021. But neither the Fed nor most private forecasters began to predict an overheating economy until much later in the year. As the magnitude and persistence of the rise in inflation became apparent in late 2021, the Fed appropriately signaled a tightening of policy, beginning with a tapering of its bond purchases. These purchases will have ended by March 16, when the Fed is expected to announce the first increase in its short-term policy rates. The Fed should project a steady rise in policy rates to a neutral level, just over 2 percent, by January 2023. It should also announce that it will soon start to allow some of the bonds it purchased to run off its balance sheet as they mature. These runoffs should increase over the summer and reach a peak of $100 billion per month by the fall. That would be twice as fast as the reduction in bond holdings after the Fed’s last bout of bond buying, and it would hasten a return to neutral conditions for longer-term interest rates. …
If PCE inflation settles in much above 2 percent by the end of 2023, the big question will be whether the Fed needs to slow the economy further, risking a recession, to get all the way back to 2 percent.[11] As some economists have argued, the evidence of the past 25 years shows that an inflation rate of 2 percent is too low for many reasons, all of which lead to higher unemployment rates than necessary. It would be a mistake to cause or even risk a recession to get inflation down to a level that is too low. … The Fed should take this opportunity to raise its inflation target to 3 percent.
On Considering the Model and the Estimation Method Separately
You did mention the difference-in-difference work, so let me focus on what I’ve actually written about. I think this is generally a good lesson that I showed you can use traditional regression methods. In particular, you can expand the usual two-way fixed effects estimator. Actually, it’s not expanding the estimator, but expanding the model. My interpretation of the recent criticism of two-way fixed effects is that it’s not a criticism of the estimator but a criticism of the model. The model assumes a single treatment effect, regardless of how many cohorts there are and how long the time period is for the intervention. And I simply noted that if you set things up properly, you can apply regression methods to estimate a much more flexible model, and this largely overcomes the criticisms of the simple two way fixed effects analysis.
So, what I tried to emphasize with my students is that it’s very important to keep separate the notion of a model and an estimation method. And I sometimes forget myself. I will say things like OLS [ordinary least squares] model, but OLS is not a model. It’s an estimation method which we can apply to various kinds of models. It’s up to us to be creative and use the tools that we have so that we apply those methods to models that don’t make strong assumptions. I hope that this idea bridges again a lot of my research, which is pretty simple. It’s trying to find simpler ways to do more flexible analysis, at the point that it gets really hard.
On the Temptations of Simulations and Machine Learning
I was in the middle of doing some simulations for some recent nonlinear difference-in-differences methods that I’ve been working on. But then I was thinking, as I was doing the simulations and changing the parameters of the simulations, am I doing this to learn about how this estimator compares with other estimators, or am I trying to rig it so that my estimator looks the best? So, I was really just making a statement. Like you know, it’s human nature to want yours to be the best, right? One uses the machine to learn about that, and I’m partly making a statement. I’m trying to be as objective as I can by showing cases where the methods I’m proposing work better but also being upfront about cases where other methods will work better. …
When we publish papers, the best way to get your work published is to show that it works better than existing methods. Since the people writing the theory and deriving the methods are the same ones doing the simulations, it will probably be better if there’s some disconnection there. … I’ve always thought that we should have more competitions, such as blind competitions where people who participate don’t know what the truth is. They apply their favorite method across a bunch of different scenarios, so we can evaluate how the different methods do. I’m guessing that machine learning will come out pretty well with that, but that’s an impression. I’m not convinced that somebody using basic methods who has good intuition and is creative can’t do as well. …
I think the work on applying machine learning methods to causal inference has guaranteed that it will have a long history in econometrics and other fields that use data analysis. When I took visits to campuses, Amazon, Google, they’re using machine learning methods quite a bit. That’s no secret. These companies are in the business of earning profits, and they’re not going to employ methods that somehow aren’t working for them. So, I think the market is certainly speaking on that. For prediction purposes, they seem to work very well.
On Simplicity and Credibility of Methods
It’s interesting that if you look at the literature on intervention analysis and difference-in-difference, in some ways we’re trying to go back to simpler things. So, if you were to compare today with twenty years ago and see what econometrics people are doing, it seems to me that structural methods may be more out of favor now than they were fifteen years ago with this re-emergence of difference-in-difference. It seems that we are always looking for natural experiments and interventions to learn things about policy. … So, I wonder if our reaction to these complications in the real world is leading us to simplify the econometrics. Or, at least we are going to only believe analyses that have some clear way to identify the causal effect of an intervention rather than our relatively simple economic models.
For gains in computing power, it’s of course well-known that productivity growth has taken off with in the last 60 years or so in what is often referred to as “Moore’s law,” the roughly accurate empirical prediction made back in the 1960s that the number of components packed on a computer chip would double about every two years, implying a sharp fall in computing costs and a correspondingly sharp rise in the uses of this technology.
The first calculator to enjoy large sales was the “arithmometer,” designed and built by Thomas de Colmar, patented in 1820. This device used levers rather than keys to enter numbers, slowing data entry. It could perform all four arithmetic operations, although the techniques are today somewhat mysterious. The device was as big as an upright piano, unwieldy, and used largely for number crunching by insurance companies and scientists. Contemporaneous records indicate that 500 were produced by 1865, so although it is often called a “commercial success,” it was probably unprofitable.
According to the calculations from Nordhaus, “there has been a phenomenal increase in computer power over the twentieth century. Depending upon the standard used, computer performance has improved since manual computing by a factor between 1.7 trillion and 76 trillion.”
Nordhaus writes: “This finding implies that the growth in the frontier volume of lighting has been underestimated by a factor of between nine hundred and sixteen hundred since the beginning of the industrial age.”
Back in 1997, one might have assumed that the rise of the compact fluorescent bulb was the apotheosis of gains in lighting technology. But LED lighting was already on the way. Indeed, Roland Haitz proposed what has come be called “Haitz’s Law” back in 2000 “which predicts that for every 10 years, the cost per lumen falls by a factor of 10 and the amount of light generated per LED package increases by a factor of 20.” Since then, the gains in cost and quality of LED lighting have largely driven the coil-shaped compact fluorescent light bulbs out of the market, and the efficiency gains in lighting that can be customized and programmed for desired uses has continued to march ahead.
The productivity gains in production of nails are not nearly as large as in computing or lighting, but from a certain perspective they are just as remarkable. After all, the methods of producing computing power or lighting would look like magic two centuries ago, but a modern nail would be readily recognizable to those using nails 200-300 years ago. If the product remains essentially the same, how much room can there be for productivity gains?
The changes can be real and substantial. As Sichel explains, hand-forged nails were common from Roman times up to the 1820s. There was then a shift in the 19th century to cut nails, “made by a bladed machine that cuts nails from thin strips of iron or steel,” which were produced with water and then steam then electrical power. By the 1880s these nails had shifted from iron to steel. At about this time, there was a shift to wire nails, “made by cutting each nail from a coil of drawn wire, sharpening a tip, and adding a head,” which were much lighter and thus changed the cost-effectiveness of shipping nails over longer distances.
Sichel collects a wide range of data on nails, and on the transitions between different kind of nails over time, and suggests that the real price of nails didn’t change much during the 1700s and 1800s, but then started a substantial decline, falling by roughly a factor of 10 from about 1800 up through the 1930s.
After that, the rise in the price of US nails represents a different story: imported nails took over the low-price end of the US nail market starting back in 1950s, while US nail producers focused instead on higher-priced nails for specialized used–which lead to the higher prices for US-produced nails in the figure.
The change here is dramatic. Nails used to be precious. In the 1700s, abandoned buildings were sometimes burned down to facilitate recovering the nails that had been used in their construction. Circa 1810, according to Sichel’s calculations, nails were about 0.4% of US GDP: “To put this share into perspective, in 2019 household purchases of personal computers and peripheral equipment amounted to roughly 0.3 percent of GDP and household purchases of air travel amounted to about 0.5 percent. That is, back in the 1700s and early 1800s, nails were about as important in the economy as computers or air travel purchased by consumers are today.”
Of course, the changes with nails did not happen in a vacuum, but instead were closely related to other technology-related changes in materials, energy sources, machines used in manufacturing, the skills of worker, and so on. This interdependence with other technological changes also holds true about the productivity gains and cost decreases in computing and lighting, too. Indeed, Sichel points out that even thought the price of nails themselves stopped declining, the price of an installed nail dropped dramatically in recent decades with the invention of the pneumatic nail gun.
When it comes to public policy affecting children, two issues always arise. One is that children don’t have a vote, while other groups like the elderly vote in large numbers. The other is that when we start talking about the situation of children, the discussion often slides over to the effects on the parents of children–for example, incentives for the adults to work or marry or to have additional children. Concerns about incentives for parents are of course legitimate–but the situation of the children themselves matters, too.
The working group proposes, in short, rewriting the generational contract. In 2019, the share of the federal budget spent on children was 9.2 percent and the share spent on the adult portions of Social Security, Medicare, and Medicaid was 45 percent. … This allocation is a statement of national priorities—priorities that the working group agrees need to change.
The underlying issues have been the same for some decades now. About one child in seven in the United States lives in a household that’s below the poverty line. The share of US children being raised in a household with two parents present has been falling over time, and the report reviews a considerable body of evidence that children who grow up in stable two-parent households have (on average, and with exceptions of course) better educational and health incomes over time.
We know that children growing up in low-income households have (on average, and of course with exceptions) worse outcomes on a variety of measures: education, health, crime, and others. Here’s one of many measures, using eligibility of a child for a free or reduced-price school lunch as a measure of poverty, here are the shares of those scoring above proficient in reading and math in 4th grade. Again, I recognize that some children make great leaps after 4th grade. But if only one-quarter or one-fifth of a group is proficient in fourth grade, that group is going to have a harder time moving forward.
What’s perhaps most interesting to me in the report is the accumulation of research evidence no how programs benefit children. For example, the Earned Income Tax Credit is a federal program that gives a “refundable tax credit” to working low-income families.
The EITC also has a large antipoverty impact, having raised 5.6 million people, including about 3 million children, out of poverty in 2018. The EITC has been expanded several times since it was introduced in 1975, and researchers have been able to study the impact of these expansions to estimate its impact. Because the EITC is available only to families with positive earned income, it leads to increases in employment, which further raises family incomes (Hoynes and Patel 2018; Schanzenbach and Strain 2020). Studies of the EITC therefore measure the combined effects of both increased income as well as changes in parental employment—likely positive to the extent that employment brings additional income to the family, but potentially negative to the extent that children attend a low-quality childcare program or receive a smaller investment of time from their parents. The EITC has been shown to improve a wide range of children’s outcomes. Infant health is improved—both increasing average birth weight (Baker 2008; Strully, Rehkopf, and Xuan 2010) and decreasing the share of low-birth-weight newborns (less than 5.5 pounds) (Hoynes, Miller, and Simon 2015). The EITC also improves educational outcomes, from test scores to high school graduation and college enrollment (Bastian and Michelmore 2018; Chetty, Friedman, and Rockoff 2011; Dahl and Lochner 2012, 2017).
Another example is the “food stamps” program, which some years ago was rechristened as Supplemental Nutrition Assistance Program (SNAP):
SNAP provides food vouchers to low-income families to use at the grocery store and reaches a large number of families. In 2019 10.9 percent of the population participated in SNAP, with average monthly benefits of $258 per household, or about $130 per person. SNAP is estimated to have lifted 3.3 million children out of poverty in 2016. Unlike the EITC, SNAP is not conditioned on work. Access to SNAP has also been shown to improve infants’ health at birth, increasing birth weights and reducing the incidence of low-birth-weight newborns (Almond, Hoynes, and Schanzenbach 2011; East 2018). SNAP availability for children under age 5 has also been shown to improve their parent-reported health in adolescence, potentially through reduced school absences, doctor visits, and hospitalizations (East 2020). Furthermore, children with access to SNAP had better health in adulthood, as measured by lower obesity rates, healthier body mass index, and fewer chronic conditions, such as diabetes and high blood pressure. Similarly, access to SNAP during childhood improves later education and economic outcomes, such as increasing high school graduation rates by 18 percentage points. SNAP during childhood also leads to improved outcomes for women, including higher earnings, higher family income, better educational attainment, and increased rates of employment (Hoynes, Schanzenbach, and Almond 2016).
Here’s another example of a program with targeted support for food for households with pregnant and postpartum women:
Another nutrition assistance program, the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC), provides targeted support for pregnant and postpartum women and for those with young children to purchase certain food items. WIC has been shown to increase birth weight for infants born to mothers who receive WIC benefits (Hoynes, Page, and Stevens 2011; Rossin-Slater 2013). Prenatal WIC participation also leads to reductions in subsequent diagnoses for attention-deficit/hyperactivity disorder (ADHD) and other childhood mental health conditions and reduces the chances of a child repeating a grade in school (Chorniy, Currie, and Sonchak 2019).
The programs that I have mentioned here aren’t perfect; for example, these programs could often be redesigned so that mothers in low-income households don’t have a disincentive to marry because it would lead to a cutoff of benefits. There are also not the only programs relevant to children. For example, it remains an important goal to get pregnant women from low-income households into prenatal care and then into early-infant care and parental support programs. Steps that improve jobs and wages for low-income households, or that improve schools and neighborhoods in low-income communities, will help children who live there, too.
My main point here is that there can be a tendency to think of the EITC as just a program that provides work incentives and additional income, or to think of SNAP as just a program that helps the poor buy food. Thinking about anti-poverty programs mainly in terms of their effect in bringing down the poverty rate isn’t wrong, but it is limited. It ignores perhaps the most important benefit of these programs, which is their demonstrated ability to provide long-term benefits to the health and education of children from low-income families, which in turn have long-lasting consequences for the future of these children as workers and citizens.
A lot of the policy steps taken during the pandemic were about replacing household income or keeping businesses afloat. However, children from low-income families have been disproportionately affected by pandemic, including in particular the effects of the K-12 schooling going online during the pandemic. A pro-child policy agenda was overdue before the pandemic, and that was before a situation where many children have spent two years of being hindered in their normal social and educational development.
Why are COVID death rates so much lower in low-income countries?
EF: In work with Tristan Reed, you have found that COVID-19 deaths per capita were actually much lower in poorer countries than in richer ones. This seems surprising. What happened?
Goldberg: Tristan and I presented this research at a Brookings conference in June 2020 with great trepidation, because that was near the beginning of the pandemic. Most people’s reaction was that this result was just because poor countries are not connected, so COVID-19 had not arrived there yet. But there was anecdotal evidence that COVID-19 had indeed arrived there. Most capitals of low-income countries are not as isolated as people think; many of these cities are global cities. They are connected to the rest of the world. So it was surprising that the deaths were so low.
Another reaction was that this was all measurement error. … But the differences in deaths are huge — orders of magnitude apart. Just to give you one striking example, in the United States right now, the deaths per million are around 2,500. In Nigeria, the number is 14; in India, it’s 340. And it’s not easy to hide deaths. Yes, there is measurement error — probably deaths and hospitalizations are much higher in low-income countries than the statistics show — but still, there is a big difference between low-income countries and richer ones.
I think there are three reasons at work. We pointed out two of them in this initial working paper. First, everyone agrees that two of the risk factors for a serious reaction to COVID-19 leading to hospitalization and death are age and obesity. The age distribution in many low-income countries is very different from that in the United States. To mention a striking case, in Niger, the median age is 15; there, COVID-19 would probably not have very severe health effects on the population. On top of that, in low-income settings, obesity is much lower. These two factors alone could explain a lot of the difference.
In addition, many epidemiologists talk about what they call “trained immunity” for low-income countries. The idea is that people in those countries are exposed to disease all the time, so their immune systems have learned how to cope. An alternative interpretation is that there has been selection; the ones who have managed to survive the various diseases they’ve been exposed to have very strong immune systems.
It seems that all these factors have contributed. It’s still the case that the poorer the country, the lower the per capita COVID-19 deaths so far. We’ll see whether this holds in the future.
What have been the effects of the Trump administration tariffs on China imposed in 2018?
EF: In research that was published in 2020 in the Quarterly Journal of Economics, you looked at the effects of the 2018 Trump tariffs. You found that between those tariffs and the retaliatory tariffs of other countries, such as China, there was a substantialredistribution from U.S. buyers of foreign goods in favor of U.S. producers and to the government. Is this what you expected to see?
Goldberg: To a certain extent, what we didn’t expect to see is that U.S. buyers would be hurt. This is because the United States is a powerful country; to a certain extent, everyone thought that China would eat some of the tariff. What our work showed, and others’ as well, is that the tariffs were completely paid by the U.S. importing side. The other effect that some people didn’t expect is that the part of the economy that was hurt the most by the tariffs was people in Republican counties, and this is because of the retaliation by China; they targeted mainly agricultural commodities.
We have a follow-up paper where we look at how third countries were affected by the tariffs. What we show is that many countries benefited from the tariffs; trade seems to have been reallocated from the United States and China toward other countries. What did not happen is reshoring of economic activity back to the United States. …
[In] my follow-up work on the U.S.-China trade war … we focus in our new paper on bystander countries or third countries. One interesting finding of this work is that we find that the trade war didn’t simply reallocate the exports of these countries toward the United States and China, as you might expect. It also increased global exports. So, to a certain extent, it led to net trade creation, which is surprising. We don’t expect a trade war to actually lead to more trade. But it seems that happened in this case, maybe because countries decided to invest more in trade capacity, or perhaps because there are scale economies. We think it’s an interesting pattern.
Occasional widely publicized controversies have led to the perception that growth statistics from developing countries are not to be trusted. Based on the comparison of several data sources and analysis of novel IMF audit data, we find no support for the view that growth is on average measured less accurately or manipulated more in developing than in developed countries. While developing countries face many challenges in measuring growth, so do higher-income countries, especially those with complex and sometimes rapidly changing economic structures. However, we find consistently higher dispersion of growth estimates from developing countries, lending support to the view that classical measurement error is more problematic in poorer countries and that a few outliers may have had a disproportionate effect on (mis)measurement perceptions.
Thomas Schelling won the Nobel prize in economics (2005) “for having enhanced our understanding of conflict and cooperation through game-theory analysis.” Watching events unfold in Ukraine reminds me of one of his lesser-known metaphors about fighting in a canoe.
For those of you who have not experienced the pleasure of gliding across a northwoods lake or river in a canoe, I’ll just note that a canoe has a point at both ends, which make it maneuverable but also potentially tippy. In contrast, a rowboat has a point at one end but is flat on the other end, which makes it more stable. From this standpoint, are small conflicts between great powers “better” in some sense than larger ones? Yes. But if there is too great a willingness to engage in many smaller conflicts, then the chance that one of them will escalate in the tippy canoe to a larger conflict is worrisome. Is a fight more likely to dump you into the water in a canoe or a rowboat? Once the fight starts, a canoe is tippier. But if neither party wants to end up in the water (in this case, a metaphor for a much broader war or a nuclear exchange), then they might be less likely to start a fight in a canoe than in a rowboat in the first place.
Engaging in well-isolated small wars or comparatively safe forms of harassment ought to be less unattractive than wrestling on the brink of a big war. But the reason why most contests, military or not, will be contests of nerve is simply that brinkmanship is unavoidable and potent. It would be hard to design a war, involving the forces of East and West on any scale, in which the risk of its getting out of control were not of commensurate importance with the other costs and dangers involved. Limited war, as remarked earlier, is like fighting in a canoe. A blow hard enough to hurt is in some danger of overturning the canoe. One may stand up to strike a better blow, but if the other yields it may not have been the harder blow that worried him. …
Stability, of course, is not the only thing a country seeks in its military forces. In fact a case can be made that some instability can induce prudence in military affairs. If there were no danger of crises getting out of hand, or of small wars blowing up into large ones, the inhibition on small wars and other disruptive events might be less. The fear of “accidental war”—of an unpremeditated war, one that arises out of aggravated misunderstandings, false alarms, menacing alert postures, and a recognized urgency of striking quickly in the event of war—may tend to police the world against overt disturbances and adventures. A canoe can be safer than a rowboat if it induces more caution in the passengers, particularly if they are otherwise inclined to squabble and fight among themselves. Still, the danger is almost bound to be too little stability, not too much of it; and we can hope for technological developments that make the military environment more stable, not less …
Here’s one more comment from Schelling, about the importance of each party in a great power confrontation having clear expectations of how the other party will react–and about reacting during even small confrontations in a way that creates a belief in ultimate firmness in what actions or reactions are likely. Schelling wrote:
It might be hard to persuade the Soviets, if the United States yielded on Cuba and then on Puerto Rico, that it would go to war over Key West. No service is done to the other side by behaving in a way that undermines its belief in one’s ultimate firmness. It may be safer in a long run to hew to the center of the road than to yield six inches on successive nights, if one really intends to stop yielding before he is pushed onto the shoulder. It may save both parties a collision.
It is often argued that “face” is a frivolous asset to preserve, and that it is a sign of immaturity that a government can’t swallow its pride and lose face. It is undoubtedly true that false pride often tempts a government’s officials to take irrational risks or to do undignified things—to bully some small country that insults them, for example. But there is also the more serious kind of “face,” the kind that in modern jargon is known as a country’s “image,” consisting of other countries’ beliefs (their leaders’ beliefs, that is) about how the country can be expected to behave. It relates not to a country’s “worth” or “status” or even “honor,” but to its reputation for action. If the question is raised whether this kind of “face” is worth fighting over, the answer is that this kind of face is one of the few things worth fighting over. Few parts of the world are intrinsically worth the risk of serious war by themselves, especially when taken slice by slice, but defending them or running risks to protect them may preserve one’s commitments to action in other parts of the world and at later times.
“Face” is merely the interdependence of a country’s commitments; it is a country’s reputation for action, the expectations other countries have about its behavior. We lost thirty thousand dead in Korea to save face for the United States and the United Nations, not to save South Korea for the South Koreans, and it was undoubtedly worth it. Soviet expectations about the behavior of the United States are one of the most valuable assets we possess in world affairs.
Still, the value of “face” is not absolute. That preserving face—maintaining others’ expectations about one’s own behavior—can be worth some cost and risk does not mean that in every instance it is worth the cost or risk of that occasion. In particular, “face” should not be allowed to attach itself to an unworthy enterprise if a clash is inevitable. Like any threat, the commitment of face is costly when it fails. Equally important is to help to decouple an adversary’s prestige and reputation from a dispute; if we cannot afford to back down we must hope that he can and, if necessary, help him.
In the present context, several thoughts flow from these lines of thinking.
When it comes to the United States and Russia, both armed with nuclear weapons, we are still fighting in a canoe. We do not wish to be dumped into the waters of a nuclear exchange, or even a head-to-head military confrontation. Thus, we stick to responses that limited, like economic and diplomatic sanctions and tacitly encouraging others to send weapons and supplies to Ukraine. However, we do not send in air forces or ground troops.
The unity and fierceness of the global response to Russia’s invasion of Ukraine is useful for global security not primarily for the people of Ukraine–any more than US participation in the Korean war was primarily about the people of Korea–but because it lays down a marker for those governments thinking about crossing national boundaries in the future.
There are too many imponderables that could be affecting Vladimir Putin’s decision process to make any definite claims, but one wonders if his decision to invade Ukraine might have been affected by earlier western actions. For example, what if there had been a stronger western reaction when Soviet troops essentially levelled the city of Grozny in Chechnya about 20 years ago? What if the countries of western Europe had been more willing to keep their promises to commit 2% of GDP to military spending over the last two decades? What if Germany had not been so extraordinarily eager to become dependent on inflows of Russian-exported oil and gas? What if various assassinations that appeared to be engineered by Russia had been met with greater pushback? What if the Winter Olympics in 2014 had not been held in Sochi? What if the Russia-Ukraine conflict of 2014, which ended with Russia annexing Crimea and other areas, had received greater pushback when Joe Biden was vice-president? What if the American pullout from Afghanistan last summer had been better-managed? What if there had been a greater effort in the last decade or so to build soft connections from Ukraine to the EU and the United States–travel, cultural exchanges, students and faculty, and so on? Perhaps none of these would have mattered. Or perhaps the highly undesirable outcome now occurred in substantial part because of decisions made and not made in the last 20 years.
When in a conflict, the temptation is always to push harder, but pushing harder can be counterproductive. Russia is not going to surrender to Ukraine, and Russia is not likely to leave Ukraine (at least in the near-term and not without tremendous destruction) without having something it can brandish as a “victory.” If the goal is to get Russia out of Ukraine sooner rather than later, it is worth thinking about what face-saving “victory” Russia can claim. As Schelling wrote: “[I]f we cannot afford to back down we must hope that he can and, if necessary, help him.” I don’t have any deep insight here, except for remembering the old adage that when a conflict seems irresolvable, it can be useful to “expand the pie” by widening the range of topics under negotiation. Negotiations over more topics could offer Russia more options for conceding on the key point–in this case, getting Russian troops out of Ukraine–while being able to claim an overall success. Personally, I’d be happy to offer US support for holding the 2030 Winter Olympics in Sochi, along with some similar gestures, as a tradeoff for the withdrawal of Russian troops.
Each year, the Credit Suisse Research Institute publishes a “Global Investment Returns Yearbook,” with a summary edition available online (February 2022). This year’s version is authored by Elroy Dimson, Paul Marsh, and Mike Staunton, and includes a special topics chapter focused on “Diversification.” As the authors point out, “Diversification allows us to either reduce risk for the same level of expected return or increase expected returns for the same level of risk.” They discuss the evidence and arguments for diversifying across stocks, across countries, and across asset classes. Here are a few points that caught my eye from a much more in-depth discussion.
For diversifying across stocks, let’s start with a figure that, as the authors say appears in “[a]most every textbook in investments or corporate finance.” (All figures in this blog post are reproduced with explicit permission of the authors.) In the first panel, the dark blue line shows the market risk for all stocks traded on the New York Stock Exchange from 2011 to 2020: that is, the standard deviation of the return in a given year is plus or minus about 20%. The lighter blue line looks at the risk of portfolios that include one stock, two stocks, three stocks, and so on up to 25 stocks–with these portfolios chosen at random. Because diversification means that random winners and losers will tend to balance each other out, a portfolio with more stocks will tend to have less risk, gradually approaching the risk of the market portfolio.
The authors write:
Conventional wisdom is that a small number of stocks – say 10 to 20 – is sufficient to provide market-mimicking returns. That interpretation is misleading … Many more stocks are needed to create a well-diversified portfolio. It would be more helpful if the standard diversification chart was presented as on the right-hand side of Figure 66, which shows the fall in unsystematic risk as the number of stocks is increased. It shows that even with 100 stocks, the tracking error is still 3.3% per annum. …
Despite the longstanding and widespread advice to hold well-diversified portfolios, many studies find that most investors hold very concentrated portfolios. Goetzmann and Kumar (2008), for example, analyzed more than 60,000 investors at a large US discount brokerage house. Their average holding was four stocks (the median was three). Only 5% held fewer than [TT note: Should probably read “more than”] ten stocks. The level of underdiversification was greater among younger, low-income, less educated and less sophisticated investors.
There are large costs to being underdiversified. Bessembinder (2018) shows that the majority of US stocks (57.4%) have had lifetime buy-and-hold returns below that on Treasury bills. Since 1926, the best-performing 4% of companies explain the net gain for the entire US stock market. This is caused by the strong positive skewness in individual stock returns. The positive premium over bills that we observe for overall stock markets is driven by very large returns for The relatively few stocks. Bessembinder et al. (2021) examined some 64,000 stocks from 42 countries and showed that the same pattern held for non- US stocks. The average individual with a concentrated portfolio is thus likely to receive less than the return on the overall market.
Does the same lesson of diversification apply across countries? Perhaps not for US investors, at least not in the last half-century of so. Here’s a figure showing the share of different countries in global equity markets in 1899 and at the start of 2022. In 1899, the US accounted for 15% of all global equity markets; by the start of 2022, the US was about 60% of all global equity markets. Some of the major equity markets back at the start of the 20th century are much less important like the UK, Germany, and France. Other equity markets that seemed important at the end of 1899, like Russia and the Austro-Hungarian Empire, essentially disappeared and provides zero value for their investors.
The dramatically different results for US equity markets imply that if you were outside the US economy and invested in US stocks, that was an excellent move to diversify. But if you were inside the US and thinking about investing outside the country, the potential results from diversifying to other countries are much smaller, or negative. Dimson, Marsh, and Staunton write:
From 1980 onward, US investors made increasingly large investments in overseas equities. However, in risk-return terms, they would have been better off staying at home. … Moreover, this is before taking account of the higher costs of investing internationally in the earlier part of this period.
For a US investor, domestic investment beat global investment over these … periods for two reasons. First, US equities performed exceptionally well. Over the 48 years from 1974 to 2021, US stocks beat non-US stocks by 1.9% per year. Over the 32 years since 1990, the outperformance was even greater at 4.6% per annum. Dimson, Marsh and Staunton (2021) have documented this continuing outperformance of US equities and describe it as a case of “American exceptionalism.”
Second, over this period, global diversification failed to lower volatility for US investors. The US equity market was among the world’s least volatile as its size, scope and breadth ensured that it was highly diversified. Over the 1974–2021 period, the equally weighted average SD [standard deviation] of non-US countries in the world index was almost double that of the US market. US investors had less to gain from risk reduction than their foreign counterparts.
In addition, a number of large US firms had much more involvement in global markets in recent decades: thus, by investing in the stock of those US firms, an investor was in effect, if indirectly, diversifying across countries to some extent. Conversely, deciding to buy stock in foreign markets directly involves various transaction costs as well as exposure to currency fluctuations, regulatory changes, and political risks.
Finally, what about diversifying across classes of assets: in particular, the authors focus on diversifying across the two major classes of assets, stocks and bonds. They point out that the correlation between returns on stocks and bonds has shifted substantially in the last 20 years or so. The darker blue line shows correlation between returns on stocks and bonds for the US; the lighter blue line for the UK. The key point here for most of the 20th century, returns on stocks and bonds were positively correlated, but starting around 2000 they have been negatively correlated, especially in the US.
From a standpoint of diversification, a negative correlation between stocks and bonds is very helpful: gains in one will tend to offset losses in the other, and vice versa. But why has the pattern shifted? And will the change be lasting? The authors list some possible factors, but argue that this remains a puzzle. They write:
The stock-bond correlation plays an important role in institutional portfolio construction. It is central to forming optimal portfolios, designing hedging strategies and assessing risk. Stock-bond correlations have now been mostly negative in major world markets for some 20 years. This negative correlation means that stocks and bonds have served as a hedge for each other, enabling investors to increase stock allocations while still satisfying a portfolio risk budget. …
In recent years, much research has focused on why the sign of the stock-bond correlation flipped in the late 1990s. What was different about the period before and afterwards? From the late 1990s on, there were more frequent crises, including three major bear markets, much lower/falling real and nominal interest rates, far lower and more stable inflation, somewhat slower economic growth, a more accommodative monetary policy (especially from 2008), somewhat more volatile and lower real equity returns, and less volatile and higher real bond returns.
The period since the late 1990s is relatively short, making it hard to establish statistically significant results. … Despite the volume of research, neither theory nor empirical studies point to a single or clear explanation for the negative stock-bond correlation. Those who claim to have found explanations largely replace one puzzle with another. For example, why have crises been more frequent or why has the correlation between the output gap and inflation changed signs?