Do Randomized Control Trials Support Real-World Policy Reforms?

Megan T. Stevenson is an active researcher in the criminal-justice-and-economics literature. She has also noted a disconcerting fact: When you look at the published studies that use randomized control trial methods to evaluate ways of reducing crime, most of the studies don’t show a meaningful effect, and of those that do show a meaningful effect, the effect often isn’t replicated in follow-up studies. She mulls over this finding in “Cause, Effect, and the Structure of the Social World” (forthcoming in the Boston University Law Review when they get around to finalizing the later issues of 2023, pp. 2001-2027, but already available at the Review’s website).

(For those not familiar with the idea of a “randomized control trial,” the basic idea is that a group of people are randomly divided. Some get access to the program or the intervention or are treated in a certain way, while others do not. Because the group was randomly divided–and you can check in various ways whether it appears to be random–a researcher can then compare the outcomes between the treated and untreated group. This method is of course similar to drug trials, when you randomly divide up a group and some get the medication while others get a placebo. This approach is sometimes called a “gold standard” methodology, because it’s straightforward and persuasive. But of course, no method is infallible. One can always ask questions like: “Was it really random?” “Was some charismatic person involved in the treatment in a way that won’t carry over to future projects?” “Was the sample size big enough to draw a reliable result?” “Did the researcher study a bunch of treatments, on a number of groups, but then only publish the few results that looked statistically significant?”)

As one example of the evidence on interventions to reduce crime, Stevenson writes (footnotes omitted):

In 2006, two criminologists published a survey article of every RCT over the previous fifty years in which: (1) there were at least 100 participants, (2) the study included a measure of offending as an outcome, and (3) the study was written in English. The authors uncovered 122 studies, evaluating interventions such as:

  • Counseling/therapy programs;
  • Criminal legal supervision, including intensive probation;
  • Scared-straight programs;
  • Work/job-training programs;
  • Drug testing, substance abuse counseling, and drug court;
  • Juvenile diversion;
  • Policing “hot spots”; and
  • Boot camps.

Note that these interventions include those associated with a tough-on-crime framework (e.g., scared-straight programs and boot camps) as well as those that provide support and resources (e.g., work/job training programs and counseling). Note further that inclusion in this analysis required that the study was written up and disseminated so it could be discovered by the survey authors—a filter that is likely to have eliminated many of the nonstatistically significant results already. Nonetheless, only 29 of the 122 studies (24%) found statistically significant impacts in the desired direction.

Stevenson reviews a number of more recent studies as well. But the likelihood of successful results remains low, and worse, the chances that a successful result is not replicated by a future study seems high.

As Stevenson points out, this finding is reminiscent of what Peter Rossi several decades ago called: “The Iron Law of Evaluation: The expected value of any net impact assessment of any large scale social program is zero.” Here, I don’t want to quarrel over whether their might be a few strong counterexamples to Stevenson’s pessimistic evaluation. Instead, what does Stevenson suggest should be learned from this discouraging pattern of findings? I’d paraphrase her arguments this way.

While it’s an attractive idea that a relatively small treatment will fundamentally alter an unpleasant outcome like crime (say, a job-training program or “hot-spot” policing), there are often underlying reasons why people make the decisions they do. Stevenson writes: “That doesn’t mean that human actions never have an impact, but rather that the type of discrete, limited scope interventions that are the primary domain of empirical causal inference research generally have limited or nonreplicable impact.”

The positive effects of some policies may be so obvious that they don’t get studied by a randomized trial. For example, feeding the hungry accomplishes a goal of feeding the hungry. One might study other possible effects of such a policy on crime or labor force participation or family dynamics, and that’s where the randomized control trial doesn’t reliably find positive effects. But the hungry did get fed. Stevenson writes:

There is an old cliché that if you give a man a fish, he will eat for a day; if you teach him how to fish, he will eat for a lifetime. Such sentiments form the basis of many of the interventions discussed in this study. These interventions, designed to give people the resources to thrive on their own, rarely have large or lasting impact. The cliché is wrong, at least when it comes to the limited-scope, systems-conserving interventions. However, there remains a straightforward and obvious way to ameliorate harm: simply give people what they need. If they are hungry, give them food. If they need shelter, give them a home. If they need work, give them a job.

The effects of certain policy choices may never get studied by a randomized control trial, because the policies are so sweeping. Perhaps changing people’s lives requires a group of policies sustained over a long period of time, and then evaluated after an even longer period. When people call for “systemic” change, they presumably have in mind a set of changes that can’t be captured by dividing up a group at random and treating one part of the group in a specific but limited way. But of course, systemic change can be very hard to evaluate in advance, and can have either good or bad outcomes.

Finally, Stevenson is asking the social science research community about whether it is overemphasizing the “gold standard” method of randomized control trials, rather than perhaps seeking out evidence from real-world experience. Her sense is that researchers may tend to follow the randomized control trial methodology because they think it is more likely to result in published papers, rather than because it’s the best way to get a persuasive answer. To put it another way, persuasive evidence for a policy can come from a variety of methods, and randomized control trials are only one of those methods.

Stevenson’s paper made me think of a recent wave of research on some of the social programs implemented several decades ago. For example, the food stamp program was rolled out, county-by-county, over the period from 1961 to 1974. The order in which counties were selected was determined by practical and political considerations, and for practical purposes can be viewed as largely random (that is, no particular group was systematically overrepresented in being covered earlier by the foot stamp program). This is sometimes called a “quasi-experiment,” referring to the idea that some families were randomly eligible for food stamps and others were not, but that pattern wasn’t designed by anyone. However, a researcher can come along later and take advantage of the randomization. In this case, it turns out that children under the age of five who were in counties that got food stamps earlier had positive long-term effects in adult health, earnings, and lower crime rates, among other factors.

Converting Office Buildings to Housing: Limits of the Possible

There are cities with high vacancy rates in their office buildings, as work-from-home has become more common. There are also cities that would like to expand their supply of housing. Thus, the idea of converting office buildings to housing offers the possibility of hitting two policies with one shot. Arpit Gupta, Candy Martinez, and Stijn Van Nieuwerburgh discuss the possibilities of Converting brown offices to green apartments” (Hamilton Project at Brookings, November 2023).

The short answer is that the possibilities are both real and limited. The authors write:

Not every office property, however, is suitable for conversion [to apartments]. Three conditions must be met for a conversion to take place: (i) the building has to be physically suitable for conversion, (ii) the zoning and building codes have to permit and facilitate such a conversion, and (iii) the financial return of the conversion has to properly compensate the developer for the risk they are taking.

The authors have data on commercial office markets in core urban areas across 105 cities. (They acknowledge that that some office buildings outside the urban core might also be suitable for conversion, but it’s not their focus here.) Using that data, they consider what kinds of office buildings are most suitable for conversions. They write:

We believe that buildings built before 1990 are the most viable conversion candidates. Many historic buildings tend to be less expensive, have smaller floor plates, and have more character, all of which increases their conversion appeal. … The  size of the building cannot be too big or too small, so we exclude buildings with a total size less than 25,000 square feet as well as large buildings with deep floor plates. Smaller buildings could be convertible, but they are less likely to attract institutional capital and federal grants. Deep floor plates have existing floor plans that start the building at a disadvantage: Too little interior light and air, too little plumbing, and too many elevators. Structural changes to remedy these
buildings for residential use are likely cost prohibitive. … We narrow our sample of candidates further by selecting buildings with no (or few) major long-term leases left.

After carrying out this exercise, what’s left? They identify 2,431 properties out of a total of 22,215 office buildings in these 105 cities–with about one-fifth of all the possible properties being in New York City. They estimate:

At 875 square feet per apartment unit, and after incorporating a 30 percent loss factor, these conversions could create 158,654 additional housing units. Scaling
up for incomplete data coverage results in 367,750 apartment units. For comparison, about 260,000 apartment units were created in the U.S. in a typical
year between 2001 and 2022.

Of course, not all of these potential conversions will be financially viable, either–a determination that would vary across cities and properties. But in a very broad sense, it seems reasonable to say that office-to-residential conversions might be equal to about a year’s worth of standard apartment-building for the nation as a whole. Thus, on one side, every city should be looking over its inventory of office buildings and figuring out which ones might be suitable for conversions. On the other side, such conversions are likely to make only modest progress on the goal of additional housing.

Unions: Facts and Fluff

According to the US Bureau of Labor Statistics, the share of US workers belonging to a union fell slightly, from 10.1% of all workers in 2022 to 10.0% in 2023. In response to this decline, the US Secretary of Labor Julie Su issued a statement that started this way:

The Bureau of Labor Statistics reported an increase in union membership, with 139,000 more union members in 2023 than in 2022, meaning this country has 400,000 more union workers than we had in 2021. The gains under the Biden-Harris administration underscore President Biden’s commitment to being the most pro-worker, pro-union president in history.  We have seen large private sector increases in unionization among health care workers, transportation and warehousing workers, and in educational services. These are workers who recognize that they have power and are organizing to use that power. Workers in health care, auto manufacturing, transportation, entertainment and more have delivered big wins at the bargaining table in the past year.

Again, these sentiments are expressed while BLS data shows a slight decline in the rate of union membership from 2022 to 2023. For the record, the union membership rate was 10.3% in 2021, and 10.8% in 2020, before “the most pro-worker, pro-union president in history” took office. But as the BLS was reporting that from 2022 to 2023, the share of private-sector workers in unions stayed flat at just 6.0%, while the share of public-sector workers in unions declined from 33.1% to 32.5%, those in support of unions were weirdly triumphalist. For an example chosen more-or-less at random, here’s some commentary from the National Partnership for Women and Families:

2023 was a banner year for labor actions and unions. “Hot Strike Summer” morphed into “The Year of the Union” as a strong job market, more than a decade of intensive labor organizing kicked off by the “Fight for 15” and a growing recognition of the need for worker protections through the ongoing pandemic helped drive major wins for workers. Striking United Auto Workers, writers, actors, UPS workers and Kaiser Permanente health care workers secured strong contracts that include benefits like increased wages, health care access and job protections, while U.S. rail workers secured paid sick leave for a large segment of their workforce. Workers at Starbucks continued to gain momentum towards their first contract in defiance of blatant union-busting tactics, building on years of organizing work and hundreds of successful store votes. In part because of these high-profile strikes, unions saw near record highs of public support in 2023, with two-thirds of people approving of unions, more than 6 in 10 saying unions help the U.S. economy and one-third of people predicting unions will be stronger in the future. Workers’ efforts were coupled with the Biden administration’s success in reinvigorating the National Labor Relations Board, the federal agency dedicated to protecting employees’ rights to organize and addressing unfair labor practices.

Remember that during this “banner year for labor actions and unions,” the share of US workers who actually belong to a union was shrinking–and has been shrinking for decades, including last year and in fact during the entire presidency. It’s true that in general, public attitudes seem more supportive of unions. But some of the union successes in the last few years, like the first success in unionizing an Amazon warehouse (on Staten Island), have since become mired in controversy and seem in danger of failing.

Maybe we’ll all look back on 2023 as the year union membership in the US bottomed out, and the beginning of a great union resurgence, but I doubt it. A couple of years ago, Suresh Naidu wrote “Is There Any Future for a US Labor Movement?” in the Fall 2022 issue of  Journal of Economic Perspectives, where I work as Managing Editor. Naidu is sympathetic toward unions, but also clear-eyed. For example, he points out that old-style union organizing outside a large physical facility is not going to work well in an economy where many people are working from home, or doing gig jobs. He points out that a decline in unionization may also reflect a broader decline in “social capital” of people acting together in a variety of contexts. He discussed a range of organizations that try to speak for worker interests in a systematic way (like the movement to raise the minimum wage to $15 per hour) without actually being unions.

But Naidu also points to an even more fundamental issue that US unions face: they need to organize one company at a time. In a dynamic US economy, where some companies are always shrinking or going out of business, this means that unions are running on a treadmill: they need to keep organizing new unionized companies just to offset the typical year-to-year loss of previously organized companies. Naidu writes:

In the United States and other establishment-level bargaining systems, a basic constraint on union density is that it is hard to organize new firms fast enough to keep pace with the exit of already unionized firms. Even if unionization were an order of magnitude easier, the costly trench warfare of establishment-by establishment organizing in the face of structural change and natural business dynamism makes keeping union density constant, let alone expanding it, an uphill battle.

Naidu cites a study from a couple of decades ago, when about 13% of American workers belonged to a union, which calculated that just to keep the rate of unionized workers stable, it “would require that the unions organize each year new members equal to 7.5 percent of their current membership.” That would require increasing the then-existing union organizing successes by six-fold, just to keep the unionization rate stable. The calculation would be a little different today, but the basic lesson remains: Americans say nice things about unions in surveys, but when it comes to organizing and supporting a union in their own workplace, most of them aren’t interested.

Better Writing is Perceived as Better Thinking

Before your writing can persuade those who actually read it of the merits of your thinking, the writing needs to persuade the reader–from the start–that it’s worth reading in the first place. I work as an editor, so I am predisposed to believe that editing matters. But Jan Feld, Corinna Lines, and Libby Ross provide some evidence on the point in “Writing matters” (Journal of Economic Behavior and Organization, January 2024, pp. 2378-397). Their study is of interest both for the methodology and findings, and also for how they define “good writing.” I’ll say a bit about both.

The study started with the authors contacting PhD students in economics and asking if they would send a paper, in exchange for free editing help. The authors got 30 papers. All 30 of the papers were then edited by professional editors, albeit professionals who didn’t usually work on academic writing. Thus, the 30 papers all had an original and an edited version. The editing took about six hours per paper.

The authors then recruited a set of 30 professors of economics and a different group of 18 writing/editing experts who worked in jobs like copywriter, technical writer, and communications manager. Each of these evaluators was sent a group of 10 papers. The economists evaluated the prospects for publication; the writing expert evaluated the quality of the writing. However, although the evaluator did not know it, they were each receiving a different mixture of original and already-edited articles. 

In addition, both groups were asked to do a quick evaluation, spending less than 8 minutes per paper. This may seem harsh. But consider the situation of an academic who is evaluating a large batch of papers that might be included in a conference, or a journal editor looking at a large batch of papers and considering which ones to desk-reject and which ones to send to referees. Quick evaluations of papers are a reality of academic life. Again, before you can impress a reader with the details of thinking, you need to get over that hump of that first five-minute read.

Feld, Lines, and Ross can then compare the evaluations of the original and the edited papers. For the writing experts, the edited papers were scored 1.22 points higher on an 11-point scale for being “better written overall” (0.6 of a standard deviation). For the economics readers, “Economists judge the overall paper quality [of the edited papers] 0.20 SD better (0.4 points on the 11-point scale). They are also 8.4 percentage points more likely to accept edited papers for a conference, and are 4.1 percentage points more likely to believe that edited papers will get published in an economics journal that is classified as A* or A on the ABDC [Australian Business Deans Council] journal ranking.” 

In short, six hours worth of work on the writing–done by an outside nontechnical editor who probably didn’t fully understand the technical economics–led to economist-readers believing that the paper was of higher quality. To put it another way, if you can’t or don’t do a good high-level edit of your own research papers, your chances of impressing readers is lower than it would otherwise be.

What is involved in taking a day to do a good high-level edit? In Appendix C, the authors spell out in a few pages what they asked the editors to do. In addition, the editors seemed to have used the software program StyleWriter to help the process along. I won’t reproduce the instructions to the outside editors in full, but here’s a short version:

Appendix C. Language Editing Guidelines for Experiment

An outline of the general approach the language editors will take

The goal is to edit the paper so that an expert who has 10 min to evaluate the paper will understand it more easily. We’ll focus on improving the title, abstract, and introduction using these guidelines. For the rest of the paper, we’ll focus on making the paper easier to skim read. …

Heavy edit of the title, abstract, and introduction only …

We’ll start by making sure the structure is clear.

The title should explain what the paper is about

We’ll make sure the title is clear. …

We’ll check the abstract is one paragraph that contains:

  • •the research question
  • •an explanation of how this question is answered
  • •the main findings. …

We’ll edit the introduction so that it has one paragraph for each of the following parts:

  • •the motivation for the research
  • •what the paper does (this paragraph often starts with “In this paper,…”)
  • •results
  • •the related literature (unless there is a separate literature section)
  • •contribution to the literature.

Avoid roadmap paragraphs

A good structure and informative section titles will do the trick in most cases …

Signposting for the reader

We’ll make sure the information flows well and is clear for the reader.

Remove roadmap phrases used to connect paragraphs

Make sure paragraphs are focused and only discuss one idea. For example, have separate paragraphs for describing the results and for discussing the related literature. …

The secret to a clear and readable style is in the first five or six words of every sentence. At the beginning of every sentence, locate the reader in familiar territory. The writing needs to have a clear flow of logic that is easy for the reader to follow — don’t frame information in a way that breaks the flow. …

Find the actor of the sentence and the actions they perform. If the actors are not the subjects and the actions are not verbs, we’ll revise so that they are.

Keep a short distance between nouns and their accompanying verb …

Use simple, familiar words

Use “use” instead of “utilize”. Use “people” instead of “individuals”.

Delete unnecessary words or clauses …

For example, in “We are the first to introduce a novel method”, there is no need to mention both “first” and “novel”.

In Section 2, we explain to the reader how our results are estimated.

Many introductory clauses that end with “that” can be deleted. Everything before the “that” should be deleted from a sentence.

It is usually the case that most good writers find that

It should be noted that writing is an art and a science. …

Avoid abbreviations and acronyms

Use them only if they help the reader and choose the ones that sound good. OECD is fine, but use Facebook instead of FB. Write New Zealand, not NZ. …

We’ll remove any hedging statements that seem unnecessary. Writers don’t need to always say “all else equal”, “fairly”, “I would argue”. However, sometimes they need to qualify the statements to avoid people getting it wrong.

Avoid naked this or that in the beginning of the sentence

We’ll add more information where needed, such as “this regression shows” instead of “this shows”. …

We’ll add information to section titles and keep them short and concise. For example, “The credit market in New Zealand” is better than “Background”. …

Use self-explanatory titles of tables and figures

“The effect of peer gender on educational outcomes” is better than “Main Results”.

Most of these changes are already covered in detail in the above sections.

  • •Fix problems with long sentences (with StyleWriter)
  • •Fix problems with passive voice (with StyleWriter)
  • •Fix problems with nominalizations (with StyleWriter)
  • •Untangle noun strings
  • •Delete unnecessary words and clauses
  • •Use more personal pronouns (like “we” and “our”) where possible

A lot of this advice, along with the comments about appropriate fonts, spacing, and formatting, may seem obvious. So why not take a few hours of extra time to do it?

The 30% Solution: Gains from Trade

The 2022, total US exports of goods and services were $3 trillion, while imports were $3.9 trillion. This overall trade provides benefits to the economy: sellers who have access to global markets (ask a farmer!), buyers who have greater access to products with the price/quality mix they prefer, and heightened competitive pressures on domestic producers to do better. However, the gains are not distributed evenly, and in some cases–say, workers in an industry facing especially strong competition from imports–people can experience outright losses.

How big are the overall gains? How should we think about the losses?

Gary Clyde Hufbauer  and Megan Hogan of the Peterson Institute for International Economics offer some insights in “America’s payoff from engaging in world markets since 1950 was almost $2.6 trillion in 2022” (Policy Briefs 23-17, December 2023).

To summarize economic gains from trade, Hufbauer and Hogan first point to pre-2017 research on more than a dozen studies of international trade, which “calculated an average `dollar ratio’ of 0.24.” They write:

Simply put, the dollar ratio is the dollar increase in GDP divided by the dollar increase in two-way trade. In language familiar to economists, the dollar ratio is the elasticity of income (GDP) with respect to trade. Expressed another way, the calculation indicates that a 1 percent increase in trade yields a 0.24 percent increase in GDP—i.e., a $1 billion increase in two-way trade increases GDP by $240 million.

They redo and update these calculations based on a newer wave of studies and suggest an updated “dollar ratio” of 0.30–that is, a $1 billion increase in two-way trade increases GDP by $300 million.

What about those who experience losses from trade? Hufbauer and Hogan offer essentially two responses. One response is to put the number of those who suffer from trade in the context of job churn in the larger US economy. After all, there are also workers who lose jobs because their employer is losing market share to domestic competitors, perhaps because of a shift in consumer preferences, or shifts in the skill level of workers that US employers are looking for, or because of below-average management. Other US workers may lose jobs because of automation and new technology. Many more workers switch jobs, looking for higher pay or better opportunities, especially if they sense that their current employer may be in troubles. The second response is to argue that all US workers who are laid off from a job deserve government support through unemployment insurance and other programs, while they move to a new job. They write:

Compared with our estimate of US workers who lost or changed jobs because of increased imports (242,000 annually between 2019 and 2022), roughly 50 million American workers change their jobs each year. A small fraction of these workers is “displaced,” meaning laid off, including a smaller fraction displaced by imports. Displacement reduces earnings over the long term. Throughout the vast US economy over the past two decades, annual displacement ranged from under 1 percent to over 3.5 percent of the labor force (1 million to 5 million workers). All displaced workers, including those displaced by trade, deserve better public safety nets.

Of course, if you find these estimates of the gains from trade to be implausible, and instead you would prefer to see a substantial decline in international trade–take heart! This is your time! US trade in goods and services as a share of GDP have been declining since the Trump administration. However, international flows of data, information, and foreign investment have continued to rise. The forms of globalization have shifted, but the technologies driving greater global connectedness continue to develop.

Aging and Long-Term Care: An International View

The world population is aging. In the next few decades, a much larger number of people are going to need long-term care. The United States, like most countries, doesn’t really have even a preliminary set of guidelines for how this might best happen. Here’s some background information from Health at a Glance 2023: OECD Indicators (November 2023), specifically from “Chapter 10: Aging and Long-Term Care.”

As a starting point, here’s a figure showing the share of the population that is 80 or older, with actual data for 2021 and then projected to 2050. These projections should be viewed as fairly solid: after all, anyone who is going to be 80 or older in 2050 was already born in 1970 or earlier, and projecting life expectancy for the group of people who are already in their 50s or older is fairly straightforward

For the OECD countries as a whole, the projection is roughly a doubling in the share of the over-80 age group, from about 5% at present to about 10% in 2050. For countries with very low birthrates, like Korea, Japan, and Italy at the top of the table, the over-80 share of population will be much larger, reaching or exceeding 15% of the total population. The US will also experience roughly a doubling of the over-80 share of the population, but from a lower base than the average country listed here.

The current models that countries have for long-term care differ quite a bit. To illustrate the point, consider current spending on care as a share of GDP. In Netherlands, Norway, Sweden and Denmark, total spending on long-term care is already more than 3% of GDP.

Here’s how the OECD explains these differences:

This variation partly mirrors differences in the population structure, but mostly reflects the stage of development of formal LTC systems, as opposed to more informal arrangements based mainly on care provided by unpaid family members. … Across OECD countries, four out of five dollars spent on LTC come from public sources. Across OECD countries, around half of health and social LTC spending in 2021 occurred in nursing homes. … In most OECD countries, these providers account for the majority of LTC spending. On average, around one-fifth of all LTC spending was used for professional (health) care provision at home. Other LTC providers include hospitals, households – if a care allowance exists that remunerates the informal provision of such services – and LTC providers with a clear social focus. These service providers each account for around one-tenth of total LTC spending across OECD countries. …Without public financial support, the total costs of LTC would be higher than median incomes among older people in most OECD countries. On average across OECD countries, institutional care for severe needs would cost more than twice the median income among older people …

When it comes time for a certain share of the elderly to need long-term care, a rough measure of the capacity of a nation’s long-term care system is the number of beds. The figure shows the number of long-term care beds per population of 1,000 over age 65. Some countries, like Japan and Korea, have a large share of long-term care beds inside hospitals. But for most countries shown here, including the United States, most long-term care beds are not in hospitals. The US is substantially below the international average in the number of beds.

Compared with these other countries, the US both spends less on long-term care as a share of GDP and has lower number of long-term care beds per capita because a much smaller share of Americans aged 65 or older end up in long-term care institutions: the average across the 28 OECD countries is 11.5%, while the US share is 1.7%.

These differences seem too large to reflect underlying differences in health. Instead, they reflect a mixture of social expectations and the design of government programs to support the elderly. The US, along with Canada, Japan, and a few others, has so far managed to have only a small proportion of the elderly in long-term care institutions.

Of course, many older people would prefer to live at home as long as possible, before moving to a long-term care institutions, and many countries have policies to support this option. In practice, the live-at-home option also ends up relying on whether a family member can be a regular care-giver, either weekly or even daily. With lower birthrates in the last few decades, and a higher share of women in the workforce, relying on care from a family member is likely to be harder in the future. As the over-65 and especially the over-80 population rises in the US in the next few decades, the existing low use of long-term care institutions in the US is likely to come under severe stress.

Economics is for the Birds

There used to be a recognized academic field of “economic ornithology,” which emphasized the economic benefits of birds to agriculture, in their role reducing bugs and weeds. But with the advent of pesticides, economic ornithology had become obsolete by the 1940s. Robert Francis tells the story at his “Bird History” substack: “Economic Ornithology: Before pesticides, birds were a farmer’s best defense against bugs. And the government’s economic ornithologists could tell you exactly how much each bird was worth” (January 10, 2024).

Francis points out:

[The] US Department of Agriculture established the Section of Economic Ornithology in 1885. The following year it became the Division of Biological Survey, and was upgraded to the Bureau of Biological Survey in 1905. …  In 1903, the Saturday Evening Post, for example, published a request that `every person in the United States who kills a bird is requested by the United States Government, not in a mandatory way, but as a matter of courtesy, to send the stomach and its contents to Washington.’ By 1916, the Bureau of Biological Survey had collected and analyzed the contents from more than 60,000 bird stomachs, which they used to determine whether each of the 400 species they studied was, on balance, helpful or harmful to man. Researchers divided the stomach contents into “good,” “bad,” and “neutral” categories, based on whether the partially-digested bug and plant matter was beneficial or harmful to farmers. …

According to the Bureau of Biological Survey, native sparrows, who are “specially efficient destroyers of weed seeds,” saved farmers $35 million in 1906 by eating ragweed and crabgrass seeds. And during Nebraska’s 1874 Rocky Mountain Locust infestation, a single Marsh Wren was calculated to have fed her brood of chicks enough grasshoppers to save $1,743.97 worth of crops. The Bureau of Biological Survey even helped rehabilitate the reputation of some birds that were historically seen as enemies to the farmer. By examining over a thousand crow stomachs, the Bureau found that while crows did in fact pull up sprouting corn and nibble corn on the stalk, they ate more “noxious insects and mice,” meaning that “the verdict was therefore rendered in favor of the crow, since, on the whole, the bird seemed to do more good than harm.”Owls, which were long considered poultry thieves, were proven to eat enough mice to earn back “the small commission they collect” by nabbing the occasional chicken.

This kind of information was distributed not just by the US Department of Agriculture, but also through groups like the Audubon Society and the League of American Sportsmen. For those who would like more history of economic ornithology, Theodore S. Palmer of the USDA provides an overview of the development of the field from the 1850s up through the end of the 19th century in his 1899 monograph: “A Review of Economic Ornithology in the United States.” H.J. Taylor (no relation) provided pocket autobiographies of five “Pioneers in Economic Ornithology” (The Wilson Bulletin, September 1931).

It wasn’t just pesticides that killed off economic ornithology. A deeper issue was that it wasn’t clear that adding birds to an agricultural area actually reduced the number of insects and weeds, at least not in a reliable way. And yet, some occasional modern studies suggest that certain birds in certain settings do have considerable economic value.

My favorite recent example is the “The Social Costs of Keystone Species Collapse: Evidence From The Decline of Vultures in India,” by Eyal G. Frank, and Anant Sudarshan (Becker Friedman Institute Research Brief, February 2, 2023). They tell the story of how a painkiller called diclofenac went off-patent, and as its price declined sharply, veterinarians in India began to give the drug to sick cattle. Although the drug was fine for cattle, it is severely toxic to vultures. Thus, when some of these cattle died and their carcasses were eaten by vultures, the vultures in this area became almost extinct. The authors write:

Vultures are efficient scavengers and feed only on carrion. In India, a country with over 500 million livestock, these birds provided an important public health service by removing livestock carcasses from the environment. In the mid-1990s, vultures experienced the fastest population collapse of a bird species in recorded history. The cause of death was unknown until 2004 when it was identified as poisoning from consuming carcasses containing traces of a common painkiller, diclofenac. The expiration of a patent led to a dramatic fall in the price of medical diclofenac, the development of generic variants, and entry into the veterinary market in 1994. We exploit this event to study the costs of losing vultures. Using habitat range maps for affected species, we compare high- to low-vulture suitability districts before and after the veterinary use of diclofenac. We find that, on average, all-cause human death rates increased by more than 4% in vulture-suitable districts after these birds nearly went extinct. … As vultures died out, the scavenging services they provided disappeared too, and carrion were left out in the open for long periods of time. Ecologists have argued that this may have led to an increase in the population of rats and feral dogs, which are a major source of rabies in India. Rotting carcasses can also transmit pathogens and diseases such as anthrax, to other scavengers. In addition, these pathogens can enter water sources either when people dump carcasses in rivers or because of erosion by surface runoff …

More generally, there continues to be a modest literature in environmental economics that carries on the “economic ornithology” tradition of looking at birds as providers of ecosystem services. Christopher J. Whelan, Çağan H. Şekercioğlu and Daniel G. Wenny provide an overview in   “Why birds matter: from economic ornithology to ecosystem services” (Journal of Ornithology, 2015, 156: 227-238). They point to a few specific studies:

For instance, Mols and Visser (2002) investigated effects of bird control of herbivorous insects in Dutch apple orchards, and reported that increasing bird density through deployment of nest boxes led to a 50 % reduction in apple damage and an increase of about 60 % in total apple crop yield. Koh (2008) attributed bird pest control to prevention of 9–26 % fruit loss in oil palm (Elaeis guineensis). Johnson et al. (2009) found birds significantly reduced damage by coffee berry-borer beetles (Hypothenemus hampei), with higher coffee yields resulting in increased income from US$44 to US$310/ha.

The authors also point to birds as providing pollination and seed dispersal services, controlling populations of mice and rats and other services. But the overall tone of the article is that there is still a lot of research to be done, not in the dissection of bird stomachs, but in understanding the role of birds within ecosystems–especially as bird populations rise or fall and ecosystems adjust accordingly. The authors write:

Yet the economic relevance of birds is not widely appreciated and the economic relevance to human society of birds’ ecological roles is even less understood. Quantifying the services provided by birds is crucial to understand their importance for ecosystems and for the people that benefit from them. In this paper, we briefly review the rise and fall of economic ornithology and call for a new economic ornithology with heightened standards and a holistic focus within the ecosystem services approach. Birds’ ecological roles, and therefore, ecosystem services, are critical to the health of many ecosystems and to human well-being.

For my bird-watcher friends, no, I’m not suggesting that all birds should be reduced to quantifiable factors of production. But when it comes to protecting and restoring bird habitat, having some dollars and cents on your side of the argument doesn’t hurt.

Some Economics for Martin Luther King Jr. Day

On November 2, 1983, President Ronald Reagan signed a law establishing a federal holiday for the birthday of Martin Luther King Jr., to be celebrated each year on the third Monday in January. As the legislation that passed Congress said: “[S]uch holiday should serve as a time for Americans to reflect on the principles of racial equality and nonviolent social change espoused by Martin Luther King, Jr..” Of course, the case for racial equality stands fundamentally upon principles of justice, with economics playing only a supporting role. But here are a few economics-related thoughts for the day clipped from posts in the previous year at this blog, with more detail and commentary at the links.

1. “Changes in the Distribution of Black and White Wealth since the US Civil War,” by Ellora Derenoncourt, Chi Hyun Kim, Moritz Kuhn, and Moritz Schularick, Journal of Economic Perspectives, Fall 2023. From the abstract:

The difference in the average wealth of Black and white Americans narrowed in the first century after the Civil War, but remained large and even widened again after 1980. Given high levels of wealth concentration both historically and today, dynamics at the average may not capture important heterogeneity in racial wealth gaps across the distribution. This paper looks into the historical evolution of the Black and white wealth distributions since Emancipation. The picture that emerges is an even starker one than racial wealth inequality at the mean. Tracing, for the first time, the evolution of wealth of the median Black household and the gap between the typical Black and white household over time, we estimate that the majority of Black households only began to dispose of measurable wealth around World War II. While the civil rights era brought substantial wealth gains for the median Black household, the gap between Black and white wealth at the median has not changed much since the 1970s. The top and the bottom of the wealth distribution show even greater persistence, with Black households consistently over-represented in the bottom half of the wealth distribution and under-represented in the top-10 percent over the past seven decades.

2) “HBCUs: The Evolving Challenge” (September 25, 2023)

This post draws on two essays: one by Gregory N. Price and Angelino C. G. Viceisza in the Summer 2023 issue of the Journal of Economic Perspectives“What Can Historically Black Colleges and Universities Teach about Improving Higher Education Outcomes for Black Students?”; and the other from Gizelle George-Joseph and Devesh Kodnani of Goldman Sachs in “Historically Black, Historically Underfunded: Investing in HBCUs” (Goldman Sachs Research, June 13, 2023).

Both essays emphasize the evolution of historically black colleges and universities (HBCUs), and the differences across these institutions. Both note that back in, say, 1967, about 80% of all black colleges students attended these institutions, while now it’s about 9%. Thus, the role of these institutions has evolved. However, they continue as a group to provide an outsized share of black college graduates, especially in the sciences. In addition, after adjusting for factors like household income and institutional resources, black students attending HBCUs have a greater likelihood of graduating. At a time when US higher education as a whole is trying to reach out to traditionally underrepresented group, it seems as if there are some lessons to be learned here.

3. “The Decarceration Trend for Black Americans” (July 27, 2023).

It’s quite possible that US incarceration rates are too high, but it’s also just a fact that they have been declining in recent years. Here’s an overall figure.

For black Americans, the change is especially noticeable.  Jason P. Robey, Michael Massoglia, and Michael T. Light describe the change in “A Generational Shift: Race and the Declining Lifetime Risk of Imprisonment” (Demography, published online July 12, 2023). From their abstract:

This study makes three primary contributions to a fuller understanding of the contemporary landscape of incarceration in the United States. First, we assess the scope of decarceration. Between 1999 and 2019, the Black male incarceration
rate dropped by 44%, and notable declines in Black male imprisonment were evident in all 50 states. Second, our life table analysis demonstrates marked declines in the lifetime risks of incarceration. For Black men, the lifetime risk of incarceration declined by nearly half from 1999 to 2019. We estimate that less than 1 in 5 Black men born in 2001 will be imprisoned, compared with 1 in 3 for the 1981 birth cohort. Third, decarceration has shifted the institutional experiences of young adulthood. In 2009, young Black men were much more likely to experience imprisonment than college graduation. Ten years later, this trend had reversed, with Black men more likely to graduate college than go to prison.

4. “The IRS Audit Algorithm and Racial Effects” (May 17, 2023)

Algorithms may in some settings be more fair than human decision-making (which is not necessarily a high bar!), but they can also lead to unexpected and undesired results. Hadi Elzayn, Evelyn Smith, Thomas Hertz, Arun Ramesh, Robin Fisher, Daniel E. Ho, and Jacob Goldin dig into the evidence in “Measuring and Mitigating Racial Disparities in Tax Audits” (Stanford Institute for Economic Policy Research, January 2023). They write: “Despite race-blind audit selection, we find that Black taxpayers are audited at 2.9 to 4.7 times the rate of non-Black taxpayers.” The research result has gotten considerable press coverage, like the recent “I.R.S. Acknowledges Black Americans Face More Audit Scrutiny” in the New York Times (May 15, 2023).

It turns out that when you dig into this data, pretty much all of the difference is because black working poor who are claiming the Earned Income Tax Credit are audited at much higher rate than non-black working poor who are claiming the EITC, and that this “disparity cannot be fully explained by racial differences in income, family size, or household structure.” Instead, the gap seems to trace back into details built into the IRS algorithm. For example, the algorithm tends to single out for audits the cases that are more likely to lead to higher taxes. This may sound reasonable at first, but imagine two tax returns: In one, there is a 95% chance that the audit will collect an extra $500, and in the other there is a 50% chance that the audit will collect an extra $10,000. If the algorithm prioritizes the chance of collecting more, rather than a mixture of the probability and the amount that could be collected, it will often focus on the working poor rather than on middle- and upper-income taxpayers who might owe more.

Professional Sports and the Lack of Local Economic Payoffs

I’m a sports fan, which in this case may represent a conflict of interest, because it means I’m conflicted about public subsidies going to sports stadiums. The economic evidence on this point is pretty clear: such subsidies can transfer how people spend their entertainment dollars from one area of a city to another, but the net gain to an urban area is probably negative. John Charles Bradbury, Dennis Coates, and Brad R. Humphreys review the evidence in “The impact of professional sports franchises and venues on local economies: A comprehensive survey (Journal of Economic Surveys, September 2023, 1389-1431). The authors write:

Between 1970 and 2020, state and local governments devoted $33 billion in public funds to construct major-league sports venues in the United States and Canada, with the median public contribution covering 73% of venue construction costs.The prevalence of subsidized sports stadiums and arenas spawned an active economics literature evaluating their efficacy at stimulating economic activity. This literature contains near-universal consensus evidence that sports venues do not generate large positive effects on local economies. … However, this literature expanded considerably since the last comprehensive literature survey. We survey the extensive academic literature on the economic impacts of sports teams and venues on local communities, which includes more than 130 articles and spans more than 30 years, most published in the past decade. We document the presence of a clear consensus in the results reported in this literature.

Many of us sports fans know that when we attend a game, nearby restaurants, bars, and parking lots are often doing a good business–of course along with economic activity in the venue itself. How do we reconcile this evidence of our own eyes with the economic studies? As Bradbury, Coates, and Humphreys write:

Robust empirical findings documenting the impotence of professional sports in local economies likely reflect a simple theoretical explanation: consumer spending on sports represents a transfer from other local consumer spending, not net-new spending. Although sports games attract some nonlocals to spend money in the area, these visitors also crowd out other tourists attracted to other consumption amenities common to major US cities. Even with the presence outside visitors
attracted by sports events, most consumer spending in and around pro sports venues derives from local residents; therefore, the opportunity cost of local sports consumption falls primarily on other competing local businesses, such as movie theaters, restaurants, and retail shopping. Most spending on game tickets, concessions, and associated hospitality near a sports venue would have occurred in other parts of the host jurisdiction without the presence of a pro sports team. Sports-related spending largely reflects a redistribution of existing spending by residents rather than increased local spending.

Any added spending from visitors attending games tends to be concentrated in certain sectors in the local economy and in locations that may not bear the full tax burden generated by subsidies. In addition, the influx of consumers also generates local nuisance or congestion externalities in the form of traffic, crowds, noise, litter, and crime, which may mitigate any positive economic effects. Furthermore, there is no obvious reason to expect income or employment multipliers from
sports spending to be greater than those for other types of local consumption spending that are crowded out; thus, the consistent empirical findings of insubstantial tangible economic impacts from professional sports teams and venues conform to theoretical expectations.

When the economic evidence is against you, then you (in this case, me) argue about noneconomic benefits. Economists sometimes refer to “nonuse benefits.” Even if I haven’t attended a game in a few months (and a combination of limited time and high ticket prices means that I don’t see a lot of games in person), I still enjoy reading and hearing about the games. The local newspaper probably devotes more space to sports coverage than to international news. During my commute, I often listen to local sports-talk radio stations. I sometimes watch games on television. Talking about weather and sports is often an easy and noncontroversial conversation opener.

Some economists have tried to estimate these kinds of “nonuse benefits” using sophisticated survey data: a common finding is that the social benefits are about 15% of the facility construction costs–not nearly enough to justify the level of public subsidies.

Another argument involves whether a new stadium increases property values in the area around the stadium. The evidence here is not clear-cut, but a rough summary would be that in suburban areas, a new stadium often decreases local property values (households and firms don’t necessarily want to be near the stadium), while a new stadium in an urban area can sometimes increase local property values. In interpreting these kinds of results, it’s important to remember that big events also tend to bring traffic jams, noise, and even a rise in crime, so if you’re not a fan, you have no advantages to balance against the disadvantages.

Of course, all of this raises a paradox: If public subsidies for stadiums don’t pay off, why do they keep happening? There are two possible answers here. One is that stadium subsidies arise from an unholy mixture of loudly represented special interests, empire-building local officials, and the threat that a team can move away. The result is a kind of arms race, where cities know they would be better off if they were all to limit these subsidies, but few individual cities are willing to do so on their own. It’s a dynamic that’s similar to colleges and universities all building certain facilities or having certain kinds of offices because everyone else is doing it. It’s also similar to the dynamic where places offer unsuitably large tax breaks or subsidies to a big company who promises to move to a certain area.

The other possible answer is that the economic studies aren’t capturing something important about the role of sports teams in the portfolio of entertainment activities in a metro area. For example, maybe certain employers and their employees want to be in the kind of city where stuff happens. After all, stadiums are often used for nonsports events: concerts, trade shows, monster trucks, whoever-on-ice, and others. If your metro area didn’t have a football stadium, you were not going to get a visit from Taylor Swift.

From this perspective, the insight that subsidies for sports stadiums are often too high doesn’t necessarily imply that no subsidies at all are justifiable. Perhaps some of the answer is for at least some urban areas to negotiate harder for lower subsidies–and thus to help set a precedent of lower subsidies that can be followed by others.

Thoughts about US Steel

In mid-December, the Japanese firm Nippon Steel announced that it was buying U.S. Steel for $14.9 billion. The news was unsettling to politicians of both parties, who have often argued over the years that steel is a vital domestic industry, along with being an important source of jobs.

For me, the real shock was the announced purchase price of $14.9 billion. When US Steel was formed back in 1901 by merging together a number of smaller competitors, it was the largest firm in the world. By 1960, it was still in the top 10 US firms in the Fortune 500 listings. By 1991, US Steel was no longer included in the 30 large firms that make up the Dow Jones Industrial Index. In 2014, US Steel fell out of Standard & Poor’s index of the top 500 US firms. Indeed, US Steel is no longer even the largest US steel firm–that would be Nucor. US Steel now makes about 12% of American steel.

The purchase of US Steel would not be the biggest deal of 2023. For example, Kroger’s bought Albertson’s last year, in a merger of grocery store chains, for $24.6 billion. The biotech firm Amgen paid $26 billion for Horizon Therapeutics. Prologis, a firm that owns and manages industrial space, paid $23 billion for Duke Realty. Broadcom, which designs and makes a range of software infrastructure and semiconductor products, bought VMWare, which makes software that allows you to “run any app on any cloud on any device” for $61 billion–call it four times the value of US Steel.

Indeed, there are now several US professional sports teams valued at $7 billion or more, including the (football) Dallas Cowboys, the (baseball) New York Yankees, and the (basketball) Golden State Warriors. Once-mighty US Steel is now worth about two professional sports franchises.

The diminishing importance of US Steels is part of an overall shift of the global steel industry. For a sense of the global steel market, and the place of US steel-makers in that market, consider this figure by Nicolo Conte at the Elements website. On the bottom left of the figure, you can see that back in 1967, China had 3% of the global steel market, but now it has 57%. Japan and the US together make less than one-fifth as much steel as China–and both the US and Japan lag behind India as a steel producer.

One part of the rise of steel production in China and India is the dramatic expansion of their economies. The World Steel Association reports the main uses of steel from a global perspective in this way:

Of course, the production of buildings, transportation equipment, and machinery in China has skyrocketed in the last 40 years or so. Thus, the local market for steel producers in China has skyrocketed, too.

But the other issue is that the US steel industry in general–and US Steel in particular–has historically been well behind the cutting edge of advances in steel technology. As Brian Potter points out in “No inventions; no innovations,” a History of US Steel” (Construction Physics, December 29, 2023), US Steel was considerably behind the technology curve in the post-World War II era, including: the shift from open hearth furnaces to the Basic Oxygen Furnace; the pursuit of economies of scale through very large furnaces; the rise of the “mini-mill” that steel by melting scrap steel, rather than processing iron ore; and others.

The US steel industry overall and US Steel in particular have been forced to trim down considerably in the last few decades, but the US economy still makes most of its own steel. According to the US Geological Survey annual volume on Mineral Commodity Summaries 2023, 14% of US finished steel consumption was imported in 2022. The main sources of these imports were Canada at 21%, Brazil at 15% and Mexico at 14%.

However, the US has a long history of blocking imported steel from other countries; for example, when President Trump decided to ramp up trade protectionism in 2018, steel was one of the first industries to gain additional tariff protection. As a result, steel-using US industries like construction, cars and transportation equipment, and machinery pay more than steel-users in other countries. The SteelBenchmarker website reports that at present, US steel-users pay $1,142 per metric tonne of hot-rolled band steel: for comparison, the comparable price in western Europe is $790; the price in world export markets is $606; and the price in mainland China is $484. Thus, every US industry relying on US steel production is at a competitive disadvantage in global markets compared to firms elsewhere.

At about this point in the argument, it’s usual for someone to say, accusingly, “So, you just don’t care about the jobs of steelworker and you just don’t care if the US steel industry vanishes.” Actually, I do care. But the historical pattern over the last half-century is that the US government keeps protecting the US steel industry from from international competition, the US steel industry has not used that protection to catch up technologically.

Looking ahead, the US and Japan combined are only a small slice of global steel markets. The steel industry in both countries needs need greater scale and continuous technological improvement. In comparison to those problems, the question of whether a certain Japanese steelmaker should be allowed to pay $14.6 billion for the #2 US steelmaker is a diversion from the real issues.