The Unfairness of Money Bail

About 40 years ago, when I was a junior on the high school debate team, we argued for the abolition of the money bail system. Like many positions taken by high school juniors in debate tournaments, our arguments were sweeping and simplistic. But we were correct in recognizing that there are real problems with money bail.

As one example, 14 elected prosecutors wrote a joint letter to New York state lawmakers on March 6, 2019.  The prosecutors  wrote:

We support ending money bail because safety, not wealth, should be the defining feature of the pretrial justice system. Three out of every four people in New York cannot afford to pay the bail amount that the judge sets at their arraignment. That means many people are jailed simply because they are too poor to purchase their freedom. … The only people who should be detained pretrial are those who a judge finds pose a specific, clear and credible threat to the physical safety of the community, or who are a risk of intentionally evading their court dates. Jails across New York frequently are over-capacity, and they are filled with people who do not need to be there. … Research shows that people who spend even a short period in jail, as opposed to being released pretrial, are more likely to commit a future crime. This makes sense. Jail is traumatizing. Jobs are lost. Families can’t pay rent. For reasons big and small, people who are away from their family, their job, and their community become more vulnerable and less stable.

Patrick Liu, Ryan Nunn, and Jay Shambaugh provide a useful backgrounder on this subject in \”The Economics of Bail and Pretrial Detention,\” written for the Hamilton Project at the Brookings Institution (December 2018). Will Dobbie and  Crystal Yang have now offered \”Proposals for Improving the U.S. Pretrial System,\” written as a Hamilton Project Policy (March 2019).  Here\’s an overview comment from the conclusion of the Liu, Nunn, and Shambaugh paper:

\”Bail has been a growing part of the criminal justice system. Nonfinancial release has been shrinking, and more and more defendants are using commercial bonds as a way to secure their release while awaiting trial. Bail can make it more likely that defendants will reappear in court, and as such reduce costs for the criminal justice system. There are, however, extensive costs. Beyond the direct costs of posting the bail, either from paying a fee or having to liquidate assets, widespread use of bail has meant that many people are incarcerated because they are unable to post bail.

\”Nearly half a million people are in jail at any given time without having been convicted of a crime. The overwhelming majority of these people are eligible to be released—that is, a judge has deemed that they are safe to be released—but are unable to raise the funds for their release. The impact of monetary bail falls disproportionately on those who are low-income, cannot post bail out of liquid assets, and thus often remain in jail for extended periods. Furthermore, as a growing body of literature has shown, the assignment of financial bail increases the likelihood of conviction due to guilty pleas, and the costs—to both individuals and society— from extra convictions can be quite high.\”

Let\’s spell some of this out more explicitly.

Nearly half a million people are incarcerated on any given day without having been convicted of a crime. Add it all up, and over 10 million people during a given year year are locked up without being convicted of anything. Roughly one-quarter of all inmates in state and local jails have not been convicted. Here\’s a figure from Liu, Nunn, and Shambaugh:

In the last few decades, the use of money bail has been rising. As Dobbie and Yang write (figures and references omitted):

The high rate of pretrial detention in the United States in recent years is largely due to the increasing use of monetary or cash bail—release conditional on a financial payment—and the corresponding decreasing use of release on recognizance (ROR), a form of release conditional only on one’s promise to return to the court. The share of defendants assigned monetary bail exceeded 40 percent in 2009 in the set of 40 populous U.S. counties where detailed data are available, an 11 percentage point–increase from 1990. The fraction of defendants released on their own recognizance decreased by about 13 percentage points over the same period in these counties, with only 14 percent of defendants being released with no conditions in 2009. The widespread use of monetary bail directly leads to high pretrial detention rates in most jurisdictions because many defendants are unable or unwilling to pay even relatively small monetary bail amounts. In New York City, for example, an estimated 46 percent of all misdemeanor defendants and 30 percent of all felony defendants were detained prior to trial in 2013 because they were unable or unwilling to post bail set at $500 or less.

The time that accused people spend in pretrial detention can be significant. Liu, Nunn, and Shambaugh write:

[T]the amount of time that a person is detained if they are unable to afford bail is substantial, ranging from 50 to 200 days, depending on the felony offense. The pretrial detention period is also growing …  From 1990 to 2009, the median duration of pretrial detention increased for every offense, ranging from an increase of 34 percent for burglary to 104 percent for rape.  … Even for durations that are relatively short—for example, 54 days for those accused of a driving-related felony—pretrial detention represents a nearly two-month period during which individuals are separated from their families and financial hardships are exacerbated. Moreover, the typical wait until trial is much longer in some places than others (e.g., 200 days in one sample of Pennsylvania counties).

Dobbie and Yang also point out that when it comes to international comparisons, the US locks accused people at a much higher rate before trial than other countries. This figure shows the number of people detained pre-trial per 100,000 population. The US is the tallest bar on the far right.

Sorting out the costs and benefits of different levels of pretrial detention isn\’t easy. The direct costs of holding people in jails and prisons is straightforward. But how many of those accused people would not have appeared before the court? If they did not appear for court, how would the cost of finding them have compared to the cost of locking them up for days–and in some cases for weeks or even months? What are the additional costs of being locked up in terms of loss of employment opportunities, or stresses on families? How many of those would have committed crimes if not detained? (And how comfortable are we as a society with locking people up not because they have been convicted of a crime, but because we suspect they might commit a crime in the future?) When thinking about conditions of pretrial release, judges are supposed to take all of this account: for example, the presumption that an accused person is innocent, the risk of the person not showing up for trial and the costs of finding them, the risk of the person committing another crime if they are not detained, the person\’s social ties to the community, the person\’s economic ability to put up a monetary bond.

To figure out the effects of different methods of pretrial detention, a social scientist might ideally like to take a large pool of people accused of crimes and conduct a randomized experiment, in which some randomly get offered differing levels of money bail, some are released on their own recognizance, and we see what happens. While it would be grossly inappropriate for the justice system to plan to operate in this way, it turns out that this randomized experiment is being conducted by reality.

Decisions about whether to offer bail, or at what level, are not made consistently across the judicial system. When there are multiple judges in a given court, some will tend to be tougher in granting bail and some will be easier, so whether defendants like it or not, they are living in a randomized experiment depending on the judge to whom they are randomly assigned. In addition, the evidence show that even the same judge will treat accused people with seemingly identical characteristics in the same way, which adds another element of randomness. Thus, research in this area can start by setting aside those who are essentially always granted bail or essentially never granted bail, and instead focus on those with seemingly identical characteristics who are more-or-less randomly granted bail in some cases but not in others.

Dobbie and Yang have been among the leading researchers in this area, and they describe the results of this research in their paper.  For example:

Those who are detailed pretrial are more likely to be found guilty, mainly because those who are detailed pretrial are more likely to take a plea bargain–which may include credit for time already served. Pretrial detention clearly reduces the risk of pretrial flight and pretrial crime, but at least in some studies, greater exposure to jail time before the trial is associated with a rise in posttrial crime. Defendants who because of the randomness ins the system are released pretrial, rather than being held pretrial, are more likely to have income and to be employed 2-4 years later.  In some jurisdictions, the randomness in the process of granting bail takes the form of racial disparities.

For those of us who aren\’t ready to take the plunge and eliminate the money bail system altogether, what might we do to move in the direction of reducing the use of money bail and rationalizing the system? Dobbie and Yang offer some proposals based on the existing research.

Some are pretty simple. When defendants are released on their own recognizance before trial, set up a system of text message or emails to remind them of their court date. For low-risk crimes, make greater use of writing citations, rather than arresting people, and when people are arrested, lean toward releasing them on their own recognizance. For those defendants where a higher degree of monitoring seems appropriate, make greater use of electronic or personal monitoring.

Some more complex proposals involve machine learning. It\’s now possible to plug the data on the characteristics of those who get bail, or are released on their recognizance, into a computer algorithm, which can look for patterns in those who are more or less likely to flee before trial, or more or less likely to commit crimes. The feedback based on studies can then be turned over to judges, so they have a systematic sense of  how they ruled in past similar cases, and how they compare with how other judges have ruled in similar cases. It\’s easy to feel queasy about this approach. Are we going to let the result of computer number-crunching play a substantial role in whether people are granted bail? But computer number-crunching may have greater clarity and consistency in its decisions than at least some judges, and could help produce results that both let more people out before trial while also leading to lower pretrial flight risks and crime. Pilot tests along these lines in jurisdictions willing to give it a try seem warranted.

Federal Employee Pay: A Trial Balloon

\”The Federal Government is the Nation’s largest employer, and its footprint is global. The total workforce comprises approximately 2.1 million non-postal civilian workers and 1.4 million active duty military, as well as approximately one million military reserve personnel serving throughout the country and the world. The postal workforce includes an additional 500,000 employees. Approximately 85 percent of the Federal workforce, or 1.7 million people, live outside of the Washington, D.C., metropolitan area. Notably, an even larger “indirect” workforce carries out much of the work paid for by Federal funds. This includes Federal contractors and State, local, and nonprofit employees whose jobs are funded by Federal contracts, grants and transfer payments.\”

This reminder is from Chapter 5 of the Analytical Perspectives volume published with the proposed FY 2020 budget from the Trump Administration. In any given year, a lot of what is in the Analytical Perspectives volume is just an update of the previous year. But the topic of federal employee pay gets more than a quick update; it\’s an announcement that the Trump administration plans to push on the topic of federal employee pay in the next couple of years. 
Here are some background figures from the budget documents. The first two figures compare education levels for federal workers and the private-sector workers, and how they have evolved over time. The first figure shows that the share of federal workers with at least a master\’s degree has roughly doubled from 15 to 30% since 1990. In the private sector, the share of workers with a master\’s degree is less than half this level, although also rising over time. 
The reverse pattern holds for workers with a high school degree or less. This group was 30% of the federal workforce in 1990, but is now about half that level. For all firms in the private sector, 50% of workers had a high school degree or less in 1990, and it\’s now about 40%.

The patterns suggest a real disjunction between federal and private-sector workers. For at least some readers, it may come as a surprise to recognize that 40% of private-sector workers in the US have a high school degree or less. hat seems like a real-world solution, or a useful process of paperwork and forms, might look rather different to members of these two workforces. 
Federal workers tend to be significantly older, too. 
Comparing the compensation of federal and private-sector workers isn\’t straightforward. A full analysis would need to take into account take-home pay, benefits, differences in skill levels, and likelihood of being fired or being able to stay on the job as long as you want. The budget document point to a study by the Congressional Budget Office:

A Congressional Budget Office (CBO) report issued in April 2017 found that, based on observable characteristics, Federal employees on average received a combined 17 percent higher wage and benefits package than the private sector average over the 2011-2015 period. The difference is overwhelmingly on the benefits side. CBO found that Federal employees receive on average 47 percent higher benefits and 3 percent higher wages than counterparts in the private sector. In CBO’s analysis, these differences reflect higher Federal compensation paid to individuals with a bachelor’s degree or less, with Federal employees with professional degrees undercompensated relative to
private sector peers.

This general pattern that wages for federal employees are similar to the private sector, given education level, but benefits are higher for government workers, goes back a few years: for example, I laid out the pattern in a blog post in 2012. budget reproduces a chart from the 2017 CBO study. It separates the workforce into five groups by education level. For each group, the left-hand bar show wages and benefits for federal workers, while the right-hand bar shows wages and benefits for private-sector workers. Again, wages are fairly similar, but retirement and health benefits are clearly better for the federal workers. 
The budget document discusses a variety of changes to federal pay: having employees contribute more to their retirement benefits; fewer days off for federal employees, but more flexibility in which days can be taken off; fewer across-the-board pay increases, and more merit increases; greater hiring of \”term\” federal employees who spend a few years in the government before heading back to the private sector; and more. These kinds of proposals for adjusting federal employee pay are fairly common, but they often tend to end up on that long list of perhaps-useful-but-not-necessarily-right-now topics that never quite make it into law. 

Market Shares for Browsers and Platforms

Teachers of intro economics, as well as industrial organization classes, are often on the lookout for recent examples of market shares that can be used for talking about the extent to which certain markets are concentrated or competitive. The W3 Counter offers a monthly breakdown of market shares for browsers and platforms. For February 2019, here\’s a figure for internet browser market share:

Here\’s a figure showing patterns over time, and thus showing the substantial rise of Chrome and the mild rise of Firefox, and the  corresponding falls of Internet Explorer/Edge  and Safari. Depending on whether you are a glass half-full or half-empty person, you will have a tendency to see this either as proof that Google\’s Chrome has a worrisome level of market dominance, or as proof that even seemingly dominant browser market shares can fall in a fairly short time.

Finally, here\’s a table with the market share of various platforms in February 2019, but it needs to be read with care, since it lists multiple versions of Windows, Android, and ioS/Mac.

Some US Social Indicators Since 1960

The Office of Management and Budget released President Trump\’s  proposed budget for fiscal year 2020 a few weeks ago. I confess that when the budget comes out I don\’t pay much attention to the spending numbers for this year or the five-year projections. Those numbers are often build on sand and political wishfulness, and there\’s plenty of time to dig into them later, if necessary. Instead, I head for the \”Analytical Perspectives\” and \”Historical Tables\” volumes that always accompany the budget. For example, Chapter 5 of the \”Analytical Perspectives\” is about \”Social Indicators\”: 

The social indicators presented in this chapter illustrate in broad terms how the Nation is faring in selected areas. Indicators are drawn from six domains: economic, demographic and civic, socioeconomic, health, security and safety, and environment and energy. … These indicators are only a subset of the vast array of available data on conditions in the United States. In choosing indicators for these tables, priority was given to measures that are broadly relevant to Americans and consistently available over an extended period. Such indicators provide a current snapshot while also making it easier to draw comparisons and establish 

This section includes a long table stretching over parts of three pages shows many statistics for ten-year intervals since 1960, and also the last few years. For me, tables like this offer a grounding in basic facts and patterns. Here, I\’ll offer some comparisons drawn from the table over the last half-century or so, from 1960 or 1970 up to the most recent data.

Economic

  • Real GDP per person has more than tripled since 1960, rising from $18,036 in 1960 to $55,373 in 2017 (as measured in constant 2012 dollars).
  • Inflation has reduced the buying power of the dollar over time such that $1 in 2016 had about the same buying power as 12.3 cents back in 1960, according to the Consumer Price Index.
  • The employment/population ratio rose from 56.1% in 1960 to 64.4% by 2000, then dropped to 58.5% in 2012, before rebounding a bit to 62.9% in 2018.
  • The share of the population receiving Social Security disabled worker benefits was 0.9% in 1960 and 5.5% in 2018. 
  • The net national savings rate was 10.9% of GDP in 1960, 7.1% in 1980, and 6.0% in 2000. It actually was slightly negative at -0.5 in 2010, but was back to 2.9% in 2017.
  • Research and development spending has barely budged over time: it was 2.52% of GDP in 1960 and 2.78% of GDP in 2017, and hasn\’t varied much in between.
Demographic
  • The foreign-born population of the US was 9.6 million out of a total of 204 million in 1970, and was 44.5 million out of at total of 325.7 million in 2017.
  • In 1960, 78% of the over-15 population had ever been married; in 2018, it was 67.7%.
  • Average family size was 3.7 people in 1960, and 3.1 people in 2018.
  • Single parent households were 4.4% of households in 1960, and 9.1% of all households in 2010, but slightly down to 8.3% of all households in 2018.
Socioeconomic
  • The share of 25-34 year-olds who are high school graduates was 58.1% in 1960, 84.2% in 1980, and 90.9% in 2018.
  • The share of 25-34 year-olds who are college graduates was 11% in 1960, 27.5% in 2000, and 35.6% in 2017.
  • The average math achievement score for a 17 year-old on the National Assessment of Educational Progress was 304 in 1970, and 306 in 2010.
  • The average reading achievement score for a 17 year-old was 285 in 1970 and 286 in 2010.
Health
  • Life expectancy at birth was 69.7 years in 1960, and 78.7 years in 2010, and 78.6 years in 2017.
  • Infant mortality was 26 per 1,000 births in 1960, and 5.8 per 1,000 births in 2017.
  • In 1960, 13.4% of the population age 20-74 was obese (as measured by having a Body Mass Index above 30). In 2016, 40% of the population was obese.
  • In 1970, 37.1% of those age 18 and older were cigarette smokers. By 2017, this has fallen  to 14.1%.
  • Total national health expenditures were 5.0% of GDP in 1960, and 17.9% of GDP in 2017.
Security and Safety
  • The murder rate was 5.1 per 100,000 people in 1960, rose to 10.2 per 100,000 by 1980, but had fallen back to 4.9 per 100,000 in 2015, before nudging up to 5.3 per 100,000 in 2017..
  • The prison incarceration rate in federal and state institutions was 118 per 100,000 in 1960, 144 per 100,000 in 1980, 519 per 100,000 by 2010, and then down to 464 per 100,000 in 2016.
  • Highway fatalities rose from 37,000 in 1960 to 51,000 in 1980, and then fell to 33,000 in 2010, before nudging up to 37,000 in 2017.
Energy

  • Energy consumption per capita was 250 million BTUs in 1960, rose to 350 million BTUs per person in 2000, but since then has fallen to 300 BTUs per person in 2017.
  • Energy consumption per dollar of real GDP (measured in constant 2009 dollars) was 14,500 BTUs in 1960 vs. 5,700 in 2017.
  • Electricity net generation on a per person basis was 4.202 kWh in 1960, had more than tripled to 13,475 kWh by 2000, but since then has declined to 12,326 kWh in 2017. 
  • The share of electricity generation from renewable sources was 19.7% of the total in 1960, fell to 8.8% by 2005, and since then rose to 17.1% of the total in 2017.
Numbers and comparisons like these are a substantial part of how a head-in-the-clouds academic like me perceives economic and social reality. If you like this kind of stuff, you would probably also enjoy my post from a few years back, \”The Life of US Workers 100 Years Ago\” (February, 5, 2016).

Interview With Greg Mankiw at the Dallas Fed

In the latest installment of its \”Global Perspectives\” series of conversations, Robert S. Kaplan of the Dallas Fed, discussed national and global economic issues with Greg Mankiw on March 7, 2019. The full 50 minutes of video is available here.

For a quick sample, here\’s what Mankiw had do say on what economists don’t understand about politicians, and vice versa:

I don’t think economists fully understand the set of constraints that politicians operate under, probably because we have tenure, so we can say whatever we want. The politicians don’t. They constantly have to get approval by the voters, and the voters have different views of economic issues than economists do. So the politicians are sort of stuck between the voters they have to appeal to and the economists who are giving them advice. I think understanding the difficult constraints that politicians operate under would be useful.

In terms of what politicians don’t understand about economists, I think they often turn to (economists) for the wrong set of questions. My mentor, [Princeton University economist] Alan Blinder, coined what he calls Murphy’s Law of economic policy, which says that economists have the most influence where they know the least, and they have the least influence where they know the most.

Politicians are constantly asking us, ‘What’s going to happen next year?’ But we are really bad at forecasting. I understand why people need forecasting, as part of the policy process, but we’re really bad at it, and we’re probably not going to be good any time soon. On the other the hand, there are certain problems where we kind of understand the answer. We understand that rent control is not a particularly good way to run a housing market. We understand that if you want to deal with climate change, you probably want to put a price on carbon. If you have a city that suffers from congestion, we can solve that with congestion pricing. 

Can Undergraduates Be Taught to Think Like Economists?

A common goal for principles of economics courses is to teach students to \”think like economists.\” I\’ve always been a little skeptical of that  high-sounding goal. It seems like a lot to accomplish in a semester or two. I\’m reminded of an essay written by Deirdre McCloskey back in 1992, which argued that while undergraduates can be taught about economics, thinking like an economist is a much larger step that will only in very rare cases happen in the principles class. Here\’s McCloskey (\”Other Things Equal: The Natural,\” Eastern Economic Journal, Spring 1992):

\”Bower thinks that we can teach economics to undergraduates. I disagree. I have concluded reluctantly, after ruminating on it for a long me, that we can\’t. We can teach about economics, which is a good thing. The undergraduate program in English literature teaches about literature, not how to do it. No one complains, or should. The undergraduate program in art history teaches about painting, not how to do it. I claim the case of economics is similar. Majoring in economics can teach about economics, but not how to do it….

As an empirical scientist I have to conclude from this and other experiences that thinking like an economist is too difficult to be a realistic goal for teaching. I have taught economics, man and boy, for nearly a century, and I tell you that it is the rare, gifted graduate student who learns to think like an economist while still in one of our courses, and it takes a genius undergraduate (Sandy Grossman, say, who was an undergraduate when I came to Chicago in 1968). Most of the economists who catch on  do so long after graduate school, while teaching classes or advising governments: that\’s when I learned to think like an economist, and I wonder if your experience is not the same. 

\”Let me sharpen the thought. I think economics, like philosophy, cannot be taught to nineteen-year olds. It is an old man\’s field. Nineteen-year olds are, most of them, romantics, capable of memorizing and emoting, but not capable of thinking coldly in the cost-and-benefit way. Look for example at how irrational they are a few years later when getting advice on post-graduate study. A nineteen-year old has intimations of immortality, comes directly from a socialized economy (called a family), and has no feel on his pulse for those tragedies of adult life that economists call scarcity and choice. You can teach a nineteen-year old all the math he can grasp, all the history he can read, all the Latin he can stand. But you cannot teach him a philosophical subject. For that he has to be, say twenty-five, or better, forty-five. …

In practical terms, the standard principles of economics course is a long march through a bunch of conceptual ideas: opportunity cost, supply and demand, perfect and imperfect competitions, comparative advantage and international trade, externalities and public goods, unemployment and inflation, monetary and fiscal policy, and more. The immediate concern of most students is to master those immediate tools–what McCloskey calls learning \”about\” economics. But I do think that in the process of learning \”about,\” many principles students get a meaningful feeling for a the broader subject and mindset. In the introduction to my own principles textbook, I write:

There’s an old joke that economics is the science of taking what is obvious about human behavior and making it incomprehensible. Actually, in my experience, the process works in the other direction. Many students spend the opening weeks of an introductory economics course feeling as if the material is difficult, even impossible, but by the middle and the end of the class, what seemed so difficult early in the term has become obvious and straightforward. As a course in introductory economics focuses on one lesson after another and one chapter after another, it’s easy to get tunnel vision. But when you raise your eyes at the end of class, it can be quite astonishing to look back and see how far you have come. As students apply the terms and models they have learned to a series of real and hypothetical examples, they often find to their surprise that they have also imbibed a considerable amount about economic thinking and the real-world economy. Learning always has an aspect of the miraculous.

Thus, I agree with McCloskey that truly \”thinking like an economist\” is a very rare outcome in a principles course, and unless you are comfortable as a teacher with setting a goal that involves near-universal failure, it\’s not a useful goal for instructors. But it also seems true to me that the series of topics in a conventional principles of economics course, and how they build on each other, does for many students combine to form a comprehensible narrative by the end of the class. The students are not thinking like economists. But they have some respect and understanding for how economist think.

Child Care and Working Mothers

During the 1990s, a social and legal expectation arose in the United States that single mothers would usually be in the workforce, even when their children were young. In turn, this immediately raised a question of how child care would be provided. The 2019 Economic Report of the President. from the White House Council of Economic Advisers,  offers some useful graphs and analysis of this subject.

Here\’s are some patterns in the labor force for \”prime-age\” women between the ages of 25 and 54, broken down by single and married, and children or not. Back in the early 1980s, for example, single women with no children (dark blue line) were far more likely to be in the labor force than other women in this age group, and less than half of the married women with children under the age of six (green line) were in the labor force.

But by about 2000, the share of single prime-age women with no children in the labor force has declined, and had roughly converged with labor force participation rates of the other groups shown–except for the labor force participation rates of married women with children under six, which rose but remained noticeably lower. The report notes: \”These married mothers of young children who are out of the labor force are evenly distributed across the educational spectrum, although on average they have somewhat less education than married mothers of young children as a whole.\”

The two especially big jumps in the figure are for labor force participation of single women with children, with the red line referring to single women with children under the age of six and the gray line referring to single women with children over the age of six. Back around 1990, single and married women with children over the age of six were in the labor force at about the same rate, and single and married women with children under the age of six were in the labor force at about the same rate. But after President Clinton signed the welfare reform legislation in 1996 (formally, the
Personal Responsibility and Work Opportunity Reconciliation Act of 1996), work requirements increased for single mothers receiving government assistance.

If single mothers with lower levels of incomes are expected to work–especially mothers with children of pre-school age–then child care becomes of obvious importance. But as the next figure shows, the cost of child care is often a sizeable percentage of the median hourly wage in a given state. And of course, by definition, half of those paid with an hourly wage earn less than the median.

Mothers who are mainly working to cover child care costs face some obvious disincentives. The report cites various pieces of research that lower child care costs tend to increase the labor force participation of women. For example, a 2017 \”review of the literature on the effects of child care costs on maternal labor supply … concludes that a 10 percent decrease in costs increases employment among mothers by about 0.5 to 2.5 percent.\”

So what steps might government take to make child care more accessible to households with low incomes? Logically, the two possibilities are finding ways to reduce the costs, or providing additional buying power to those households.

When it comes to reducing costs, one place to look is at the variation in state-level requirements for child care facilities. It\’s often politically easy to ramp up the strictness of such requirements; after all, passing requirements for child care facilities doesn\’t make the state spend any money, and who can object to keeping children safe? But when state regulations raise the costs of providing a service, the buyers of that service end up paying the higher costs. The report points out some variations across states by staffing requirements.

\”For 11-month-old children, minimum staff-to-child ratios ranged from 1:3 in Kansas to 1:6 in Arkansas, Georgia, Louisiana, Nevada, and New Mexico in 2014. For 35-month-old children, they ranged from 1:4 in the District of Columbia to 1:12 in Louisiana. For 59-month-old children, they ranged from 1:7 in New York and North Dakota to 1:15 in Florida, Georgia, North Carolina, and Texas. Assuming an average hourly wage of $15 for staff members (inclusive of benefits and payroll taxes paid by the employer), the minimum cost for staff per child per hour would range from $2.50 in the most lenient State to $5 in the most stringent State for 11-month-old children, from $1.25 to $3.75 for 35-month old-children, and from $1.00 to $2.14 for 59 month-old children.\”

Here\’s a figure illustrating the theme.

Staffing requirements aren\’t the only rules causing variation in child care costs across states of course. The report notes:

Wages are based on the local labor market demand for the employees’ skills and qualifications, as well as the availability of workers in the field. Regulations that require higher-level degrees or other qualifications drive up the wages required to hire and retain staff, increasing the cost of child care. Though recognizing that some facilities are exempt from these requirements, all States set requirements for minimum ages and qualifications of staff, including some that require a bachelor’s degree for lead child care teachers. Other staff-related regulations that can drive up costs include required background checks and training requirements. In addition to standards regarding staff, many States set minimum requirements for buildings and facilities, including regulating the types and frequency of environmental inspections and the availability of indoor and outdoor space.

The report looks at some studies of the effects of these rules. One study estimates \”that decreasing the maximum number of infants per staff member by one (thereby increasing the minimum staff-to-child ratio) decreases the number of center-based care establishments by about 10 percent. Also, each additional year of education required of center directors decreases the supply of care centers by about 3.5 percent.\” The point, of course, is not that states should all move unquestioningly to lower staffing levels. It\’s that states should question their rules, and look at practices elsewhere, bearing in mind that the costs of rules hit harder for those with lower incomes.

The other approach to making child care more available is to increase the buying power of low-income households with children, which can be done in a variety of ways.  The Economic Report of the President always brags about the current administration, but it was nonetheless interesting to me that it chose to brag about additional support for child care costs of low-income families:

The Trump Administration has mitigated these work disincentives by substantially bolstering child care programs for low-income families. In 2018, the CCDBG [Child Care and Development Block Grant] was increased by $2.4 billion, and this increase was sustained in 2019. The Child Care and Development Fund, which includes CCDBG and other funds, distributed a total of $8.1 billion to States to offer child care subsidies to low-income families who require child care in order to work, go to school, or enroll in training programs. In addition, Federal child care assistance is offered through TANF, Head Start, and other programs.

There are also mentions of how programs like Supplemental Nutrition Assistance Program (SNAP, or \”food stamps\”) and the Earned Income Tax Credit can help to make child care more affordable. The Child Tax Credit, which was increased in the 2017 tax legislation, including \”the refundable component of the CTC for those with earnings but no Federal income tax liability.\” There\’s also a child and dependent care tax credit.

When it comes to the incentives and opportunities for low-income women to work, child care is of course just one part of the puzzle, and often not the largest part. But it remains a real and difficult hurdle for a lot of households, especially for lower-income women.  An additional issue is that some households will prefer formal child care, and thus will be benefit more from policies aimed directly at formal child care, while others will rely more informal networks of family and friends, and will benefit more from policies that increase income that be used for any purpose. 

For some other gleanings from this year\’s Economic Report of the President, see:

Geoengineering: The Governance Problem

Solar geoengineering refers to putting stuff in the atmosphere that would have the effect of counteracting greenhouse gases. Yes, there would be risks in undertaking geoengineering. However, those who argue that substantial dangers of climate change are fairly near-term must be willing to consider potentially unpleasant answers. Even if the risks of geoengineering are too substantial right now, given the present state of climate change, if the world as a whole doesn\’t move forward with steps to hold down emissions of greenhouse gases, then perhaps the risks of geoengineering will look more acceptable in a decade or two? 

Thus, the Harvard Project on Climate Agreements and Harvard’s Solar Geoengineering Research Program have combined to publish Governance of the Deployment of Solar Geoengineering, an introduction followed by 26 short essays on the subject. The emphasis on governance seems appropriate to me, because there\’s not a huge mystery over how to do solar geoengineering. \”The method most commonly discussed as technically plausible and potentially effective involves adding aerosols to the lower stratosphere, where they would reflect some (~1%) incoming sunlight back to space.\” However, one can also imagine more localized versions of solar geoengineering, like making a comprehensive effort so that all manmade structures–including buildings and roads–would be more likely to reflect sunlight. 
The question of governance is about who decides when and what would be done. As David Keith and Peter Irvine write in their essay: \”[T]he hardest and most important problems raised by solar geoengineering are non-technical.\”
For example, what if one country or a group of countries decided to deploy geoengineering in the atmosphere? Perhaps that area is experiencing particular severe weather where the public is demanding that its politicians take action.  Perhaps it\’s the \”Greenfinger\” scenario, in which a very wealthy person decides that its up to them to save the planet. Other countries could presumably respond with some mixture of complaints, trade and financial sanction,  counter-geoengineering to reverse the effects of what the first country was doing, or even military force. Thus, it\’s important to think both about governance of institutions that would consider deploying geoengineering, but it may be even more important to think about governance of institutions that would decide how to respond when someone else undertake geoengineering. 
Perhaps we can learn from other international agreements, like those affecting nuclear nonproliferation, cybersecurity, even international monetary policy. But countries and regions are likely to be affected differently by climate change, and thus are likely to weigh the costs and benefits of geoengineering differently. Agreement won\’t be easy. But like the nuclear test-ban treaty, one can imagine rules that allow nation to monitor other nations, to see if they are undertaking geoengineering efforts. 
A common argument among these authors is that geoengineering will happen. ItFor example, Lucas Stanczyk writes: \”Looking at the limited range of options available to mitigate the coming climate crisis, it is difficult to escape the conclusion that some form of solar geoengineering will be deployed on a global scale this century.\” \’s just a question of when and where, and of trying to arrange an institutional set-up where the benefits are more likely to outweigh the costs. 
Richard J. Zeckhauser and Gernot Wagner make this point in more detail. They write: 
  • Both unchecked climate change and any potential deployment of solar geoengineering (SG) are governed by processes that are currently unknowable; i.e., either is afflicted with ignorance. 
  • Risk, uncertainty, and ignorance are often greeted with the precautionary principle: “do not proceed.” Such inertia helps politicians and bureaucrats avoid blame. However, the future of the planet is too important a consequence to leave to knee-jerk caution and strategic blame avoidance. Rational decision requires the equal weighting of errors of commission and omission. 
  • Significant temperature increase, at least to the 2°C level, is almost certainly in our planet’s future. This makes research on SG a prudent priority, with experimentation to follow, barring red-light findings. …

Consider the decision of whether to enroll in a high-risk medical trial. Faced with a bad case of cancer, the standard treatment is high-dose chemotherapy. Now consider as an alternative treatment an experimental bone-marrow transplant. the additional treatment mortality of the trial, of say 4 percentage points, is surely an important aspect of the decision – but so should be the gain in long-run survival probability. If that estimated gain is greater than 4 percentage points, say 10 or even “only” 6 percentage points, a decision maker with the rational goal of maximizing the likelihood of survival should opt for the experimental treatment. 

All too often, however, psychology intervenes, including that of doctors. Errors of commission get weighted more heavily; expected lives are sacrificed. The Hippocratic Oath bans the intention of harm, not its possibility. Its common misinterpretation of “first do no harm” enshrines the bias of overweighing errors of commission. To be sure, errors of commission incur greater blame or self-blame than those of omission when something bad happens, a major source of their greater weight. But blame is surely small potatoes relative to survival, whether of a patient or of the Earth. Hence, we assert once again, italics and all: Where climate change and solar geoengineering are concerned, errors of commission and omission should be weighted equally. 

That also implies that the dangers of SG [solar geoengineering] – and they are real – should be weighed objectively and dispassionately on an equal basis against the dangers of an unmitigated climate path for planet Earth. The precautionary principle, however tempting to invoke, makes little sense in this context. It would be akin to suffering chronic kidney disease, and being on the path to renal failure, yet refusing a new treatment that has had short-run success, because it could have long-term serious side effects that tests to date have been unable to discover. Failure to assiduously research geoengineering and, positing no red-light findings, to experiment with it would be to allow rising temperatures to go unchecked, despite great uncertainties about their destinations and dangers. That is hardly a path of caution.

For an earlier post on this topic, see \”Geoengineering: Forced Upon Us?\” (May 11, 2015).

Time to Abolish "Statistical Significance"?

The idea of \”statistical significance\” has been a basic concept in introductory statistics courses for decades. If you spend any time looking at quantitative research, you will often see in tables of results that certain numbers are marked with an asterisk or some other symbol to show that they are \”statistically significant.\”

For the uninitiated, \”statistical significance\” is a way of summarizing whether a certain statistical result is likely to have happened by chance, or not. For example, if I flip a coin 10 times and get six heads and four tails, this could easily happen by chance even with a fair and evenly balanced coin. But if I flip a coin 10 times and get 10 heads, this is extremely unlikely to happen by chance. Or if I flip a coin 10,000 times, with a result of 6,000 heads and 4,000 tails (essentially, repeating the 10-flip coin experiment 1,000 times), I can be quite confident that the coin is not a fair one. A common rule of thumb has been that if the probability of an outcome occurring by chance is 5% or less–in the jargon, has a p-value of 5% or less–then the result is statistically significant. However, it\’s also pretty common to see studies that report a range of other p-values like 1% or 10%.

Given the omnipresence of \”statistical significance\” in pedagogy and the research literature, it was interesting last year when the American Statistical Association made an official statement \”ASA Statement on Statistical Significance and P-Values\” (discussed here) which includes comments like: \”Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold. … A p-value, or statistical significance, does not measure the size of an effect or the importance of a result. … By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.\”

Now, the ASA has followed up with a special supplemental issue of its journal The American Statistician on the theme \”Statistical Inference in the 21st Century: A World Beyond p < 0.05" (January 2019).  The issue has a useful overview essay, \”Moving to a World Beyond “p < 0.05.” by Ronald L. Wasserstein, Allen L. Schirm, and  Nicole A. Lazar. They write:

We conclude, based on our review of the articles in this special issue and the broader literature, that it is time to stop using the term “statistically significant” entirely. Nor should variants such as “significantly different,” “p < 0.05,” and “nonsignificant” survive, whether expressed in words, by asterisks in a table, or in some other way. Regardless of whether it was ever useful, a declaration of “statistical significance” has today become meaningless. … In sum, `statistically significant'—don’t say it and don’t use it.

The special issue is then packed with 43 essays from a wide array of experts and fields on the general theme of  \”if we eliminate the language of statistical significance, what comes next?\”

To understand the arguments here, it\’s perhaps useful to have a brief and partial review of some main reasons why the emphasis on \”statistical significance\” can be so misleading: namely, it can lead one to dismiss useful and true connections; it can lead one to draw false implications; and it can cause researchers to play around with their results. A few words on each of these.

The question of whether a result is \”statistically significant\” is related to the size of the sample. As noted above, 6 out of 10 heads can easily happen by chance, but 6,000 out of 10,000 heads is extraordinarily unlikely to happen by chance.  So say that you do an study which finds an effect which is fairly large in size, but where the sample size isn\’t large enough for it to be statistically significant by a standard test. In practical terms, it seems foolish to ignore this large result; instead, you should presumably start trying to find ways to run the test with a much larger sample size. But in academic terms, the study you just did with its small sample size may be unpublishable: after all, a  lot of journals will tend to decide against publishing a study that doesn\’t find a statistically significant effect–because it feels as if such a study isn\’t pointing out any new connection or insight.

Knowing that journals are looking to publish \”statistically significant\” results, researchers will be tempted to look for ways to jigger their results. Studies in economics, for example, aren\’t about simple probability examples like flipping coins. Instead, one might be looking at Census data on households that can be divided up in roughly a jillion ways: not just the basic categories like age, income, wealth, education, health, occupation, ethnicity, geography, urban/rural, during recession or not, and others, but also various interactions of these factors looking at two or three or more at a time. Then, researchers make choices about whether to assume that connections between these variables should be thought of a linear relationship, curved relationships (curving up or down), relationships are are U-shaped or inverted-U, and others. Now add in all the different time periods and events and places and before-and-after legislation that can be considered. For this fairly basic data, one is quickly looking at thousands or tens of thousands of possible connections relationships.

Remember that the idea of statistical significance relates to  whether something has a 5% probability or less of happening by chance. To put that another way, it\’s whether something would have happened only one time out of 20 by chance. So if a researcher takes the same basic data and looks at thousands of possible equations, there will be dozens of equations that look like they had a 5% probability of not happening by chance. When there are thousands of researchers acting in this way, there will be a steady stream of hundreds of result every month that appear to be \”statistically significant,\” but are just a result of the general situation that if you look at a very large number of equations one at a time, some of them will seem to mean something. It\’s a little like flipping a coin 10,000 times, but then focusing only on the few stretches where the coin came up heads five times in a row–and drawing conclusions based on that one small portion of the overall results.

A classic statement of this issue arises in Edward Leamer\’s 1983 article, \”Taking the Con out of Econometrics\” (American Economic Review, March 1983, pp. 31-43). Leamer wrote:

The econometric art as it is practiced at the computer terminal involves fitting many, perhaps thousands, of statistical models. One or several that the researcher finds pleasing are selected for re- porting purposes. This searching for a model is often well intentioned, but there can be no doubt that such a specification search in-validates the traditional theories of inference. … [I]n fact, all the concepts of traditional theory, utterly lose their meaning by the time an applied researcher pulls from the bramble of computer output the one thorn of a model he likes best, the one he chooses to portray as a rose. The consuming public is hardly fooled by this chicanery. The econometrician\’s shabby art is humorously and disparagingly labelled \”data mining,\” \”fishing,\” \”grubbing,\” \”number crunching.\” A joke evokes the Inquisition: \”If you torture the data long enough, Nature will confess\” … This is a sad and decidedly unscientific state of affairs we find ourselves in. Hardly anyone takes data analyses seriously. Or perhaps more accurately, hardly anyone takes anyone else\’s data analyses seriously.\”

Economists and other social scientists have become much more aware of these issues over the decades, but Leamer was still writing in 2010 (\”Tantalus on the Road to Asymptopia,\” Journal of Economic Perspectives, 24: 2, pp. 31-46):

Since I wrote my “con in econometrics” challenge much progress has been made in economic theory and in econometric theory and in experimental design, but there has been little progress technically or procedurally on this subject of sensitivity analyses in econometrics. Most authors still support their conclusions with the results implied by several models, and they leave the rest of us wondering how hard they had to work to find their favorite outcomes … It’s like a court of law in which we hear only the experts on the plaintiff’s side, but are wise enough to know that there are abundant for the defense. 

Taken together, these issues suggest that a lot of the findings in social science research shouldn\’t be believed with too much firmness. The results might be true. They might be a result of a researcher pulling out \”from the bramble of computer output the one thorn of a model he likes best, the one he chooses to portray as a rose.\” And given the realities of real-world research, it seems goofy to say that a result with, say, only a 4.8% probability of happening by chance is \”significant,\” while if the result had a 5.2% probability of happening by chance it is \”not significant.\” Uncertainty is a continuum, not a black-and-white difference.

So let\’s accept the that the \”statistical significance\” label has some severe problems, as Wasserstein, Schirm, and Lazar write:

[A] label of statistical significance does not mean or imply that an association or effect is highly probable, real, true, or important. Nor does a label of statistical nonsignificance lead to the association or effect being improbable, absent, false, or unimportant. Yet the dichotomization into “significant” and “not significant” is taken as an imprimatur of authority on these characteristics. In a world without bright lines, on the other hand, it becomes untenable to assert dramatic differences in interpretation from inconsequential differences in estimates. As Gelman and Stern (2006) famously observed, the difference between “significant” and “not significant” is not itself statistically significant.

But as they recognize, criticizing is the easy part. What is to be done instead? And here, the argument fragments substantially. Did I mention that there were 43 different responses in this issue of the American Statistician?

Some of the recommendations are more a matter of temperament than of specific statistical tests. As Wasserstein, Schirm, and Lazar emphasize, many of the authors offer advice that can be summarized in about seven words: \”Accept uncertainty. Be thoughtful, open, and modest.” This is good advice! But a researcher struggling to get a paper published might be forgiven for feeling that it lacks specificity.

Other recommendations focus on the editorial process used by academic journals, which establish some of the incentives here. One interesting suggestion is that when a research journal is deciding whether to publish a paper, the reviewer should only see a description of what the researcher did–without seeing the actual empirical findings. After all, if the study was worth doing, then it\’s worthy of being published, right? Such an approach would mean that authors had no incentive to tweak their results. A method already used by some journals is \”pre-publication registration,\” where the researcher lays out beforehand, in a published paper, exactly what is going to be done. Then afterwards, no one can accuse that researcher of tweaking the methods to obtain specific results.

Other authors agree with turning away from \”statistical significance,\” but in favor of their own preferred tools for analysis: Bayesian approaches, \”second-generation p-values,\” \”false positive risk,\”

\”statistical decision theory,\” \”confidence index,\” and many more. With many alternative examples along these lines, the researcher trying to figure out how to proceed can again be forgiven for desiring little more definitive guidance.

Wasserstein, Schirm, and Lazar also asked some of the authors whether there might be specific situations where a p-value threshold made sense. They write:

\”Authors identified four general instances. Some allowed that, while p-value thresholds should not be used for inference, they might still be useful for applications such as industrial quality control, in which a highly automated decision rule is needed and the costs of erroneous decisions can be carefully weighed when specifying the threshold. Other authors suggested that such dichotomized use of p-values was acceptable in model-fitting and variable selection strategies, again as automated tools, this time for sorting through large numbers of potential models or variables. Still others pointed out that p-values with very low thresholds are used in fields such as physics, genomics, and imaging as a filter for massive numbers of tests. The fourth instance can be described as “confirmatory setting[s] where the study design and statistical analysis plan are specified prior to data collection, and then adhered to during and after it” …  Wellek (2017) says at present it is essential in these settings. “[B]inary decision making is indispensable in medicine and related fields,” he says. “[A] radical rejection of the classical principles of statistical inference…is of virtually no help as long as no conclusively substantiated alternative can be offered.”

The deeper point here is that there are situation where a researcher or a policy-maker or an economic needs to make a yes-or-no decision. When doing quality control, is it meeting the standard or not? when the Food and Drug Administration is evaluating a new drug, does it  approve the drug or not? When a researcher in genetics is dealing with a database that has thousands of genes, there\’s a need to focus on a subset of those genes, which means making yes-or-no decisions on which genes to include a certain analysis.

Yes, the scientific spirit should \”Accept uncertainty. Be thoughtful, open, and modest.” But real life isn\’t a philosophy contest. Sometimes, decisions need to be made. If you don\’t have a statistical rule, then the alternative decision rule becomes human judgment–which has plenty of cognitive, group-based, and political biases of its own.

My own sense is that \”statistical significance\” would be a  very poor master, but that doesn\’t mean it\’s a useless servant. Yes, it would foolish and potentially counterproductive to give excessive weight to \”statistical significance.\” But the clarity of conventions and rule, when their limitations are recognized and acknowledges, can still be useful. I was struck by a comment in the essay by Steven N. Goodman:

P-values are part of a rule-based structure that serves as a bulwark against claims of expertise untethered from empirical support. It can be changed, but we must respect the reason why the statistical procedures are there in the first place … So what is it that we really want? The ASA statement says it; we want good scientific practice. We want to measure not just the signal properly but its uncertainty, the twin goals of statistics. We want to make knowledge claims that match the strength of the evidence. Will we get that by getting rid of P−values? Will eliminating P−values improve experimental design? Would it improve measurement? Would it help align the scientific question with those analyses? Will it eliminate bright line thinking? If we were able to get rid of P-values, are we sure that unintended consequences wouldn’t make things worse? In my idealized world, the answer is yes, and many statisticians believe that. But in the real world, I am less sure.

What Did Gutenberg\’s Printing Press Actually Change?

There\’s an old slogan for journalists: \”If your mother says she loves you, check it out.\” The point is not to be  too quick to accept what you think you already know.

In a similar spirit, I of course know that the introduction of a printing press with moveable type by to Europe in 1439 by Johannes Gutenberg is often called one of the most important inventions in world history. However, I\’m grateful that Jeremiah Dittmar and Skipper Seabold have been checking it out. They have written \”Gutenberg’s moving type propelled Europe towards the scientific revolution,\” for the LSE Business Review (March 19, 2019). It\’s a nice accessible version of the main findings from their  research paper, \”New Media and Competition: Printing and Europe\’sTransformation after Gutenberg\” (Centre for Economic Perfomance Discussion Paper No 1600 January 2019). They write:

\”Printing was not only a new technology: it also introduced new forms of competition into European society. Most directly, printing was one of the first industries in which production was organised by for-profit capitalist firms. These firms incurred large fixed costs and competed in highly concentrated local markets. Equally fundamentally – and reflecting this industrial organisation – printing transformed competition in the ‘market for ideas’. Famously, printing was at the heart of the Protestant Reformation, which breached the religious monopoly of the Catholic Church. But printing’s influence on competition among ideas and producers of ideas also propelled Europe towards the scientific revolution.While Gutenberg’s press is widely believed to be one of the most important technologies in history, there is very little evidence on how printing influenced the price of books, labour markets and the production of knowledge – and no research has considered how the economics of printing influenced the use of the technology.\”

Dittmar and Seabold aim to provide some of this evidence. For example, here\’s their data on how the price of 200 pages changed over time, measured in terms of daily wages. (Notice that the left-hand axis is a logarithmic graph.) The price of a book went from weeks of daily wages to much less than one day of daily wages. 

They write: \”Following the introduction of printing, book prices fell steadily. The raw price of books fell by 2.4 per cent a year for over a hundred years after Gutenberg. Taking account of differences in content and the physical characteristics of books, such as formatting, illustrations and the use of multiple ink colours, prices fell by 1.7 per cent a year. … [I]n places where there was an increase in competition among printers, prices fell swiftly and dramatically. We find that when an additional printing firm entered a given city market, book prices there fell by 25%. The price declines associated with shifting from monopoly to having multiple firms in a market was even larger. Price competition drove printers to compete on non-price dimensions, notably on product differentiation. This had implications for the spread of ideas.\”

Another part of this change was that books were produced for ordinary people in the language they spoke, not just in Latin. Another part was that wages for professors at universities rose relative to the average worker, and the curriculum of universities shifted toward the scientific subjects of the time like \”anatomy, astronomy, medicine and natural philosophy,\” rather than theology and law.
The ability to print books affected religious debates as well, like the spread of Protestant ideas after Martin Luther circulated his 95 theses criticizing the Catholic Church in 1517.

Printing also affected the spread of technology and business.

Previous economic research has studied the extensive margin of technology diffusion, comparing the development of cities that did and did not have printing in the late 1400s …  Printing provided a new channel for the diffusion of knowledge about business practices. The first mathematics texts printed in Europe were ‘commercial arithmetics’, which provided instruction for merchants. With printing, a business education literature emerged that lowered the costs of knowledge for merchants. The key innovations involved applied mathematics, accounting techniques and cashless payments systems.
The evidence on printing suggests that, indeed, these ideas were associated with significant differences in local economic dynamism and reflected the industrial structure of printing itself. Where competition in the specialist business education press increased, these books became suddenly more widely available and in the historical record, we observe more people making notable achievements in broadly bourgeois careers.

It is impossible to avoid wondering if economic historians in 50 or 100 years will be looking back on the spread of internet technology, and how it affected patterns of technology diffusion, human capital, and social beliefs–and how differing levels of competition in the market may affect these outcomes.