Bill Kerr interviews Erica Groshen, who ran the Bureau of Labor Statistics from 2014-17, on topics involving how to improve labor market statistics in particular and government statistics overall in a Harvard Business School podcast titled “Infrastructure: Upgrading the US labor statistics system” (June 30, 2021, audio and a transcript available).

A lot of economic data has traditionally been collected by household surveys: that is, asking people what they earn, how many hours they work per week, how they spend their money, what government benefits they receive, and so on. These surveys are carefully designed and conducted, but at the end of the day, they have an irreducible amount of measurement error–because you are relying on people to remember and report accurately. Thus, there has has been a shift in recent decades to “administrative” data–that is, data collected by government agencies for other purposes like taxes or Social Security records–which seems likely to be more accurate than household surveys. Thus, Groshen points out that the unemployment insurance system could be expanded and converted into a way of gathering much more accurate data on labor markets and jobs. She notes:

[S]tate unemployment insurance agencies that, as part of running the program, collect worker wage records every quarter from every employer that lists the wages of workers for every month during that quarter. They also collect claims records from people who apply for claims. And these data are generally not available to BLS to augment or replace its current data collections. And that’s basically a shame, because it would be quite useful for statistical purposes. And employers, of course, have to report the same or slightly different data to a number of different government agencies. Our economics statistics are also not as good as they could be as a consequence of this. UI [unemployment insurance] wage records include who the person’s employer is and their earnings—that’s what’s in there. They should have job title, because that is closely associated with the person’s occupation. … And this would enable us to track workforce supply and demand much more closely, make better projections about the future of work. You also would want the number of hours worked for the wages that are being reported so that you know if someone is full time or part time, so you can get hourly rates, and really follow that dimension on which wages vary. Another thing you want is the actual work location of the people … And then, the last thing, particularly in these times of understanding demographic inequities—racial inequities, in particular, but also gender inequities, things like that—you want to have demographics so that you can track social justice issues and advances and understand how the world of work is affecting demographic outcomes. These data should also, of course, be curated—by which I mean, they have to clean them up so that you can really analyze them and made accessible to the statistical agencies, for particular with the BLS, so that they can create better statistics. You could get better, cheaper, and more-frequent program-policy evaluations so that policy makers could make better decisions.

What are some of the payoffs of this approach? One set of gains is that improved data can improve public policy. As a recent example, during the pandemic

I’ll give you an example of one of the things that happens when you don’t have the attention of national statisticians to administrative data. You can think about the unemployment insurance initial claims releases. There’s been a lot of attention to that during the course of the pandemic, because it’s some of the most-timely data and very closely associated with what was going on in the labor market. But those are administrative totals. They are not constructed to be economic indicators. And most of the people paying attention to them were looking for an economic indicator. The solution is clear, which is to have BLS partner with the unemployment insurance system and take over production of the creation of economic indicators from this inputted information a new program that takes advantage of the skills of a national statistical agency to input that data and create an economic indicator that wouldn’t require all of the journalists and all of the economists everywhere else to say, “Well, let me make this adjustment; maybe that’ll tell me what’s really going on.”

Another payoff, further in the future, is that workers with access to their own personal records could have a work history authenticated by the records of their past employers.

They want to come up with a mechanism to provide workers with portable, authoritative job records that they could tap into for applications, for jobs, for educations, for UI benefits, and other public programs as well. Anytime when they say, “What’s your work history?” the worker would be able to plug in their ID and their password or something like that, and they’d have an authenticated work history with skills information and duration and other information on it. 

As another example of a gap in labor market data that I’ve become aware of in my own reading, we know very little about workplace skills and the extent of on-the-job training. As Groshen says:

We know how to count years of education, and we have some ideas on quality. And we know test scores and things like that. But we actually don’t track workplace skills very well—either on the micro or the macro level—for individuals or for the country or groups as a whole. A very large component of skills training is done by employers. And we have no good measures of that. We don’t know who gets what kind of training. We don’t know what our skill gaps are. And so how can we be making the right decisions if we don’t have that information? The last authoritative source of employer-provided training was by the BLS, and it was done in 1995. Congress hasn’t funded it since then. 

Given that government statistics are produced as part of a bureaucracy, one way to produce better statistics is to improve the status of statisticians in the hierarchy. Groshen suggests:

There are many steps we can take to strengthen the independence of statistical agencies. And one of them would be to put them in their own agency outside of the control of a member of the cabinet, have it be headed up by the national statistician of the U.S.—and that’s a job that exists, but right now, the national statistician of the U.S. has an office of about seven people in OMB [the Office of Management and Budget]. So the alternative is put the national statistician actually in charge of the statistical agencies and move them away from reporting to the people who are in charge of policy directly.

There’s more in the interview about the productivity slowdown, the current churning in US labor markets and other topics.