The Referee Arrives: Stanford AI Indicators

The Referee Arrives: Pre-Registering the Discontinuity Thesis Against Stanford’s AI Indicators

The Stanford Digital Economy Lab is building something the AI labour debate has never had: a referee.

The AI Economic Indicators, currently in pre-launch, will track on a regularly updated basis how AI is moving hiring patterns, AI usage, productivity, and consumer surplus. It builds on the Canaries in the Coal Mine study, which found a 16 percent relative employment decline among 22-to-25-year-olds in AI-exposed occupations, and on the Lab’s collaboration with ADP, which gives it something no pundit has: payroll microdata covering tens of millions of workers, updated continuously, analysed by the most credentialed labour-and-technology team in the world.

This matters because the entire AI labour discourse currently runs on vibes. Optimists cite adoption curves. Pessimists cite layoff announcements. Everyone cites anecdotes. Every framework is compatible with every headline, because no framework has ever been forced to say in advance what data would prove it wrong.

That era can end the day this platform launches. Whether it ends depends on whether anyone has the nerve to file predictions before the data arrives.

So I will go first.

A prediction filed after the dashboard goes live is an interpretation. A prediction filed before is a test.

The AI discourse is diseased with post-hoc fitting. Every quarterly data release gets absorbed by every theory within 48 hours. Strong jobs report: the optimists claim vindication and the pessimists say the effects haven’t arrived yet. Weak entry-level numbers: the pessimists claim vindication and the optimists blame interest rates. Both moves are always available, which means neither side’s framework is doing any work.

The only escape is pre-registration: state the direction, the series, the threshold, and the window, in public, with a date on it, before the numbers exist. Then let the instrument decide.

The Discontinuity Thesis has published its refutation conditions since version 3.3. This essay attaches them to a specific external instrument, run by a lab with no stake in my framework and every institutional inclination toward the optimist reading. Stanford DEL is Erik Brynjolfsson’s shop. He coined the Turing Trap, wrote Race Against the Machine, and recently declared in the FT that the AI productivity take-off is finally visible. If the data ends up supporting a thesis about the death of the wage-demand circuit, it will not be because the referee was friendly.

The thesis holds that AI plus thin human verification has crossed the cost-quality threshold for a large and growing class of cognitive work, that competitive pressure makes adoption compulsory at every level, and that displacement arrives not as mass firing but as non-absorption: the quiet failure to hire the next cohort. From those mechanisms, the following predictions. Each one names a direction, and where the data allows, a magnitude and a window.

Prediction 1: The entry-level gap widens. Relative employment of workers aged 22 to 25 in AI-exposed occupations continues to decline against less-exposed occupations and against older workers in the same occupations. The Canaries baseline was a 16 percent relative decline from late 2022 to mid 2025. I predict the gap does not close, and widens further by end of 2027. This is the thesis’s flagship series.

Prediction 2: The ladder freezes from the bottom. Junior-to-senior composition in exposed cognitive sectors (software, legal services, accounting, marketing, customer operations) compresses. New hiring shifts toward experienced workers; the share of total hires going to early-career workers in these sectors falls year over year. Incumbents hold their seats while entrances narrow.

Prediction 3: Headline employment stays boring. Aggregate unemployment remains unremarkable throughout. This is the prediction most people will misread, so let me be plain: the Discontinuity Thesis predicts calm headline numbers. Displacement by non-absorption does not show up as unemployment spikes. It shows up as the gap between a fine-looking aggregate and a collapsing entry tier. If you are waiting for the unemployment rate to confirm or refute anything, you are watching the wrong dial, and so is every journalist who writes “AI jobs apocalypse fails to materialise” over a 4-point-something headline rate.

Prediction 4: The decoupling widens. Measured productivity in AI-exposed sectors accelerates, as Brynjolfsson is already reporting, while median wage growth in those same sectors lags it. The productivity-pay gap, the post-1973 chart everyone knows, steepens rather than closes. Gains accrue to capital and to a thin senior tier; they do not propagate down the wage structure, because the bargaining mechanism that once propagated them required labour to be scarce.

Prediction 5: Consumer surplus shows enormous gains. The Lab’s GDP-B style measures will show consumers capturing massive uncompensated value from AI tools. I am filing this prediction specifically so that nobody can use it against the thesis later. Consumption gains are not a refutation of the Discontinuity Thesis. The thesis distinguishes consumption continuity from system continuity: people eating well while becoming economically unnecessary is not the system surviving, it is the successor system arriving. When the headlines say AI is making everyone better off, check which column the betterment is in. If it is in the consumer-surplus column while the wage-participation columns decay, that is the thesis confirmed, not contradicted.

Prediction 6: Usage broadens from chat to agents. AI usage metrics shift from conversational assistance toward agentic and workflow execution: tools that operate software, complete multi-step tasks, and produce finished work product. This is the interface-collapse mechanism becoming visible in usage data. Watch the ratio of “AI helped me draft” to “AI did the task.”

Prediction 7: The rate-cut test. This is the sharpest one, and it is a genuine hostage to fortune. The standard rebuttal to the Canaries finding is that high interest rates, not AI, suppressed junior hiring. Fine. Rates will eventually fall. When they do, cyclically depressed hiring should recover broadly. The thesis predicts a two-speed recovery: hiring rebounds in less-exposed occupations and for experienced workers, while entry-level hiring in AI-exposed occupations recovers only partially or not at all. If the rate cycle turns and the exposed juniors are left behind in the recovery, the interest-rate explanation is dead. If exposed entry-level hiring snaps back fully with the cycle, the thesis takes serious damage. Stanford’s own follow-up, Canaries, Interest Rates, and Timing, is already fighting this decomposition battle. The recovery will settle it.

Symmetry, in writing, in advance:

If the relative entry-level employment gap between AI-exposed and less-exposed occupations closes and holds for two consecutive quarters, the thesis takes structural damage.

If junior-to-senior hiring ratios in exposed cognitive sectors stabilise or recover through 2027 and 2028, the thesis takes structural damage.

If a rate-cut cycle produces full recovery of exposed entry-level hiring, the displacement mechanism is substantially weakened and I will say so.

If wage growth in exposed sectors re-couples with sector productivity, the bargaining-collapse claim fails.

If review and verification roles scale into mass career ladders, growing in proportion to AI-mediated output and feeding workers upward into seniority, the verifier-compression argument fails.

Any of these, sustained, and I concede ground publicly, on this Substack, under this title. The thesis already maintains published refutation conditions and a standing prize for structural refutation. This essay pegs those conditions to an external instrument so that nobody, including me, gets to adjudicate their own framework.

Three blind spots in the instrument, named now so they cannot be deployed as dodges later. ADP covers US private formal payroll. It does not see contractors and gig substitution, which is one of the thesis’s own predicted displacement channels and will make payroll data understate the effect. It does not see offshore substitution. And occupation-level exposure classifications are contestable at the margins. I commit to arguing within the instrument’s frame anyway, supplementing rather than escaping it.

Here is the actual purpose of this essay.

Every labour-market framework currently in circulation should file its predictions against this platform before it launches. Not takes. Predictions: series, direction, threshold, window, and the readings that would falsify the framework. If your theory cannot produce that list, you do not have a theory. You have a vibe with citations.

To the new-task school (Autor and descendants): your framework says technology destroys tasks and creates new ones, and that the new ones have historically absorbed displaced labour at scale. Name the series. Where, in the Stanford data, will new mass-employment task categories show up, by when, and at what wage tier? What absorption rate, sustained for how long, would falsify the claim that the ladder regrows? Eighty years of census data earned you the right to be taken seriously. It did not earn an exemption from forecasting.

To the augmentation school (co-intelligence, centaurs, humans-in-the-loop): your claim is that human-AI complementarity is a destination rather than a corridor. Name the series. If augmentation is stable, the wage premium for AI-using workers should persist and propagate, junior workers should be absorbed as augmented juniors, and the verifier tier should grow into careers. State the numbers at which you would concede that augmentation was the transition phase of replacement, not its alternative.

To the normal-technology school (slow diffusion, organisational friction, decades not years): your claim is fundamentally about slope. Name the slope. What rate of change in the entry-level gap, the junior ratio, or sectoral wage share is consistent with “normal,” and what rate would force you to admit this is not normal? A theory of gradualism that never specifies the gradient is unfalsifiable by construction.

To the steering school (direction-of-technology, pro-worker AI by choice): your claim is that policy and institutional choices can direct AI toward complementing labour. Name the intervention, the jurisdiction, and the series that will move if steering works. If five years of steering advocacy produce no divergence between steered and unsteered labour markets, what follows?

And to the optimists who believe AI will rebuild the middle class: the platform will track exactly the variables your scenario requires. Wage compression between median and top earners. Expertise democratisation showing up as earnings gains for less-credentialed workers. File it. Dates and thresholds.

The Stanford Indicators are about to make this discourse falsifiable for the first time. Most participants will respond by continuing to interpret. The ones worth reading will respond by predicting. You will be able to tell who is who within a week of launch, because interpretation can be filed any time, and prediction can only be filed now.

I expect to be accused of confidence. Guilty. The mechanisms have been public for over a year: unit cost dominance, compulsory adoption under competitive pressure, displacement through non-absorption, calm aggregates over a collapsing entry tier. The Canaries study found the first predicted signal in the first place the thesis said to look. The productivity take-off Brynjolfsson reports is the other blade of the same scissors: output up, juniors out, wages flat. Every series this platform will publish has a predicted direction in the framework, written down before the platform existed.

That is either a theory about to be confirmed in public, on a Stanford dashboard, in quarterly instalments, or a theory about to be dismantled by the best labour data ever assembled. I am content with both outcomes, because either one is worth more than another year of vibes.

The referee arrives shortly. The predictions are filed. The window for joining me closes at launch.

Your move.

https://digitaleconomy.stanford.edu/project/indicators/