Cost of Work and the Dashboard Ceiling

Some CEOs are starting to measure per-employee token consumption as a productivity signal. The intent is right. The instrument is wrong. There is a better metric — and it is the one that finally puts

Jun 29, 2026

Goldman Sachs Research projected this spring that global AI token consumption will reach 120 quadrillion per month by 2030, a 24x increase driven primarily by autonomous enterprise workflows. The number is large enough to be abstract. Spread across the planet, it works out to roughly 14 million tokens per person per month — more than 100 full-length novels’ worth of text, generated every thirty days, mostly by agents that no one has yet learned to govern.

The number is real. The reaction in some executive teams is the part worth examining.

A pattern has emerged in the last few months that some are calling Tokenmaxxing. The CEO asks how many tokens each employee is consuming. The CIO produces a dashboard. The teams using more tokens are rewarded as “AI-fluent.” The teams using fewer are flagged as “behind on adoption.” Token consumption becomes a productivity proxy, and within a quarter, the organization has built an entire performance signal around a metric that measures activity, not value.

The instinct underneath Tokenmaxxing is correct. There is a new resource being consumed, and it does need to be managed. The problem is the resolution. Counting tokens per employee is the equivalent of measuring developer productivity by lines of code, or analyst productivity by hours logged. The metric is easy. The metric is also wrong.

There is a sharper version of the same instinct emerging in some of the more disciplined practices. Rather than counting tokens, they ask whether the value each token returns outpaces what was spent to generate it. This is sometimes called token yield. Token yield is closer to right. At the per-token level, it is the right metric — it asks the right question about the cost of generating intelligence as a utility.

But token yield is still measured at the wrong layer. The token is not the unit of business value. The unit of business value is a piece of work completed. A reconciliation closed. A decision rendered. A customer issue resolved. An invoice posted. Token yield optimizes the cost of producing the intermediate output (the tokens). It does not measure the cost of producing the business outcome the tokens were generated in service of.

That is where the actual metric lives. And it is the metric this edition is about.

Cost of Work

The Cost of Work is the unit economics of getting a specific business outcome delivered. Not per token. Not per query. Not per employee. Per outcome.

Concretely: if your accounts payable function reconciles vendor exceptions at a fully-loaded cost of $47 per exception when humans do it, and the same exception costs $3 to reconcile when the workflow is AI-led, the Cost of Work for that workflow has moved from $47 to $3. That is the number. It is denominated in dollars per unit of work, it is comparable across human and AI delivery, and it is comparable across time as the AI asset matures.

This is the metric that does what Tokenmaxxing was trying to do and what token yield gets closer to. It measures whether the AI investment is producing economic value, in a unit of measure the CFO already uses for every other operating decision.

The reason Cost of Work matters specifically — beyond being a better metric than the alternatives — is that it operationalizes the reframe from the second edition. AI is capital, not software, only if the capital actually delivers a return. Capital is measured by return, and the return of an AI capital base is the work it produces at a lower marginal cost than the alternatives. Cost of Work is the return calculation. Without it, the capital framing remains rhetorical. With it, the capital framing becomes governable.

There is a specific named cost that Cost of Work surfaces and the activity-layer metrics cannot see. It is the cost I introduced in the last capital-framing edition: Service Debt. The hidden labor that scales with volume — the analyst reconciling exceptions, the coordinator moving data between systems, the reviewer checking output before it ships. Service Debt is invisible in tool-cost accounting because it lives in headcount, not in the AI’s line item. It is fully visible in Cost of Work, because Cost of Work measures the total cost of producing the outcome — software, infrastructure, and the human work that surrounds them.

When the Cost of Work for a workflow falls from $47 to $3, what has actually happened is that the AI has retired the Service Debt associated with that workflow. The tool cost has gone up; the labor cost has gone down by a much larger amount; the unit economics has improved. The CFO who looks only at the tool’s line item sees the cost going up. The CFO who looks at the Cost of Work sees the asset compounding. Same workflow, opposite read.

Autonomy Ratio

Cost of Work is necessary but not sufficient on its own. A workflow can show good Cost of Work numbers in a pilot and still fail to scale, because the pilot ran on a narrow slice of cases and only some percentage of the workflow actually executes without human intervention. The rest still requires a human in the loop, and the human-in-the-loop work is what eats the economics in production.

The companion metric is Autonomy Ratio. The percentage of the workflow that runs lights-out — completed end-to-end without human intervention, including exceptions and edge cases.

A workflow with 95% Autonomy Ratio at $3 per unit is a real AI program. A workflow with 25% Autonomy Ratio at $3 per unit is a pilot that worked on the easy cases and stops working when the harder ones arrive. The Cost of Work number alone cannot tell you the difference. The pair can.

Together, Cost of Work and Autonomy Ratio form the measurement pair that puts AI on the balance sheet. Cost of Work measures the unit economics. Autonomy Ratio measures the durability of those economics under real operating conditions. A CAIO who can quote both numbers for any workflow on the AI portfolio has a program. A CAIO who cannot quote either has an experiment.

This is not a small distinction. Most AI programs today are running on the implicit assumption that the pilot’s Cost of Work will hold at scale. It almost never does. The Autonomy Ratio at pilot stage is usually well above what it will be in production, because the pilot was selected for tractable cases. As the workflow widens to include harder cases, the Autonomy Ratio falls, and the Cost of Work rises with it. The economics that justified the program quietly erode. Six months later, the program is still running and still being reported on. The dashboards are still green. The CFO is still being told the value is coming.

It is not coming. It is being eaten by the cases the pilot did not test.

The Dashboard Ceiling

There is a pattern that explains why most enterprises hit a wall here, and the pattern has a name. I have been calling it the Dashboard Ceiling.

For fifteen years, enterprises have been told that the way to measure technology investment is through dashboards — adoption rates, query volumes, user counts, license utilization, system uptime. These metrics worked well when the technology was reporting infrastructure consumed by humans. They are the wrong metrics for capital that produces work.

The Dashboard Ceiling is the wall enterprises hit when they continue measuring AI through usage dashboards instead of through unit economics. Investment continues. Adoption metrics climb. The dashboards show green. And the financial return refuses to materialize, because nothing on the dashboard is denominated in the unit the CFO actually cares about. Usage is not return. Adoption is not value. License utilization is not capital appreciation.

A program that hits the Dashboard Ceiling can stay there indefinitely. The metrics are healthy. The reporting is mature. The executive reviews proceed on schedule. The only signal that something is wrong is the absence of P&L movement that the original business case promised — and that absence is easy to explain away one quarter at a time.

Cost of Work and Autonomy Ratio are the metrics that break the ceiling. Not because they are sophisticated — they are not. Because they are denominated correctly. A CFO who is shown that the company’s accounts payable reconciliation has moved from $47 per exception to $3 per exception at 89% Autonomy Ratio, and that the same shift is now being engineered across vendor onboarding and contract review, understands immediately what is happening. That CFO will fund the next phase. The same CFO, shown a dashboard of weekly query volumes and seat counts, will not.

This is the language the CFO is waiting to hear. Most CAIOs are not yet speaking it.

What a CAIO should be able to quote

Before the next board review, the next investment case, or the next executive update, the test is whether the following can be answered for the company’s three most prominent AI workflows.

What is the Cost of Work for this workflow today, as a fully-loaded dollar figure per unit of business outcome?

What was it before the AI program began, on the same definition?

What is the Autonomy Ratio of this workflow today — the percentage of cases that complete end-to-end without human intervention?

What is the trend line on both metrics over the last two quarters?

If those numbers are not available, the program is not yet on the balance sheet. It is on the dashboard. The reporting may be healthy. The economics is unproven. That gap is what the next quarter’s work should close.

The companies that are making AI count are not the companies with the best models. They are the companies with the best measurement. That is not an accident, and it is not because measurement is glamorous. It is because measurement is what turns activity into value, and what turns value into capital.

Tokenmaxxing is the wrong question because it asks about the wrong layer. Token yield is the right question at the wrong layer. Cost of Work and Autonomy Ratio are the right questions at the right layer. The CAIO who builds the practice around those two metrics, and the CFO who learns to read them, are the pair that will turn the AI program into an appreciating asset rather than a recurring expense.

That is the work.

The executive decision

Pull the three most prominent AI workflows in the company. For each, produce the Cost of Work today, the Cost of Work before AI, and the current Autonomy Ratio. If any of those nine numbers is unavailable, the gap is the measurement system, not the technology. Close that gap before the next investment case is approved.

Board line

If you cannot quote your Cost of Work and Autonomy Ratio for an AI workflow, you do not have an AI program. You have an AI experiment.

Closing question

If the CFO asked tomorrow for the Cost of Work and Autonomy Ratio on your three most prominent AI workflows — would the numbers exist? And if they exist, are they real, or are they pilot numbers that have not been pressure-tested against the cases the pilot did not see?

Onward,

Raja

Raja Pabba is the founder of CloudMetrics and writes The CAIO Review on enterprise AI operating discipline. Subscribe at thecaioreview.com.

For readers who want to score their own data substrate against the six-level maturity model from the first edition, the simplified ADRMM Scorecard is available at thecaioreview.com/scorecard. The diagnostic takes about five minutes. There is no CTA at the end.

The CAIO Review

Discussion about this post

Ready for more?