Token Accounting and the Flow of Machine Cognition

Abstract ledger of machine cognition flowing through local and cloud AI systems

For decades, organizations have tracked the consumption of resources.

We measure money through accounting systems. We measure labour through timesheets. We measure inventory, materials, assets, energy, and equipment utilization.

The reason is simple.

What gets measured can be managed.

Artificial intelligence introduces a new resource category that many organizations are consuming but very few are managing:

Machine cognition.

Not consciousness. Not sentience. Not intelligence in the human sense.

Cognitive work.

Reasoning. Summarization. Classification. Planning. Retrieval. Analysis. Code generation. Decision support.

Every time an AI system performs one of these activities, it consumes computational resources that are typically measured in tokens.

Today, most organizations view tokens as a billing mechanism.

That is a mistake.

The invoice tells you what machine cognition cost.

It does not tell you where it went, why it was consumed, whether it created value, or whether the underlying system is behaving as intended.

This is why I believe we need a new discipline:

Token accounting.

Financial accounting tracks the flow of money.

Token accounting tracks the flow of machine cognition.

Looking Beyond the Invoice

Most AI dashboards look something like this:

Tokens consumed
Requests processed
Monthly cost
Provider breakdown

Useful information, certainly.

But imagine running a business using only a bank statement.

You would know money was spent.

You would not know whether it was spent wisely.

The same principle applies to AI.

Suppose an organization spends $10,000 per month on AI services.

That number alone tells us almost nothing.

Was the cognition used for customer support? Research? Software development? Knowledge retrieval? Agent workflows? Executive reporting?

Without context, the number is merely an expense.

What organizations actually need is visibility into how machine cognition is being allocated throughout the business.

Machine Cognition as an Organizational Resource

Organizations already perform labour accounting.

They know where employee time is being spent.

Engineering. Sales. Marketing. Operations. Support.

The same approach should be applied to AI.

Imagine a token accounting report that shows:

Knowledge retrieval: 12 million tokens
Software development: 8 million tokens
Customer support: 3 million tokens
Research: 2 million tokens
Executive analysis: 1 million tokens

Suddenly the conversation changes.

Instead of asking:

How much did we spend?

We can ask:

Why did we spend it?

More importantly:

What value did it create?

Token Consumption as a Drift Indicator

This is where things become particularly interesting.

Organizations are already familiar with operational drift.

Processes drift. Documentation drifts. Standards drift. Software drifts. Agentic systems drift.

The same is true for machine cognition.

Suppose a research agent consumed 200,000 tokens in January. Then 400,000 in February. Then 900,000 in March. Then 2.1 million in April.

That increase may not represent growth.

It may represent drift.

Perhaps a prompt changed. Perhaps a workflow became recursive. Perhaps retrieval quality degraded and the system began compensating with larger context windows. Perhaps users started relying on the agent for tasks it was never designed to perform.

The token increase is not the problem.

The token increase is the symptom.

The underlying behavioural change is the problem.

Viewed through this lens, token consumption becomes an observability metric rather than merely a billing metric.

The Coming Pareto Discovery

I suspect many organizations will eventually discover that token consumption follows the same patterns seen elsewhere in business.

A small number of workflows will generate the majority of token usage.

A small number of users will consume the majority of machine cognition.

A small number of architectural decisions will drive the majority of costs.

This should not surprise anyone.

The Pareto Principle appears almost everywhere.

The interesting question is not whether the pattern exists.

The interesting question is what is causing it.

When organizations begin performing token accounting, they may discover that much of their machine cognition is being spent on activities that do not require frontier AI models at all.

Why Local-First AI Matters

This brings us to what I believe is the next stage of AI architecture.

Most organizations currently route everything to a cloud provider.

Every request. Every summary. Every classification. Every search. Every workflow. Every agent.

The result is predictable.

Costs increase. Dependencies increase. Governance becomes difficult. Control becomes externalized.

A better approach is local-first AI.

Local-first does not mean local-only.

It means keeping the control plane inside the organization.

Simple tasks are handled locally. Knowledge retrieval is handled locally. Classification is handled locally. Summarization is handled locally. Internal workflows are handled locally.

Only when additional capability is required does the system escalate to a cloud model.

And only when truly necessary should it escalate to a frontier model.

This changes the economics entirely.

Instead of asking:

How do we reduce token costs?

Organizations begin asking:

Why are we consuming machine cognition in the first place, and what is the most appropriate source of that cognition?

That is a governance question.

Not a billing question.

The Future of AI Governance

Today's organizations manage financial capital.

Human capital.

Intellectual capital.

Increasingly, they will also need to manage machine cognitive capital.

The token stream generated by AI systems is the observable exhaust of that capability.

Every token tells part of a story.

Every spike indicates behaviour.

Every trend reveals usage patterns.

Every escalation reflects a governance decision.

Token accounting is therefore not about counting tokens.

It is about understanding how machine cognition flows through an organization.

And once that flow becomes visible, it becomes manageable.

The organizations that master AI over the next decade will not necessarily be those with access to the largest models.

They will be the organizations that understand where machine cognition is being consumed, why it is being consumed, and when it should remain local.

Because drift happens.

And increasingly, the first place we may see it is in the token ledger.

View source Markdown · Verified Provenance