The Machine Readable Organization

March 2026

Enterprise agents need more than prompts and permissions. They need a machine readable organization with runtime controls grounded in business state.

After spending years at Blue Onion building a subledger and navigating the messy reality of ERPs, I’ve come to think the enterprise agent problem is not especially new. Getting agents safely and usefully into a business depends on more than prompting or model quality. It depends on the machine readable organization.

In a good subledger, the job is never just to collect data. The harder work is deciding which version of reality the business is allowed to use. Which source wins when systems disagree. When a payment is final enough to count. Whether a refund is routine or suspicious. How an exception should be handled when the official workflow and operational reality diverge. How to leave behind a trail someone else can actually defend later.

As soon as agents move from assistant work into real workflows, they run into the same questions. What action is valid right now? What needs review? What should proceed automatically, and what should stop the line? What can be trusted? What needs to be reconciled? The closer agents get to systems of record, money movement, customer outcomes, approvals, and operational truth, the more the problem looks like a control problem.

People as the Policy Engine

This is where the machine readable organization starts to matter. A machine readable organization has made enough of its operating logic explicit for software to work inside it without constantly leaning on a human interpreter. Not the version from the system diagram, but the real one with the exceptions, approvals, side agreements, and unwritten rules that determine how the business actually runs.

Most companies are much less explicit than they seem. There is an ERP, a CRM, a warehouse, roles and access controls, process docs, approval flows, dashboards. Up close, a surprising amount of the real logic lives somewhere else. It lives in habits, exception lists, side letters, Slack threads, and the judgment of a few operators who know when the documented process is about to do something dumb.

That arrangement works because humans are good at carrying ambiguity. Someone in finance knows a certain customer gets special handling because of how credits flow through their contract. Someone in operations knows a vendor update should pause if banking details changed last week, even though the system does not enforce it. Someone in support can tell that a ticket technically qualifies for an automated refund, but the surrounding context makes that a bad idea. In a lot of enterprises, people are the real policy engine. They are just not described that way.

This is manageable while AI stays in assistant mode. A model can summarize the ticket, draft the reply, suggest the next step, and tee things up for a person. Once an agent starts taking action inside the workflow, the fuzziness gets expensive. A system that can post entries, trigger refunds, update records, or communicate with customers needs clearer boundaries than “someone in finance usually catches that.”

Machine Readability Forces Organizational Honesty

When a company tries to make itself legible enough for agents to operate safely, it has to confront how much of its own decision making is informal, contradictory, or dependent on translators. It has to decide which rules are real and which ones are folklore. It has to admit when the source of truth is actually a negotiation between systems. It has to make peace with the fact that some approvals are policy and others are social ritual. It has to decide which exceptions are legitimate and which ones are just institutional scar tissue.

Most companies have never had to do that work cleanly because human operators absorb the contradictions. They translate between systems, patch over ambiguity, and apply judgment case by case. Software is much less forgiving. It forces the company to become more explicit about itself.

That is why the machine readable organization matters more than the agent itself. If a company cannot say, in a form software can use, what is real, what is allowed, and what needs review, the model’s capabilities do not matter much. You can wrap the workflow in better prompts, cleaner tools, and more elegant UX. Underneath it, the operating environment is still fuzzy.

IAM Is Not Enough

Even when a company is ready to formalize its rules, the industry will probably reach first for an incomplete technical solution. The obvious move is to extend IAM. Give the agent an identity. Assign it scopes. Route its actions through the same access layer that governs human users and service accounts. Necessary but not sufficient.

IAM was built to answer whether a principal can touch a resource. Agents raise a different question. Is a specific action valid given what the business knows right now? IAM can tell you whether an agent is allowed to post a journal entry. It cannot tell you whether this particular entry still makes sense given that the counterparty was already credited through another channel two days ago. It does not know that a vendor payment should pause because banking details changed last week. It does not know that a contract side letter changes how credits are supposed to flow. Those are questions about business state. They have to be answered at runtime, in context.

Imagine an agent wants to issue a $12,000 customer credit. The identity layer says the agent is allowed to create credits below $25,000. That matters, but it is only the first check. A runtime control layer would ask whether this exact action is executable, reviewable, or forbidden given the current operational state, unresolved exceptions, prior events, contractual constraints, and risk thresholds. It might check the customer’s contract, payment status, recent support history, prior concessions, open disputes, and the reason for the credit. If the customer already received a manual concession two days ago through another channel, the system should route the case to review instead of blindly executing.

The review should not be a vague escalation. It should show why the action was stopped, which facts supported that decision, which events conflicted with the proposed action, and which threshold or policy was involved. Whether the credit is approved or denied, the system should record the decision, the evidence, and the state of the business at the time.

Contextual Validity

The missing layer evaluates actions against the live operational state of the business before they execute. In practice, that probably means some combination of runtime policy evaluation, state consolidated from upstream systems, event history, and explicit review queues for ambiguous or risky cases. Something closer to a control plane shaped more like a subledger than a standard IAM product.

It needs to look more like a subledger because it has to reconcile competing sources, preserve truth as of a given time, maintain explicit exception queues, support restatement when reality changes, and leave behind evidence for every decision. The existing categories each own part of this, but none owns the whole problem. Okta and Auth0 understand identity and access, but they generally do not model the business state that determines whether an action is sensible. NetSuite and Workday hold more of that state, but storing and updating business facts is different from runtime enforcement. OpenAI, Anthropic, Microsoft, and Google own pieces of the agent runtime, but the model runtime is not the same thing as the business control plane.

There is still a useful analogy to cloud security, but only up to a point. Cloud security matured by making identity, least privilege, logging, and policy much more explicit. Agent governance will move in the same general direction. But cloud security mostly asks who can touch what. Agent governance has to ask whether a specific action is valid in context. That is a more operational problem. It sits closer to how a business actually works.

Reliability Is the Measure

This is why I think reliability is the ultimate measure. I mean whether the system can survive contact with a real company. Can it stay within its authority? Can someone reconstruct why it acted? Does it fail in a way the business can live with? Or does it require a human shadowing it all day like an expensive babysitter, which is usually a sign that the autonomy is mostly theater?

Working on Blue Onion’s subledger made this feel obvious to me before I had the language for it. What I want from an agent governance layer is what a good subledger provides. It is a system grounded tightly enough in operational reality to know when something can proceed, when it needs review, and when the underlying state does not support the action.

The agent may be what users see, but the valuable system underneath is the one that knows when the agent is allowed to move.