Back
Autonomous Systems

Autonomy Is the Fuel. Control Is the Product.

May 30, 2026
10 min read

A nuclear reactor and an atomic bomb run on the same reaction. The only thing between them is control.

That is the situation we are now in with AI agents. The reaction, raw autonomy, is getting cheap fast. A model that can plan, call tools, and act in a loop is close to a commodity. The scarce part is the control system wrapped around it, which is what decides whether an agent can run in production at all.

The reactor is the right mental model because the hard part is the opposite of what you'd guess. The chain reaction was never the hard part of nuclear power. Neutrons multiply on their own. Decades of engineering went into the control system instead. Software is going the same way. So the useful question is not what your agent can do. It is what you have built around it to keep it critical: productive, bounded, and one button from off.

One curve, five reactors

The move toward autonomy looks the same everywhere because it is. One curve running through five reactors, each at a different power level.

  • Cars. Driving was solved for the easy cases years ago. What gets a stranger into the back seat is everything wrapped around the driving: the operational design domain, the remote operators, the disengagement logic, the geofence. Waymo's real product is that containment.
  • Workflows. They already chain dozens of steps. What stops them before production is the lack of gates, budgets, and rollback, not the lack of capability.
  • The SDLC. Agents write, test, and deploy code now. The teams shipping this safely did not get better models. They built the control surfaces: sandboxed execution, evals, instant rollback, hard limits on what an agent can touch.
  • Companies. The autonomous org, swarms of agents running operations, arrives the same way, bottlenecked on governance, observability, and budget control long before the agents run out of things they can do.
  • Research. Self-improving systems are the highest-energy reactor of all. The entire field of alignment and oversight is, in reactor terms, control-rod design for a core we do not fully understand yet.

Same reaction, five times over: the autonomy slides toward commodity and the control system becomes the product. So what is that control system, concretely?

What control looks like in an agent

Here is the reactor's control system translated into agent terms. Five surfaces, each a concrete thing you build, none of them a metaphor.

Instrumentation: trace every step. A reactor has thousands of sensors. An agent should emit a structured trace of every tool call: the inputs, the raw output, tokens spent, latency, and the model's stated reason for the call. If you cannot replay exactly what the agent did and why, you are running a reactor with the gauges painted on. This is the most skipped surface, and it is why teams say they "don't trust the agent": they have no read on the core.

Control rods: budgets and gates that throttle the reaction. Three concrete throttles. A step budget, a hard cap on loop iterations so a confused agent cannot spin forever. A token and dollar budget that halts the run at a ceiling. And tiered approval gates: classify actions by blast radius and reversibility, let the low-risk ones (read a file, query a staging replica) run automatically, and pause the high-risk ones (deploy, send the email, issue the refund, drop the table) for a human.

Containment: scope the blast radius before the run, not after. A vessel bounds the damage when everything else fails. For an agent that means least-privilege credentials (a token scoped to exactly the resources the task needs, never your prod admin role), an allowlist of tools instead of open shell access, ephemeral sandboxes for execution, and dry-run modes that emit a plan you can diff before anything executes. You decide how bad the worst case can be before you press start.

Negative feedback: verify against ground truth after every action. Run the test, read the real error, hit the actual database before the next step trusts the last one.

SCRAM: a kill switch wired to rollback. One control that halts the run mid-flight and reverts. For a code agent that is the throwaway branch and the one-command deploy rollback. For an ops agent it is the circuit breaker that freezes all actions when an anomaly fires. Build it on day one. You will not get the chance during the incident that makes you want it.

Stacked up, the loop is small enough to write down:

steps, spend = 0, 0
while not done and steps < MAX_STEPS and spend < BUDGET:
    action = agent.next_action(context)      # the reaction: cheap, abundant
    if blast_radius(action) == "high":
        await human_approval(action)         # control rod: gate the dangerous moves
    result = sandbox.run(action)             # containment: scoped creds, no prod
    spend += result.cost
    verdict = verify(result, ground_truth)   # negative feedback: check the work
    if verdict.fatal:
        scram(); rollback(); break           # SCRAM: halt and revert
    context += observe(result, verdict)      # an error makes the next step more cautious
    steps += 1

Notice that none of these surfaces makes the agent more capable. Every one is about bounding a capability you already have.

The harness is a controller

Forget the reactor for a second. There is an everyday machine that does exactly what a good agent harness does: cruise control.

Cruise control has one job. You set a target speed and it holds the car there without you touching the pedal. It watches the speedometer and corrects: drifting below the target, add throttle; over it, ease off. Measure, compare to the goal, correct, repeat.

An agent harness is that same loop wrapped around an LLM. You set a goal, the model acts, and the harness checks the result against something real (a test, the actual database, a type checker) and feeds the gap back into the next step. The model is the engine. The harness is what keeps it pointed at the goal. Control engineers call this a feedback controller; the name matters less than the loop.

Cruise controlAgent harness
Target speedThe goal or spec
The engineThe LLM and its tools
SpeedometerWhat you can actually verify is true
Gap from the targetHow far the result is from the goal
Throttle correctionTighten constraints, lower autonomy, re-plan

A raw model looping on its own output has no speedometer. It never measures the gap, so nothing pulls it back. The harness is what closes the loop.

There are two ways to correct.

The obvious one is to correct in proportion to how far off you are. Big miss, big correction: a failed test means stop and re-plan, not nudge and continue.

The subtle one is what saves long runs. Picture cruise control on a long, gentle hill. If it only reacted to the gap right now, it would sit just under your target speed the whole climb and never quite catch up. So it also tracks how long the gap has persisted and keeps adding throttle until it closes. Agents fail the same way: the agent drifts a little off-goal every step, no single step alarming, and thirty steps later it is confidently building on a wrong foundation. A harness that checks each step in isolation sleeps right through it. One that also watches the accumulated drift catches the slow leak before it compounds.

drift = 0
for step in run:
    result = act()
    gap = compare(result, ground_truth)   # how far off, right now
    drift += gap                           # how far off it has been, over time
    if too_big(gap) or too_big(drift):
        escalate_or_halt()                 # catch the spike and the slow leak

The reactor tells you why control is the product. Cruise control tells you how the loop is wired.

Engineer the feedback, not just the gate

A gate is a control rod: it works, but a human has to push it, so it does not scale. The reactors that run safely for decades do not depend on operators catching every excursion. They are designed so the physics damps a runaway on its own. As the core heats up, the reaction slows. That negative feedback is built into the fuel, not bolted on by a person.

Agents have the same fork. The default agent loop has positive feedback: the model's output becomes its own next input. One wrong assumption gets written into the context, the next step builds on it, and the error compounds across a long run until the agent is confidently, elaborately wrong. People call this context poisoning. Errors amplify instead of dying out.

Negative feedback means putting a check against reality between steps. The agent does not get to believe its own output. It runs the test and reads the actual failure. It queries the real database instead of trusting a value it computed three steps ago. It re-reads the file it just edited rather than assuming the edit applied. Each action is verified against ground truth before the next one builds on it, so an error makes the next step more cautious instead of more confident.

This is why the agents that work in production usually are not the ones with the best model but the ones wired to the most ground truth: a test suite, a type checker, a real API that returns a real error, a database that rejects a bad write. The verification is what damps the errors. Without it, nothing corrects the agent on its own, and it is safe only as long as a human is standing there holding the rod.

Two ways to fail: meltdown and cold shutdown

The obvious failure is the runaway: the agent that drops the database, drains the budget, emails the customer the wrong thing. That risk is real, and the containment surfaces above are how you bound it.

But the quieter failure runs the other way. Out of fear of the meltdown, you gate everything. Every action needs sign-off, the agent cannot move without a human in the loop, and you are left with a slower, more expensive way of doing the work by hand. A reactor that produces no power, and you called it safety.

The target is neither. Critical is the agent running productively inside bounds it cannot exceed, with humans gating only the handful of actions that are both high-blast-radius and hard to reverse, and everything else flowing automatically under budget and verification. Drawing that line, which actions auto-run and which stop for a human, is the engineering. Gate too much and you are subcritical, doing the work by hand. Gate too little and you are supercritical, one bad loop from the headline.

Hold it critical

The uncomfortable part is that the engineering that matters most here is not the autonomy. The reaction is commoditizing. Everyone gets the fuel. What stays scarce is the control system around it.

So the question is not how autonomous can we make it. It is how much power can you safely hold critical. A reactor you cannot control is not an asset. It is just a faster way to reach the same pile of waste.

aiautonomous-systemsagentscontrol-theoryfuture-of-engineering