What Enterprise Agents Can Learn from Claude Code
Most enterprise AI agents are built like waterfall software. A product manager writes a spec. An architect draws boxes and arrows. Engineers implement a fixed flow: intake, classify, route, execute, respond. The agent follows a script. When the script breaks, the agent breaks.
Claude Code doesn't work this way.
1. The Loop, Not the Plan
Claude Code runs on a simple loop: think, act, observe, repeat. No predetermined plan. The agent looks at the current state, picks a tool, uses it, reads the result, and decides what to do next.
You ask Claude Code to "fix the failing test in auth_service.py." It runs:
- Read
auth_service.py. - Run
pytest auth_service.py. Sees the failure. - Read the error trace. Missing mock for the new OAuth provider.
- Search the codebase for how other tests mock OAuth. Finds a pattern.
- Edit the test.
- Run
pytestagain. Pass.

Compare this to a typical enterprise support agent. "Customer says they were double-charged."
- Classifies intent: billing dispute.
- Routes to billing workflow.
- Calls
get_recent_charges(customer_id). - Finds two charges. Calls
initiate_refund(charge_id). - Sends template response.
Works for the happy path. But what if the double charge happened because the customer upgraded mid-cycle and the proration logic has a bug? The agent can't investigate. It can't read the proration code. It can't check the billing event log. It's locked into the flow someone designed.
The fix isn't a better flowchart. It's replacing the flowchart with a loop.
2. Small Tools, Loosely Joined
Claude Code has about a dozen tools: Read, Edit, Grep, Glob, Bash, Write, WebFetch, Agent. None encode business logic. The intelligence is in how the agent composes them.

Most enterprise architectures build the opposite: massive tools that do everything in one call.
A customer asks: "Can I switch from annual to monthly without losing my credits?"
Fat tools: The agent needs switch_plan_with_credit_preservation(customer_id, new_plan, preserve_credits=true). If that tool doesn't exist, the agent says "please contact support."
Thin tools: The agent has primitives: get_subscription(id), get_credit_balance(id), get_plan_details(plan), update_subscription(id, changes), create_credit_adjustment(id, amount, reason). It reads the subscription, checks credits, looks up pricing, calculates the difference, updates, and adjusts. No one anticipated this scenario. The agent composed the answer from parts.
3. MCP: The Missing Interface Layer
Model Context Protocol standardizes how tools describe themselves to agents. Each tool declares its name, what it does, what it needs, what it returns. The agent discovers capabilities at runtime.

Adding CRM capabilities to an enterprise agent:
Without MCP:
- Read CRM API docs
- Write adapter code in the agent
- Add error handling, auth, retries
- Test, deploy the updated agent
- Repeat for every new system
With MCP:
- Deploy a CRM MCP server
- Point the agent at it
- Agent discovers
search_contacts,get_deal,update_stageautomatically - Starts using them. No agent code changed.
Add ERP next week? Deploy an ERP MCP server. The agent discovers get_invoice, create_purchase_order, check_inventory and starts composing CRM and ERP tools together for cross-system problems no one anticipated.
4. Context Is the Product
Claude Code maintains context across an entire session. Every file read, every command run, every error encountered becomes working memory.
Claude Code's CLAUDE.md files take this further. Project-level knowledge that persists across sessions: conventions, architectural decisions, known gotchas. The agent reads them at session start and adjusts behavior. No code changes needed.

Most enterprise agents have three context problems:
Amnesia. Customer calls Monday about billing. Agent investigates, partially resolves, says wait 48 hours. Customer calls Wednesday. Agent has forgotten everything.
Flat context. The agent has conversation history but no structured knowledge. It knows "I was double-charged" but doesn't know this customer has been double-charged three times in six months, all from the same proration bug.
No organizational memory. Agent A discovers the proration bug affects mid-cycle upgrades. Agent B handles a similar case an hour later and starts from zero. Knowledge doesn't compound.
The fix is a knowledge graph linking artifacts together: tickets, code changes, incidents, customer interactions. The agent sees the full history. Not "this customer called before" but "this matches a pattern we've seen 47 times this month, traced to commit abc123."
5. Human-in-the-Loop as Trust Architecture
Claude Code asks for permission before destructive operations. This isn't a limitation. It's what makes the agent trustworthy enough to have a wide action space.

Most enterprise agents try to be fully autonomous, which forces a narrow action space. Refund under $50? Automatic. Over $50? "Please contact support." The agent handles easy cases and punts on interesting ones.
Claude Code can edit any file, run any command, delete anything. You trust it because it shows you the plan and waits. Permission isn't a constraint on capability. It enables capability.
For enterprise agents, design explicit trust boundaries:
Tier 1: Act freely. Read, search, summarize. No permission needed.
Tier 2: Act with notification. Update records, apply standard discounts. Do it, log it.
Tier 3: Act with approval. Large refunds, contract changes. Propose, wait for human.
Tier 4: Never act. Delete data, change security. Recommend only.
The mistake is treating everything as tier 1 (too risky) or tier 3 (too slow). Claude Code's boundaries are nuanced and configurable.
The Uncomfortable Implication
Most enterprise agent frameworks are overengineered in the wrong places. Too much effort on workflow orchestration and decision trees. Not enough on the core loop, composable primitives, context, and trust boundaries.
The agent that wins is not the one with the most sophisticated planning module. It's the one with the tightest loop, the simplest tools, the richest context, and the clearest trust boundaries.
The question is whether enterprise teams are willing to throw away their flowcharts and trust the loop.