May 8, 2026

From Swagger Overload to a Single Capable Agent

My first multi-agent architecture looked clean on paper.

I had specialized agents, each responsible for a specific domain and equipped with the full Swagger specification for the APIs it owned. An orchestrator sat on top, routing questions between agents and coordinating responses.

It seemed logical. It didn’t scale.

What Went Wrong

Swagger specifications are massive.

A single service can expose 40+ endpoints, most of which an agent will never use. Feeding the full spec into an LLM’s context created several problems at once:

  • Massive token consumption
  • Ambiguity between similar endpoints
  • Increased reasoning complexity
  • Operational guidance buried under schema noise

The architecture was technically sophisticated, but operationally fragile.

The Capability Registry

We replaced Swagger injection with a capability registry.

Instead of giving agents entire API specs, we indexed individual callable capabilities in a vector store. Each capability represented one executable action:

  • an HTTP endpoint
  • a SQL query
  • a native tool
  • or another callable operation

When the agent needed to act, it performed a semantic lookup and retrieved only the most relevant capability.

Each result contained just enough information to execute:

  • tool_name
  • base_url
  • path
  • parameters
  • agent_notes

The agent_notes field became the operational brain of the system — guidance on when to use something, edge cases, and what assumptions not to make.

The LLM no longer had to reason across dozens of irrelevant endpoints. It only saw the capability it needed.

Any Backend, One Interface

Because the registry abstracted invocation types, every backend looked identical during discovery.

HTTP capabilities returned:

  • base_url
  • path

SQL capabilities returned:

  • database location
  • query template

Native tools returned:

  • tool_name

One discovery pattern. One execution model.

From Many Agents to One

Once capability discovery became dynamic, the need for multiple specialized agents started disappearing.

Previously, specialization existed because each agent required carefully curated prompts containing only its domain knowledge.

With runtime capability lookup, the registry became the knowledge layer.

We collapsed the system into a single agent with access to the full registry.

The result:

  • no routing errors
  • lower latency
  • simpler debugging
  • one deployment
  • one trace stream

The Trade-Off

The capability registry is not free.

It must stay synchronized with live systems. Capabilities need curation. Writing high-quality agent_notes takes discipline.

But that cost already existed — previously paid through wasted tokens, routing failures, and difficult debugging sessions.

The Bigger Lesson

One well-configured agent with the right capability registry turned out to be far more powerful — and dramatically simpler to operate — than a fleet of narrowly scoped agents.

The breakthrough wasn’t adding more agents.

It was reducing what the agent needed to know at any given moment.

No comments:

Post a Comment