When debugging or understanding LLM agents like Strands, tracing is critical. Recently, I ran a simple prompt — “greet rahul using tool” — and captured the trace emitted by Strands Agent using OpenTelemetry. Even though the use case was simple, the trace revealed the elegance of how the agent plans, executes, and finalizes its response in structured steps.
Here’s a breakdown of the key spans and why they exist, with a focus on the two chat spans and two execute_event_loop_cycle spans, and how they tie together.
What Happened?
The user instruction was: "greet rahul using tool"
Strands Agent, powered by Claude 3 Sonnet, interpreted this and:
- Planned a toolUse of the greet tool with input "rahul".
- Executed the tool.
- Used the result to send back a human-friendly message.
This simple interaction resulted in a two-turn conversation, captured in two event loop cycles.
Trace Summary (at a glance)
| Span Name | Duration (s) | Key Events | Role | 
|---|---|---|---|
| execute_event_loop_cycle #1 | 2.28 | toolUse,toolResult | Plans and executes tool | 
| chat #1 | 2.15 | LLM decides: use greet on "rahul" | Claude generates plan | 
| execute_tool greet | 0.13 | Input: "rahul"→ Output:"Hello, rahul" | Tool executes | 
| execute_event_loop_cycle #2 | 3.06 | Final reply from model | Uses tool result to complete | 
| chat #2 | 3.06 | LLM returns final message | Claude wraps up | 
| invoke_agent | 5.33 | All agent activity spans | Wraps both cycles | 
| agent.run | 5.33 | Full request lifecycle | Top-level root span | 
agent.run (INTERNAL)
└── invoke_agent "Strands Agents" (CLIENT)
├── execute_event_loop_cycle (INTERNAL)
│ ├── chat (LLM planning) (CLIENT)
│ └── execute_tool greet (INTERNAL)
└── execute_event_loop_cycle (INTERNAL)
└── chat (LLM response) (CLIENT)
Why Two execute_event_loop_cycle?
Strands Agent works in planning cycles — each one wraps:
- Observation of the current context (user input, tool results)
- A model call (chat)
- Optional tool execution (execute_tool)
- A choice of whether to continue, end, or act again
Cycle 1:
- Interprets "greet rahul using tool"
- Decides to call greet({ name: "rahul" })
- Tool responds with "Hello, rahul"
Cycle 2:
- Receives the tool result
- Generates a final assistant message:
Why Two chat Spans?
Each chat span is a model call. Here’s how they differ:
chat #1:
- Input:
- Raw user message
- Full conversation history: Previous user inputs, assistant replies, tool calls (toolUse) and tool results (toolResult) are automatically replayed in the context window so the model has memory of the ongoing session.
- Tool schemas: Strands injects all available tools (from gen_ai.agent.tools) into the prompt as structured JSON function specs. Even if only one tool is used, the model sees them all so it can reason about the best fit.
{
  "name": "greet",
  "description": "Greets a person by name.",
  "parameters": {
    "type": "object",
    "properties": {
      "name": { "type": "string", "description": "The name of the person to greet." }
    },
    "required": ["name"]
  }
}
   
  [
  {
    "text": "I'll greet Rahul using the greet tool right away."
  },
  {
    "toolUse": {
      "toolUseId": "tooluse_jN8iuxwKRC6eji43nu64OQ",
      "name": "greet",
      "input": {
        "name": "rahul"
      }
    }
  }
]
 
chat #2:
- Input: Includes the toolResult as context
- Output: Final message to the user
This separation is deliberate: tool planning and result-based reply generation are split for clarity, control, and extensibility.
Final Thoughts
This trace, from a seemingly trivial agent command, illustrates the powerful architecture of Strands Agent:
- Planning is explicit and looped (via execute_event_loop_cycle)
- Decisions and actions are clearly separated (chat, execute_tool)
- Full observability is built-in with OpenTelemetry
If you're designing agents or trying to debug them, traces like this are your best lens into what’s really going on — and how LLMs, planning logic, and tools interact.
 
 
No comments:
Post a Comment