There's a common trap when building AI-powered pipelines: reaching for an agentic framework because the problem feels “intelligent,” even when the solution is fundamentally deterministic. This post walks through a document ingestion system where that mistake shows up—and what the right mental model looks like.
The System: Ingesting Documents at Scale
The pipeline processes documents at scale—loading files from storage, extracting structured metadata via an LLM, enriching that metadata against external systems, and indexing everything into a vector store and document store for downstream retrieval.
The flow looks like this:
Object storage / local filesystem
↓
list_documents
↓
[per document]
load → classify → chunk → embed → extract_metadata → enrich → store → archive
Simple enough on paper. The complexity comes from two questions:
- How do you orchestrate deterministic steps cleanly?
- Where does the LLM fit in—and how?
The system uses two patterns to answer these: a graph-based workflow engine for orchestration and agent-based execution for LLM-driven tasks. Understanding when to use each is key.
LangGraph: When the Path Is Known
LangGraph is a workflow engine built on top of LangChain. Its core primitive is a directed graph where nodes are Python functions and edges define allowed transitions. State flows through the graph as a typed dictionary.
Here’s a simplified version of the ingestion graph:
from langgraph.graph import END, StateGraph
workflow = StateGraph(dict)
workflow.add_node("load_document", load_document)
workflow.add_node("classify_document", classify_document)
workflow.add_node("chunk_document", chunk_document)
workflow.add_node("embed_chunks", embed_chunks_node)
workflow.add_node("extract_metadata", extract_metadata_node)
workflow.add_node("enrich_metadata", enrich_metadata_node)
workflow.add_node("store_embeddings", store_embeddings_node)
workflow.add_node("store_summary", store_summary_node)
workflow.add_node("archive_document", archive_document)
workflow.add_node("skip_document", skip_document)
workflow.set_entry_point("load_document")
workflow.add_edge("load_document", "classify_document")
workflow.add_conditional_edges(
"classify_document",
should_process,
{"process": "chunk_document", "skip": "skip_document"},
)
workflow.add_edge("chunk_document", "embed_chunks")
workflow.add_edge("embed_chunks", "extract_metadata")
workflow.add_edge("extract_metadata", "enrich_metadata")
workflow.add_edge("enrich_metadata", "store_embeddings")
workflow.add_edge("store_embeddings", "store_summary")
workflow.add_edge("store_summary", "archive_document")
workflow.add_edge("archive_document", END)
workflow.add_edge("skip_document", END)
graph = workflow.compile()
What this gives you:
- Explicit control flow: Every transition is defined in code.
- Typed state management: Each node declares inputs and outputs.
- Deterministic branching: Conditions are pure Python—no LLM needed.
- Composability: Easy to wrap per-document flows into batch processing.
Mental model: Use LangGraph when you know the answer to “what happens next?”
If the pipeline topology is fixed, a deterministic DAG is the right tool.
Agent Frameworks: When the LLM Decides the Path
Agent frameworks introduce a different execution model: the LLM drives control flow by choosing tools, interpreting results, and deciding what to do next.
The Right Use: Orchestrator with Tools
At query time, an orchestrator agent can route user questions to specialized downstream components, each exposed as a tool.
Example pattern:
def build_tools():
return [
make_tool("query_domain_a"),
make_tool("query_domain_b"),
make_tool("synthesize_results"),
]
At runtime, the LLM decides:
- Should it call one tool or multiple?
- Does it need to combine results?
- Does it need to resolve entities first?
This kind of routing depends on semantic understanding, not deterministic rules.
No static DAG can reliably express this.
Mental model: Use an agent when the path depends on meaning the LLM must interpret.
A Valid Use Case: Enrichment with Tool Interaction
In the enrichment step, an agent can call external systems (e.g., registries or APIs), interpret responses, and resolve ambiguity.
agent = Agent(
model=model,
system_prompt=prompt,
tools=tools,
)
response = agent(prompt)
This is justified when:
- Tool results may be ambiguous
- Multiple calls may be needed
- The LLM must reason about correctness
However, it’s worth monitoring: if it always becomes a single tool call, a simpler pattern may be better.
The Anti-Pattern: Agent as a Thin Wrapper
A common mistake is using an agent for simple, single-step tasks:
agent = Agent(
model=model,
system_prompt=prompt,
)
response = agent(chunk)
parsed = parse_json(response)
No tools. No iteration. No decision-making.
This is just a prompt → JSON call with unnecessary overhead.
Problems:
- Added latency from agent loop setup
- Repeated overhead for each chunk
- Fragile parsing logic
- No strong structure guarantees
The Better Approach: Structured LLM Calls
Use direct structured output instead:
from langchain_core.messages import HumanMessage, SystemMessage
llm = SomeLLM(model="...", temperature=0.2)
chain = llm.with_structured_output(MySchema)
result = chain.invoke([
SystemMessage(content=system_prompt),
HumanMessage(content=chunk),
])
Benefits:
- Strong typing via schema validation
- No manual parsing
- Lower latency
- Simpler execution model
The Decision Framework
Does control flow depend on meaning the LLM must interpret?
├─ NO → Use LangGraph (or plain code)
│ Fixed steps, deterministic branching
│ Examples: ETL, document pipelines
└─ YES → Does the LLM need tools or iteration?
├─ NO → Use direct structured LLM call
│ Prompt → structured output
│ Examples: extraction, classification
└─ YES → Use an agent
Tool selection + reasoning loop
Examples: routing, research, disambiguation
When each layer does only its job, the system becomes simpler, faster, and easier to reason about.
No comments:
Post a Comment