Rahul Raj: LangGraph and Agent Frameworks: Using the Right Tool for the Job

There's a common trap when building AI-powered pipelines: reaching for an agentic framework because the problem feels “intelligent,” even when the solution is fundamentally deterministic. This post walks through a document ingestion system where that mistake shows up—and what the right mental model looks like.

The System: Ingesting Documents at Scale

The pipeline processes documents at scale—loading files from storage, extracting structured metadata via an LLM, enriching that metadata against external systems, and indexing everything into a vector store and document store for downstream retrieval.

The flow looks like this:


Object storage / local filesystem
        ↓
  list_documents
        ↓
  [per document]
  load → classify → chunk → embed → extract_metadata → enrich → store → archive

Simple enough on paper. The complexity comes from two questions:

How do you orchestrate deterministic steps cleanly?
Where does the LLM fit in—and how?

The system uses two patterns to answer these: a graph-based workflow engine for orchestration and agent-based execution for LLM-driven tasks. Understanding when to use each is key.

LangGraph: When the Path Is Known

LangGraph is a workflow engine built on top of LangChain. Its core primitive is a directed graph where nodes are Python functions and edges define allowed transitions. State flows through the graph as a typed dictionary.

Here’s a simplified version of the ingestion graph:


from langgraph.graph import END, StateGraph

workflow = StateGraph(dict)

workflow.add_node("load_document", load_document)
workflow.add_node("classify_document", classify_document)
workflow.add_node("chunk_document", chunk_document)
workflow.add_node("embed_chunks", embed_chunks_node)
workflow.add_node("extract_metadata", extract_metadata_node)
workflow.add_node("enrich_metadata", enrich_metadata_node)
workflow.add_node("store_embeddings", store_embeddings_node)
workflow.add_node("store_summary", store_summary_node)
workflow.add_node("archive_document", archive_document)
workflow.add_node("skip_document", skip_document)

workflow.set_entry_point("load_document")
workflow.add_edge("load_document", "classify_document")

workflow.add_conditional_edges(
    "classify_document",
    should_process,
    {"process": "chunk_document", "skip": "skip_document"},
)

workflow.add_edge("chunk_document", "embed_chunks")
workflow.add_edge("embed_chunks", "extract_metadata")
workflow.add_edge("extract_metadata", "enrich_metadata")
workflow.add_edge("enrich_metadata", "store_embeddings")
workflow.add_edge("store_embeddings", "store_summary")
workflow.add_edge("store_summary", "archive_document")
workflow.add_edge("archive_document", END)
workflow.add_edge("skip_document", END)

graph = workflow.compile()

What this gives you:

Explicit control flow: Every transition is defined in code.
Typed state management: Each node declares inputs and outputs.
Deterministic branching: Conditions are pure Python—no LLM needed.
Composability: Easy to wrap per-document flows into batch processing.

Mental model: Use LangGraph when you know the answer to “what happens next?”
If the pipeline topology is fixed, a deterministic DAG is the right tool.

Agent Frameworks: When the LLM Decides the Path

Agent frameworks introduce a different execution model: the LLM drives control flow by choosing tools, interpreting results, and deciding what to do next.

The Right Use: Orchestrator with Tools

At query time, an orchestrator agent can route user questions to specialized downstream components, each exposed as a tool.

Example pattern:


def build_tools():
    return [
        make_tool("query_domain_a"),
        make_tool("query_domain_b"),
        make_tool("synthesize_results"),
    ]

At runtime, the LLM decides:

Should it call one tool or multiple?
Does it need to combine results?
Does it need to resolve entities first?

This kind of routing depends on semantic understanding, not deterministic rules.

No static DAG can reliably express this.

Mental model: Use an agent when the path depends on meaning the LLM must interpret.

A Valid Use Case: Enrichment with Tool Interaction

In the enrichment step, an agent can call external systems (e.g., registries or APIs), interpret responses, and resolve ambiguity.


agent = Agent(
    model=model,
    system_prompt=prompt,
    tools=tools,
)

response = agent(prompt)

This is justified when:

Tool results may be ambiguous
Multiple calls may be needed
The LLM must reason about correctness

However, it’s worth monitoring: if it always becomes a single tool call, a simpler pattern may be better.

The Anti-Pattern: Agent as a Thin Wrapper

A common mistake is using an agent for simple, single-step tasks:


agent = Agent(
    model=model,
    system_prompt=prompt,
)

response = agent(chunk)
parsed = parse_json(response)

No tools. No iteration. No decision-making.

This is just a prompt → JSON call with unnecessary overhead.

Problems:

Added latency from agent loop setup
Repeated overhead for each chunk
Fragile parsing logic
No strong structure guarantees

The Better Approach: Structured LLM Calls

Use direct structured output instead:


from langchain_core.messages import HumanMessage, SystemMessage

llm = SomeLLM(model="...", temperature=0.2)
chain = llm.with_structured_output(MySchema)

result = chain.invoke([
    SystemMessage(content=system_prompt),
    HumanMessage(content=chunk),
])

Benefits:

Strong typing via schema validation
No manual parsing
Lower latency
Simpler execution model

The Decision Framework


Does control flow depend on meaning the LLM must interpret?

├─ NO → Use LangGraph (or plain code)
│       Fixed steps, deterministic branching
│       Examples: ETL, document pipelines

└─ YES → Does the LLM need tools or iteration?

        ├─ NO → Use direct structured LLM call
        │       Prompt → structured output
        │       Examples: extraction, classification

        └─ YES → Use an agent
                Tool selection + reasoning loop
                Examples: routing, research, disambiguation

When each layer does only its job, the system becomes simpler, faster, and easier to reason about.

Rahul Raj

Apr 12, 2026

LangGraph and Agent Frameworks: Using the Right Tool for the Job