Prompt Chaining | The Agentic Wiki

●●○○○ Complexity

Overview

Prompt chaining decomposes a task into a sequence of discrete LLM calls, where each step processes the output of the previous one. Rather than asking a single model call to handle an entire complex task, you break the work into smaller, well-defined stages connected in a pipeline. Between each stage, you can insert programmatic gate checks — validation logic that verifies the intermediate output before passing it to the next step. This makes the workflow easier to debug, test, and refine, because each link in the chain has a clear input and output contract.

How It Works

Define the stages. Identify the logical sub-tasks within your overall task. Each stage should have a single, focused responsibility (e.g., “generate a draft,” “extract key entities,” “translate to Spanish”).
Write a prompt for each stage. Each prompt is tailored to its specific sub-task, which allows you to optimize instructions, examples, and output format independently.
Connect the stages sequentially. The output of stage N becomes part of the input (or the entire input) for stage N+1.
Insert gate checks between stages. After each LLM call, run programmatic validation — check for required fields, verify format constraints, confirm the output passes a quality heuristic. If a gate check fails, you can retry the stage, branch to an error-handling path, or halt the pipeline.
Collect the final output. The result of the last stage is the overall output of the chain.

When to Use

The task naturally decomposes into sequential sub-tasks with clear boundaries.
You want to trade a small amount of latency for higher accuracy and easier debugging.
Each stage benefits from a different prompt style, system message, or model configuration.
You need verifiable intermediate results (e.g., confirming extracted data is valid before summarizing it).
The output of each step is relatively predictable in structure.

When Not to Use

The task is simple enough that a single well-crafted prompt handles it reliably.
Latency is critical and you cannot afford multiple sequential round-trips.
The sub-tasks are independent of each other — consider Parallelization instead.
The number or nature of sub-tasks is not known in advance — consider Orchestrator-Worker instead.

Example

# Prompt Chaining: Generate a marketing email, then translate it.

def generate_email(product_description: str) -> str:
    """Stage 1: Draft a marketing email in English."""
    response = llm.call(
        system="You are an expert marketing copywriter.",
        prompt=f"Write a short marketing email for: {product_description}"
    )
    return response.text

def gate_check_email(email: str) -> bool:
    """Gate: Verify the email contains a subject line and body."""
    return "Subject:" in email and len(email) > 100

def translate_email(email: str, target_language: str) -> str:
    """Stage 2: Translate the email to the target language."""
    response = llm.call(
        system="You are a professional translator.",
        prompt=f"Translate the following email to {target_language}:\n\n{email}"
    )
    return response.text

# Run the chain
draft = generate_email("Wireless noise-canceling headphones, 40h battery")
if not gate_check_email(draft):
    raise ValueError("Draft email failed gate check -- missing structure.")
final = translate_email(draft, target_language="Spanish")

Routing — Can be used before a chain to select which chain variant to execute based on input type.
Parallelization — When stages are independent, run them in parallel instead of sequentially.
Evaluator-Optimizer — Adds an evaluation loop around a generation step, useful when a single pass through the chain may not meet quality requirements.

Overview

How It Works

When to Use

When Not to Use

Example

Related Patterns