Conditional Logic Patterns for Agentic AI Workflows

Branching diagram showing conditional patterns inside an agentic AI workflow

The four conditional logic patterns that hold up in real agentic workflows are: deterministic routing by rules, LLM-as-router by classification, retrieval-then-route by semantic match, and reflection loops where the agent grades its own output and decides whether to retry. Use the simplest one that works. Reach for the next only when the simpler one demonstrably fails.

Agentic, defined plainly

An agentic workflow is one where the LLM decides what step to take next, instead of you wiring every step in advance. The conditional logic is the part where the LLM, or your code on behalf of the LLM, picks a branch.

The agent that kept calling the same tool forever

September 2024, I built an agent that triaged inbound product feedback. It had three tools: tag, summarise, and route-to-engineer. On the second day of production, a piece of feedback came in that the model could not classify. The agent called tag, then summarise, then tag again, then summarise again, then tag, in a loop. By the time my budget alert fired, it had spent $14.20 on a single ticket and produced nothing useful. The fix was a hard cap on tool calls per task and a fallback branch when the cap hit. Twenty lines of code. I now write that cap before I write the agent. Lesson learned at the price of a decent lunch.

The opinion that gets me in arguments

Most workflows that people build as agents should be plain switch statements. The LLM is the most expensive way to choose between three options when those three options are knowable in advance. Use a rule-based router for the 80% of cases that fit a pattern, and reserve LLM routing for the 20% that genuinely need judgement. The mechanism: every LLM call adds latency, cost, and one more place for things to go non-deterministic. The cost of being wrong is a workflow that is slower, more expensive, and harder to debug than the if-else it replaced. Hold this loosely once your routing rules grow past a couple dozen branches; that is the point where an LLM classifier earns its keep.

Flowchart showing four branches from a central decision node with one branch looping back through a self-grading step

Pattern one: deterministic routing

If the input has a field you can test, test it. In n8n, this is the Switch node. In code, it is an if-else or a lookup table. Examples that should never touch an LLM router: email domain to team mapping, file extension to handler, payment provider webhook event type. Latency: microseconds. Cost: zero. Debuggability: total.

Pattern two: LLM as router

When the branch depends on meaning rather than a field, give the LLM the input and a short list of named categories, ask it to return one category by name, and route on that. Keep the prompt brutally short. Set temperature to 0. Cap output to roughly the longest category name. Validate the response against the enum and fall back to a default branch if the model returns something unexpected. This last step is where most people fail; the model will eventually return something off-list and your switch needs to not explode.

Pattern three: retrieval-then-route

When the categories themselves are too many or too dynamic for a prompt, embed both the input and the candidate routes, and pick the route with the highest cosine similarity above a threshold. Below the threshold, route to a human or a fallback. This is the right pattern for things like routing a support ticket to one of 200 article topics: a prompt listing 200 options is wasteful; an embedding lookup is fast and cheap.

Pattern four: reflection loops

After the agent takes an action, a second LLM call grades the result against the original goal and returns continue, retry-with-feedback, or stop. Cap the loop at 2 or 3 iterations. Always cap. The grader prompt must be different from the actor prompt; same prompt produces same answer, which is no signal. This pattern lifts quality on open-ended tasks like drafting and code generation, and adds nothing useful on closed tasks like classification.

The three rules that keep agents sane

  • Hard cap tool calls per task. I default to 8. If the model needs more, the task is decomposed wrong.
  • Hard cap total tokens per task. Set a budget. Per OpenAI's API docs as of 2025, you can pass max_tokens on each call; chain them with a running total in your orchestration code.
  • Log every decision. Tool called, arguments, result, model's stated reason. You will read these logs the first time something goes weird, and you will be glad you wrote them.

Where these four patterns sit in the wider pattern catalogue

Microsoft's Azure Architecture Center groups agentic orchestration into five named patterns: sequential, concurrent, group chat, handoff, and magentic. My four patterns above are routing decisions inside whichever orchestration shape you pick. Separately, the reasoning literature uses three other names you should recognise: ReAct (the model interleaves thought and action), Reflexion (the model critiques and revises its own output), and Tree-of-Thought (the model explores multiple branches before committing). Machine Learning Mastery's roadmap and the servicesground.com 2026 overview both walk through the trade-offs. The useful framing is this: orchestration patterns describe how agents talk to each other, reasoning patterns describe how one agent thinks, and routing patterns describe how a single step picks its next move. You will use all three layers in any non-trivial workflow.

Frequently asked questions

Do I need LangGraph for agentic workflows?

Not necessarily. LangGraph gives you nice primitives for state machines and is genuinely useful past a certain complexity. For two or three nodes with a loop, plain Python or n8n is enough and easier to debug at 2am.

How do I evaluate an agent?

Build a small eval set of real inputs with expected outcomes, even just 50 examples. Run the agent against the set after every prompt or model change. Measure success rate, average tool calls, average cost. Without an eval set you are tuning a thing you cannot see.

What about multi-agent setups?

Multi-agent is often a worse loop with extra steps. Two agents arguing is rarely better than one agent with a good prompt and a grader. There are cases where role separation helps (a researcher and a writer with different prompts), but the bar is higher than the marketing suggests.

Should the router and the actor be the same model?

Often the router can be a much smaller, cheaper model. Routing is classification; classification is easy. Save the expensive model for the actual work. This single split usually cuts cost by 40 to 60% with no quality loss.

Start with the switch statement. Add an LLM only where you can name the case the switch cannot handle. The agent is the last resort, not the first move.