AI Tool Pipelines — Automate Your WorkflowsAI Tool Pipelines

How to Build Conditional Logic with AI: 5 Patterns That Ship

6 min read · Updated Jun 4, 2026

Flowchart showing AI-powered conditional branching in a workflow

Conditional logic in AI workflows is the bit where you stop trusting if/else on a substring and start letting a model decide. The trick is knowing when each is right: deterministic IF for shape, model-driven branching for intent. Get that line right and your workflow stays understandable. Get it wrong and you either ship brittle keyword logic or you let a 7B model decide your refund policy.

Key takeaways

  • Use deterministic IF when the rule is about shape (status code, amount > X, field present).
  • Use an AI classifier when the rule is about intent (complaint vs. question, urgency, sentiment).
  • Always return structured JSON ({ category, confidence }) from the model, never free text.
  • Treat anything below ~0.7 confidence as "send to human" — false confidence is worse than no answer.
  • Workflow-level IFs give you an audit trail. In-prompt conditions give you nuance. Combine them, don’t pick one.

Deterministic IF vs AI classifier: the decision rule

Which kind of branch to use when.
Decision is about…UseWhy
HTTP status / field presenceDeterministic IFCheap, reliable, auditable
Numeric thresholds ($, count, age)Deterministic IFNo model needed
Customer intent (complaint vs question)AI classifierBeyond keyword reach
Urgency or sentimentAI classifierTone and context matter
Document type from PDFAI classifierLayout + content blend
Compliance / refund eligibilityDeterministic IF (and only)Never let a model decide policy

The classification prompt that actually holds

A reliable classifier prompt does four things: (1) defines categories by intent not keywords, (2) returns strict JSON, (3) emits a confidence score, (4) has a default "uncertain" category. Models will route anything not-uncertain into your hottest category if you skip step 4. Below is the template I have shipped four times since 2024 with under-20-line tweaks.

text
You are a classifier. Read the user message and return ONE JSON object
matching this schema (no prose, no markdown):

{
  "category": "complaint" | "question" | "praise" | "refund_request" | "spam" | "uncertain",
  "confidence": 0.0 to 1.0,
  "reason": "one sentence, max 20 words"
}

Definitions (by intent, NOT keywords):
- complaint: user expresses dissatisfaction with the product, service, or experience.
- question: user is asking how to do something or asking for information.
- praise: user expresses genuine satisfaction or thanks (not sarcastic).
- refund_request: user is explicitly asking for money back.
- spam: promotional content, off-topic, automated.
- uncertain: you cannot confidently place it in any of the above.

Rules:
- If confidence < 0.7, you MUST return "uncertain".
- Sarcasm in a positive form is a complaint, not praise.
- "How do I cancel" is a question, NOT a refund_request unless money is mentioned.

Message:
"""
{{ $input.message }}
"""

Wiring it into n8n

The production shape: a single OpenAI/Claude node returns the JSON, an n8n Switch node reads the category field, each output route goes to the right downstream branch, and a default route catches uncertain and ships to a #triage Slack channel. The Switch node is your audit trail — every execution shows which route fired and why.

json
{
  "nodes": [
    { "name": "Webhook", "type": "n8n-nodes-base.webhook" },
    {
      "name": "Classify",
      "type": "@n8n/n8n-nodes-langchain.openAi",
      "parameters": {
        "model": "gpt-4o-mini",
        "options": { "response_format": "json_object", "temperature": 0 }
      }
    },
    {
      "name": "Route",
      "type": "n8n-nodes-base.switch",
      "parameters": {
        "rules": [
          { "value": "={{ $json.category }}", "operation": "equals", "value2": "complaint" },
          { "value": "={{ $json.category }}", "operation": "equals", "value2": "question" },
          { "value": "={{ $json.category }}", "operation": "equals", "value2": "refund_request" }
        ],
        "fallbackOutput": 3
      }
    }
  ]
}

The story behind every "categorise by INTENT not keywords" tip

May 2024, Tuesday morning. An 8-person SaaS support inbox getting ~140 messages/day. We built the classifier above with five categories and a 0.7 threshold. Week one accuracy: 82%. Looked great. Then I checked the failures: 70% of the misclassifications were going into the "billing" branch. The reason was embarrassingly simple. The prompt defined billing by keywords ("invoice", "charge", "subscription", "$") and a lot of normal product questions happened to mention pricing. "Hey what is the difference between the $19 plan and the $49 plan?" routed straight to billing. We rewrote the prompt to define each category by intent (billing = user is trying to change their payment, refund, or dispute a charge), gave the model two examples per category, and accuracy jumped to 94% in a single deploy. The lesson stuck: never define a classifier category by the words it contains; define it by what the user is trying to accomplish.

Confidence gating: the cheapest insurance you can buy

The opinion I will defend

Pitfalls that bite within the first month

  • Free-text output: the model returns "I think this is a complaint" instead of JSON, the Switch node throws, the workflow dies silently. Force response_format: json_object and validate.
  • Temperature > 0 on a classifier: non-deterministic routing for the same input. Always set temperature: 0 for classifiers.
  • Letting the LLM decide policy: "is this user eligible for a refund?" must be a deterministic IF on order date + amount, not a model judgement. Models will be reasonable 95% of the time and quietly catastrophic 5%.
  • No fallback branch: if you Switch on category and the model returns one you didn’t list, the workflow stops. Always wire fallbackOutput to a #triage channel.

Cost reality check

GPT-4o-mini at OpenAI 2024 pricing ($0.15/M input, $0.60/M output) classifies a typical 200-token message in ~50 input + 30 output tokens. That is roughly $0.00003 per classification, or $30 per million. A team handling 5,000 support messages/day pays about $4.50/month for the classifier itself. The Switch node and the workflow are free. The expensive thing is the engineer-hour you spend on the prompt; budget 3–5 hours for the first one and ~30 minutes to add a new category later.

Where to go next

For the full classifier playbook with confusion-matrix tuning, see AI conditional logic in automation workflows. For dynamic tool selection (where the AI picks which downstream API to call, not just which branch), see conditional routing for AI pipelines. For the durable-execution side of branching, AI pipeline automation tools for developers covers Inngest and Temporal.

“Models are excellent at judging intent and terrible at enforcing policy. Let them route. Don’t let them decide.”

Frequently asked questions

Frequently asked questions

What is AI conditional logic in a workflow?

It’s a workflow branch where the decision — which path runs next — is made by a language model classifying the input, rather than a hand-coded if/else on a keyword or threshold. Typical use: classify a support message by intent and route it to the matching team.

When should I use a model instead of a regular IF?

Use a model when the decision is about intent, sentiment, document type, or anything else where the same outcome can be expressed in many different phrases. Use a regular IF when the decision is about shape (HTTP status, amount > X, field present). Never use a model for policy enforcement (refund eligibility, compliance gates).

What’s the best model for classification in 2026?

GPT-4o-mini is the price/quality sweet spot for short messages ($0.15/M input). Claude Haiku is a close second. For local/private setups, Llama 3 8B with a careful prompt hits ~90% of cloud quality on classification tasks at zero per-call cost.

How do I prevent the model from returning free-text instead of JSON?

Use the provider’s structured-output mode: OpenAI’s response_format: { type: "json_object" } or function calling, Anthropic’s tool-use schema. Then validate the parsed JSON against a schema (Zod, ajv) and route any parse failure to a triage branch.

What confidence threshold should I use?

0.7 is the safe default for support routing. Lower (0.6) if your "uncertain" branch is cheap (just a human glance). Higher (0.85) if the downstream action is irreversible (sending a refund, escalating to a senior agent). Always measure on a labelled sample before tuning.

How much does AI classification cost at scale?

At OpenAI 2024 pricing for GPT-4o-mini, classifying a 200-token message costs about $0.00003. A team handling 5,000 messages/day pays roughly $4.50/month for classification. Engineer time on the prompt is the bigger line item — budget 3–5 hours for the initial setup.