Question 1

Should I use OpenAI, Anthropic, Google, or an open-source model?

Accepted Answer

Start with OpenAI behind a feature flag, ship in a week, measure for two. Add Anthropic when you need 200K-token context windows. Add Google Gemini for research with sourced citations. Switch to open-source (Llama, DeepSeek, Mistral) for tasks where data must stay on your network or per-call cost dominates.

Question 2

What is the cheapest way to integrate an LLM into my app?

Accepted Answer

GPT-4o-mini at $0.15 / 1M input tokens covers most classification, summarisation, and extraction tasks. For everything cheaper, run Llama 3 8B on a $400 mini-PC with a 3060 GPU — marginal cost per call drops to electricity.

Question 3

How do I stop my LLM from hallucinating?

Accepted Answer

You cannot eliminate it. You can reduce it with: retrieval-augmented generation (give the model your source documents in-context), schema-constrained outputs (function calling), self-critique loops (have a second LLM call grade the first), and confidence-gated fallbacks (route low-confidence answers to a human).

Question 4

Is fine-tuning worth it, or is prompting enough?

Accepted Answer

Prompting + few-shot examples + RAG covers 95% of use cases as of 2026. Fine-tune only when (a) you have 1,000+ high-quality training examples, (b) the task is narrow and stable, and (c) latency or cost matters more than flexibility. Most "we should fine-tune" instincts are premature.

Question 5

How do I handle LLM API failures gracefully?

Accepted Answer

Exponential backoff with jitter, capped at 2–3 retries, then a fallback model (e.g. Claude if OpenAI is down). Surface a "service degraded" message to users rather than waiting silently. Log every failure to a queue you can replay when the provider recovers.

LLM integration

Key takeaways

Frequently asked questions about this category

Should I use OpenAI, Anthropic, Google, or an open-source model?

What is the cheapest way to integrate an LLM into my app?

How do I stop my LLM from hallucinating?

Is fine-tuning worth it, or is prompting enough?

How do I handle LLM API failures gracefully?

How to Build a RAG Pipeline: Practical Guide for 2026

Running DeepSeek Locally for Free, Secure Data Extraction

Build Multi-Agent Workflows in n8n with DeepSeek and Ollama