How to Connect Local LLMs to n8n for Automated Data Extraction
The rise of open-source large language models has transformed what is possible for solo developers and small teams. Models like Llama 3, Mistral, and Phi-3 can now run on consumer hardware, delivering GPT-class performance without sending a single byte of data to an external API. In this tutorial, we will walk through connecting a locally hosted LLM to n8n so you can extract structured data from emails, PDFs, and web pages automatically.
Why Use a Local LLM with n8n?
Cloud-based LLM APIs like OpenAI and Anthropic are powerful, but they come with per-token costs that scale quickly. For repetitive extraction tasks — parsing invoices, categorizing support tickets, summarizing meeting notes — a local LLM eliminates ongoing API fees entirely. Combined with n8n, you get a visual drag-and-drop interface to orchestrate these tasks without writing boilerplate code.
- Zero API costs — run unlimited inference on your own GPU or CPU
- Full data privacy — sensitive documents never leave your network
- Low latency — local inference avoids network round-trips to cloud endpoints
- Customizable models — fine-tune or swap models without changing your workflow
Setting Up Ollama as Your Local LLM Server
Ollama is the easiest way to get started with local LLMs. Install it on macOS, Linux, or Windows, then pull a model with a single command: ollama pull llama3. Once running, Ollama exposes an OpenAI-compatible API at localhost:11434 that n8n can connect to directly using the HTTP Request node or the built-in AI Agent node.
Building the n8n Extraction Workflow
In n8n, create a new workflow with a Webhook trigger. Add an HTTP Request node pointing to your Ollama endpoint, passing the incoming data as a prompt. Use a structured output prompt template to ensure the LLM returns JSON. Finally, route the parsed output to Google Sheets, Airtable, or your database of choice. The entire pipeline runs in under two seconds per document.
Once your workflow is live, you can extend it with error handling, retry logic, and conditional branching — all within the n8n visual editor. This approach scales beautifully: add more triggers, swap models, or chain multiple LLM calls for multi-step extraction tasks.