The Best Self-Hosted Automation Platform for Running Local LLM Workflows
6 min read

For most teams running local LLM workflows, the best self-hosted automation platform is n8n paired with Ollama on the same Docker network. It self-hosts under a fair-code licence, ships native LLM and HTTP nodes, and, crucially, still handles the unglamorous glue around the model: Postgres writes, queues, cron schedules, and webhooks. The LLM-native builders (Flowise, Langflow, Dify) look better in a demo, but they fall behind the moment your workflow has to touch a database and a Slack channel in the same run.
Key takeaways
- n8n + Ollama is the default pick: native LLM nodes, fair-code self-hosting, and a real node ecosystem for the glue around the model.
- Activepieces is the cleanest MIT-licensed alternative if you want a simpler UI and do not need n8n’s node breadth.
- Windmill wins if your team is code-first and wants TypeScript or Python scripts over a visual canvas.
- Flowise and Langflow are LLM-native but narrow. Great for a chatbot, weak the moment you need cron, queues, and database writes.
- The deciding factor is rarely the LLM features. It is whether the platform handles the unglamorous plumbing around the model.
What "best" means for local LLM work
A local LLM workflow is rarely just the model. It is a trigger, some data fetched from somewhere, a call to a model running on your own hardware, validation of what came back, and then an action: write a row, send a message, update a ticket. The model is one node out of eight. So the platform that wins is not the one with the prettiest prompt box. It is the one that does the model call well and the other seven nodes without fighting you.
Ollama, the tool that runs open-weight models locally, exposes an OpenAI-compatible API on port 11434 by default (Ollama docs). Any platform that can make an HTTP request can talk to it, so raw connectivity is not the differentiator. The differentiator is everything around that call.

The contenders, head to head
| Platform | Licence | LLM-native | Best for |
|---|---|---|---|
| n8n | Fair-code (Sustainable Use) | Yes, native nodes | General automation with a local model in the loop |
| Activepieces | MIT | Yes, AI pieces | Teams wanting a simpler, fully open UI |
| Windmill | AGPL / commercial | Via scripts | Code-first teams who prefer TypeScript or Python |
| Flowise | Apache 2.0 | Yes, LLM-first | Standalone chatbots and RAG demos |
| Node-RED | Apache 2.0 | Via HTTP nodes | IoT and event glue, not LLM-shaped |
n8n leads because the breadth is already there: 400-plus integrations, a Code node for the cases the integrations miss, and an Ollama Chat Model node so you are not hand-rolling the request. Activepieces is the one I recommend when a team finds n8n’s licence or density off-putting; it is MIT-licensed and the UI is gentler. Windmill is the pick for engineers who would rather write a script than drag a node. Flowise and Langflow are excellent at exactly one thing and frustrating outside it.
The Friday I moved a team off the pretty tool
January 2025, a Friday morning, a 5-person legal-research shop wanted contracts read and summarised locally for privacy reasons, about 90 documents a day. Dana, their one technical person, had built the first version in Langflow because the RAG demo looked perfect. It worked until the workflow needed three more things in the same run: write the summary to Postgres, post a link in Slack, and re-run any document that failed parsing on a nightly cron. None of that was the model. All of it was a fight in a tool built around the model. We rebuilt it in n8n with Ollama running Llama 3.1 8B in about a day and a half. The model node was the easy part. The win was that cron, the Postgres node, and the Slack node were already in the box. Three months later it was still running on the same machine, untouched.
The hardware story is forgiving too. Llama 3.1 8B fits in roughly 8GB of VRAM in a 4-bit quantisation (Meta Llama 3.1 release, 2024), which means a single mid-range GPU, or even a recent Mac, runs this class of workflow without a server budget. The platform choice matters more than the silicon.
A minimal n8n plus Ollama stack
Put both on the same Docker network and n8n reaches Ollama by service name, no tunnels, no exposed ports. This is the whole base.
services:
ollama:
image: ollama/ollama:latest
volumes:
- ollama:/root/.ollama
# GPU passthrough optional; CPU works for small models
n8n:
image: n8nio/n8n:latest
ports:
- '5678:5678'
environment:
# n8n reaches the model by service name, not localhost
- OLLAMA_HOST=http://ollama:11434
depends_on:
- ollama
volumes:
ollama:From there, point the Ollama Chat Model node at http://ollama:11434 and build the rest of the workflow with the normal nodes. If you are starting from zero, the local LLM to n8n tutorial walks the first pipeline end to end, and local data extraction with DeepSeek covers the model side in more depth.
So which one should you pick
My opinion, held firmly but not blindly: start with n8n plus Ollama unless you have a specific reason not to. Pick Activepieces if the licence or the density bothers you, Windmill if your team writes code happily, and Flowise only if the entire job really is a chatbot and nothing else. Where this changes: if you are a solo builder with no appetite for ops, the maintenance tax on any self-hosted stack can outweigh the privacy and cost wins, and a managed option is the grown-up choice. For teams that already run their own infra, self-hosting local LLM workflows is one of the highest-leverage moves available, and n8n is the shortest path to it.
“The best platform for local LLM work is the one still running, unattended, three months after you built the thing. Demos are easy. Tuesdays are the test.”
Frequently asked questions
Frequently asked questions
Is n8n really better than Flowise for local LLM workflows?
For standalone chatbots, Flowise is excellent and often faster to build. For workflows that also need cron, queues, database writes, and integrations, n8n wins because that plumbing is built in rather than bolted on.
Do I need a GPU to self-host local LLM workflows?
Not for small models. An 8B model in 4-bit quant fits in about 8GB of VRAM and runs acceptably on a recent Mac or a single mid-range GPU. CPU-only works for low-volume batch jobs, just slower.
What licence does n8n use, and can I self-host it commercially?
n8n uses a fair-code Sustainable Use Licence. You can self-host and use it internally for free, including commercially. Reselling it as a hosted product is where the restrictions apply, so read the licence if that is your plan.
How does the platform connect to Ollama?
Over Ollama’s HTTP API on port 11434. Put the automation platform and Ollama on the same Docker network and connect by service name so you never expose the port publicly.
Is self-hosting cheaper than a managed automation platform?
On infrastructure, almost always. On total cost, only if someone already owns the ops. Once you price your own time for updates, backups, and incidents, a solo builder may find a managed plan is the cheaper honest answer.