Skip to content

Ollama (local LLM)

Ollama is a small server that runs open-weight LLMs locally — no cloud calls, no usage limits, your data never leaves your hardware. TravStats uses it for two things:

  1. Email parsing — primary, recommended. When Ollama is configured (default in the bundled docker-compose), it parses booking confirmation emails before the built-in regex templates kick in. This is the strongly-recommended path because Ollama handles multi-flight bookings (round trips, connections, multi-leg itineraries) reliably, which is the regex layer’s biggest weakness. Regex templates exist as the fallback when Ollama is unavailable.
  2. Boarding-pass vision — the first parser tried when you upload a boarding-pass image. Pulls the flight number, date, seat, class from the visual.

Both are technically optional — TravStats works without Ollama — but you’ll feel it on emails: regex-only parsing handles the eight built-in carriers’ single-leg bookings well, and falters on multi- flight emails or any unfamiliar airline. If you fly multi-leg itineraries or non-European carriers regularly, run Ollama.

The default docker-compose.prod.yml includes an Ollama container on the same private network as the app:

ollama:
image: ollama/ollama:latest
# ... resource limits: 4 CPUs, 8 GB RAM, 1 GB reservation

That’s the bundled path — easy to start with, costs ~5 GB of disk and 4 GB RAM idle (the model lives in memory while the Ollama process is running).

The external path means you run Ollama somewhere else (Mac mini, gaming PC, separate Linux box) and point TravStats at it via the OLLAMA_URL env var. This is the right choice if:

  • You’re running TravStats on a Pi 4 or NAS where 4 GB RAM is too much to spare
  • You already have an Ollama running for other purposes (LibreChat, OpenWebUI, …)
  • You want a beefier model than the host TravStats lives on can run

TravStats defaults to gemma3:12b for text parsing. You set the model in Admin → Settings → Ollama Model. The dropdown shows whatever models you’ve pulled into Ollama — TravStats doesn’t pull them for you.

Section titled “Recommended models for text (email parsing)”
ModelSizeRAMSpeed (M2 Mac mini)Accuracy on TravStats parser test set
gemma3:12b (default)8 GB~10 GB~3 s / email100%
llama3.1:8b5 GB~7 GB~2 s / email96%
qwen2.5:7b5 GB~7 GB~2 s / email95%
mistral:7b4 GB~6 GB~1.5 s / email92%
gemma3:4b3 GB~5 GB~1 s / email87%

gemma3:12b is the default because TravStats benchmarked it at 100% on the parser test set (recorded in backend/src/services/parsers/__tests__/). It does need ~10 GB RAM though, so smaller boxes use gemma3:4b or llama3.1:8b and accept slightly worse accuracy.

Section titled “Recommended models for vision (boarding pass scan)”

Vision capability is a separate model class. TravStats expects you to pull a vision model and select it under Admin → Settings → Boarding Pass Parser:

ModelSizeNotes
llama3.2-vision:11b8 GBBest quality on barcoded passes
llava:13b8 GBOlder but reliable
bakllava:7b4 GBSmallest viable; misses some hand-printed passes

If you don’t pull a vision model, TravStats skips Ollama vision and goes straight to the next parser in the cascade (OpenAI / Claude / Tesseract — see Vision parsers).

Inside the Ollama container:

Terminal window
docker exec travstats-ollama ollama pull gemma3:12b
docker exec travstats-ollama ollama pull llama3.2-vision:11b

The first pull is large (3–8 GB) and slow on a residential connection. Subsequent updates are diffs.

List what’s installed:

Terminal window
docker exec travstats-ollama ollama list

In the TravStats admin UI, the Refresh button next to the model dropdown re-queries Ollama to pick up newly pulled models.

Rough numbers from the field:

HostWorkable modelsNotes
Raspberry Pi 4 (4 GB)None practicalStick to email-template parsing only, disable Ollama
Raspberry Pi 5 (8 GB)gemma3:4b, mistral:7b (slow)~10 s per email parse — works but you’ll feel it
NAS with x86 CPU (16 GB+)gemma3:12b, llama3.1:8bFine for personal use
Apple Silicon (M1/M2 Mac mini, 16 GB)All listedMetal acceleration ≈ 3× CPU
Modern x86 CPU + 32 GB RAMAll listed, plus larger 70B-class modelsOverkill for TravStats; useful if Ollama serves other apps too
GPU box (NVIDIA / AMD)Whatever fits in VRAMTravStats benefits but doesn’t require GPU

The Apple Silicon path is the homelab sweet-spot — a $500 used Mac mini runs the bundled stack on Linux and Ollama natively on macOS, with TravStats pointing at it via LAN.

In .env next to your compose file:

Terminal window
# Mac mini at 192.168.1.50, default Ollama port 11434
OLLAMA_URL=http://192.168.1.50:11434

Then in the compose file you can drop the ollama service entirely:

services:
app:
# ... existing config
# depends_on no longer needs ollama
db:
# ... existing config
# ollama: removed entirely

Restart the stack. TravStats will hit the external Ollama; the local container is gone, freeing 5 GB of disk and 4 GB of RAM.

You can also override at runtime in the admin UI: Admin → Settings → Ollama URL takes precedence over the env var. Useful for testing.

If you only fly the eight airlines TravStats has built-in templates for — Lufthansa, Swiss, Austrian, Brussels, Ryanair, easyJet, Eurowings, Wizz Air — you can run without LLM-fallback parsing.

Stop and remove the container:

Terminal window
docker compose -f docker-compose.prod.yml stop ollama
docker compose -f docker-compose.prod.yml rm -f ollama

Then in Admin → Settings → Ollama URL, set the field to empty. Email parsing for non-templated airlines will return a “no match” message, and you can either record a user template or enter the flight manually. Built-in templates and boarding-pass scanning still work via the OpenAI/Claude/Tesseract cascade.

The full body of the email or the base64-encoded image. Nothing leaves your network if your Ollama runs on your own hardware (the bundled container, a Mac mini on your LAN, etc.) — Ollama itself makes no outbound calls.

If you point OLLAMA_URL at a hosted Ollama service (some exist), that hosted provider sees everything. Don’t do that with passenger names and PNRs you’d rather keep private.

The Ollama container has a memory cap configured in the compose:

deploy:
resources:
limits:
cpus: "4.0"
memory: 8G

If you’re running a larger model and the container OOM-kills, raise the limits. For an 8b model 6 GB is enough; 12b wants 10 GB; 70b needs 48 GB+ and a GPU.

The first request after a cold start takes longer because Ollama loads the model weights into RAM. Subsequent calls are fast. TravStats keeps the model warm by sending periodic health-check calls — adjustable in Admin → Settings → Ollama Keep-alive.