Ollama (local LLM)

Ollama is a small server that runs open-weight LLMs locally — no cloud calls, no usage limits, your data never leaves your hardware. TravStats uses it for two things:

Email parsing — primary, recommended. When Ollama is configured (default in the bundled docker-compose), it parses booking confirmation emails before the built-in regex templates kick in. This is the strongly-recommended path because Ollama handles multi-flight bookings (round trips, connections, multi-leg itineraries) reliably, which is the regex layer’s biggest weakness. Regex templates exist as the fallback when Ollama is unavailable.
Boarding-pass vision — the first parser tried when you upload a boarding-pass image. Pulls the flight number, date, seat, class from the visual.

Both are technically optional — TravStats works without Ollama — but you’ll feel it on emails: regex-only parsing handles the eight built-in carriers’ single-leg bookings well, and falters on multi- flight emails or any unfamiliar airline. If you fly multi-leg itineraries or non-European carriers regularly, run Ollama.

Bundled vs external Ollama

The default docker-compose.prod.yml includes an Ollama container on the same private network as the app:

ollama:
  image: ollama/ollama:latest
  # ... resource limits: 4 CPUs, 8 GB RAM, 1 GB reservation

That’s the bundled path — easy to start with, costs ~5 GB of disk and 4 GB RAM idle (the model lives in memory while the Ollama process is running).

The external path means you run Ollama somewhere else (Mac mini, gaming PC, separate Linux box) and point TravStats at it via the OLLAMA_URL env var. This is the right choice if:

You’re running TravStats on a Pi 4 or NAS where 4 GB RAM is too much to spare
You already have an Ollama running for other purposes (LibreChat, OpenWebUI, …)
You want a beefier model than the host TravStats lives on can run

Picking a model

TravStats defaults to gemma3:12b for text parsing. You set the model in Admin → Settings → Ollama Model. The dropdown shows whatever models you’ve pulled into Ollama — TravStats doesn’t pull them for you.

Recommended models for text (email parsing)

Model	Size	RAM	Speed (M2 Mac mini)	Accuracy on TravStats parser test set
`gemma3:12b` (default)	8 GB	~10 GB	~3 s / email	100%
`llama3.1:8b`	5 GB	~7 GB	~2 s / email	96%
`qwen2.5:7b`	5 GB	~7 GB	~2 s / email	95%
`mistral:7b`	4 GB	~6 GB	~1.5 s / email	92%
`gemma3:4b`	3 GB	~5 GB	~1 s / email	87%

gemma3:12b is the default because TravStats benchmarked it at 100% on the parser test set (recorded in backend/src/services/parsers/__tests__/). It does need ~10 GB RAM though, so smaller boxes use gemma3:4b or llama3.1:8b and accept slightly worse accuracy.

Recommended models for vision (boarding pass scan)

Vision capability is a separate model class. TravStats expects you to pull a vision model and select it under Admin → Settings → Boarding Pass Parser:

Model	Size	Notes
`llama3.2-vision:11b`	8 GB	Best quality on barcoded passes
`llava:13b`	8 GB	Older but reliable
`bakllava:7b`	4 GB	Smallest viable; misses some hand-printed passes

If you don’t pull a vision model, TravStats skips Ollama vision and goes straight to the next parser in the cascade (OpenAI / Claude / Tesseract — see Vision parsers).

Pulling a model

Inside the Ollama container:

docker exec travstats-ollama ollama pull gemma3:12b
docker exec travstats-ollama ollama pull llama3.2-vision:11b

The first pull is large (3–8 GB) and slow on a residential connection. Subsequent updates are diffs.

List what’s installed:

docker exec travstats-ollama ollama list

In the TravStats admin UI, the Refresh button next to the model dropdown re-queries Ollama to pick up newly pulled models.

Hardware reference

Rough numbers from the field:

Host	Workable models	Notes
Raspberry Pi 4 (4 GB)	None practical	Stick to email-template parsing only, disable Ollama
Raspberry Pi 5 (8 GB)	`gemma3:4b`, `mistral:7b` (slow)	~10 s per email parse — works but you’ll feel it
NAS with x86 CPU (16 GB+)	`gemma3:12b`, `llama3.1:8b`	Fine for personal use
Apple Silicon (M1/M2 Mac mini, 16 GB)	All listed	Metal acceleration ≈ 3× CPU
Modern x86 CPU + 32 GB RAM	All listed, plus larger 70B-class models	Overkill for TravStats; useful if Ollama serves other apps too
GPU box (NVIDIA / AMD)	Whatever fits in VRAM	TravStats benefits but doesn’t require GPU

The Apple Silicon path is the homelab sweet-spot — a $500 used Mac mini runs the bundled stack on Linux and Ollama natively on macOS, with TravStats pointing at it via LAN.

Pointing at an external Ollama

In .env next to your compose file:

# Mac mini at 192.168.1.50, default Ollama port 11434
OLLAMA_URL=http://192.168.1.50:11434

Then in the compose file you can drop the ollama service entirely:

services:
  app:
    # ... existing config
    # depends_on no longer needs ollama
  db:
    # ... existing config
  # ollama: removed entirely

Restart the stack. TravStats will hit the external Ollama; the local container is gone, freeing 5 GB of disk and 4 GB of RAM.

You can also override at runtime in the admin UI: Admin → Settings → Ollama URL takes precedence over the env var. Useful for testing.

Disabling Ollama entirely

If you only fly the eight airlines TravStats has built-in templates for — Lufthansa, Swiss, Austrian, Brussels, Ryanair, easyJet, Eurowings, Wizz Air — you can run without LLM-fallback parsing.

Stop and remove the container:

docker compose -f docker-compose.prod.yml stop ollama
docker compose -f docker-compose.prod.yml rm -f ollama

Then in Admin → Settings → Ollama URL, set the field to empty. Email parsing for non-templated airlines will return a “no match” message, and you can either record a user template or enter the flight manually. Built-in templates and boarding-pass scanning still work via the OpenAI/Claude/Tesseract cascade.

What gets sent to Ollama

The full body of the email or the base64-encoded image. Nothing leaves your network if your Ollama runs on your own hardware (the bundled container, a Mac mini on your LAN, etc.) — Ollama itself makes no outbound calls.

If you point OLLAMA_URL at a hosted Ollama service (some exist), that hosted provider sees everything. Don’t do that with passenger names and PNRs you’d rather keep private.

Performance tuning

The Ollama container has a memory cap configured in the compose:

deploy:
  resources:
    limits:
      cpus: "4.0"
      memory: 8G

If you’re running a larger model and the container OOM-kills, raise the limits. For an 8b model 6 GB is enough; 12b wants 10 GB; 70b needs 48 GB+ and a GPU.

The first request after a cold start takes longer because Ollama loads the model weights into RAM. Subsequent calls are fast. TravStats keeps the model warm by sending periodic health-check calls — adjustable in Admin → Settings → Ollama Keep-alive.