Vision parsers (OpenAI / Anthropic / Tesseract)

Boarding-pass scanning works through a cascade. TravStats tries parsers in order until one returns a usable flight number. Each parser has a different cost / accuracy / privacy profile, so the cascade lets you mix-and-match.

The cascade order

1. Ollama vision      — local, free, default, ~3 s on M2
2. OpenAI Vision      — cloud, paid, ~1 s, very accurate
3. Anthropic Claude   — cloud, paid, ~1 s, very accurate
4. Tesseract OCR      — local, free, slowest, less accurate
5. Manual entry       — last resort, pre-filled with whatever OCR could read

The cascade is per-image: if Ollama returns “no flight number detected” on a particular pass, TravStats tries OpenAI on that one image, and so on. Subsequent passes start from the top of the cascade again.

You configure the cascade in Admin → Settings → Boarding Pass Parser. Each tier has an enable/disable toggle and a priority rank. The defaults are sensible; most users only ever change the default vision provider.

Ollama vision

Covered in detail on the Ollama page.

In short: pull a vision model into the bundled Ollama container, select it as the default parser, you’re done. Free, private, ~3 s per pass on an M2 Mac mini.

Recommended models: llama3.2-vision:11b (best quality) or bakllava:7b (smaller, lower RAM).

OpenAI Vision

Provider: platform.openai.com · Cost: ~$0.001 per boarding pass at the time of writing (gpt-4o-mini model, ~500 tokens out per image).

Why use it

Faster than Ollama on most home hardware (~1 s vs ~3 s).
More accurate on hand-stamped or low-contrast passes — OpenAI’s vision models are trained on a broader dataset than the open-weight alternatives.
Useful when the bundled Ollama container is too heavy for your host (Pi, low-spec NAS) but you still want vision-based scanning.

Setup

Sign in at platform.openai.com, API → Create new secret key. Copy it; OpenAI shows it once.
Top up the account with a small balance (anything ≥ $5 covers thousands of scans).
Admin → Settings → External APIs → OpenAI API key → paste, save.
Admin → Settings → Boarding Pass Parser → Set OpenAI as default (or move it ahead of Ollama in the cascade).

Privacy

The image is sent to OpenAI’s servers. OpenAI’s usage policies say API content isn’t used for training (as of 2026), but the image does cross the public internet. Don’t use this parser for boarding passes you’d rather keep entirely private.

Cost control

A single boarding pass is ~$0.001. A user logging 100 flights / year spends ~$0.10 / year on OpenAI. The bundled rate limits in TravStats keep you from accidentally blowing through credits with a runaway script — see Personal Access Tokens.

Anthropic Claude Vision

Provider: console.anthropic.com · Cost: comparable to OpenAI (~$0.001 per pass, depending on which Claude model you select).

Same trade-offs as OpenAI:

Pro: Cloud-fast, very accurate, high-quality on edge cases.
Con: Image leaves your network.

Setup

Anthropic Console → Settings → API Keys → create.
Add a small balance (~$5 covers thousands of scans).
Admin → Settings → External APIs → Anthropic API key → paste, save.
Admin → Settings → Boarding Pass Parser → Add Claude to the cascade.

When to pick Claude over OpenAI

If you already have an Anthropic account for other things, save the extra signup. Quality difference between Claude and OpenAI vision on boarding passes is negligible — both are excellent. Pick whichever you have credit on.

You can configure both at once and order them in the cascade — TravStats tries one, falls back to the other if the first errors.

Tesseract OCR

Bundled. No setup, no API key, runs inside the app container.

Tesseract is open-source OCR — it doesn’t “understand” images, just reads pixels into text. TravStats then runs heuristics on the output (regex for flight numbers, dates, IATA codes) to extract fields.

When it works

High-contrast, machine-printed passes
Modern e-tickets with crisp digital rendering
Photos taken straight on with good lighting

When it doesn’t

Hand-stamped boarding passes (regional carriers, older airports)
Photos at an angle or with glare
Passes printed on coloured backgrounds

When Tesseract fails or returns garbage, TravStats falls through to manual entry — pre-filling the form with whatever Tesseract did manage to read. Even a partial OCR pass usually saves you typing the flight number, so the parser is rarely worthless.

Tuning

Tesseract speed and accuracy are tuned in backend/src/services/parsers/vision/. The defaults work for most use cases; advanced users can tweak language packs and PSM modes if they need to.

Choosing a configuration

The right cascade depends on what you optimise for:

“Privacy first, never send my data anywhere”

Cascade: Ollama vision → Tesseract OCR → manual

Free, fully local, slower on hand-stamped passes. The compromise is that some passes will need manual entry.

”Speed and accuracy first, cost is fine”

Cascade: OpenAI Vision → Tesseract OCR → manual

(Or Claude in place of OpenAI.) Per-scan cost is rounding-error small; quality is best-in-class.

”Best of both worlds”

Cascade: Ollama vision → OpenAI Vision → Tesseract OCR → manual

Tries local first (free, private). Falls back to OpenAI for the ~5–10% of passes Ollama struggles on (hand-stamped, low-contrast, unusual layouts). Total cost stays in the cents-per-year range for a typical user.

”Lowest hardware footprint”

Cascade: OpenAI Vision → Tesseract OCR → manual

Skip Ollama entirely. No 5 GB model, no 4 GB RAM cap. Pi-friendly. Pay a couple cents per year in API costs.

What’s actually sent

Every parser receives the raw boarding-pass image — base64 encoded, ~50 KB to ~2 MB depending on resolution.

Nothing else leaves your network for that scan. Your name, your other flights, your TravStats credentials — none of that is visible to the parser. After extraction, you see the result on the review screen and confirm before anything saves.