Skip to content

Vision parsers (OpenAI / Anthropic / Tesseract)

Boarding-pass scanning works through a cascade. TravStats tries parsers in order until one returns a usable flight number. Each parser has a different cost / accuracy / privacy profile, so the cascade lets you mix-and-match.

1. Ollama vision — local, free, default, ~3 s on M2
2. OpenAI Vision — cloud, paid, ~1 s, very accurate
3. Anthropic Claude — cloud, paid, ~1 s, very accurate
4. Tesseract OCR — local, free, slowest, less accurate
5. Manual entry — last resort, pre-filled with whatever OCR could read

The cascade is per-image: if Ollama returns “no flight number detected” on a particular pass, TravStats tries OpenAI on that one image, and so on. Subsequent passes start from the top of the cascade again.

You configure the cascade in Admin → Settings → Boarding Pass Parser. Each tier has an enable/disable toggle and a priority rank. The defaults are sensible; most users only ever change the default vision provider.

Covered in detail on the Ollama page.

In short: pull a vision model into the bundled Ollama container, select it as the default parser, you’re done. Free, private, ~3 s per pass on an M2 Mac mini.

Recommended models: llama3.2-vision:11b (best quality) or bakllava:7b (smaller, lower RAM).

Provider: platform.openai.com · Cost: ~$0.001 per boarding pass at the time of writing (gpt-4o-mini model, ~500 tokens out per image).

  • Faster than Ollama on most home hardware (~1 s vs ~3 s).
  • More accurate on hand-stamped or low-contrast passes — OpenAI’s vision models are trained on a broader dataset than the open-weight alternatives.
  • Useful when the bundled Ollama container is too heavy for your host (Pi, low-spec NAS) but you still want vision-based scanning.
  1. Sign in at platform.openai.com, API → Create new secret key. Copy it; OpenAI shows it once.
  2. Top up the account with a small balance (anything ≥ $5 covers thousands of scans).
  3. Admin → Settings → External APIs → OpenAI API key → paste, save.
  4. Admin → Settings → Boarding Pass Parser → Set OpenAI as default (or move it ahead of Ollama in the cascade).

The image is sent to OpenAI’s servers. OpenAI’s usage policies say API content isn’t used for training (as of 2026), but the image does cross the public internet. Don’t use this parser for boarding passes you’d rather keep entirely private.

A single boarding pass is ~$0.001. A user logging 100 flights / year spends ~$0.10 / year on OpenAI. The bundled rate limits in TravStats keep you from accidentally blowing through credits with a runaway script — see Personal Access Tokens.

Provider: console.anthropic.com · Cost: comparable to OpenAI (~$0.001 per pass, depending on which Claude model you select).

Same trade-offs as OpenAI:

  • Pro: Cloud-fast, very accurate, high-quality on edge cases.
  • Con: Image leaves your network.
  1. Anthropic Console → Settings → API Keys → create.
  2. Add a small balance (~$5 covers thousands of scans).
  3. Admin → Settings → External APIs → Anthropic API key → paste, save.
  4. Admin → Settings → Boarding Pass Parser → Add Claude to the cascade.

If you already have an Anthropic account for other things, save the extra signup. Quality difference between Claude and OpenAI vision on boarding passes is negligible — both are excellent. Pick whichever you have credit on.

You can configure both at once and order them in the cascade — TravStats tries one, falls back to the other if the first errors.

Bundled. No setup, no API key, runs inside the app container.

Tesseract is open-source OCR — it doesn’t “understand” images, just reads pixels into text. TravStats then runs heuristics on the output (regex for flight numbers, dates, IATA codes) to extract fields.

  • High-contrast, machine-printed passes
  • Modern e-tickets with crisp digital rendering
  • Photos taken straight on with good lighting
  • Hand-stamped boarding passes (regional carriers, older airports)
  • Photos at an angle or with glare
  • Passes printed on coloured backgrounds

When Tesseract fails or returns garbage, TravStats falls through to manual entry — pre-filling the form with whatever Tesseract did manage to read. Even a partial OCR pass usually saves you typing the flight number, so the parser is rarely worthless.

Tesseract speed and accuracy are tuned in backend/src/services/parsers/vision/. The defaults work for most use cases; advanced users can tweak language packs and PSM modes if they need to.

The right cascade depends on what you optimise for:

“Privacy first, never send my data anywhere”

Section titled ““Privacy first, never send my data anywhere””
Cascade: Ollama vision → Tesseract OCR → manual

Free, fully local, slower on hand-stamped passes. The compromise is that some passes will need manual entry.

”Speed and accuracy first, cost is fine”

Section titled “”Speed and accuracy first, cost is fine””
Cascade: OpenAI Vision → Tesseract OCR → manual

(Or Claude in place of OpenAI.) Per-scan cost is rounding-error small; quality is best-in-class.

Cascade: Ollama vision → OpenAI Vision → Tesseract OCR → manual

Tries local first (free, private). Falls back to OpenAI for the ~5–10% of passes Ollama struggles on (hand-stamped, low-contrast, unusual layouts). Total cost stays in the cents-per-year range for a typical user.

Cascade: OpenAI Vision → Tesseract OCR → manual

Skip Ollama entirely. No 5 GB model, no 4 GB RAM cap. Pi-friendly. Pay a couple cents per year in API costs.

Every parser receives the raw boarding-pass image — base64 encoded, ~50 KB to ~2 MB depending on resolution.

Nothing else leaves your network for that scan. Your name, your other flights, your TravStats credentials — none of that is visible to the parser. After extraction, you see the result on the review screen and confirm before anything saves.