A local-first macOS app (Electron + React, better-sqlite3) reads the master tour spreadsheet — ten tours with real departures, prices, policies, FAQ, templates, and rentals — into SQLite. A two-stage pipeline runs per message: Gemini Flash-Lite triages and extracts client facts, then a GPT mini model retrieves the relevant tours and policies and writes a draft that answers the client's latest message and cites its sources, using the rest of the thread only as context so settled points aren't re-answered.
Safety is two independent layers. Every draft passes a model QA review and a model-independent guardrail scan that flags any price or term not found in the source data, with sixteen forbidden-claims rules injected into every prompt; values marked needs_owner_review are never quoted. It degrades gracefully offline — no cloud model falls back to owner-approved templates, no Google falls back to the local .xlsx — and an optional on-device Ollama model (qwen3) keeps the whole thing local. Gmail sync pulls threads and sends approved replies in-thread.