AI FinOps for a Travel Platform: Finding €180K in Wasted Agent Spend
Context and challenge
A mid-sized travel platform—selling flights, hotels, and packaged trips across multiple markets—had steadily expanded its AI footprint over 18 months. What started as a single customer support assistant became 14 AI agents operating across:
- Customer support (triage, refunds, itinerary changes, disruption handling)
- Pricing and revenue (rate monitoring, competitive scanning, promo recommendations)
- Operations (supplier communications, internal ticket routing, policy Q&A)
The agents were owned by different teams, built on different frameworks, and billed through a mix of per-token usage, per-seat tools, and cloud infrastructure. Individually, each agent looked inexpensive. Collectively, AI spend had become one of the fastest-growing line items in the operating budget.
The leadership team’s concern wasn’t just cost—it was uncertainty:
- Which agents were truly driving measurable outcomes?
- Where was spend growing fastest, and why?
- Were multiple agents solving the same problem?
- How could the company scale AI responsibly without stifling innovation?
The platform initiated an AI FinOps audit—a structured approach to managing AI costs with the same rigor as cloud FinOps—aimed at improving ROI, governance, and accountability.
Approach and solution
The audit was designed to avoid a common failure mode: focusing only on invoices. Instead, it connected spend to usage, usage to output, and output to business value.
1) Build an AI spend map (end-to-end)
The first step was assembling a comprehensive view of all AI-related costs, not just model API bills. The audit captured:
- Model usage (tokens, requests, context window patterns)
- Orchestration costs (agent frameworks, workflow tools)
- Vector search and storage (embeddings, retrieval infrastructure)
- Human-in-the-loop costs (review queues, escalations)
- Hidden duplications (multiple tools doing similar tasks across teams)
Every agent received a unique cost center and owner. That owner was accountable for both spend and a clear outcome metric.
2) Define “measurable output” and ROI per agent
Many teams described success in vague terms: “faster,” “smarter,” “less manual work.” The audit introduced a lightweight measurement framework:
- Primary metric (e.g., deflected tickets, time-to-resolution, price updates issued, supplier responses processed)
- Quality metric (e.g., customer satisfaction for support, accuracy for pricing actions, error rate for ops)
- Economic value proxy (e.g., cost per ticket, cost per manual action, revenue lift assumptions)
Where exact causal measurement wasn’t possible, the audit used approximate value ranges and required teams to document the assumptions. The goal wasn’t perfect attribution—it was decision-grade clarity.
3) Identify redundancy and overlapping scope
The audit then examined intent and function, not just technical implementation. It asked:
- Does another agent already produce the same outcome?
- Are two agents using different prompts to solve identical tickets?
- Are separate teams maintaining parallel knowledge bases?
This revealed two types of redundancy:
- Functional overlap: agents performing the same task with similar outputs
- Operational overlap: separate agents requiring separate retrieval pipelines, monitoring, and triage—even when the underlying work was identical
4) Pinpoint “cost hotspots” and waste patterns
The team reviewed usage telemetry and found several recurring patterns:
- Long prompts and oversized context windows used by default, even for simple tasks
- Always-on agents polling systems continuously instead of event-driven triggers
- Low-confidence outputs that still consumed expensive multi-step reasoning workflows
- No guardrails on when the agent should stop and escalate to a human
One agent in particular stood out: it consumed 34% of total AI spend yet produced no measurable output. It had been created to “optimize operations,” but it lacked defined KPIs, and its outputs were not connected to any workflow where actions were taken. In practice, it generated summaries and recommendations that were rarely read, never tracked, and not used to make decisions.
5) Implement controls without blocking experimentation
Rather than shutting down AI innovation, the audit introduced a set of FinOps-style controls:
- Budget thresholds and alerts by agent (weekly and monthly)
- Standardized KPI templates required for production agents
- Tiered model routing (cheap model first; escalate only when needed)
- Event-driven execution replacing always-on polling where possible
- Sunset criteria for agents without adoption or measurable impact
Crucially, the company created a simple governance rule: No agent remains in production without a defined owner, metric, and review cadence.
Results
Within the audit window, the travel platform identified €180K in recoverable annual waste from its AI agent program. The waste fell into three buckets:
1) Three agents with negative ROI
These agents incurred ongoing spend but failed to generate sufficient value when measured against operational cost proxies. Common reasons included:
- High escalation rates leading to limited deflection
- Inconsistent accuracy causing rework
- Outputs not integrated into systems where actions were taken
The remediation plan varied by agent:
- One was retired and its small set of useful prompts were absorbed into a broader support agent.
- One was re-scoped to a narrower domain with tighter retrieval and clear guardrails.
- One was downgraded to a lower-cost model with stricter usage triggers.
2) Two agents with redundant function
Two agents were effectively solving the same workflow in different departments—one in customer support operations and one in general operations. They used different tooling and prompts but produced overlapping outputs (internal ticket routing and policy Q&A).
The platform consolidated them into a single agent with:
- A shared knowledge base
- A common set of intents and routing logic
- Department-specific reporting views
This reduced duplicated infrastructure and simplified monitoring and maintenance.
3) One agent consuming 34% of spend with no measurable output
This was the most material finding. The agent’s cost profile was driven by:
- Frequent execution cycles
- Complex multi-step prompting
- Large retrieval contexts
- No clear gating criteria for when it should run
Because the agent’s output wasn’t tied to any workflow KPI, its “impact” was essentially invisible. The audit team paused it, then worked with operations leadership to decide whether it should be rebuilt with measurable objectives or permanently retired.
The resolution: the agent was taken out of production, and a smaller, event-triggered reporting function was created that generated only the specific summaries teams actually used—tracked via internal views and downstream actions.
Key takeaways
- AI spend isn’t just model tokens. Agent costs hide in orchestration, retrieval, monitoring, and human review queues. A true AI FinOps view must include the full stack.
- Unmeasured agents become cost sinks. If an agent doesn’t have a defined output metric and an action pathway, it can consume budget indefinitely without anyone noticing.
- Redundancy happens naturally in multi-team environments. When departments build independently, overlap is inevitable. Consolidation reduces not just spend but operational complexity.
- Negative ROI agents are often “almost good” agents. The fix is frequently tighter scope, better gating, or cheaper model routing—not necessarily a full shutdown.
- Governance doesn’t need to be heavy. A few simple rules—owner, budget, KPI, review cadence—can prevent runaway costs while preserving experimentation.
- The biggest wins come from tying AI to workflows. Agents that don’t trigger actions, update systems, or reduce time-to-resolution are at high risk of being “AI theater.”
By treating AI agents as financial assets—measured, reviewed, and optimized—the travel platform was able to recover €180K annually in waste while improving clarity on where AI genuinely moved the business forward. The audit didn’t slow down innovation; it gave it structure, making future deployments easier to justify, scale, and trust.