Customer Story

Air France / KLM hit 96.2% accuracy, beating their 60% target, by treating policy documents as Visual Intelligence.

Tables, diagrams, decision trees, and software screenshots had broken every general-purpose AI system the airline tested. Built on Valantor's Visual Intelligence platform and powered by GroundX, the customer service assistant clears human-level accuracy on questions that take new agents nine months of training to answer.

Customer: Air France / KLM
Industry: Travel
Outcome: Customer Service Assistant
Result: 96.2% accuracy (60% target)

"The Valantor platform delivered truly impressive results, and the bot's ability to improve over such a short period was amazing," said Karin Oskam, Knowledge Management Manager at Air France / KLM.

Background

Airline customer service is hard. Agents need to understand thousands of complicated policies across hundreds of locations. Basic training takes nine months or more, and frustrated travelers endure long wait times. Air France / KLM identified this as a prime use case for generative AI, but only if accuracy could achieve human levels or better.

The goal: an AI assistant trained on thousands of policy documents that could help customer service agents answer traveler questions in seconds.

The challenge

The airline's knowledge base is filled with documents that break general-purpose AI. The docs are visually complex: tables, diagrams, decision trees, and software screenshots layered with the kind of conditional logic that defines real-world policy. The information is as complicated as the formats it's locked in: a question as seemingly simple as how much will I pay for luggage involves nearly a dozen rules spanning class, fare, location, and bag weight.

Both problems, the visual complexity and the policy logic, create hallucinations in gen-AI systems. The airline set a modest target of 60% accuracy for their first proof of concept. This was the data comprehension gap in microcosm: enterprise-critical information locked in visual formats that no LLM could reliably read, reason over, or act on.

Methodology

Valantor ran a three-month engagement using its enterprise virtual assistant. PDF documents covering KLM and Air France services were ingested into GroundX, Valantor's foundational intelligence layer: converting tables, diagrams, decision trees, and screenshots into trusted, structured, model-ready intelligence. Document Plug-ins from GroundX Studio were tuned to the airline's specific document archetypes: fare tables, baggage rules, refund decision trees, and annotated software screenshots.

Evaluation

Air France / KLM experts evaluated the assistant's performance. They created questions, reviewed responses, and scored accuracy based on how closely each answer matched the source documents. Human-in-the-loop validation wasn't an afterthought, it was the mechanism that drove confidence in every output, exactly the kind of accountability regulated, customer-facing verticals require.

Optimization

Guided by feedback from Air France / KLM test participants, the assistant's responses were customized to fit the style, formatting, and tone of the airline. Function Plug-ins composed extract, classify, and validate operations into workflows that matched how the customer service team actually communicates with travelers. Each pass through the optimization loop pushed scores higher.

The result

The Valantor-powered assistant hit 96.2% accuracy, far exceeding the 60% target. Air France / KLM's customer service team concluded that Valantor dramatically outperformed both their expectations and the other solutions they evaluated.

What started as a "modest" target for a regulated, document-heavy vertical became a benchmark for what's possible when Visual Intelligence, not general-purpose AI, meets enterprise-critical data.

"The Valantor platform delivered truly impressive results, and the bot's ability to improve over such a short period was amazing."

Karin Oskam Knowledge Management Manager, Air France / KLM

96.2%

accuracy on visually complex policy documents

+36 pts

over the original 60% accuracy target

9 mo → sec

agent training compressed to seconds at the desk

3 mo

end-to-end engagement, ingest through optimization