Inside the Canotera Pipeline: From Case File to Forecast

Processing multi-gigabyte case files requires strict architectural boundaries. We separate the language models that read medicals from the mathematical models that calculate settlement ranges.

By Tal Knafo
CTO
20 December 2025
5 min read

TL;DR — Generative AI should never predict claim outcomes. By restricting language models to extraction and using geometric machine-learning for forecasting, claims teams gain traceable data to set day-one reserves and detect escalation.

A new claim lands on an adjuster's desk as a massive PDF. Half of the file consists of boilerplate pleadings and procedural motions. A quarter is duplicated medical records from a hospital visit three years ago. Scattered somewhere in the remaining pages are the actual drivers of the claim's financial value. If you want to predict the trajectory of this litigation, you cannot rely on a human reading every page. You also cannot dump the entire file into a standard large language model and ask for a settlement number. Both approaches fail in production. Humans miss critical details buried on page two thousand, and language models invent facts when asked to perform complex mathematical reasoning over long contexts. We built the Canotera pipeline to solve this specific engineering problem. We separate the reading from the math.

Structuring the Unstructured

The ingestion layer is where most predictive systems choke. Case files arrive as massive, unstructured blobs of text, images, and scanned faxes. Our first task is parsing. We process these documents through a secure, isolated ingestion pipeline, standardizing the text before it reaches the models. This is where generative AI enters the architecture. We use large language models strictly as extraction engines. They never act as forecasting oracles. Their job is to read pleadings, medical reports, and correspondence, and structure that text into a rigid, predefined schema. They extract injury types, plaintiff demographics, jurisdiction details, and specific legal allegations.

We impose strict constraints on this phase to control output. Hallucinations are a known failure mode for generative AI. We mitigate this by requiring the extraction layer to bind every structured data point to its exact source location in the original document. When the system identifies a spinal surgery, it records the exact page and paragraph of the operative report. This provides traceability by design. An adjuster or defense attorney can click a driver and see the underlying text immediately. We do not predict anything here. We only build a high-fidelity, machine-readable representation of the case facts.

Security dictates our infrastructure choices at this stage. Claims files contain protected health information and highly sensitive litigation strategy. We do not send this data to public API endpoints. The extraction models run in isolated, tenant-specific environments. Data from one carrier never cross-pollinates with another during processing. We maintain strict access controls and encryption at rest and in transit. Once the text is structured and validated against our schema, the generative models hand off the structured payload to the prediction engine and spin down.

The Prediction Engine

Forecasting litigation outcomes requires an entirely different class of mathematics. We rely on geometric machine-learning models trained exclusively on large datasets of resolved cases with known outcomes. These models operate solely on the structured schema produced by the ingestion layer. They map the exact geometry of the current claim against historical precedents. We use architectures that handle tabular data exceptionally well and provide clear feature importance metrics.

A point estimate is useless in litigation. A model that predicts a case will settle for a single, specific dollar amount provides false precision and encourages bad reserving habits. Our prediction engine outputs calibrated settlement ranges. It calculates the probability distribution of potential outcomes based on the specific drivers extracted from the file. It evaluates how a specific venue historically impacts similar injury profiles. It calculates an escalation probability, flagging cases likely to breach current reserve limits or attract third-party litigation funding.

This calculation requires mapping the current claim to comparable resolved cases. The engine identifies the historical cases that most closely share the mathematical profile of the active file and surfaces these comparables to the user. This turns a black-box prediction into a defensible baseline. When an adjuster needs to set a realistic reserve on day one, they have the distribution curve, the historical precedents, and the specific case facts driving the math. They can see the reserve delta versus their current working number. They allocate defense spend based on actual risk exposure. They negotiate from data rather than relying on gut instinct.

Latency, Tradeoffs, and Delivery

Building this pipeline required explicit engineering tradeoffs. Processing thousands of pages through an extraction layer is computationally expensive. We optimized for accuracy and traceability over sub-second latency. A system designed to allocate millions in defense spend or spot nuclear verdicts early must be right, not instantaneous. Our pipeline processes complex, multi-gigabyte case files in minutes. This latency profile fits the reality of claims workflows. Adjusters do not need real-time streaming inference. They need a reliable, fully processed forecast ready when they open the file in the morning.

Delivery happens via our API. We designed the interface to integrate directly into existing claims management systems. The API accepts document payloads and returns the structured schema, the settlement range, the escalation probability, and the traceability links. We expose the exact drivers behind the math. If a specific jurisdiction or a combination of injuries pushes the settlement range higher, the API payload explicitly flags those variables. We provide endpoints for monitoring model drift, allowing engineering teams to verify that our predictions remain calibrated as new data enters the system. Onboarding new carriers involves mapping their historical claims data to our schema to ensure the prediction models understand their specific operational baselines. We map the data, configure the secure tenant environments, and point the API at their document stores.

This architecture reflects the reality of modern litigation. Social inflation and reserve volatility are destroying traditional actuarial models. Those older models rely on structured data entered manually by overburdened adjusters, which often lags months behind the actual ground truth of the case. By automating the extraction of facts from the raw documents and applying rigorous, calibrated math to those facts, we close the gap between the document payload and the final settlement check.

You cannot out-negotiate an opponent if you are still trying to read the file.

Want to talk to an executive?

Press, partners, investors, candidates — the inbox is monitored. Tell us who you are and we'll route it to the right person within two business days.

Book a Demo See Open Roles

Inside the Canotera Pipeline: From Case File to Forecast

Structuring the Unstructured

The Prediction Engine

Latency, Tradeoffs, and Delivery

Related articles.

Want to talk to an executive?