Video-First Agentic AI for Insurance Subrogation at Scale

Overview

Scaled Tesla Insurance subrogation by introducing a video-first AI co-pilot, increasing recovered revenue by 2.5× and improving margins by ~35%.

Subrogation is the resolution of accident liability between insurance providers. Analyst teams process large volumes of multimodal material—videos, images, documents, and internal artifacts—across the phases of Assessment, Preparation, and Resolution. The core bottleneck was not analyst effort, but analyst judgment: deciding which cases were worth pursuing and why.

This system shifts that judgment earlier in the workflow. It analyzes incident video, maps observations to traffic law, and produces evidence-backed liability hypotheses that analysts can validate and act on. Assessment is automated end-to-end; Preparation efficiency improves by roughly 50%.

In simple terms, the system identifies which cases are likely to recover revenue and explains why, with traceable evidence.

Situation & Stakes

Subrogation analysts perform complex legal and factual reasoning over large, multimodal data sets
Analysts are expensive to hire, train, and retain
Analyst judgment capped revenue scale more than analyst throughput
Tens of thousands of potential cases existed, but only a fraction could be evaluated
Incremental AI (text parsing, document normalization) delivered marginal gains
The vendor relationship was approaching a revenue plateau
Competitive pressure was increasing as insurers explored AI-driven subrogation

Failure carried real risk:

For the vendor: loss of account credibility and likely replacement
For the client: missed growth opportunity and strategic disadvantage

This was a growth-critical initiative, not an experimental one.

Observations & Decisions

Optimize for revenue recovery, not analyst efficiency

The primary failure mode was missed recoveries, not slow processing. The system was explicitly optimized to maximize revenue potential, even at the cost of false positives. This trade-off accepted wasted analyst effort to avoid missing revenue—false negatives bury recoverable revenue; false positives only cost time.

Use a co-pilot model to generate learning signals

Fully automated judgment would be legally and operationally risky. The system retained human validation in exchange for slower automation, but this trade-off enabled continual learning rather than static automation. Analyst feedback generated structured training data that improved the system over time.

Go video-first (and video-only) for the MVP

This was the pivotal decision. Text-based AI systems already existed and provided marginal gains. The real signal—what actually happened during an accident—lived in Tesla’s multi-camera video feeds. Video could be decomposed into representative frames and reasoned over using multimodal models. This approach accepted technical risk to unlock a zero-to-one capability: video-first reasoning enabled case triage and liability assessment that text alone could not approximate. I derisked this decision by building prototypes that demonstrated end-to-end reasoning: observation → applicable law → liability conclusion, across multiple real incidents.

Enforce evidence-backed reasoning as a hard constraint

Unsubstantiated AI judgment is dangerous in legal workflows. The system is explicitly not allowed to present hypotheses without evidence—every liability claim had to be traceable to observable facts and applicable law. This constraint preserved analyst trust, legal defensibility, and brand safety.

System Design Overview

Product & solution design The system was designed as a review-gated judgment amplification system for revenue recovery, operating under legal and evidentiary constraints. The core design invariant was that the system may propose liability hypotheses but may not assert conclusions without traceable evidence. Rather than optimizing analyst throughput directly, the product optimizes case selection quality by aggressively surfacing revenue-positive opportunities, accepting false positives while minimizing false negatives. Human analysts retain full ownership of final determinations and submissions, while the system is explicitly constrained from generating final legal artifacts or compliance assertions. Learning is driven by analyst accept/reject decisions and correction traces, not by autonomous feedback loops.

System flow

Incident data (video, images, sensor data, documents) is ingested at case creation
Video is decomposed into representative frames and incident segments
Perception outputs are translated into structured incident narratives
Applicable traffic law and prior case precedents are retrieved and contextualized
The system generates a liability proposal package: hypothesis, supporting evidence, counter-arguments, and confidence qualifiers
Analyst reviews, validates, rejects, or modifies the proposal
Accepted cases proceed to preparation; rejected cases are logged with rationale
Analyst decisions feed back into training and guardrail refinement pipelines

Key components

Evidence-first reasoning engine: liability proposals are forbidden without explicit links to observed events and legal references.
Adversarial judgment modeling: internal “prosecution vs defense” reasoning cycles surface counter-arguments before analyst review.
Learning from rejection: false positives and rejected cases are retained as high-value negative signals, preventing optimism bias.
Domain adaptation pipeline: continuous DAPT improves interpretation of traffic law, incident semantics, and insurer-specific language.
Governance boundary: final legal posture and external communication remain human-owned, preserving accountability and regulatory safety.

Impact & Outcomes

Direct impact

2.5× increase in recovered revenue through aggressive, evidence-backed case triage
~35% margin improvement as analysts shifted from full analysis to validation

Second-order effects

Near real-time assessment enabled “instant payout” business models
Analyst confidence in AI increased materially
Vendor and client momentum accelerated, leading to follow-on AI initiatives

Reflection

This project clarified the co-pilot paradigm as a durable system design pattern: keeping humans in the loop not only preserves legal and compliance accountability, but creates the conditions for continuous learning as users move up the cognitive stack and the system absorbs their corrections. Subro also demonstrated that, for many judgment-heavy scenarios, converting video into structured visual evidence and then into text is not a compromise but an advantage—language becomes the medium for reasoning, explanation, and knowledge capture, even if it is slower than end-to-end reactive systems. Together, these insights point to a broader principle: systems that prioritize interpretability, evidence discipline, and human-guided learning can compound value over time, whereas automation-first approaches tend to stall once they exhaust their initial data or trust envelope.

Role & Scope

Role: Vendor-side CTO, Solution and Product Lead
Responsibility: Executive owner for solution design, prototyping, and end-to-end delivery; led pre-sales strategy, architecture, and customer validation
Authority: Full control over technical approach and delivery sequencing; influenced customer-side adoption and risk posture without formal authority
Team: Small senior team (2 forward-deployed engineers, 1 quality-focused engineer), scaled post-MVP
Primary interfaces: Client executive leadership, subrogation analysts, internal engineering and delivery teams