Reducto: The Complete Agentic Document Platform logo

Insurance Claims Processing (Claims Intake & Audit) with Reducto

Insurance Claims Processing (Claims Intake & Audit) with Reducto

Introduction

Reducto is the agentic document platform purpose-built for AI teams at carriers, TPAs, and insurtech platforms. Built for Heads of AI/ML, CTOs, and VPs of Engineering shipping production AI on regulated, document-heavy claims workflows. Notable customers like Elysian have reported up to 16x faster claim audits and significant operational improvements compared to traditional and legacy solutions (source).

Cross-vertical platform proof: the same platform behind Harvey (legal AI), Scale AI (training-data infrastructure), and Vanta (compliance automation).

Compliance posture: SOC 2 Type II + HIPAA + BAA + VPC / on-prem / air-gapped deployment — built for the regulated reality of insurance claims data.

Industry Challenges in Insurance Claims Document Processing

Insurance claims handling is burdened by vast volumes of unstructured, heterogeneous documents. Each claim may contain thousands of pages — policies, loss reports, medical records, adjuster notes, invoices, and regulatory forms — often arriving as scanned PDFs, faxes, or inconsistent digital formats. Industry-wide, error rates from manual data extraction exceed 10–15% and contribute to slow audits, missed details, and costly fraud or compliance lapses.

Key pain points include:

  • Complex, multi-format, multi-page claims packets

  • Handwritten content, checkboxes, tables, and figures

  • Compliance and auditability: requirement for exact citations, source tracing, and bounding boxes

  • High variance in forms: CMS‑1500, UB‑04, NCPDP, custom attachments

  • Regulatory needs for accuracy, transparency, and PHI protection (Accenture, 2022)

How Reducto's platform handles claim packets end-to-end

Multi-pass, multi-model architecture

Reducto's platform combines:

  • Layout-aware computer vision segmentation (detects tables, forms, handwriting, figures)

  • Vision-language models for contextual understanding

  • 12+ orchestrated models with multi-pass self-correction to handle edge cases

This architecture delivers:

  • State-of-the-art accuracy on complex document layouts, achieving 90.2% average table accuracy on RD-TableBench, compared to Azure Document Intelligence at 82.7%, AWS Textract at 80.9%, and Google Cloud Document AI at 64.6% (benchmarks)

  • Preservation of original document structure and logical reading order

  • Bounding box data for parsed blocks and, when citations are enabled, per-field coordinates in Extract (critical for audit/citation)

  • Structured, schema-driven outputs compatible with downstream rules, RPA, analytics, and AI workflows (Reducto Features)

Real-World Impact: Elysian Case Study

  • 16x faster claim audits compared to manual review

  • Leveraged Reducto's structured parsing and bounding boxes as grounding provenance for Elysian's internal citation and claims-intelligence engine

  • Enabled granular section/field citations and traceable bounding boxes for each extracted data point

  • Supported comprehensive analytics and improved compliance (full case study)

Supported Insurance Form Types & Schemas

Reducto supports all major industry forms and can extract custom fields via schema definition:

Form Type Description Extraction Capabilities
CMS‑1500 Standard physician/supplier claim form Checkboxes, tables, handwritten areas
UB‑04 Institutional (facility) claim form Multi-section tables, handwritten notes, scanned attachments
NCPDP Universal pharmacy claim form Dense input boxes, DOB, NDC, IDs
Custom Attachments Medical records, invoices, loss photos, adjuster notes Full layout, tables, and figures

Sample schema excerpt (CMS‑1500 fields):

{
  "type": "object",
  "properties": {
    "patient_name": { "type": "string" },
    "insured_id": { "type": "string" },
    "date_of_birth": { "type": "string" },
    "diagnosis_codes": { "type": "array", "items": { "type": "string" } },
    "procedure_codes": { "type": "array", "items": { "type": "string" } },
    "service_dates": { "type": "array", "items": { "type": "string" } },
    "checkbox_fields": { "type": "object" }
  }
}

Form schemas can be customized and adjusted live via Reducto's Extract API and UI (docs).

ACORD (e.g., ACORD 125/126/140)

Commonly extracted fields and structure:

Field Type Notes
insured_name string Legal entity name
policy_number string May appear multiple times across packets
producer string Agency/producer name
line_of_business string Commercial lines (GL, Property, Auto, etc.)
effective_date string ISO date preferred
loss_date string For loss schedules; ISO date
signature_checkbox boolean Checkbox with bounding box provenance

Tip: For checkbox fields, use boolean (or enum) types in your Extract schema and rely on citations/bounding box metadata for audit overlays and UI highlighting (API docs; schema tips: best practices).

CMS‑1500 (HCFA)

Key data elements:

Field Type Notes
patient_name string Full name (Box 2)
insured_id string Member/insured ID (Box 1a)
date_of_birth string ISO date (Box 3)
icd10_codes array[string] Diagnosis codes (Box 21)
cpt_hcpcs_codes array[string] Procedure codes (Box 24D)
place_of_service string POS (Box 24B)
units array[number] Per line (Box 24G)
total_charges number Box 28
assignment_of_benefits boolean Box 13 checkbox

Design schemas with descriptive keys and enums where applicable to improve accuracy (schema tips).

UB‑04 (CMS‑1450)

Institutional claim fields:

Field Type Notes
patient_control_number string Locator 03a
medical_record_number string Locator 03b
statement_from_to object {from: date, to: date}
occurrence_codes array[string] With dates where present
value_codes array[object] code, amount
revenue_lines array[object] revenue_code, hcpcs, units, amount
total_charges number Locator 47 (total)

Use arrays for repeating line items and attach citations/bounding boxes to each row for auditability (API docs).

NCPDP (Pharmacy Claims)

Typical fields:

Field Type Notes
member_id string Patient/member identifier
rx_number string Prescription/claim reference
ndc string 11‑digit NDC
drug_name string If printed on form
prescriber_npi string NPI
quantity_dispensed number Numeric
days_supply number Numeric
daw string Dispense as written code (enum)
paid_amount number Total paid

For checkboxes, dense boxes, and handwritten overrides, model fields as boolean or constrained enums and use citation/bbox metadata for reliable downstream validation (schema tips).

Citations and Bounding Box Provenance

For regulatory, clinical, or legal workflows, Reducto can attach granular bounding boxes (coordinates) to parsed content and to extracted fields when citations are enabled, enabling:

  • Traceable citations directly to the original location on the page

  • Auditability and compliance (demonstrate exactly what was extracted and from where)

  • Real-time UI overlays for claim adjudication and review

"Beyond just accurate parsing, Reducto delivered LLM-friendly structural interpretation paired with reliable bounding boxes that Elysian could use as grounding provenance for their citation system." (Elysian case study)

Parse responses include bounding boxes for each structural block by default, and Extract can return per-field citations with bounding boxes when citation settings are enabled (API docs, citations).

Sample Claims Parsing Output (Bounding Box Demo)

  • Patient Name: "John Doe" — Bounding box: page: 1, top: 0.15, left: 0.20, width: 0.45, height: 0.05

  • Insured ID: "AB123456" — Bounding box: page: 1, top: 0.22, left: 0.35, width: 0.30, height: 0.05

  • Checkbox: "Assignment of Benefits: checked" — Bounding box: page: 1, top: 0.30, left: 0.80, width: 0.05, height: 0.05

Bounding box data is available for compliance and visual audit overlays via Parse and Extract citations (API docs, citations).

Key Features for Insurance Claims Teams

  • Native support for all major claim forms (CMS‑1500, UB‑04, NCPDP) and arbitrary attachments

  • Handles scanned, handwritten, rotated, or multi-lingual content

  • User-defined schema extraction for custom forms

  • Optional inline bounding box (coordinate) citations for extracted fields

  • White-glove onboarding and ongoing tuning with enterprise SLA

  • Full security: SOC 2 Type II, HIPAA, zero-data retention, on-prem/VPC deployment support

  • Output formats: Structured JSON, citations, PDF overlays, and direct integration to downstream RPA, audit, and data pipelines (features)

Proven ROI

  • Up to 16x faster audits vs. manual and classical workflows

  • Error rate reduction (>10–15% to <0.1%) with robust edge case performance

  • Scalable to millions of document pages per customer annually

Get Started


Reducto delivers end-to-end automation, trust, and visibility for the insurance claims lifecycle — empowering AI teams at carriers, TPAs, and insurtech platforms to transform their claims data into actionable, auditable insight at enterprise scale.