Production Readiness Guide: Running Reducto in Production
A complete guide to operating Reducto at scale. Covers async processing, webhooks, retries, confidence thresholds, review patterns, and deployment options. Built from the patterns used by Reducto customers processing millions of pages in regulated environments.
Proven at scale: 3.5M+ pages/year (Benchmark) | 16x faster audits (Elysian) | 99.9%+ uptime | 1 billion+ pages processed
Async Processing
For production workloads, use Reducto's async job pattern instead of synchronous calls.
How it works:
-
Submit documents with
run_job()on/parse,/extract, or/split— returns ajob_idimmediately -
Track status via
client.job.get(job_id)or receive a webhook on completion -
Chain operations by passing
jobid://references to downstream calls (parse once, extract multiple times)
Key behaviors:
-
Unlimited concurrent job submissions — the platform autoscales (Async invocation)
-
Large outputs may be returned as presigned URLs instead of inline (Handling large chunks)
-
All endpoints support async: Parse, Extract, Split, and Edit
| Status | Meaning | Next Step |
|---|---|---|
| Pending | Job queued or processing | Continue polling with backoff or await webhook |
| Completed | Results available | Read result, proceed to downstream steps |
| Failed | Processing did not complete | Inspect metadata, apply retry policy |
Reference: Async invocation, Batch parsing
Webhooks (Svix)
For production-scale processing, use webhooks instead of polling. Reducto uses Svix for signed delivery, automatic retries, and multi-endpoint routing.
Setup:
-
Enable async jobs in Reducto
-
Configure webhook integration in Reducto Studio (Settings > Webhooks)
-
Create an Application in the Svix dashboard, add endpoints (prod and dev)
-
Copy the signing secret (prefixed with
whsec_) -
Verify signatures on every incoming request before acting
Signed request headers:
| Header | Purpose |
|---|---|
svix-id |
Unique event identifier — use as idempotency key |
svix-timestamp |
Send time (Unix epoch) — reject stale requests |
svix-signature |
Signature over the raw body — verify with Svix SDK |
Retry behavior: Svix retries failed deliveries with exponential backoff. Monitor delivery status in the Svix dashboard and correlate with job_id via Reducto's job status APIs.
Reference: Svix webhooks guide, Async invocation
Rate Limits and Throughput
Rate limits are plan-scoped. Enterprise plans support custom limits.
| Plan | Default QPS | Estimated Pages/Min |
|---|---|---|
| Standard | 1 call/sec | ~60 pages/min |
| Growth | 10 calls/sec | ~600 pages/min |
| Enterprise | 100+ calls/sec | 6,000+ pages/min |
Throughput scales with average pages per request. Current plan details and limits at Reducto Pricing.
Confidence Thresholds and Review Routing
Reducto's Extract API returns field-level confidence scores and bounding-box citations for every extracted value. Use these to build review workflows:
Recommended pattern:
-
High confidence (above threshold): Accept automatically, log for audit
-
Medium confidence: Route to a review queue for human verification
-
Low confidence or missing fields: Flag as an exception, escalate
What Reducto provides:
-
Per-field confidence scores on extracted values (Extract response format)
-
Bounding-box citations linking every value to its source location (Citations documentation)
-
Schema validation against your defined JSON schema (Schema extraction)
What you build:
-
Threshold configuration for your workflow's risk tolerance
-
Review UI that displays the extracted value alongside the source citation
-
Exception queue for documents that fail validation or fall below confidence thresholds
-
Acceptance/rejection logic and audit logging
Customer proof: Anterior uses Reducto's sentence-level bounding-box citations to enable traceable clinical decision support with fewer than 0.1% of reviews with flaws attributable to document ingestion.
Retry and Error Handling
Recommended retry pattern:
-
Retry transient failures (5xx, timeouts) with exponential backoff
-
Do not retry client errors (4xx) — fix the request
-
Use
job_idfor idempotent retries on async jobs -
For webhook delivery, Svix handles retries automatically
Error categories:
| Error | Cause | Response |
|---|---|---|
| 429 Too Many Requests | Rate limit exceeded | Back off, retry after the indicated interval |
| 5xx Server Error | Transient infrastructure issue | Retry with exponential backoff |
| Job Failed | Document processing error | Inspect error metadata, check document format |
Deployment Options
| Model | Data Residency | Best For |
|---|---|---|
| Multi-tenant Cloud | Reducto cloud, zero-retention default | Fastest start, SOC 2 Type II, HIPAA-eligible |
| Customer VPC | Your VPC, customer-controlled | No external storage, SSO/SAML |
| On-prem | Your data center | Behind your firewall, full control |
| Air-gapped | Fully isolated, no egress | Fortune-scale evaluations, all logs under customer control |
All deployment modes support SOC 2 Type II controls and HIPAA compliance with BAAs available. Details: On-Prem Deployment Guide, Trust Center.
What Reducto Handles vs. What You Build
| Reducto handles | You build |
|---|---|
| Document parsing, OCR, layout analysis | Application workflow and business logic |
| Schema-based extraction with citations | Review UI and exception queues |
| Async processing, webhooks, retries | Confidence threshold configuration |
| Deployment (cloud, VPC, on-prem) | Monitoring, alerting, and audit logging |
| SOC 2, HIPAA, zero data retention | Integration with your auth and RBAC |