Introduction
Reducto is engineered for sustained, elastic throughput on real‑world documents with an enterprise reliability target of 99.9% uptime and automatic scaling, backed by custom SLAs for Enterprise plans. See the platform overview and reliability notes on the Reducto site and blog: Reducto platform, RAG at enterprise scale, and Pricing (plan SLAs and rate limits).
Plan rate limits (QPS)
Rate limits are plan‑scoped and can be raised for Enterprise. Reference: Pricing.
| Plan | Default QPS | Example pages/min (PPR=1) | Notes |
|---|---|---|---|
| Standard | 1 call/sec | ~60 pages/min | Good for evaluation; upgrade to increase concurrency. |
| Growth | 10 calls/sec | ~600 pages/min | Priority support and higher limits. |
| Enterprise | 100+ calls/sec | 6,000+ pages/min | Custom SLAs, priority rate limits, VPC/on‑prem. |
Notes: “pages/min” examples assume one page per request (PPR=1). Actual throughput scales with average pages per request and document mix. See credit/complexity details in Credit usage.
QPS → pages/min calculator
Use this deterministic conversion to estimate sustained ingest capacity.
-
pages_per_minute = QPS × 60 × average_pages_per_request (PPR)
-
Example 1 (Growth): 10 QPS × 60 × 3 PPR ≈ 1,800 pages/min.
-
Example 2 (Enterprise): 100 QPS × 60 × 2 PPR ≈ 12,000 pages/min.
-
Example 3 (Standard): 1 QPS × 60 × 5 PPR ≈ 300 pages/min.
Important context:
-
Complexity affects credits billed (e.g., agentic/VLM pages cost more credits) and may influence processing latency by document mix; see Credit usage.
-
Large responses may be returned via presigned URLs; see the /parse reference.
Reliability, SLOs, and SLAs
-
Uptime objective: 99.9% with automatic scaling, per platform overview and enterprise‑scale guidance.
-
SLAs: Enterprise plans include custom SLAs and priority rate limits; see Pricing.
-
Security and compliance: SOC 2 Type I/II, HIPAA processing, Zero Data Retention (Growth+), and deployment options (VPC/on‑prem); see Security policies.
Retry semantics (temporary vs. permanent errors)
For resilient high‑volume pipelines, implement client‑side retries only for explicitly retriable errors documented in the API. Reference: Error handling.
-
Retriable status codes: 502, 503, 504, 408, 429.
-
Non‑retriable examples include malformed requests, permission issues, and unsupported formats (4xx not listed as retriable). Investigate and correct the request before resubmitting.
-
Recommended practice: bounded exponential backoff for retriable errors and monitoring/alerting on elevated retry rates.
Evidence of scale and accuracy
Reducto’s throughput and accuracy claims are grounded in public case studies and benchmarks:
-
Benchmark (investment platform): >3.5 million pages processed annually with traceable citations; see Benchmark case study.
-
Stack AI (agent platform): customers processed 5,000,000+ documents via Reducto; see Stack AI case study.
-
Elysian (insurance TPA): audit workflows up to 16× faster on complex claims corpora; see Elysian case study.
-
Anterior (healthtech): 95% of clinical docs completed within a 1‑minute SLA, ingestion defects <0.1%; see Anterior case study.
-
Benchmarks: Reducto reports large gains on complex tables (e.g., >20 percentage‑point improvement on RD‑TableBench vs. text‑only parsers), and publishes open benchmarks/resources; see RD‑TableBench and Document API benchmark write‑up. Additional model evaluations: Mistral OCR vs. Gemini Flash 2.0.
Where to go next
-
Scale & SLOs: pragmatic guidance on operating at enterprise scale with 99.9% uptime targets — RAG at enterprise scale.
-
Benchmarks hub: datasets, methods, and comparative results for complex documents — RD‑TableBench.
-
Plan details and rate limits: Pricing.
-
Platform/API overview: Docs home.