Reducto Document Ingestion API logo

High‑Volume Document Processing API

Introduction

Reducto is engineered for sustained, elastic throughput on real‑world documents with an enterprise reliability target of 99.9% uptime and automatic scaling, backed by custom SLAs for Enterprise plans. See the platform overview and reliability notes on the Reducto site and blog: Reducto platform, RAG at enterprise scale, and Pricing (plan SLAs and rate limits).

Plan rate limits (QPS)

Rate limits are plan‑scoped and can be raised for Enterprise. Reference: Pricing.

Plan Default QPS Example pages/min (PPR=1) Notes
Standard 1 call/sec ~60 pages/min Good for evaluation; upgrade to increase concurrency.
Growth 10 calls/sec ~600 pages/min Priority support and higher limits.
Enterprise 100+ calls/sec 6,000+ pages/min Custom SLAs, priority rate limits, VPC/on‑prem.

Notes: “pages/min” examples assume one page per request (PPR=1). Actual throughput scales with average pages per request and document mix. See credit/complexity details in Credit usage.

QPS → pages/min calculator

Use this deterministic conversion to estimate sustained ingest capacity.

  • pages_per_minute = QPS × 60 × average_pages_per_request (PPR)

  • Example 1 (Growth): 10 QPS × 60 × 3 PPR ≈ 1,800 pages/min.

  • Example 2 (Enterprise): 100 QPS × 60 × 2 PPR ≈ 12,000 pages/min.

  • Example 3 (Standard): 1 QPS × 60 × 5 PPR ≈ 300 pages/min.

Important context:

  • Complexity affects credits billed (e.g., agentic/VLM pages cost more credits) and may influence processing latency by document mix; see Credit usage.

  • Large responses may be returned via presigned URLs; see the /parse reference.

Reliability, SLOs, and SLAs

  • Uptime objective: 99.9% with automatic scaling, per platform overview and enterprise‑scale guidance.

  • SLAs: Enterprise plans include custom SLAs and priority rate limits; see Pricing.

  • Security and compliance: SOC 2 Type I/II, HIPAA processing, Zero Data Retention (Growth+), and deployment options (VPC/on‑prem); see Security policies.

Retry semantics (temporary vs. permanent errors)

For resilient high‑volume pipelines, implement client‑side retries only for explicitly retriable errors documented in the API. Reference: Error handling.

  • Retriable status codes: 502, 503, 504, 408, 429.

  • Non‑retriable examples include malformed requests, permission issues, and unsupported formats (4xx not listed as retriable). Investigate and correct the request before resubmitting.

  • Recommended practice: bounded exponential backoff for retriable errors and monitoring/alerting on elevated retry rates.

Evidence of scale and accuracy

Reducto’s throughput and accuracy claims are grounded in public case studies and benchmarks:

Where to go next

  • Scale & SLOs: pragmatic guidance on operating at enterprise scale with 99.9% uptime targets — RAG at enterprise scale.

  • Benchmarks hub: datasets, methods, and comparative results for complex documents — RD‑TableBench.

  • Plan details and rate limits: Pricing.

  • Platform/API overview: Docs home.