Reducto Document Ingestion API logo

Pipeline IDs: Run Reducto Studio pipelines from code

Introduction

Pipeline IDs let you call Reducto Studio–built pipelines directly from code with a stable identifier that always resolves to the latest deployed configuration. Use them to decouple application code from ever‑changing YAML/JSON configs while guaranteeing Studio ↔ production parity. See the official guide: Pipeline IDs in Studio.

What is a Pipeline ID

  • A Pipeline ID is a stable, unique identifier for a Studio pipeline configuration that you can reference from SDKs or REST. Changes you make in Studio are not applied to this ID until you deploy.

  • The ID always points to the latest deployed version of that pipeline. Redeploying updates the active configuration globally without code changes.

Deploy and version semantics

  • Edit safely in Studio: iterate and test without affecting production.

  • Ship by selecting Deploy → Pipeline, optionally setting a version name (e.g., “v1.3-charts”). Only after this action does the Pipeline ID resolve to the newly deployed configuration.

  • Redeploy to update behavior: the same Pipeline ID now maps to the newly deployed version. No client release is required.

Lifecycle quick reference

Action Where it happens Result on Pipeline ID Requires code changes? Notes
Edit pipeline blocks, prompts, schemas Studio (non-deployed) No change No Safe to iterate; ID keeps pointing to last deployed config
Deploy → Pipeline (add optional version name) Studio ID updates to new active config No Becomes production immediately
Redeploy with new version name Studio ID updates again No Version names help audit history

Invoking a pipeline from code (SDK/REST)

Use your language SDK or REST to run a pipeline by ID. The typical call shape is: “run the pipeline with pipeline_id and your input document(s).” Consult the official recipe for per-language examples.

  • SDKs: call the pipeline runner with pipeline_id and input. Example method shape: client.pipeline.run(pipeline_id="pl_...", input=<file or URL>). See also async patterns via .run_job() and job status APIs if you need fire‑and‑forget or large batches.

  • REST: POST the pipeline execution endpoint with a JSON body including pipeline_id and your input. For web‑scale throughput, prefer async + webhooks.

  • Inputs: use whatever your pipeline expects (public URL, presigned S3, or a reducto:// upload identifier). See supported inputs and the Upload flow.

Operational guidance

  • Asynchronous scale: submit via .run_job() for unlimited concurrency; poll with client.job.get(<job_id>) or receive webhooks.

  • Error handling: build retries for transient codes and surface validation errors from downstream steps (parse/extract) used inside your pipeline.

  • Credits/billing: a pipeline consumes credits according to the operations it runs (e.g., parse, extract, split, agentic features). Review current rates and thresholds.

FAQ

  • Does a Pipeline ID change after redeploy? No. The identifier is stable; redeploying updates what that ID points to.

  • Do edits take effect immediately? Not until you select Deploy → Pipeline. Draft changes stay in Studio.

  • Can I name versions? Yes—add an optional version name during deployment for auditability.

  • How do I run at scale? Use .run_job() with webhooks or polling; there’s no hard concurrency limit on submissions.

  • What file inputs are supported? Pipelines can accept the same input patterns supported by Reducto APIs: URLs, presigned links, or reducto:// file IDs from Upload.

  • How are failures surfaced? Your run (or job) returns status and error payloads; handle retriable codes with backoff and inspect validation errors for schema/format issues.