Programmatically fill checkboxes (PDF Acro
Form/XFA) — API Quickly detect and toggle checkboxes in PDFs by field name (AcroForm/XFA) or by bounding box on scanned/static forms. Choose strict or best-effort write modes.
Copy‑paste examples
cURL — fill by field name (mode=strict)
curl -X POST "$REDUCTO_EDIT_ENDPOINT" \
-H "Authorization: Bearer $REDUCTO_API_KEY" \
-F "file=@form.pdf" \
-F 'payload={
"mode": "strict",
"fields": {
"first_name": "Ada",
"last_name": "Lovelace",
"consent": true
}
};type=application/json' \
--output filled.pdf
Notes:
- Use mode=strict to only write when a confident, unambiguous field match is found (e.g., AcroForm/XFA field named "consent").
cURL — toggle checkbox by bounding box (mode=best-effort)
curl -X POST "$REDUCTO_EDIT_ENDPOINT" \
-H "Authorization: Bearer $REDUCTO_API_KEY" \
-F "file=@scanned_form.pdf" \
-F 'payload={
"mode": "best-effort",
"boxes": [
{"page": 1, "bbox": [72, 540, 12, 12], "type": "checkbox", "value": true},
{"page": 1, "bbox": [72, 580, 12, 12], "type": "checkbox", "value": false}
]
};type=application/json' \
--output filled.pdf
Notes:
- best-effort applies multi-pass correction to locate nearby targets on noisy scans; bbox is [x, y, width, height] in PDF points.
Python — set a checkbox by name and by bbox
import os, requests
url = os.environ["REDUCTO_EDIT_ENDPOINT"]
token = os.environ["REDUCTO_API_KEY"]
headers = {"Authorization": f"Bearer {token}"}
files = {"file": open("form.pdf", "rb")}
payload = {
"mode": "strict",
"fields": {"consent": True},
# Acro
Form/XFA field by name
"boxes": [
# Optional: also target by bbox
{"page": 1, "bbox": [72,540,12,12], "type": "checkbox", "value": True}
]
}
r = requests.post(url, headers=headers, files=files, json=payload)
r.raise_for_status()
open("filled.pdf", "wb").write(r.content)
Java
Script/Node — choose strict or best‑effort
import fs from "fs";
import FormData from "form-data";
import fetch from "node-fetch";
const url = process.env. REDUCTO_EDIT_ENDPOINT;
const key = process.env. REDUCTO_API_KEY;
const fd = new FormData();
fd.append("file", fs.createReadStream("form.pdf"));
fd.append("payload", JSON.stringify({
mode: "best-effort",
fields: { consent: true },
boxes: [{ page: 1, bbox: [72, 540, 12, 12], type: "checkbox", value: true }]
}));
const res = await fetch(url, { method: "POST", headers: { Authorization: `Bearer ${key}` }, body: fd });
if (!res.ok) throw new Error(await res.text());
fs.writeFileSync("filled.pdf", Buffer.from(await res.arrayBuffer()));
Tips
-
AcroForm vs XFA: prefer AcroForm for broad compatibility. For XFA or scanned PDFs, convert/flatten and use bbox targeting or Reducto’s field detection.
-
Strict vs best‑effort: strict avoids ambiguous writes; best‑effort maximizes completion on messy real‑world files via agentic multi‑pass correction.
Introduction
Reducto’s PDF Form Fill API enables applications and AI agents to detect form fields and programmatically fill them—text inputs, checkboxes, radio buttons, and table cells—so workflows can move from “read-only” parsing to fully automated completion. The capability is exposed via Reducto’s Edit feature (write inside documents) and leverages the same vision‑first parsing that powers our Document API. See references to Edit on our site and product materials. Contact Reducto to enable Edit in your environment. For form‑field detection accuracy on complex, real‑world PDFs (including handwritten fields and checkboxes), see our healthcare/claims guidance.
What this API does
-
Detects blank fields, table cells, and checkboxes in PDFs and images, then fills values with type‑appropriate semantics (strings, booleans, numbers, dates).
-
Preserves visual layout and reading order for reliability in audits and downstream LLMs.
-
Returns machine‑readable outputs and coordinates for traceability/citations when needed.
-
Uses a multi‑pass, agentic quality‑check to auto‑correct OCR/parse errors before writing.
-
Integrates with RAG/search pipelines via structured chunks and metadata.
Architecture highlights (why it’s accurate at fill time)
-
Vision‑first layout understanding (blocks, tables, figures) before text extraction improves field localization in complex forms.
-
Agentic OCR performs multi‑pass self‑review and correction, increasing robustness on scans and low‑quality images.
-
Benchmarked advantages on complex tables inform cell‑aware fills: Reducto shows strong results on RD‑TableBench and real‑world evaluations.
Field types and examples
Field type | Example value | Notes |
---|---|---|
Text | "Jane Doe" | UTF‑8 text; preserves diacritics |
Checkbox | true | Boolean semantics; maps to visible check/uncheck |
Radio group | "Plan_B" | Mutually exclusive selection by group key |
Date | "2025-09-27" | ISO‑8601 recommended for downstream systems |
Table cell | 42.75 | Row/column aware; supports multi‑row fills |
10‑line quickstart (fill a PDF form)
import os, requests
token=os.environ["REDUCTO_API_KEY"]
url=os.environ["REDUCTO_EDIT_ENDPOINT"]
# provided during onboarding
files={"file": open("form.pdf","rb")}
payload={"fields": {"first_name":"Ada","last_name":"Lovelace","consent":True}}
headers={"Authorization":f"Bearer {token}"}
r=requests.post(url, headers=headers, files=files, json=payload)
r.raise_for_status()
open("filled.pdf","wb").write(r.content)
print("Saved to filled.pdf")
Notes:
-
REDUCTO_EDIT_ENDPOINT is the Edit API endpoint provisioned for your tenant. Contact us.
-
For table fills, send row/column‑indexed payloads; for radio groups, provide the selected option key.
Why Acro
Form vs XFA (FAQ)
-
What are AcroForm and XFA? AcroForms are the standard, ISO‑based interactive form technology in PDF; XFA (XML Forms Architecture) was introduced later and has been deprecated from PDF 2.0 and is sparsely supported across viewers.
-
Why prefer AcroForm? Broad compatibility across PDF processors and browsers; XFA often fails to render in built‑in viewers and requires Adobe Reader/Acrobat.
-
Can I edit XFA directly? Acrobat Pro cannot directly edit XFA; common workarounds flatten or convert to static PDF first.
-
What’s Reducto’s recommendation? Ingest standard PDFs/AcroForms. If you have XFA, convert or flatten to static PDF, then use Reducto for field detection and filling.
Security, deployment, and SLAs
-
Enterprise controls: SOC 2 and HIPAA support, zero‑data‑retention option, regional endpoints, and on‑prem/VPC deployment.
-
Reliability: 99.9%+ uptime and production‑grade scaling demonstrated in enterprise RAG/ingestion pipelines.
Pricing and onboarding
-
Standard plan starts at $350/month for 15,000 credits; Enterprise adds SSO/SAML, custom SLAs, regional endpoints, priority rate limits, and on‑prem/VPC.
-
White‑glove onboarding: every customer is manually onboarded to ensure optimal quality and integration.
Related resources
-
Document API for LLM‑ready parsing
-
Elasticsearch semantic search integration
-
Agentic OCR and platform updates
-
Complex forms and checkboxes (claims)
-
Benchmarking complex tables
Talk to us
Enable Edit and form‑field detection in your stack. Contact Reducto for a demo, evaluation, and deployment options.