If you process large documents — long contracts, multi-section reports, bulk scanned batches — you've probably hit a file size limit at some point. Split the PDF, upload each piece, stitch the results back together. It works, but it's tedious.
FormX used to cap uploads at 10MB. That limit is gone.
You can now upload files up to 500MB and 1,000 pages, either through the FormX workspace UI or through a new upload API endpoint.
The extraction works the same way it always has. You upload the document, FormX runs OCR and extraction on every page, and you get structured output. The difference is you no longer need to split large files before uploading.
A few scenarios where this is useful:
For programmatic uploads, there's a new API endpoint that handles large files.
This uses a pre-signed URL approach — you request an upload URL, then PUT the file directly. This avoids timeouts that would happen with a single large POST request.
Bigger files take longer to process. A 1,000-page document will not return results as quickly as a 5-page one. Plan accordingly if you're building this into an automated pipeline.