How to Extract Receipt Data with OCR, Regex and AI
Our journey of developing the high accuracy receipt extraction solution.
By integrating FormX into the UiPath workflow, you can bring the advanced extraction models into your robot process.
Businesses can use Robotic Process Automation (RPA) software to automate operations and eliminate tedious tasks, and so to reduce costs. These tools can simulate human digital activities such as data entry, validation and processing. UiPath is a popular choice among the RPA solutions. When it comes to non-standard document formats like receipts and invoice, it will be difficult to automate the data extraction.
FormX is an AI-based intelligent data capture solution which can understand the context in the documents. By integrating FormX into the UiPath workflow, you can bring the advanced extraction models into your robot process. FormX enable the automations to understand and read data intelligently from complex document types.
In this article, we will show you how do you enhance PRA workflows with FormX OCR solution.
In this sample project, the bot will look into a folder of receipt images and extract the transaction dates and amounts into an Excel spreadsheet.
You can download the UiPath project via this link: https://go.formx.ai/uipath-receipt
To try the workflow,
In the FormX portal, click the “Add New Form” and then “My Document don’t have a fixed format” to create an extractor. Select the “Receipt” model under document type. Pick the items you want to extract.
In this example, we will choose the “Total Amount” and “Date”.
Press “Save” to save the extractor.
Click “Manage Packages” and install the “RestSharp” Package. This package will help us connect the workflow to the FormX API.
First, add a “Try Catch” activity.
Inside the “Try Catch” activity, Create an “Invoke Code” activity
Select “VBNet” as the Language
Create “endpoint”, “form_id”, “access_token”, “inn” and “outt” in the argument list
Your code editor should look like the following
Create the “endpoint”, “form_id”, and “access_token” variables in the “Variables” panel in the main sequence scope
Copy the access token and Form ID from the portal and put into the variables
Use the “Browse for folder” and “Select file” actions to read the image files.
The JSON response from FormX will be stored in the variable “outt”. Handle the response by deserializing the JSON object and then pass the values into the next step in the sequence.
To save the resulting values into an Excel spreadsheet, the sequence will look something like this.
In this article, we show how your RPA bot can utilize the amazing extraction ability of FormX to process complex document types intelligently.
Besides receipts and invoices, businesses also use FormX to capture data from identity documents, licenses, certificates, address proof, and many other forms. FormX is also optimized for mobile-captured images. Using its state-of-the-art machine learning model, information can be accurately extracted from the images under different lighting condition, skew and blurriness.
You can download the UiPath sample project to get started: https://go.formx.ai/uipath-receipt
Contact us to schedule a demo today to bring automation to your work!