Industry Insights

 Thoughts from our CEO

The Importance of Capture in Document-Based Process Automation

Capture is dead, long live capture!

 

History of capture
25 years ago, turning a paper document into a digital image was a rather big challenge. One needed these unwieldy devices called scanners and document capture application software. The capture software would create an image, and typically store it in a .tiff format. Then, before these captured images could be put into an electronic archive, a person would need to manually add some index data. This was known as ‘late’ capture. ‘Late’ because paper documents are only captured and electronically archived at the very end of an entirely paper-based (manual) process -- the whole purpose was simply to replace paper-based filing systems with an electronic archive for easy storage and retrieval. Such were the times!

Capturing forms and data developed relatively early in the 1990's and was initially called data capture to differentiate from document capture, as only the data is of interest and the images often get discarded. Different technologies existed, each focused on handling to deal with either machine- or hand-written content and special information (e.g. checkboxes).

Around 15 years ago ‘early’ capture began to surface, which means that documents are being captured (scanned) immediately upon arrival at the organization. Again, an image is produced but advances mean that OCR technology is applied to obtain textual (content) data. The normalization standard is typically PDF/A, which allows to keep the image information and its text data in a separate layer. ‘Early’ capture nowadays is widespread. It is called digitization and is the important first step for complete digital processing.

From digitization to intelligent document processing (IDP)
With the arrival of the internet, it became possible to scan documents from any office with so-called desktop scanners, and the most prevalent ‘documents’ became emails. Capturing the body of an email is straightforward, however email attachments are an entirely different matter – the variety of attachment formats is mind boggling, and that’s before we even get into the topic of emails attached to other emails, and/or containing archive media (of which there are again multiple types - .zip, .rar, etc.) Recognizing the formats and normalizing them into one single standard is the first challenge.

With a personal scanner in everyone's pocket and data entering organizations in increasingly various formats, organizations needed to be able to extract data from a wide range of sources, turn that data into information and gather insights.

The criticality of capture
Nowadays capture is certainly not the buzz - everyone is talking about AI. But it’s critical to get right, and part of this is leveraging the synergies of capture and artificial intelligence. Organizations should be aware of some of the challenges associated with solely relying on artificial intelligence capabilities for data extraction and classification.

Ultimately, customers want the choice of communication (format, channel) with organizations, and organizations want to remove friction from customer-centric processes. To achieve this, organizations must have a system and process in place that treats all document ingestions in the same manner. Only this centralized, standardized normalization approach to capture guarantees the least amount of processing errors downstream.

Of course, I would not write this if the products and solutions provided by TCG Process could not fully provide the necessary functionality.

Capture is neither a thing of the past nor getting less important – quite the contrary. Capture remains a critical aspect of intelligent document processing (IDP) and digitization efforts, serving as the foundation for automating and optimizing business processes in organizations that are driven by document-based information flow. By implementing effective capture strategies, businesses can unlock the full potential of their data and leverage it to drive digital transformation.

Pay attention to proper capture. TCG Process’ strong roots in capture and IDP mean we know how important this is to get right so you don’t pay for it somewhere downstream, where correction costs explode, process times increase dramatically, and customer satisfaction is impacted. Long live capture!

About Arnold
Arnold von Büren is a Swiss entrepreneur with three decades of experience in capture and input management. He was a founding member of DICOM Group plc. and played an instrumental role in the acquisition of Kofax, Inc. USA, becoming Kofax CEO in 2000. From 2003 – 2006 Arnold was CEO of DICOM Group plc. Since 2007 he has been CEO of TCG Process, providing leading process automation software to businesses of all sizes and growing the company into a global organization with more than a dozen subsidiaries across Europe, the Americas and Asia Pacific.