Leveraging AI Models to Improve IDP Outcomes

Author

Neil Walker

Types of AI models & deployment options

Large Language Models (LLMs) & Generative AI
Large language models are a class of natural language processing (NLP) models that are trained on massive datasets. Due to its everyday, consumer-friendly interface, the one people are likely most familiar with today is OpenAI’s GPT-3 or GPT-4, by way of their chatbot – ChatGPT. The latest version of the Bing search engine uses a next generation LLM from OpenAI that has been customized for search. Google’s BARD is based on its Pathways Language Model (PaLM).

Generative AI refers to the content-creating functionality of these systems, and covers not just chat but also image and video creation (e.g. Dall-E, Midjourney).

Why is this distinction important for document-based process automation? Decoupling LLMs from the content-creating functionality of the system means that a solution can benefit from their understanding, without having to worry about some of the downsides - like the fact that generative AI services are prone to hallucinations. It also means that results can be combined from multiple different models or combined with traditional capture to avoid these falsehoods.

Niche AI Models
Niche AI models are designed for specific, specialized tasks or domains and are more finely tuned to excel in a particular area. They are pre-trained on a large and diverse dataset and subsequently fine-tuned on a more specific dataset relevant to a particular domain or task. Like LLM’s these are not limited to one organization and are available to connect to business processes for more specialized tasks.

Custom AI Models
Unlike LLMs this approach starts with no pre-trained data set. Custom models are machine learning models that are tailored and trained specifically for a particular task or domain.

With a custom model, the first step is to label a dataset of documents with the values to be extracted and create a trained model based on that dataset.

This approach allows for more targeted and fine-tuned results. Models can be created with relatively small sample sets and refined to address specific issues. This requires some specialist knowledge, works for images as well as text, and can be used for specific scenarios where there are poor and variable quality images (e.g. those captured on a mobile device).

Hyperscaler Services
The large hyperscalers (Microsoft, Amazon, Google) continue to add document and data related services to their cloud platforms. Here, models are continually being updated and refined, so it is easy to take advantage of continuous improvement cycles (or stick to a tried and trusted model version).

There are options for both pre-built and custom models, allowing you to choose the best approach for what you are trying to achieve. Some services are only available to run in the cloud, while others can be downloaded and run locally as containers, which may result in limitations for some organizations. Pricing can be somewhat dynamic, as providers reserve the right to change prices at any time.

Where each of these are most appropriate:

The choice of which model to use depends on various factors including the nature of the data, resource availability, desired accuracy levels, complexity and data processing requirements.

LLMs are suitable for analytical type processes (e.g. contract analysis), sentiment analysis & classification, content summarization, intelligent search and document tagging. They are easy to deploy and require minimal fine-tuning, however they might not be as accurate or domain-specific as custom models.
Custom models are beneficial when data quality and compliance are a top concern, or when dealing with specific document types or image classification as they provide more control over the training process, allowing organizations to tailor to their specific needs.
Hyperscaler services are an excellent option as an alternative to OCR for difficult or noisy documents and those containing handwritten elements and allow for a fast deployment as they are ready to use with minimal set-up.

Orchestrating AI for regulatory compliance

With the increasing implementation of governance regulations concerning AI, especially in relation to how companies engage with individuals, we believe that the demand for "explainable AI" will further intensify – meaning that there will need to be clarity around the decisions or predictions made. This trend will likely lead to a decreased use of single large custom models that are only providing answers without sufficient explanations of their derivation. Instead, experience in our field tells us there will be a preference for combining multiple smaller models, where the intelligent navigation between those models will be controlled and managed using an automated orchestration platform.

This same approach can also play a vital role in mitigating mistakes and errors. Ultimately, it is much more expensive to correct a false positive downstream than it is to fix a valid error. This approach empowers organizations with the capability to combine and validate various approaches against each other, as well as against known data sources. By leveraging this methodology, organizations can ensure greater accuracy and reliability in their operations.

----------------------------

Continue reading…..more insight on artificial intelligence for human-related decisioning.

Interested in learning more?

Revolutionize your document processing with end-to-end automation

Schedule demo Learn more

About the Author

Neil Walker

Head of Product at TCG Process