Your business is more ready for AI than you think

Published
Sep 11, 2024
Author
Brandon Barnett

Businesses have always pursued efficiencies as a means of driving profit margins, and especially over the past few years of global economic uncertainty, they are increasingly looking to do more with less. The increased accessibility of AI solutions such as Large Language Models (LLMs) have brought AI tools to the forefront of executive-level conversations about opportunities to gain competitive advantages though technology.

This article discusses the value that LLMs can provide any business, as well as addressing two common and critical questions: how does a business train an LLM on proprietary non-public data, and how do they get started now without spending millions on data cleansing and warehousing projects?

LLMs enable a more focused workforce

LLMs are designed to process and analyze large volumes of text data, which is a common feature of most corporate datasets – including anything from internal reports and emails to customer feedback and market research. The volume and complexity of this data often make it difficult for human employees to extract actionable insights with any level of agility.

LLMs can streamline this process by:

  • Understanding: LLMs can rapidly analyze unstructured text, making sense of data that would take humans significantly more time to process. For instance, an LLM can quickly sift through thousands of customer feedback entries to identify common themes and sentiments.
  • Recognizing Patterns: By leveraging their deep learning capabilities, LLMs can detect patterns within data that might not be immediately obvious and could require an unsustainable amount of workforce hours to process manually. This includes identifying trends in sales data, spotting anomalies in financial reports, or recognizing recurring issues in product support tickets.
  • Summarization: LLMs can break down complex information at scale into concise summaries, making it easier for decision makers extract critical insights on a significantly reduced timeline.

These capabilities translate into direct productivity gains for businesses. By automating the understanding, pattern recognition, and summarization of data, LLMs can reduce the need for extensive manual analysis, freeing up valuable human resources for more strategic tasks.

Reasoning over private data without a team of PhDs

Executives that have been using ChatGPT or similar tools for any length of time have likely been blown away by the quality and speed of answers and have been wondering how they can have a similar experience in searching through their proprietary corporate data to ask questions and receive valuable insights on the fly. Short of having a team of data scientists and machine learning experts on staff, how does a business get started?

One answer lies in a software architecture pattern called Retrieval Augmented Generation (RAG), which leverages semantic search and prompt engineering techniques to retrieve relevant information, apply reasoning and generate natural language answers to user questions.

Retrieval Augmented Generation example

Retrieval Augmented Generation is a pattern that combines the concept of prompt engineering – the art of providing additional instructions to an LLM to enhance the quality of its reasoning – with sophisticated data search techniques to provide answers to user queries over a set of non-public data and documents.  

Here’s what an engineered prompt might look like in the context of a manufacturing facility that wants to troubleshoot CNC machine issues:

In this scenario, the maintenance tech asks the following question based on an observation:

One way to answer this question is to perform a search of excerpts from the equipment manuals. In RAG, it is common to store and search such excerpts in a Vector Database, because they enable semantic similarity search. This is a more advanced methodology than a traditional keyword search because it finds relevant excerpts that match the question’s semantic intent rather than requiring that certain keywords match.

Here are three examples of manual excerpts that might have been retrieved from a similarity search based on the tech’s question above:

A RAG application works by replacing the context and question placeholders in the prompt above with the tech’s question and the retrieved manual excerpts and then sending to an LLM for reasoning. The combined prompt was sent to OpenAI’s GPT-4o model, which produced the following output that highlights the LLM’s ability to summarize textual data and respond with actionable steps:

The LLM’s value proposition

This interaction represents multiple efficiencies to the troubleshooting process:

  1. The relevant excerpts returned by search are commonly found in multiple sections of different manuals. Depending on the facility’s document management capabilities, manuals may be time consuming for a technician to find, and they may have to read through entire sections to determine relevance to the problem they are trying to solve.
  1. The LLM quickly summarizes the disparate content so that it is all presented to the technician in coherent language. It can also make relevant citations in its response in the case the technician wants to review any of the excerpts in more detail.
  1. In most production cases, RAG applications allow for conversations where the technician can ask follow-up questions, which allows them to iterate through this process, multiplying the value of the time savings over the course of each conversation.

The scenario presented illustrates the potential time savings that a technician could achieve over one interaction with the application. When extrapolated out over multiple technicians and additional datasets (e.g., SQL or Excel workbooks), this represents an opportunity to significantly reduce the cycle time of documentation searches so that the technicians can spend less time researching and more time solving problems.

Getting started with AI

Given the number of conversations over the past decade about “big data” and the infrastructure required to support machine learning activities, it is natural for business leaders to assume that they cannot execute AI projects until the business is running a highly polished data warehouse or lakehouse.

However, there is no such thing as “clean” data, as data should be processed into a format consistent with its usage and how it is most effectively searched and retrieved. Any business that has access to documents (e.g., Excel, Word or PDFs) or databases (e.g., SQL) that contain valuable insights can get started today by starting small – focus on a single use case or document type, find what does and doesn’t work and then iterate to expand scope. And the act of going through this exercise, even on a subset of the business’ data universe, requires just enough planning and iteration that it will inform and influence the organization’s data strategy and readiness moving forward.

Consider whether your business has access to the following types of data that are commonly used in RAG applications:

  • Technical Manuals and User Guides - Used to answer technical queries, troubleshoot issues, or guide users through complex processes.
  • Inventory and Asset Management Workbooks/Tables - Provide real-time information on stock availability, asset tracking, and inventory turnover for supply chain management and operational planning.
  • Contracts and Agreements - Assist in clarifying contract terms, obligations, and legal responsibilities when answering legal or compliance-related questions.
  • Maintenance and Inspection Logs - Provide insights into maintenance history, upcoming inspections, and equipment reliability.
  • Customer Support Logs and CRM Data - Provide context for customer service interactions and generate personalized responses based on past interactions.

Conclusion

The pursuit of business efficiencies has never been more critical, especially in today’s climate of economic uncertainty. Leveraging Large Language Model technologies on proprietary data is not only feasible but can be initiated with the data resources already at hand. This approach allows businesses to take immediate action, capitalizing on AI's potential without the barrier of large-scale data cleansing or warehousing projects.

Choose a use case for your business and get started today!