Why RAG isn’t enough for GenAI apps

Retrieval Augmented Generation (RAG) is a popular, low-cost technique to boost GenAI response quality. But for many use cases, it still falls short.

John Kanarowski

Aug 5, 2024

Why RAG isn’t enough for GenAI apps

‍Retrieval Augmented Generation, or RAG, is a technique that many teams use to feed up-to-date and use case-specific context to generalized Large Language Models (LLMs).

However, there are many use cases in which RAG isn’t enough to get the results you need. In the most relevant empirical benchmarking study, a research team at Stanford found that domain-specific RAG solutions generate accuracy rates in the range of 20% to 65%. The remainder of responses are either Hallucinations or Incomplete. While higher than general purpose GenAI accuracy rates, these accuracy rates for RAG solutions are too low to drive user adoption and business ROI. The key takeaway is that RAG alone is not sufficient to solve the GenAI accuracy problem in domain-specific contexts. In this article, we’ll discuss where RAG falls short and how using RAG driven by an entity graph can greatly improve GenAI output accuracy.

The keys to GenAI success

First, before diving into RAG, I wanted to make a few observations based on our experiences with customers who’ve deployed GenAI solutions.

A “successful” GenAI solution is one that achieves its desired outcomes. This will differ by use case. Agolo’s domain focus is on product support GenAI for consumer and industrial manufacturers. In this product support domain, , the desired outcomes usually include:

Improved product support due to increased issue resolution and personalization Better support team experience, enabling them to respond to customer queries more accurately
Critical user feedback supplied to product teams so that they can drive improvements to the product

The good news about GenAI is that it works. Achieving your desired outcomes, however, requires a few key project principles:

Taking a slow, pragmatic approach. You’ll get results given time - but it requires having a robust enterprise data pipeline and working through the initial stumbling blocks. There is no “easy button” solution. This is how we approach projects.
Having clean, well-tagged data. Data is 85% of a working GenAI solution (and you should avoid anyone who tells you otherwise!). This can be tricky, as data is dynamic and your outcomes can change over time. You’ll need to be adaptive and open to making changes as your data changes. We can help you with this.
Iterating. Take the time to develop, deploy, test, and develop again. You’ll know when you have your data down when you have a few initial successes under your belt. After those first few successes, your results will begin to snowball as you further iterate and refine your model. This is how we approach projects.

The promise of RAG

These goals are impossible to achieve with a general purpose LLM. Most LLMs are built for general-use purposes and have been trained on all publically available data. Typically, they haven’t been trained deeply on a company’s products and issues. As a result, they don't know much - if anything - about a company’s product support data.

The data in an LLM continues to be dated and limited - despite recent releases. Depending on the LLM, the training data is dated and extends to about two or three years back. That limits its usefulness in a day-to-day business context.

RAG is a GenAI application development pattern that can help you achieve your GenAI goals by enhancing an LLM’s general purpose data with more up-to-date and domain-specific supplementary data. This may include, for example, a product catalog, internal documentation on a product, support ticket information, etc.

RAG data is often stored in a vector database and queried before a prompt is sent to an LLM. A GenAI app includes the information returned as additional context in the prompt. The LLM can use this supplementary text to deliver a more accurate, current answer.
‍

Many teams use RAG because it’s a cost-effective way to extend an LLM foundation model without building a custom model solution from the ground up. It also gives developers more control over an LLM’s responses, as they can change the data they send to fine-tune replies or to respond to changing requirements and circumstances.

The challenges with RAG

However, while RAG can provide a number of advantages in certain use cases, it doesn’t perform as well in others.

One drawback is that RAG struggles in use cases where you have to connect the dots. Simple, naive RAG implementations work best when RAG can pull the answer from a single document source. They don’t work as well when the answer is spread across multiple documents or data sources.

For example, if a solution to a product issue is discussed across numerous Knowledge Base articles, RAG will often limit its response to information in the “closest” match. That can result in an incomplete or even irrelevant answer.

RAG is also, by definition, highly dependent upon the search terms used. If the search terms are even slightly off, RAG may not supply the right text chunks from the docs needed to answer the user’s question.

Another related drawback is that RAG lacks holistic understanding. It can’t synthesize a concept across documents or track data through time. If the answer to a question has evolved over time, RAG won’t be able to capture that nuance.

Finally, simple, naive RAG doesn’t have traceability. This can lead to losing valuable context. For example, if the answer to a question in a document contains accompanying images and diagrams, these will likely be detached from the RAG response.

Overcoming the limitations of RAG with an entity graph

The good news is that you can overcome these limitations of RAG by using it in conjunction with a better underlying data model: an entity graph.

An entity graph is a data structure that models the most contextually relevant relationships between an organization’s entities. The nodes are the entities, whereas the edges represent the connections between the entities. We can think of entity graphs as graphs that provide a more comprehensive view of the relationships around the entities critical to your organization.

Unlike knowledge graphs, which are static and human-curated, entity graphs are built and kept up to date using automation. They provide advanced intelligence by disambiguating and normalizing terms across references, resolving name variations, partial names, misspellings, and other noise in unstructured data.

Turning data into an entity graph results in better results with RAG than using RAG as normally implemented. Because the RAG query can leverage relationships among entities, it can return more relevant and targeted results.

An entity graph RAG query is also more traceable. The query preserves the underlying context, providing data lineage back to source documentation. That means you can preserve critical context such as images, diagrams, and more.

Finally, entity graphs are easy to implement into an existing RAG-enabled pipeline. You implement an entity graph by using parsing software to read in all of the relevant structured and unstructured data you want to organize - how tos, tutorials, customer service tickets, parts databases, etc. The parser turns the terms it identifies into entities and relationships, automatically populating the Entity Graph. User queries are passed to the Graph, which then returns relevant chunks of text, based upon the entities and relationships which answer the query. That content is passed back to the RAG pipeline, which pushes it to the LLM for a response.

‍

In short, an entity graph doesn’t replace RAG. It just makes RAG better.

Using Agolo for Entity Graph RAG: An example

The Entity Graph is the heart of Agolo Entity Intelligence. Agolo turns structured and unstructured data into a clean entity graph representation that can power many enterprise AI use cases.

A GraphRAG powered by Agolo can deliver results that other approaches to RAG can’t. For example, one of our customers, a leading consumer technology manufacturer, came to us recently because their GenAI chatbot was yielding inaccurate responses around 50% of the time.

After a deep dive, they discovered the RAG implementation powering the bot couldn’t handle complex documents in multiple formats (like PDF), in multiple languages, and with complex embedded figures such as tables and images. The result was poor adoption due to a lack of trust in the bot’s responses.

Working together, we helped the customer replace their simple RAG implementation with a GraphRAG solution powered by the Agolo Entity Graph trained from all available sources.

The result? The quality of the chatbot and related AI applications went up, with inaccurate responses reduced by 90%. This quality boost resulted in an adoption increase of the GenAI chat bot across the company.

Conclusion

GenAI use cases require high-quality data that’s up-to-date and relevant to a customer’s query. Simple, naive RAG implementations can only deliver this for the most basic use cases. A RAG implementation powered by an entity graph can deliver greater accuracy and relevance, turning a struggling GenAI rollout into a success story.