Understanding the Path to Graph RAG

Generative AI uses machine learning models, particularly generative models, to produce synthetic data, including text, codes, images, audio, and other forms of content, based on patterns learned from existing data. You might have noticed the recent developments in this field, such as the launch of large language models like GPT, Llama, Google Gemini, and Claude by large companies like OpenAI, Meta AI, Google, and Anthropic, respectively. These GenAI models are designed to understand and generate human language. We call them large language models because they are designed with billions of parameters to capture complex patterns and nuances in language. These models are trained on massive datasets consisting of text from books, articles, websites, and other sources, allowing them to learn grammar, facts, reasoning abilities, and even some level of context awareness. Chatbots use these models to interact with users in a conversational manner.

Below, we show a demo of a famous chatbot, ChatGPT, which uses GPT as the large language model at the backend. We ask ChatGPT questions, and it generates responses to our questions.

Limitation of LLMs and RAG as a solution

The LLMs generate correct responses as long as the questions pertain to information available in the training data. If the information requested falls outside the training window or pertains to recent developments, the models will struggle to provide correct answers.

Press + to interact
A chatbot that generates a response to a user query based solely on the data the LLM is trained on
A chatbot that generates a response to a user query based solely on the data the LLM is trained on

A chatbot that solely depends on the training data of the LLM to generate responses to user queries will produce inaccurate or outdated information when dealing with specialized datasets or rapidly changing data domains. To address this issue, there is a technique known as retrieval-augmented generation (RAG). Instead of relying solely on the LLM’s static training data, RAG allows the chatbot to dynamically fetch relevant and up-to-date information from external sources, such as databases, knowledge bases, or the web, during the conversation. This data is provided as context to the LLM, along with the user’s query. With the provided context, the LLM will generate up-to-date and contextually accurate answers.

Press + to interact
An enhanced chatbot architecture incorporating a retrieval-augmented generation (RAG) approach. The user's query is processed by a data retriever, which fetches relevant information from external sources. This retrieved data is combined with the original query to form an augmented input, which is then processed by the pretrained LLM. The final response, grounded in up-to-date information, is returned to the user through the chatbot interface.
An enhanced chatbot architecture incorporating a retrieval-augmented generation (RAG) approach. The user's query is processed by a data retriever, which fetches relevant information from external sources. This retrieved data is combined with the original query to form an augmented input, which is then processed by the pretrained LLM. The final response, grounded in up-to-date information, is returned to the user through the chatbot interface.

Basic RAG vs. knowledge graph-based RAG

In basic RAG, we typically retrieve raw text passages that closely match the user’s query from a database or document corpus. These retrieved texts are then provided as context to LLMs, which use them to generate content. This approach is effective for enabling LLMs to produce more accurate and relevant responses by grounding their generation in the provided context, which might belong to a specialized dataset or new data from the web that falls outside the training data.

Press + to interact
Basic retrieval-augmented generation process: The retriever fetches unstructured raw text relevant to the user’s query, which is then provided as context to a pretrained LLM for generating responses.
Basic retrieval-augmented generation process: The retriever fetches unstructured raw text relevant to the user’s query, which is then provided as context to a pretrained LLM for generating responses.

While basic RAG is a valuable method for enhancing an LLM’s ability to generate context-based responses, the model remains vulnerable to hallucination. Unstructured raw text as context can be ambiguous and open to multiple interpretations. Additionally, processing large volumes of unstructured text can be computationally intensive and less efficient. In contrast, graph-based RAG offers a more structured and semantically rich approach by leveraging knowledge graphs. Knowledge graphs represent information in a highly organized format, consisting of entities (such as people, places, or concepts) and the relationships between them. Instead of retrieving raw text, graph-based RAG retrieves relevant entities and their interconnections from a knowledge graph. This structured data is then provided as context to the LLM, enabling the model to generate responses that are not only accurate but also contextually coherent.

Press + to interact
Knowledge graph-based retrieval-augmented generation process: The retriever fetches structured data from a knowledge graph, including entities and their relationships. This structured context is then provided to a pretrained LLM for generating more accurate and contextually coherent responses.
Knowledge graph-based retrieval-augmented generation process: The retriever fetches structured data from a knowledge graph, including entities and their relationships. This structured context is then provided to a pretrained LLM for generating more accurate and contextually coherent responses.

Formally defining a knowledge graph (KG)

A knowledge graph is a structured representation of knowledge that captures entities (objects) and their relationships in a graph format. It uses nodes to represent entities and edges to represent the relationships between them. Following are the three key components of a knowledge graph:

  • Entities: These are the nodes or vertices in the graph. They represent real-world objects or concepts, such as people, places, or items.

  • Relationships: These are edges or links connecting the nodes. They define the type of relationship between entities.

  • Attributes: These are properties or characteristics of entities that provide additional information about them.

Consider the following information in the raw text format:

Information in raw text format: "The Eiffel Tower is a famous landmark located in Paris, France. It was designed by Gustave Eiffel and completed in 1889. The tower stands 330 meters tall and attracts millions of tourists every year."

This information in the structured format will look like the following:

List of entities

['The Eiffel Tower', 'Landmark', 'Paris, France', 'Gustave Eiffel', '1889', '330 meters', 'Tourists']

Relationship tuples list

[('The Eiffel Tower', 'is a', 'Landmark'),
('The Eiffel Tower', “located in', 'Paris, France'),
('The Eiffel Tower', 'designed by', 'Gustave Eiffel'),
('The Eiffel Tower', 'completed in', '1889'),
('The Eiffel Tower', 'height', '330 meters'),
('The Eiffel Tower', 'attracts', 'Tourists')]

Attributes

['Year of Completion': '1889',
'Height': '330 meters']

Press + to interact
Visual representation of a knowledge graph
Visual representation of a knowledge graph

Benefits of graph RAG over basic RAG

The benefits of graph RAG over basic RAG lie in the richness of the response to a query. This is due to the difference in the context retrieval process, as shown in the illustration below.

Press + to interact
Difference between basic RAG and graph RAG retrieval process
Difference between basic RAG and graph RAG retrieval process

While basic RAG retrieves the most relevant text fragments based on similarity scores, it often overlooks important connections or related information present in less relevant sections of the data. Graph RAG, on the other hand, uses a knowledge graph to retrieve context by leveraging relationships between entities. This ensures that the LLM has access to a broader, more interconnected view of the data, leading to:

  • Holistic understanding: Graph RAG considers the full structure of the data, not just isolated text snippets.

  • Reduced ambiguity: The relationships in the graph help clarify meanings and reduce hallucinations.

  • Comprehensive responses: Instead of providing fragmented information, graph RAG offers a more thorough and complete response by pulling in interconnected facts.

Example: Comparing basic RAG vs. graph RAG

We’ll use an example to understand the difference between an LLM’s responses using raw text data as context and one using a knowledge graph built from the same text as context.

Raw text data

“Sarah is an avid traveler who recently visited New York City. During her trip, she saw the Statue of Liberty, which was designed by Frédéric Auguste Bartholdi and completed in 1886. Sarah also visited the Empire State Building, which was completed in 1931 and was designed by Shreve, Lamb & Harmon. Sarah took a memorable photo in front of the Brooklyn Bridge, which was designed by John A. Roebling and completed in 1883. She also visited Central Park, a large public park in New York City.”

Knowledge graph construction from the raw text data

From the raw text data, we can construct a knowledge graph by identifying entities and their relationships:

  • Entities:

['Sarah', 'Traveler', 'New York City', 'Statue of Liberty', 'Frédéric Auguste Bartholdi', '1886', 'Empire State Building', 'Shreve, Lamb & Harmon', '1931', 'Brooklyn Bridge', 'John A. Roebling', '1883', 'Central Park', 'Public Park']

  • Relationships:

[('Sarah', 'is', 'Traveler'),

('Sarah', 'visited', 'New York City'),

('Sarah', 'saw', 'Statue of Liberty'),

('Statue of Liberty', 'was designed by', 'Frédéric Auguste Bartholdi'),

('Statue of Liberty', 'was completed in', '1886'),

('Sarah', 'visited', 'Empire State Building'),

('Empire State Building', 'was designed by', 'Shreve, Lamb & Harmon'),

('Empire State Building', 'was completed in', '1931'),

('Sarah', 'took a photo in front of', 'Brooklyn Bridge'),

('Brooklyn Bridge', 'was designed by', 'John A. Roebling'),

('Brooklyn Bridge', 'was completed in', '1883'),

('Sarah', 'visited', 'Central Park'),

('Central Park', 'is', 'Public Park in New York City')]

Response comparison

Query 1: Which landmark visited by Sarah was designed by multiple architects?

Press + to interact
Responses to the query 1: "Which landmark visited by Sarah was designed by multiple architects?"
Responses to the query 1: "Which landmark visited by Sarah was designed by multiple architects?"

The response generated using raw text as context identifies the Empire State Building as a landmark designed by multiple architects. The phrase “multiple architects” could be interpreted in various ways, leading to potential ambiguity. The lack of specific details about the architects (Shreve, Lamb & Harmon) leaves room for misinterpretation or oversimplification.

The response generated using the KG context provides a clear and unambiguous answer. It not only identifies the Empire State Building as a landmark designed by multiple architects but also clarifies that Shreve, Lamb & Harmon is a group of architects. This additional detail reduces the chance of misinterpretation, ensuring that the user understands exactly what “multiple architects” refers to.

Query 2: Which landmark visited by Sarah was designed by different architects?

Press + to interact
Responses to the query 2: "Which landmark visited by Sarah was designed by different architects?"
Responses to the query 2: "Which landmark visited by Sarah was designed by different architects?"

The raw text response implies that only the Empire State Building was designed by different architects, omitting other relevant landmarks. This oversimplification introduces ambiguity, potentially misleading the user to believe there is only one correct answer when, in fact, there are multiple.

The KG-based response, on the other hand, is unambiguous. It correctly identifies all landmarks designed by different architects (Statue of Liberty, Empire State Building, and Brooklyn Bridge), providing a detailed and accurate answer. Using the structured information in the KG, the LLM disambiguates the query, ensuring the user receives a complete and correct response.

In both examples, the raw text context leads to responses that are prone to ambiguity and oversimplification, which can mislead the user. The knowledge graph context, however, helps the LLM provide clear, unambiguous, and detailed answers by leveraging the structured relationships between entities. This reduces the risk of hallucination and ensures more accurate and reliable responses.

Quiz: Understanding the Path to Graph RAG

1

What issue does retrieval-augmented generation (RAG) aim to solve in large language models?

A)

Reducing computational costs

B)

Generating creative images

C)

Providing accurate and up-to-date information

D)

Reducing the size of training data

Question 1 of 30 attempted

What’s next?

Next, we’ll explore techniques for extracting entities and relationships from raw text. We’ll learn how to improve entity and relationship extraction with LLMs. We’ll also learn to create a knowledge graph from the extracted entities and relationships in Neo4j, a graph database. Lastly, we’ll enhance the capabilities of LLM with the constructed knowledge graph.