Retrieval-Augmented Generation (RAG) - Unleash the Power of Large Language Models Using LangChain

Retrieval-augmented generation (RAG) is an NLP model architecture that combines the retrieval-based and generation-based approaches to enable a model’s capability to extract information from a specified document. The language model utilizes user-specific data to pull the relevant information. RAG overcomes the limitations in generating contextually relevant and accurate responses by leveraging the benefits of retrieval mechanisms. This results in more informed and contextually appropriate responses.

LangChain facilitates the implementation of RAG applications, empowering developers to seamlessly replace specific functionalities within an application.

Press + to interact

Document loaders

Document loaders load documents from different sources and types, including HTML, PDF, code, and CSV. They also allow support to load private S3S3 is an AWS service that stands for "simple storage service." It's used to store, access, retrieve, and back up data where the data is stored in the form of objects. buckets and documents from public websites. This is particularly useful when we wish to retrieve information from any documentation available on external storage.

Loading a text file

The most basic type of document loading available in LangChain is through the load method. It reads a file as text and saves it in a single document as follows:

Here, the the PyPDFLoader loads the document into an array of documents with each document containing a single page contents and metadata with the corresponding page number.

Document transformers

After loading the document, it’s important to transform it according to our application or model requirements. This is where document transformers come into play. LangChain has various built-in transformers for documents that can perform several operations, including:

Splitting
Filtering
Combining
Translating to another language
Manipulating data

Let’s look into the simplest transformer that operates splitting. We use text splitters in LangChain for this purpose.

Text splitters

When working with a long document, splitting it into smaller pieces is often necessary. Let’s look at how the text splitters work:

Press + to interact

Text Splitters

Text Splitter	Usage
HTML header text splitter	Splits texts at an element level and adds relevant metadata for each header to a chunk
Split by character	Splits based on characters where a chunk size is measured by the number of characters present in a chunk
Split code	Enables splitting code in multiple programming languages
Markdown header text splitter	Splits a document in chunks identified by various headers creating header groups
Recursively split by character	Splits generic text based on a parameterized list of characters. The default list is `["\n\n", "\n", " ", ""]`.
Split by tokens	Splits text based on the token limit of a language model

Embedding documents: This embeds multiple documents or texts into their numerical representation. To embed documents, we use the following syntax:

embeddings = embeddings_model.embed_documents(
 [
     "This is the first text",
     "This is the second text",
     "This is the third text"
 ]
)

Embedding a query: This embeds a single query into its numerical embedding. A query can be a text that contains the query we want to search for in the document.
```
embedded_query = embeddings_model.embed_query("WRITE_YOUR_QUERY_HERE")
```

Vector stores

A document may or may not have unstructured data requiring some structuring for the model to access. We most commonly use embeddings to structure that data in a vector space. When a query is passed for retrieving data from the document, the unstructured query is embedded to determine the similarity index (most similar data) between the data present in the vector space and the embedded query. Vector stores perform all of this search process.

There are two methods for searching the similarity in data stored in vector stores:

Computing a simple similarity index for searching data

In the simple similarity method, we directly pass the query to the database by computing the similarity index with the most relevant documents and retrieve the most similar result, as shown in the diagram below.

Press + to interact

from langchain.document_loaders import TextLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
# Load the document
input_document = TextLoader('my_document.txt').load()
# transform the document
text_splitter = RecursiveCharacterTextSplitter(
    # The chunk_size and chunk_overlap can be modified according to the requirements
    length_function = len,
    chunk_size = 200,
    chunk_overlap  = 10,
    add_start_index = True,
)
documents = text_splitter.create_documents([input_document])
# embed the chunks
db = Chroma.from_documents(documents, OpenAIEmbeddings())
# user query
query = "WRITE_YOUR_QUERY_HERE"
# computing the search using the similarity_search() method
docs = db.similarity_search(query)

Line 4: We import the Chroma module, which is an open-source vector store for building AI applications.
Line 7: We load the input document my_document using the TextLoader function.
Lines 10–18: We split the document using a RecursiveCharacterTextSplitter and store the chunks in a documents array.
Line 21: We use OpenAIEmbeddings to create a Chroma database db for vector stores.
Line 27: We retrieve the data based on the query using the similarity_search method.

Using vector stores to compute the similarity

We can also pass the quey after embedding to the vector store to find the similarity index and retrieve data.

Press + to interact

Introduction to LangChain

Exploring LangChain

LangGraph Basics

Wrapping Up

Query CSV Files with Natural Language Using LangChain and Panel

Document loaders

Loading a text file

Loading a CSV file

Loading a PDF file

Document transformers

Text splitters

Text Splitters

Text embedding models

Vector stores

Computing a simple similarity index for searching data

Using vector stores to compute the similarity

Retrievers

Test yourself