What are agents?

Agents in LangChain make decisions and take action by utilizing a language model. Unlike traditional chains, where a sequence of actions is hardcoded in code, agents use a language model to generate the response based on the user’s input and the available tools.

Press + to interact
Agent takes in user input and utilizes the available tools
Agent takes in user input and utilizes the available tools

What are tools?

Tools are the functionalities available to an agent to make an observation. We can think of tools as a way to connect agents to an external source of information and use it to answer user’s questions. Tools allow agents to interact with any data stores, fetch data from any API, or even use the functionality of any programming language to help reach an observation. Let’s look at the commonly used tools:

  • LLM math: This is an excellent tool that we can utilize to answer math problems. This can be particularly useful when building an educational platform to provide step-by-step solutions to students.

  • Google: LangChain supports various Google products, including Google Drive, Finance, Jobs, Lens, Places, Scholar, Search, Serper, and Trends. For example, we can utilize the Google Search tool to answer questions about current events outside of the knowledge base of GPT.

  • Python REPL: This tool is useful for executing Python commands to solve complex problems. This allows us to integrate LLMs into our data science application that runs data analysis and machine learning algorithms interactively.

  • YouTube search tool: This tool allows us to search a YouTube video by scrapping the result page. This allows a user to comment on and search videos within an application.

  • Twilio API wrapper: This allows us to send short messages using Twilio’s messaging channels, like WhatsApp, Facebook Messenger, and Google Messages. This allows us to build a customer support application through instant messaging platforms.

  • Open weather map API wrapper: This tool allows us to fetch weather information that we can use in travel planning applications.

Types of agents

LangChain provides several types of agents that are useful for different applications. In this lesson, we’ll cover some of the most useful types.

Zero-shot ReAct

Zero-shot ReAct is the most commonly used agent that uses a reasoning and action (ReAct) framework to choose from the available tools. The ReAct framework generates reasoning traces and task-specific actions.

  • The reasoning part helps in figuring out, keeping track of, and updating plans for what to do.

  • The action part is about performing actions, like talking to others or getting information from different places.

This is like having a smart team where one part plans things out, and the other gets things done by talking to the outside world for information.

Example: Getting information from an external API

Let’s look at an example of using the ReAct framework in LangChain. We’ll use an llm-math tool, as well as SerpAPI, to search for information over the internet. SerpAPI accesses Google to search for results through API calls.

To get the API key for SerpAPI, follow the steps below:

  1. Visit the SerpAPI website.

  2. Click “Register,” and sign up using your Google account or GitHub account, or follow the sign-up process.

  3. Follow the steps to verify your email address and phone number.

  4. Copy and save the API key under the “SERPAPI_API_KEY” navigation tab on the left side of the SerpAPI dashboard.

In this example, we ask the model to tell us the world’s current population and calculate the percentage change from five years ago. Because the LLM’s knowledge is limited to the public data it’s trained on, it needs access to the world’s current population through Google. That’s where SerpAPI can help. Additionally, we also provide the llm-math tool to calculate the percentage change once the current population and the population from the past five years are available. Let’s see how we can pass SerpAPI and llm-math as tools to the agents:

import os
# importing LangChain modules
from langchain.llms import OpenAI
from langchain.agents import AgentType, initialize_agent, load_tools

os.environ["SERPAPI_API_KEY"] = "{{SERPAPI_API_KEY}}"

# Insert your key here
llm = OpenAI(temperature=0.0,
            openai_api_key = "{{OpenAI_Key}}")

# loading tools
tools = load_tools(["serpapi", 
                    "llm-math"], 
                    llm=llm)

agent = initialize_agent(tools, 
                        llm, 
                        agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, 
                        verbose=True)

# user's query
print(agent.run("What is the current population of the world, and calculate the percentage change compared to the population five years ago"))
Zero-shot ReAct agent in LangChain
  • Line 4: We import the methods AgentType, initialize_agent, and load_tools from langchain.agents.

  • Line 6: We load the SerpAPI API key as an environment variable. We use this to search the internet for information and respond to the user’s query.

  • Lines 13–15: We load the llm-math and the serpapi tools using the load_tools method.

  • Lines 17–20: We initialize the zero-shot ReAct agent with tools.

  • Line 23: We run the agent with a query asking about the percentage change in the world’s current population compared to the population from five years ago.

Press + to interact
Calculating the percentage change in current population compared to population from five year ago
Calculating the percentage change in current population compared to population from five year ago

The output shows the complete thought process of our agent. Note that the agent has access to SerpAPI for searching and a math tool.

  1. After entering the chain, the agent first determines that it needs two pieces of information to reach an answer. It figures out that it needs the current population and the population five years ago.

  2. The agent then takes its first action as “search.” It creates an action input of “world population” and makes an observation that the current world population is 7.88 billion people.

  3. Based on the observation, the agent then devices the next step, represented by the “thought” here.

  4. This thought triggers a search action again, and the agent creates the corresponding input.

  5. Now that the current population and the population five years ago are known, the agent realizes it needs a calculator as an action. It creates an action input to calculate the percentage and finally reach a final answer.

Conversational ReAct

Many applications require the responses to be in a conversational style. The conversational agent specializes in this purpose where the agent’s response is helpful and conversational by keeping the chat history in memory. Additionally, as the name suggests, it uses the ReAct framework to decide which tool will take action.

Example: Empowering ChatGPT with current knowledge

Let’s look at an example of how we can utilize the conversational ReAct agent in LangChain to create a clone of ChatGPT. However, we’ll strive to make a better version of ChatGPT that overcomes the knowledge limitation of ChatGPT by providing a Google search capability through SerpAPI.

import os
# importing LangChain modules
from langchain.llms import OpenAI
from langchain.memory import ConversationBufferMemory
from langchain.agents import AgentType, initialize_agent, load_tools

os.environ["SERPAPI_API_KEY"] = "{{SERPAPI_API_KEY}}"
# Insert your key here
llm = OpenAI(temperature=0.0,
            openai_api_key = "{{OpenAI_Key}}")

memory = ConversationBufferMemory(memory_key="chat_history")

# loading tools
tools = load_tools(["serpapi"], 
                    llm=llm)

agent = initialize_agent(tools,
                        llm,
                        agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
                        verbose=True,
                        memory=memory)

agent.run("Hi, my name is Alex, and I live in the New York City.")
agent.run("My favorite game is basketball.")
print(agent.run("Give me the list of stadiums to watch a basketball game in my city today. Also give the teams that are playing."))
Conversational agent in LangChain
  • Line 4: We import the ConversationBufferMemory to maintain the chat history for a conversational agent on line 12.

  • Lines 15–16: We initialize an agent with SerpAPI to equip our conversation agent with search capabilities and a memory variable.

  • Lines 24–26: We prompt the agent with queries. Note that the third prompt on line 26 asks about a list of stadiums without providing the city’s name. This is because the agent remembers the city name from the chat history.

Test yourself

Check how well you understand how agents and tools are used in LangChain.

1

(True or False) Agents are the core of LangChain that allow us to communicate with the external world to obtain information with the help of tools.

A)

True

B)

False

Question 1 of 20 attempted

What is the correct order of actions that an agent must take to reach a final response?

Drag and drop the cards in the blank spaces.


Get hands-on with 1400+ tech skills courses.