LLM Super Powers with Langchain Agents

LLM Super Powers with Langchain Agents

Langchain is one of my favorite tools at the moment because of how it simplifies complex machine-learning tasks.

In some cases you may need more than just a well-written prompt, you may want to trigger different data sources or actions based on what the user asks.

For example: in an e-commerce site, if the user asks to view a list of "shoes", you probably would do a keyword/semantic search and return a list of matching shoes.

If they ask about delivery information, you may want to look up that data from an SQL DB or API.

Langchain agents are a powerful mechanism at your disposal that will enable you to build complex custom LLM chatbots. In this article, we will go over what is an agent and how to build one.

What is a Langchain Agent?

When you use an LLM like OpenAI's "chatgpt-turbo", you typically will send the model a prompt consisting of one or more messages, and the LLM will respond accordingly.

So essentially text in and text out, this is okay for a chatbot that is performing one particular task like answering support questions, but what if you need to change the data source or do a web search or record something in the DB?

This is where Agents come in handy; an Agent is basically a "task executor". It will allow the LLM to execute functions and other code in your application based on reasoning and user input.

Think in terms of a switch statement, depending on which pathway is true, the switch statement will execute that particular block of code.

Agents are not exclusive to Langchain. Each LLM has a different way of handling agents, however, Langchain just provides a consistent API to work with regardless of which backend LLM you are using.

Tools

Since agents are task executors, we need some kind of "callback" for the agent to execute such as a function or class.

Tools take in, either the raw prompt or a list of arguments and return some sort of output. Usually, you would return a string, but it's also possible to return more complex data like a LangChain document.

There are multiple ways of declaring a tool. We will cover the decorator approach since it's the most common and easiest solution to understand.

Here is an example:

from langchain.tools import tool

@tool
def search_delivery_information(prompt :str) -> str:
    """ When the user requests delivery information """

    # We now return some text from an external API
    # The LLM will analyze this text and 
    # - return the appropriate answer to the user.
    return requests.get("/somewhere/delivery.json").text

Three essential components make up a tool:

  1. @tool - This decorator will take care of handling the input/output of your function in a way that the LLM can understand.

  2. """ - The docstring; think of this as a prompt system message to the LLM, where you tell the LLM when to execute this function and provide any other useful context data.

  3. The returned data. The LLM will ingest your function's response as context data and scope its response to that context.

To clarify what I mean by "scope", without any context data, if the user asks: "Which is the best shoe brand?", it will respond with some generic response based on the LLMs training data, similar to asking ChatGPT a question, so probably it may respond with "Nike" or "Reebok".

However, if the tool returns "Addidas is our best brand.", then the LLM's response will regard "Addidas" as the best brand and not "Nike" or "Reebok".

Putting it all together

Okay great! Now you know what an agent is and how to create your custom callback functions to help the LLM better answer the user's question.

Let's now build an Agent and link it to our custom tool:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.agents.format_scratchpad.openai_tools import (
    format_to_openai_tool_messages,
)
from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser

# Create a standard Chat LLM
llm = ChatOpenAI(model_name="gpt-3.5-turbo-0125", temperature=0)

# Create a list of all tools you want to enable
tools = [search_delivery_information]

# Connect our tools to the LLM
llm_with_tools = llm.bind_tools(tools) 

# Build a chat prompt.
# Notice we have placeholders for the user's input
# - and a second placeholder for the Agent's context data.
prompt = ChatPromptTemplate.from_messages(
        [
            (
                "system",
               f"You are an e-commerce assistant.",
            ),
            ("user", "{input}"),
           MessagesPlaceholder(variable_name="agent_ctx"),
        ]
)

# Next we build the actual agent.
agent = (
        {
            "input": lambda x: x["input"],
            "agent_ctx": lambda x: format_to_openai_tool_messages(
                x["intermediate_steps"]
            )
        }
        | prompt
        | llm_with_tools
        | OpenAIToolsAgentOutputParser()
)
  {
      "input": lambda x: x["input"],
      "agent_ctx": lambda x: format_to_openai_tool_messages(
           x["intermediate_steps"]
      )
  }

The above block might seem confusing at first glance, but basically, the first argument is the user's input, the prompt template created earlier will replace "{input}" with the actual user's question or message.

Secondarily: "agent_ctx", since our tool callback functions are just Python functions, there needs to be a translation step that converts the output from these functions into something that the model can understand and the agent can transmit via the REST API.

You will also notice we chain one other object at the end "OpenAIToolsAgentOutputParser", this will receive data and convert it into a format that the agent can understand.

Finally, we can instantiate an agent executor and prompt the LLM:

from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=False)
result = agent_executor.invoke({"input": question})
print(result)

Wait! What about memory?

Naturally, users will ask follow-up questions and the LLM needs to be aware of these to provide a much more accurate answer.

Luckily, Langchain makes managing memory so much easier. Here is an example:

from myapp.models import Message
from langchain.prompts import MessagesPlaceholder
from langchain_openai import ChatOpenAI
from langchain_core.messages import AIMessage, HumanMessage
from langchain_core.prompts import (
ChatPromptTemplate,
MessagesPlaceholder
)
from langchain.agents.format_scratchpad.openai_tools import (
    format_to_openai_tool_messages,
)
from langchain.agents.output_parsers.openai_tools import (
OpenAIToolsAgentOutputParser
)

# ... other agent code as per above snippets
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            f"You are an e-commerce assistant.",
        ),
        # Add new variable to replace with previous chat messages.
        MessagesPlaceholder(variable_name="chat_history"), 
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_ctx"),
     ]
)

# Build a list of messages from the DB.
chat_history = []

# Very simple query, you may want to limit to
# - the last 10 conversations or something to that effect.
messages = Message.objects.filter(is_deleted=False)
for m in messages:
    if m.model_answer is not None:
        chat_history.append(
            [
                HumanMessage(content=m.user_question),
                AIMessage(content=m.model_answer),
            ]
         )

agent = (
    {
        "input": lambda x: x["input"],
        "agent_ctx": lambda x: format_to_openai_tool_messages(
            x["intermediate_steps"]
        ),
        "chat_history": lambda x: x["chat_history"],
    }
    | prompt
    | llm
    | printer
    | OpenAIToolsAgentOutputParser()
)

In the above code, we first add a variable to the original prompt template: "chat_history" and then in our agent pipeline we extract the messages from the list constructed above in "chat_history".

Security concerns

Since tool functions are essentially serialized and passed to the OpenAI REST API, it's important to ensure you do not expose any sensitive information in these functions.

Furthermore, if you are allowing the LLM access to interact with your SQL RDMS, just be careful of SQL injection. Rather, build an API around your data or use a vector-based database like Qdrant.