커스텀 RAG 에이전트 만들기

개요

이 튜토리얼에서는 LangGraph를 사용하여 retrieval 에이전트를 만들어 보겠습니다. LangChain은 LangGraph 프리미티브를 사용하여 구현된 내장 agent 구현을 제공합니다. 더 깊은 커스터마이징이 필요한 경우, 에이전트를 LangGraph에서 직접 구현할 수 있습니다. 이 가이드는 retrieval 에이전트의 예제 구현을 보여줍니다. Retrieval 에이전트는 LLM이 vectorstore에서 컨텍스트를 검색할지 또는 사용자에게 직접 응답할지 결정하도록 하려는 경우에 유용합니다. 튜토리얼을 마치면 다음을 수행하게 됩니다:

검색에 사용될 문서를 가져오고 전처리합니다.
의미론적 검색을 위해 해당 문서를 인덱싱하고 에이전트를 위한 retriever tool을 생성합니다.
retriever tool을 언제 사용할지 결정할 수 있는 agentic RAG 시스템을 구축합니다.

개념

다음 개념들을 다룰 것입니다:

document loaders, text splitters, embeddings, vector stores를 사용한 Retrieval
state, nodes, edges, conditional edges를 포함한 LangGraph Graph API

Setup

필요한 패키지를 다운로드하고 API 키를 설정해 봅시다:

pip install -U langgraph "langchain[openai]" langchain-community langchain-text-splitters bs4

import getpass
import os


def _set_env(key: str):
    if key not in os.environ:
        os.environ[key] = getpass.getpass(f"{key}:")


_set_env("OPENAI_API_KEY")

LangSmith에 가입하여 LangGraph 프로젝트의 문제를 빠르게 발견하고 성능을 개선하세요. LangSmith를 사용하면 trace 데이터를 활용하여 LangGraph로 구축된 LLM 앱을 디버그, 테스트 및 모니터링할 수 있습니다.

1. 문서 전처리

RAG 시스템에서 사용할 문서를 가져옵니다. Lilian Weng의 훌륭한 블로그에서 가장 최근의 페이지 세 개를 사용하겠습니다. WebBaseLoader 유틸리티를 사용하여 페이지의 콘텐츠를 가져오는 것부터 시작하겠습니다:

from langchain_community.document_loaders import WebBaseLoader

urls = [
    "https://lilianweng.github.io/posts/2024-11-28-reward-hacking/",
    "https://lilianweng.github.io/posts/2024-07-07-hallucination/",
    "https://lilianweng.github.io/posts/2024-04-12-diffusion-video/",
]

docs = [WebBaseLoader(url).load() for url in urls]

docs[0][0].page_content.strip()[:1000]

가져온 문서를 vectorstore에 인덱싱하기 위해 더 작은 청크로 분할합니다:

from langchain_text_splitters import RecursiveCharacterTextSplitter

docs_list = [item for sublist in docs for item in sublist]

text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=100, chunk_overlap=50
)
doc_splits = text_splitter.split_documents(docs_list)

doc_splits[0].page_content.strip()

2. retriever tool 생성

이제 분할된 문서가 있으므로, 의미론적 검색에 사용할 vector store에 인덱싱할 수 있습니다.

인메모리 vector store와 OpenAI embeddings를 사용합니다:

from langchain_core.vectorstores import InMemoryVectorStore
from langchain_openai import OpenAIEmbeddings

vectorstore = InMemoryVectorStore.from_documents(
    documents=doc_splits, embedding=OpenAIEmbeddings()
)
retriever = vectorstore.as_retriever()

LangChain의 사전 구축된 create_retriever_tool을 사용하여 retriever tool을 생성합니다:

from langchain_classic.tools.retriever import create_retriever_tool

retriever_tool = create_retriever_tool(
    retriever,
    "retrieve_blog_posts",
    "Search and return information about Lilian Weng blog posts.",
)

tool을 테스트합니다:

retriever_tool.invoke({"query": "types of reward hacking"})

3. 쿼리 생성

이제 agentic RAG 그래프를 위한 컴포넌트(nodes와 edges)를 구축하기 시작하겠습니다. 컴포넌트는 MessagesState에서 작동한다는 점에 유의하세요 — chat messages 리스트가 포함된 messages 키를 가진 그래프 state입니다.

generate_query_or_respond node를 구축합니다. 현재 그래프 state(메시지 리스트)를 기반으로 응답을 생성하기 위해 LLM을 호출합니다. 입력 메시지가 주어지면, retriever tool을 사용하여 검색할지 또는 사용자에게 직접 응답할지 결정합니다. .bind_tools를 통해 앞서 생성한 retriever_tool에 대한 액세스 권한을 chat model에 부여하고 있다는 점에 유의하세요:

from langgraph.graph import MessagesState
from langchain.chat_models import init_chat_model

response_model = init_chat_model("openai:gpt-4o", temperature=0)


def generate_query_or_respond(state: MessagesState):
    """Call the model to generate a response based on the current state. Given
    the question, it will decide to retrieve using the retriever tool, or simply respond to the user.
    """
    response = (
        response_model
        .bind_tools([retriever_tool]).invoke(state["messages"])  
    )
    return {"messages": [response]}

임의의 입력으로 시도해 봅니다:

input = {"messages": [{"role": "user", "content": "hello!"}]}
generate_query_or_respond(input)["messages"][-1].pretty_print()

출력:

================================== Ai Message ==================================

Hello! How can I help you today?

의미론적 검색이 필요한 질문을 합니다:

input = {
    "messages": [
        {
            "role": "user",
            "content": "What does Lilian Weng say about types of reward hacking?",
        }
    ]
}
generate_query_or_respond(input)["messages"][-1].pretty_print()

출력:

================================== Ai Message ==================================
Tool Calls:
retrieve_blog_posts (call_tYQxgfIlnQUDMdtAhdbXNwIM)
Call ID: call_tYQxgfIlnQUDMdtAhdbXNwIM
Args:
    query: types of reward hacking

4. 문서 평가

검색된 문서가 질문과 관련이 있는지 판단하기 위해 conditional edge — grade_documents — 를 추가합니다. 문서 평가를 위해 구조화된 출력 스키마 GradeDocuments를 가진 모델을 사용합니다. grade_documents 함수는 평가 결정에 따라 이동할 node의 이름을 반환합니다(generate_answer 또는 rewrite_question):

from pydantic import BaseModel, Field
from typing import Literal

GRADE_PROMPT = (
    "You are a grader assessing relevance of a retrieved document to a user question. \n "
    "Here is the retrieved document: \n\n {context} \n\n"
    "Here is the user question: {question} \n"
    "If the document contains keyword(s) or semantic meaning related to the user question, grade it as relevant. \n"
    "Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question."
)


class GradeDocuments(BaseModel):  
    """Grade documents using a binary score for relevance check."""

    binary_score: str = Field(
        description="Relevance score: 'yes' if relevant, or 'no' if not relevant"
    )


grader_model = init_chat_model("openai:gpt-4o", temperature=0)


def grade_documents(
    state: MessagesState,
) -> Literal["generate_answer", "rewrite_question"]:
    """Determine whether the retrieved documents are relevant to the question."""
    question = state["messages"][0].content
    context = state["messages"][-1].content

    prompt = GRADE_PROMPT.format(question=question, context=context)
    response = (
        grader_model
        .with_structured_output(GradeDocuments).invoke(  
            [{"role": "user", "content": prompt}]
        )
    )
    score = response.binary_score

    if score == "yes":
        return "generate_answer"
    else:
        return "rewrite_question"

tool 응답에 관련 없는 문서가 있는 경우로 실행합니다:

from langchain_core.messages import convert_to_messages

input = {
    "messages": convert_to_messages(
        [
            {
                "role": "user",
                "content": "What does Lilian Weng say about types of reward hacking?",
            },
            {
                "role": "assistant",
                "content": "",
                "tool_calls": [
                    {
                        "id": "1",
                        "name": "retrieve_blog_posts",
                        "args": {"query": "types of reward hacking"},
                    }
                ],
            },
            {"role": "tool", "content": "meow", "tool_call_id": "1"},
        ]
    )
}
grade_documents(input)

관련 문서가 그렇게 분류되는지 확인합니다:

input = {
    "messages": convert_to_messages(
        [
            {
                "role": "user",
                "content": "What does Lilian Weng say about types of reward hacking?",
            },
            {
                "role": "assistant",
                "content": "",
                "tool_calls": [
                    {
                        "id": "1",
                        "name": "retrieve_blog_posts",
                        "args": {"query": "types of reward hacking"},
                    }
                ],
            },
            {
                "role": "tool",
                "content": "reward hacking can be categorized into two types: environment or goal misspecification, and reward tampering",
                "tool_call_id": "1",
            },
        ]
    )
}
grade_documents(input)

5. 질문 재작성

rewrite_question node를 구축합니다. retriever tool은 잠재적으로 관련 없는 문서를 반환할 수 있으며, 이는 원래 사용자 질문을 개선할 필요가 있음을 나타냅니다. 이를 위해 rewrite_question node를 호출합니다:

REWRITE_PROMPT = (
    "Look at the input and try to reason about the underlying semantic intent / meaning.\n"
    "Here is the initial question:"
    "\n ------- \n"
    "{question}"
    "\n ------- \n"
    "Formulate an improved question:"
)


def rewrite_question(state: MessagesState):
    """Rewrite the original user question."""
    messages = state["messages"]
    question = messages[0].content
    prompt = REWRITE_PROMPT.format(question=question)
    response = response_model.invoke([{"role": "user", "content": prompt}])
    return {"messages": [{"role": "user", "content": response.content}]}

시도해 봅니다:

input = {
    "messages": convert_to_messages(
        [
            {
                "role": "user",
                "content": "What does Lilian Weng say about types of reward hacking?",
            },
            {
                "role": "assistant",
                "content": "",
                "tool_calls": [
                    {
                        "id": "1",
                        "name": "retrieve_blog_posts",
                        "args": {"query": "types of reward hacking"},
                    }
                ],
            },
            {"role": "tool", "content": "meow", "tool_call_id": "1"},
        ]
    )
}

response = rewrite_question(input)
print(response["messages"][-1]["content"])

출력:

What are the different types of reward hacking described by Lilian Weng, and how does she explain them?

6. 답변 생성

generate_answer node를 구축합니다: grader 검사를 통과하면, 원래 질문과 검색된 컨텍스트를 기반으로 최종 답변을 생성할 수 있습니다:

GENERATE_PROMPT = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer the question. "
    "If you don't know the answer, just say that you don't know. "
    "Use three sentences maximum and keep the answer concise.\n"
    "Question: {question} \n"
    "Context: {context}"
)


def generate_answer(state: MessagesState):
    """Generate an answer."""
    question = state["messages"][0].content
    context = state["messages"][-1].content
    prompt = GENERATE_PROMPT.format(question=question, context=context)
    response = response_model.invoke([{"role": "user", "content": prompt}])
    return {"messages": [response]}

시도해 봅니다:

input = {
    "messages": convert_to_messages(
        [
            {
                "role": "user",
                "content": "What does Lilian Weng say about types of reward hacking?",
            },
            {
                "role": "assistant",
                "content": "",
                "tool_calls": [
                    {
                        "id": "1",
                        "name": "retrieve_blog_posts",
                        "args": {"query": "types of reward hacking"},
                    }
                ],
            },
            {
                "role": "tool",
                "content": "reward hacking can be categorized into two types: environment or goal misspecification, and reward tampering",
                "tool_call_id": "1",
            },
        ]
    )
}

response = generate_answer(input)
response["messages"][-1].pretty_print()

출력:

================================== Ai Message ==================================

Lilian Weng categorizes reward hacking into two types: environment or goal misspecification, and reward tampering. She considers reward hacking as a broad concept that includes both of these categories. Reward hacking occurs when an agent exploits flaws or ambiguities in the reward function to achieve high rewards without performing the intended behaviors.

7. 그래프 조립

이제 모든 nodes와 edges를 완전한 그래프로 조립하겠습니다:

generate_query_or_respond로 시작하여 retriever_tool을 호출해야 하는지 판단합니다
tools_condition을 사용하여 다음 단계로 라우팅합니다:
- generate_query_or_respond가 tool_calls를 반환한 경우, retriever_tool을 호출하여 컨텍스트를 검색합니다
- 그렇지 않으면, 사용자에게 직접 응답합니다
질문과의 관련성에 대해 검색된 문서 콘텐츠를 평가하고(grade_documents) 다음 단계로 라우팅합니다:
- 관련이 없는 경우, rewrite_question을 사용하여 질문을 재작성한 다음 generate_query_or_respond를 다시 호출합니다
- 관련이 있는 경우, generate_answer로 진행하여 검색된 문서 컨텍스트가 포함된 ToolMessage를 사용하여 최종 응답을 생성합니다

from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode, tools_condition

workflow = StateGraph(MessagesState)

# Define the nodes we will cycle between
workflow.add_node(generate_query_or_respond)
workflow.add_node("retrieve", ToolNode([retriever_tool]))
workflow.add_node(rewrite_question)
workflow.add_node(generate_answer)

workflow.add_edge(START, "generate_query_or_respond")

# Decide whether to retrieve
workflow.add_conditional_edges(
    "generate_query_or_respond",
    # Assess LLM decision (call `retriever_tool` tool or respond to the user)
    tools_condition,
    {
        # Translate the condition outputs to nodes in our graph
        "tools": "retrieve",
        END: END,
    },
)

# Edges taken after the `action` node is called.
workflow.add_conditional_edges(
    "retrieve",
    # Assess agent decision
    grade_documents,
)
workflow.add_edge("generate_answer", END)
workflow.add_edge("rewrite_question", "generate_query_or_respond")

# Compile
graph = workflow.compile()

그래프를 시각화합니다:

from IPython.display import Image, display

display(Image(graph.get_graph().draw_mermaid_png()))

8. agentic RAG 실행

이제 질문으로 실행하여 완전한 그래프를 테스트해 봅시다:

for chunk in graph.stream(
    {
        "messages": [
            {
                "role": "user",
                "content": "What does Lilian Weng say about types of reward hacking?",
            }
        ]
    }
):
    for node, update in chunk.items():
        print("Update from node", node)
        update["messages"][-1].pretty_print()
        print("\n\n")

출력:

Update from node generate_query_or_respond
================================== Ai Message ==================================
Tool Calls:
  retrieve_blog_posts (call_NYu2vq4km9nNNEFqJwefWKu1)
 Call ID: call_NYu2vq4km9nNNEFqJwefWKu1
  Args:
    query: types of reward hacking



Update from node retrieve
================================= Tool Message ==================================
Name: retrieve_blog_posts

(Note: Some work defines reward tampering as a distinct category of misalignment behavior from reward hacking. But I consider reward hacking as a broader concept here.)
At a high level, reward hacking can be categorized into two types: environment or goal misspecification, and reward tampering.

Why does Reward Hacking Exist?#

Pan et al. (2022) investigated reward hacking as a function of agent capabilities, including (1) model size, (2) action space resolution, (3) observation space noise, and (4) training time. They also proposed a taxonomy of three types of misspecified proxy rewards:

Let's Define Reward Hacking#
Reward shaping in RL is challenging. Reward hacking occurs when an RL agent exploits flaws or ambiguities in the reward function to obtain high rewards without genuinely learning the intended behaviors or completing the task as designed. In recent years, several related concepts have been proposed, all referring to some form of reward hacking:



Update from node generate_answer
================================== Ai Message ==================================

Lilian Weng categorizes reward hacking into two types: environment or goal misspecification, and reward tampering. She considers reward hacking as a broad concept that includes both of these categories. Reward hacking occurs when an agent exploits flaws or ambiguities in the reward function to achieve high rewards without performing the intended behaviors.

Edit the source of this page on GitHub.

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.

Tutorials

Conceptual overviews

Additional resources

커스텀 RAG 에이전트 만들기

개요

개념

Setup

1. 문서 전처리

2. retriever tool 생성

3. 쿼리 생성

4. 문서 평가

5. 질문 재작성

6. 답변 생성

7. 그래프 조립

8. agentic RAG 실행

Tutorials

Conceptual overviews

Additional resources

​개요

​개념

​Setup

​1. 문서 전처리

​2. retriever tool 생성

​3. 쿼리 생성

​4. 문서 평가

​5. 질문 재작성

​6. 답변 생성

​7. 그래프 조립

​8. agentic RAG 실행

개요

개념

Setup

1. 문서 전처리

2. retriever tool 생성

3. 쿼리 생성

4. 문서 평가

5. 질문 재작성

6. 답변 생성

7. 그래프 조립

8. agentic RAG 실행