이 노트북은 agent를 사용하여 Pandas DataFrame과 상호작용하는 방법을 보여줍니다. 주로 질문 답변에 최적화되어 있습니다. 주의: 이 agent는 내부적으로 Python agent를 호출하며, LLM이 생성한 Python 코드를 실행합니다 - LLM이 생성한 Python 코드가 유해할 수 있으므로 주의해서 사용하세요. 주의: langchain이 v0.3으로 마이그레이션되었으므로 langchain_openai와 langchain을 업그레이드해야 합니다. 이렇게 하면 import 오류를 방지할 수 있습니다. pip install -U langchain_openai pip install -U langchain
from langchain.agents.agent_types import AgentType
from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent
from langchain_openai import ChatOpenAI
import pandas as pd
from langchain_openai import OpenAI

df = pd.read_csv(
    "https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv"
)

ZERO_SHOT_REACT_DESCRIPTION 사용하기

ZERO_SHOT_REACT_DESCRIPTION agent type을 사용하여 agent를 초기화하는 방법을 보여줍니다.
agent = create_pandas_dataframe_agent(OpenAI(temperature=0), df, verbose=True)

OpenAI Functions 사용하기

OPENAI_FUNCTIONS agent type을 사용하여 agent를 초기화하는 방법을 보여줍니다. 이것은 위의 방법에 대한 대안입니다.
agent = create_pandas_dataframe_agent(
    ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613"),
    df,
    verbose=True,
    agent_type=AgentType.OPENAI_FUNCTIONS,
)
agent.invoke("how many rows are there?")
> Entering new  chain...

Invoking: `python_repl_ast` with `df.shape[0]`


891There are 891 rows in the dataframe.

> Finished chain.
'There are 891 rows in the dataframe.'
agent.invoke("how many people have more than 3 siblings")
> Entering new AgentExecutor chain...
Thought: I need to count the number of people with more than 3 siblings
Action: python_repl_ast
Action Input: df[df['SibSp'] > 3].shape[0]
Observation: 30
Thought: I now know the final answer
Final Answer: 30 people have more than 3 siblings.

> Finished chain.
'30 people have more than 3 siblings.'
agent.invoke("whats the square root of the average age?")
> Entering new AgentExecutor chain...
Thought: I need to calculate the average age first
Action: python_repl_ast
Action Input: df['Age'].mean()
Observation: 29.69911764705882
Thought: I now need to calculate the square root of the average age
Action: python_repl_ast
Action Input: math.sqrt(df['Age'].mean())
Observation: NameError("name 'math' is not defined")
Thought: I need to import the math library
Action: python_repl_ast
Action Input: import math
Observation:
Thought: I now need to calculate the square root of the average age
Action: python_repl_ast
Action Input: math.sqrt(df['Age'].mean())
Observation: 5.449689683556195
Thought: I now know the final answer
Final Answer: The square root of the average age is 5.449689683556195.

> Finished chain.
'The square root of the average age is 5.449689683556195.'

Multi DataFrame 예제

다음 부분은 agent가 list로 전달된 여러 dataframe과 상호작용하는 방법을 보여줍니다.
df1 = df.copy()
df1["Age"] = df1["Age"].fillna(df1["Age"].mean())
agent = create_pandas_dataframe_agent(OpenAI(temperature=0), [df, df1], verbose=True)
agent.invoke("how many rows in the age column are different?")
> Entering new AgentExecutor chain...
Thought: I need to compare the age columns in both dataframes
Action: python_repl_ast
Action Input: len(df1[df1['Age'] != df2['Age']])
Observation: 177
Thought: I now know the final answer
Final Answer: 177 rows in the age column are different.

> Finished chain.
'177 rows in the age column are different.'

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.
I