YDB

YDB는 고가용성과 확장성을 강한 일관성과 ACID 트랜잭션과 결합한 다재다능한 오픈 소스 분산 SQL Database입니다. 트랜잭션(OLTP), 분석(OLAP), 스트리밍 워크로드를 동시에 처리할 수 있습니다.

이 노트북은 YDB vector store와 관련된 기능을 사용하는 방법을 보여줍니다.

Setup

먼저, Docker로 로컬 YDB를 설정합니다:

! docker run -d -p 2136:2136 --name ydb-langchain -e YDB_USE_IN_MEMORY_PDISKS=true -h localhost ydbplatform/local-ydb:trunk

이 통합을 사용하려면 langchain-ydb를 설치해야 합니다

! pip install -qU langchain-ydb

Credentials

이 노트북에는 별도의 자격 증명이 필요하지 않습니다. 위에 표시된 대로 패키지를 설치했는지 확인하세요. 모델 호출에 대한 최고 수준의 자동 트레이싱을 원한다면 아래 주석을 해제하여 LangSmith API 키를 설정할 수도 있습니다:

os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
os.environ["LANGSMITH_TRACING"] = "true"

Initialization

# | output: false
# | echo: false
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

/Users/ovcharuk/opensource/langchain/.venv/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

from langchain_ydb.vectorstores import YDB, YDBSearchStrategy, YDBSettings

settings = YDBSettings(
    table="ydb_example",
    strategy=YDBSearchStrategy.COSINE_SIMILARITY,
)
vector_store = YDB(embeddings, config=settings)

Manage vector store

vector store를 생성한 후에는 항목을 추가하거나 삭제하여 상호작용할 수 있습니다.

Add items to vector store

작업할 문서를 준비합니다:

from uuid import uuid4

from langchain_core.documents import Document

document_1 = Document(
    page_content="I had chocolate chip pancakes and scrambled eggs for breakfast this morning.",
    metadata={"source": "tweet"},
)

document_2 = Document(
    page_content="The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees.",
    metadata={"source": "news"},
)

document_3 = Document(
    page_content="Building an exciting new project with LangChain - come check it out!",
    metadata={"source": "tweet"},
)

document_4 = Document(
    page_content="Robbers broke into the city bank and stole $1 million in cash.",
    metadata={"source": "news"},
)

document_5 = Document(
    page_content="Wow! That was an amazing movie. I can't wait to see it again.",
    metadata={"source": "tweet"},
)

document_6 = Document(
    page_content="Is the new iPhone worth the price? Read this review to find out.",
    metadata={"source": "website"},
)

document_7 = Document(
    page_content="The top 10 soccer players in the world right now.",
    metadata={"source": "website"},
)

document_8 = Document(
    page_content="LangGraph is the best framework for building stateful, agentic applications!",
    metadata={"source": "tweet"},
)

document_9 = Document(
    page_content="The stock market is down 500 points today due to fears of a recession.",
    metadata={"source": "news"},
)

document_10 = Document(
    page_content="I have a bad feeling I am going to get deleted :(",
    metadata={"source": "tweet"},
)

documents = [
    document_1,
    document_2,
    document_3,
    document_4,
    document_5,
    document_6,
    document_7,
    document_8,
    document_9,
    document_10,
]
uuids = [str(uuid4()) for _ in range(len(documents))]

add_documents function을 사용하여 vector store에 항목을 추가할 수 있습니다.

vector_store.add_documents(documents=documents, ids=uuids)

Inserting data...: 100%|██████████| 10/10 [00:00<00:00, 14.67it/s]

['947be6aa-d489-44c5-910e-62e4d58d2ffb',
 '7a62904d-9db3-412b-83b6-f01b34dd7de3',
 'e5a49c64-c985-4ed7-ac58-5ffa31ade699',
 '99cf4104-36ab-4bd5-b0da-e210d260e512',
 '5810bcd0-b46e-443e-a663-e888c9e028d1',
 '190c193d-844e-4dbb-9a4b-b8f5f16cfae6',
 'f8912944-f80a-4178-954e-4595bf59e341',
 '34fc7b09-6000-42c9-95f7-7d49f430b904',
 '0f6b6783-f300-4a4d-bb04-8025c4dfd409',
 '46c37ba9-7cf2-4ac8-9bd1-d84e2cb1155c']

Delete items from vector store

delete function을 사용하여 ID로 vector store에서 항목을 삭제할 수 있습니다.

vector_store.delete(ids=[uuids[-1]])

True

Query vector store

vector store가 생성되고 관련 문서가 추가되면, chain이나 agent 실행 중에 이를 쿼리하고 싶을 것입니다.

Query directly

Similarity search

간단한 유사도 검색은 다음과 같이 수행할 수 있습니다:

results = vector_store.similarity_search(
    "LangChain provides abstractions to make working with LLMs easy", k=2
)
for res in results:
    print(f"* {res.page_content} [{res.metadata}]")

* Building an exciting new project with LangChain - come check it out! [{'source': 'tweet'}]
* LangGraph is the best framework for building stateful, agentic applications! [{'source': 'tweet'}]

Similarity search with score

점수와 함께 검색을 수행할 수도 있습니다:

results = vector_store.similarity_search_with_score("Will it be hot tomorrow?", k=3)
for res, score in results:
    print(f"* [SIM={score:.3f}] {res.page_content} [{res.metadata}]")

* [SIM=0.595] The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees. [{'source': 'news'}]
* [SIM=0.212] I had chocolate chip pancakes and scrambled eggs for breakfast this morning. [{'source': 'tweet'}]
* [SIM=0.118] Wow! That was an amazing movie. I can't wait to see it again. [{'source': 'tweet'}]

Filtering

아래와 같이 필터를 사용해 검색할 수 있습니다:

results = vector_store.similarity_search_with_score(
    "What did I eat for breakfast?",
    k=4,
    filter={"source": "tweet"},
)
for res, _ in results:
    print(f"* {res.page_content} [{res.metadata}]")

* I had chocolate chip pancakes and scrambled eggs for breakfast this morning. [{'source': 'tweet'}]
* Wow! That was an amazing movie. I can't wait to see it again. [{'source': 'tweet'}]
* Building an exciting new project with LangChain - come check it out! [{'source': 'tweet'}]
* LangGraph is the best framework for building stateful, agentic applications! [{'source': 'tweet'}]

Query by turning into retriever

chain에서 더 쉽게 사용하기 위해 vector store를 retriever로 변환할 수도 있습니다. 다음은 vector store를 retriever로 변환한 뒤, 간단한 쿼리와 필터로 retriever를 호출하는 방법입니다.

retriever = vector_store.as_retriever(
    search_kwargs={"k": 2},
)
results = retriever.invoke(
    "Stealing from the bank is a crime", filter={"source": "news"}
)
for res in results:
    print(f"* {res.page_content} [{res.metadata}]")

* Robbers broke into the city bank and stole $1 million in cash. [{'source': 'news'}]
* The stock market is down 500 points today due to fears of a recession. [{'source': 'news'}]

Usage for retrieval-augmented generation

이 vector store를 retrieval-augmented generation(RAG)에 사용하는 방법에 대한 가이드는 다음 섹션을 참조하세요:

API reference

YDB의 모든 기능과 구성에 대한 자세한 문서는 API 레퍼런스를 확인하세요: python.langchain.com/api_reference/community/vectorstores/langchain_community.vectorstores.ydb.YDB.html

Edit the source of this page on GitHub.

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.

Popular Providers

Integrations by component

Setup

Credentials

Initialization

Manage vector store

Add items to vector store

Delete items from vector store

Query vector store

Query directly

Similarity search

Similarity search with score

Filtering

Query by turning into retriever

Usage for retrieval-augmented generation

API reference

Popular Providers

Integrations by component

​Setup

​Credentials

​Initialization

​Manage vector store

​Add items to vector store

​Delete items from vector store

​Query vector store

​Query directly

​Similarity search

​Similarity search with score

​Filtering

​Query by turning into retriever

​Usage for retrieval-augmented generation

​API reference

Setup

Credentials

Initialization

Manage vector store

Add items to vector store

Delete items from vector store

Query vector store

Query directly

Similarity search

Similarity search with score

Filtering

Query by turning into retriever

Usage for retrieval-augmented generation

API reference