openGauss VectorStore

이 노트북은 openGauss VectorStore를 시작하는 방법을 다룹니다. openGauss는 네이티브 벡터 저장 및 검색 기능을 갖춘 고성능 관계형 데이터베이스입니다. 이 통합은 LangChain 애플리케이션 내에서 ACID 호환 벡터 작업을 가능하게 하며, 전통적인 SQL 기능과 현대적인 AI 기반 유사도 검색을 결합합니다.

Setup

openGauss Container 실행

docker run --name opengauss \
  -d \
  -e GS_PASSWORD='MyStrongPass@123' \
  -p 8888:5432 \
  opengauss/opengauss-server:latest

langchain-opengauss 설치

pip install langchain-opengauss

시스템 요구사항:

openGauss ≥ 7.0.0
Python ≥ 3.8
psycopg2-binary

Credentials

openGauss Credentials 사용

Initialization

from langchain_opengauss import OpenGauss, OpenGaussSettings

# Configure with schema validation
config = OpenGaussSettings(
    table_name="test_langchain",
    embedding_dimension=384,
    index_type="HNSW",
    distance_strategy="COSINE",
)
vector_store = OpenGauss(embedding=embeddings, config=config)

Manage vector store

vector store에 항목 추가

from langchain_core.documents import Document

document_1 = Document(page_content="foo", metadata={"source": "https://example.com"})

document_2 = Document(page_content="bar", metadata={"source": "https://example.com"})

document_3 = Document(page_content="baz", metadata={"source": "https://example.com"})

documents = [document_1, document_2, document_3]

vector_store.add_documents(documents=documents, ids=["1", "2", "3"])

vector store의 항목 업데이트

updated_document = Document(
    page_content="qux", metadata={"source": "https://another-example.com"}
)

# If the id is already exist, will update the document
vector_store.add_documents(document_id="1", document=updated_document)

vector store에서 항목 삭제

vector_store.delete(ids=["3"])

Query vector store

vector store가 생성되고 관련 문서가 추가되면, 체인이나 에이전트를 실행하는 동안 이를 쿼리하고 싶을 것입니다.

직접 쿼리

간단한 유사도 검색은 다음과 같이 수행할 수 있습니다:

TODO: Edit and then run code cell to generate output

results = vector_store.similarity_search(
    query="thud", k=1, filter={"source": "https://another-example.com"}
)
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

유사도 검색을 실행하고 해당 점수를 받으려면 다음을 실행할 수 있습니다:

results = vector_store.similarity_search_with_score(
    query="thud", k=1, filter={"source": "https://example.com"}
)
for doc, score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

retriever로 변환하여 쿼리

체인에서 더 쉽게 사용하기 위해 vector store를 retriever로 변환할 수도 있습니다.

TODO: Edit and then run code cell to generate output

retriever = vector_store.as_retriever(search_type="mmr", search_kwargs={"k": 1})
retriever.invoke("thud")

Usage for retrieval-augmented generation

retrieval-augmented generation (RAG)을 위해 이 vector store를 사용하는 방법에 대한 가이드는 다음 섹션을 참조하세요:

Configuration

Connection Settings

Parameter	Default	Description
`host`	localhost	데이터베이스 서버 주소
`port`	8888	데이터베이스 연결 포트
`user`	gaussdb	데이터베이스 사용자 이름
`password`	-	복잡한 비밀번호 문자열
`database`	postgres	기본 데이터베이스 이름
`min_connections`	1	연결 풀 최소 크기
`max_connections`	5	연결 풀 최대 크기
`table_name`	langchain_docs	벡터 데이터와 메타데이터를 저장하는 테이블 이름
`index_type`	IndexType.HNSW	벡터 인덱스 알고리즘 유형. 옵션: HNSW 또는 IVFFLAT\n기본값은 HNSW입니다.
`vector_type`	VectorType.vector	사용할 벡터 표현 유형. 기본값은 Vector입니다.
`distance_strategy`	DistanceStrategy.COSINE	검색에 사용할 벡터 유사도 메트릭. 옵션: euclidean (L2 거리), cosine (각도 거리, 텍스트 임베딩에 이상적), manhattan (희소 데이터용 L1 거리), negative_inner_product (정규화된 벡터용 내적).\n 기본값은 cosine입니다.
`embedding_dimension`	1536	벡터 임베딩의 차원.

지원되는 조합

Vector Type	Dimensions	Index Types	Supported Distance Strategies
vector	≤2000	HNSW/IVFFLAT	COSINE/EUCLIDEAN/MANHATTAN/INNER_PROD

Performance Optimization

Index 튜닝 가이드라인

HNSW Parameters:

m: 16-100 (재현율과 메모리 간의 균형)
ef_construction: 64-1000 (2*m보다 커야 함)

IVFFLAT 권장사항:

import math

lists = min(
    int(math.sqrt(total_rows)) if total_rows > 1e6 else int(total_rows / 1000),
    2000,  # openGauss maximum
)

Connection Pooling

OpenGaussSettings(min_connections=3, max_connections=20)

Limitations

bit 및 sparsevec 벡터 타입은 현재 개발 중
최대 벡터 차원: vector 타입의 경우 2000

API reference

모든 __ModuleName__VectorStore 기능 및 구성에 대한 자세한 문서는 API reference를 참조하세요: python.langchain.com/api_reference/en/latest/vectorstores/opengauss.OpenGuass.html

Edit the source of this page on GitHub.

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.

Popular Providers

Integrations by component

Setup

openGauss Container 실행

langchain-opengauss 설치

Credentials

Initialization

Manage vector store

vector store에 항목 추가

vector store의 항목 업데이트

vector store에서 항목 삭제

Query vector store

직접 쿼리

retriever로 변환하여 쿼리

Usage for retrieval-augmented generation

Configuration

Connection Settings

지원되는 조합

Performance Optimization

Index 튜닝 가이드라인

Connection Pooling

Limitations

API reference

Popular Providers

Integrations by component

​Setup

​openGauss Container 실행

​langchain-opengauss 설치

​Credentials

​Initialization

​Manage vector store

​vector store에 항목 추가

​vector store의 항목 업데이트

​vector store에서 항목 삭제

​Query vector store

​직접 쿼리

​retriever로 변환하여 쿼리

​Usage for retrieval-augmented generation

​Configuration

​Connection Settings

​지원되는 조합

​Performance Optimization

​Index 튜닝 가이드라인

​Connection Pooling

​Limitations

​API reference

Setup

openGauss Container 실행

langchain-opengauss 설치

Credentials

Initialization

Manage vector store

vector store에 항목 추가

vector store의 항목 업데이트

vector store에서 항목 삭제

Query vector store

직접 쿼리

retriever로 변환하여 쿼리

Usage for retrieval-augmented generation

Configuration

Connection Settings

지원되는 조합

Performance Optimization

Index 튜닝 가이드라인

Connection Pooling

Limitations

API reference