이 가이드는 WRITER의 text splitter를 시작하기 위한 간단한 개요를 제공합니다. WRITER의 context-aware splitting endpoint는 긴 문서(최대 4000단어)에 대한 지능형 텍스트 분할 기능을 제공합니다. 단순한 문자 기반 분할과 달리, chunk 간의 의미론적 의미와 맥락을 보존하여 일관성을 유지하면서 장문의 콘텐츠를 처리하는 데 이상적입니다. langchain-writer에서는 WRITER의 context-aware splitting endpoint를 LangChain text splitter로 사용할 수 있도록 제공합니다.

Overview

Integration details

ClassPackageLocalSerializableJS supportDownloadsVersion
WriterTextSplitterlangchain-writerPyPI - DownloadsPyPI - Version

Setup

WriterTextSplitterlangchain-writer package에서 사용할 수 있습니다:
pip install --quiet -U langchain-writer

Credentials

WRITER AI Studio에 가입하여 API key를 생성하세요 (이 Quickstart를 따라할 수 있습니다). 그런 다음 WRITER_API_KEY environment variable을 설정하세요:
import getpass
import os

if not os.getenv("WRITER_API_KEY"):
    os.environ["WRITER_API_KEY"] = getpass.getpass("Enter your WRITER API key: ")
최고 수준의 observability를 위해 LangSmith를 설정하는 것도 유용합니다(필수는 아닙니다). 설정하려면 LANGSMITH_TRACINGLANGSMITH_API_KEY environment variable을 설정할 수 있습니다:
os.environ["LANGSMITH_TRACING"] = "true"
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass()

Instantiation

strategy parameter를 다음 중 하나로 설정하여 WriterTextSplitter instance를 생성하세요:
  • llm_split: 정확한 의미론적 분할을 위해 language model 사용
  • fast_split: 빠른 분할을 위해 휴리스틱 기반 접근 방식 사용
  • hybrid_split: 두 가지 접근 방식을 결합
from langchain_writer.text_splitter import WriterTextSplitter

splitter = WriterTextSplitter(strategy="fast_split")

Usage

WriterTextSplitter는 동기적 또는 비동기적으로 사용할 수 있습니다.

Synchronous usage

WriterTextSplitter를 동기적으로 사용하려면 분할하려는 텍스트와 함께 split_text method를 호출하세요:
text = """Reeeeeeeeeeeeeeeeeeeeeaally long text you want to divide into smaller chunks. For example you can add a poem multiple times:
Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.

Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.

Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.
"""

chunks = splitter.split_text(text)
chunks
생성된 chunk의 개수를 확인하기 위해 chunk의 길이를 출력할 수 있습니다:
print(len(chunks))

Asynchronous usage

WriterTextSplitter를 비동기적으로 사용하려면 분할하려는 텍스트와 함께 asplit_text method를 호출하세요:
async_chunks = await splitter.asplit_text(text)
async_chunks
생성된 chunk의 개수를 확인하기 위해 chunk의 길이를 출력하세요:
print(len(async_chunks))

API reference

모든 WriterTextSplitter 기능 및 구성에 대한 자세한 문서는 API reference를 참조하세요.

Additional resources

WRITER의 model(비용, context window, 지원되는 input type 포함) 및 tool에 대한 정보는 WRITER docs에서 확인할 수 있습니다.
Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.
I