Copy
---
title: Bagel
---
> [Bagel](https://www.bagel.net/) (`AI를 위한 Open Inference 플랫폼`)은 AI 데이터를 위한 GitHub와 같습니다.
사용자가 Inference 데이터셋을 생성하고, 공유하며, 관리할 수 있는 협업 플랫폼입니다.
독립 개발자를 위한 개인 프로젝트, 기업을 위한 내부 협업, 데이터 DAO를 위한 공개 기여를 지원할 수 있습니다.
### Installation and Setup
```bash
pip install bagelML langchain-community
텍스트로부터 VectorStore 생성하기
Copy
from langchain_community.vectorstores import Bagel
texts = ["hello bagel", "hello langchain", "I love salad", "my car", "a dog"]
# create cluster and add texts
cluster = Bagel.from_texts(cluster_name="testing", texts=texts)
Copy
# similarity search
cluster.similarity_search("bagel", k=3)
Copy
[Document(page_content='hello bagel', metadata={}),
Document(page_content='my car', metadata={}),
Document(page_content='I love salad', metadata={})]
Copy
# the score is a distance metric, so lower is better
cluster.similarity_search_with_score("bagel", k=3)
Copy
[(Document(page_content='hello bagel', metadata={}), 0.27392977476119995),
(Document(page_content='my car', metadata={}), 1.4783176183700562),
(Document(page_content='I love salad', metadata={}), 1.5342965126037598)]
Copy
# delete the cluster
cluster.delete_cluster()
문서로부터 VectorStore 생성하기
Copy
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter
loader = TextLoader("../../how_to/state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)[:10]
Copy
# create cluster with docs
cluster = Bagel.from_documents(cluster_name="testing_with_docs", documents=docs)
Copy
# similarity search
query = "What did the president say about Ketanji Brown Jackson"
docs = cluster.similarity_search(query)
print(docs[0].page_content[:102])
Copy
Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the
Cluster에서 모든 텍스트/문서 가져오기
Copy
texts = ["hello bagel", "this is langchain"]
cluster = Bagel.from_texts(cluster_name="testing", texts=texts)
cluster_data = cluster.get()
Copy
# all keys
cluster_data.keys()
Copy
dict_keys(['ids', 'embeddings', 'metadatas', 'documents'])
Copy
# all values and keys
cluster_data
Copy
{'ids': ['578c6d24-3763-11ee-a8ab-b7b7b34f99ba',
'578c6d25-3763-11ee-a8ab-b7b7b34f99ba',
'fb2fc7d8-3762-11ee-a8ab-b7b7b34f99ba',
'fb2fc7d9-3762-11ee-a8ab-b7b7b34f99ba',
'6b40881a-3762-11ee-a8ab-b7b7b34f99ba',
'6b40881b-3762-11ee-a8ab-b7b7b34f99ba',
'581e691e-3762-11ee-a8ab-b7b7b34f99ba',
'581e691f-3762-11ee-a8ab-b7b7b34f99ba'],
'embeddings': None,
'metadatas': [{}, {}, {}, {}, {}, {}, {}, {}],
'documents': ['hello bagel',
'this is langchain',
'hello bagel',
'this is langchain',
'hello bagel',
'this is langchain',
'hello bagel',
'this is langchain']}
Copy
cluster.delete_cluster()
metadata로 cluster 생성하고 metadata를 사용하여 필터링하기
Copy
texts = ["hello bagel", "this is langchain"]
metadatas = [{"source": "notion"}, {"source": "google"}]
cluster = Bagel.from_texts(cluster_name="testing", texts=texts, metadatas=metadatas)
cluster.similarity_search_with_score("hello bagel", where={"source": "notion"})
Copy
[(Document(page_content='hello bagel', metadata={'source': 'notion'}), 0.0)]
Copy
# delete the cluster
cluster.delete_cluster()
Copy
---
<Callout icon="pen-to-square" iconType="regular">
[Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/oss/python/integrations/vectorstores/bagel.mdx)
</Callout>
<Tip icon="terminal" iconType="regular">
[Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>