Copy
---
title: SageMakerEndpoint
---
[Amazon SageMaker](https://aws.amazon.com/sagemaker/)는 완전 관리형 인프라, 도구 및 워크플로우를 통해 모든 사용 사례에 대한 머신 러닝(ML) 모델을 구축, 훈련 및 배포할 수 있는 시스템입니다.
이 노트북은 `SageMaker endpoint`에 호스팅된 LLM을 사용하는 방법을 다룹니다.
```python
!pip3 install langchain boto3
설정
SagemakerEndpoint 호출에 필요한 다음 매개변수를 설정해야 합니다:
endpoint_name: 배포된 Sagemaker 모델의 endpoint 이름입니다. AWS Region 내에서 고유해야 합니다.credentials_profile_name: ~/.aws/credentials 또는 ~/.aws/config 파일에 있는 profile의 이름으로, access key 또는 role 정보가 지정되어 있어야 합니다. 지정하지 않으면 기본 credential profile이 사용되거나, EC2 instance에서 실행 중인 경우 IMDS의 credential이 사용됩니다. 참조: boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html
예제
Copy
from langchain_core.documents import Document
Copy
example_doc_1 = """
Peter and Elizabeth took a taxi to attend the night party in the city. While in the party, Elizabeth collapsed and was rushed to the hospital.
Since she was diagnosed with a brain injury, the doctor told Peter to stay besides her until she gets well.
Therefore, Peter stayed with her at the hospital for 3 days without leaving.
"""
docs = [
Document(
page_content=example_doc_1,
)
]
외부 boto3 session으로 초기화하는 예제
cross account 시나리오의 경우
Copy
import json
from typing import Dict
import boto3
from langchain.chains.question_answering import load_qa_chain
from langchain_aws.llms import SagemakerEndpoint
from langchain_aws.llms.sagemaker_endpoint import LLMContentHandler
from langchain_core.prompts import PromptTemplate
query = """How long was Elizabeth hospitalized?
"""
prompt_template = """Use the following pieces of context to answer the question at the end.
{context}
Question: {question}
Answer:"""
PROMPT = PromptTemplate(
template=prompt_template, input_variables=["context", "question"]
)
roleARN = "arn:aws:iam::123456789:role/cross-account-role"
sts_client = boto3.client("sts")
response = sts_client.assume_role(
RoleArn=roleARN, RoleSessionName="CrossAccountSession"
)
client = boto3.client(
"sagemaker-runtime",
region_name="us-west-2",
aws_access_key_id=response["Credentials"]["AccessKeyId"],
aws_secret_access_key=response["Credentials"]["SecretAccessKey"],
aws_session_token=response["Credentials"]["SessionToken"],
)
class ContentHandler(LLMContentHandler):
content_type = "application/json"
accepts = "application/json"
def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes:
input_str = json.dumps({"inputs": prompt, "parameters": model_kwargs})
return input_str.encode("utf-8")
def transform_output(self, output: bytes) -> str:
response_json = json.loads(output.read().decode("utf-8"))
return response_json[0]["generated_text"]
content_handler = ContentHandler()
chain = load_qa_chain(
llm=SagemakerEndpoint(
endpoint_name="endpoint-name",
client=client,
model_kwargs={"temperature": 1e-10},
content_handler=content_handler,
),
prompt=PROMPT,
)
chain({"input_documents": docs, "question": query}, return_only_outputs=True)
Copy
import json
from typing import Dict
from langchain.chains.question_answering import load_qa_chain
from langchain_aws.llms import SagemakerEndpoint
from langchain_aws.llms.sagemaker_endpoint import LLMContentHandler
from langchain_core.prompts import PromptTemplate
query = """How long was Elizabeth hospitalized?
"""
prompt_template = """Use the following pieces of context to answer the question at the end.
{context}
Question: {question}
Answer:"""
PROMPT = PromptTemplate(
template=prompt_template, input_variables=["context", "question"]
)
class ContentHandler(LLMContentHandler):
content_type = "application/json"
accepts = "application/json"
def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes:
input_str = json.dumps({"inputs": prompt, "parameters": model_kwargs})
return input_str.encode("utf-8")
def transform_output(self, output: bytes) -> str:
response_json = json.loads(output.read().decode("utf-8"))
return response_json[0]["generated_text"]
content_handler = ContentHandler()
chain = load_qa_chain(
llm=SagemakerEndpoint(
endpoint_name="endpoint-name",
credentials_profile_name="credentials-profile-name",
region_name="us-west-2",
model_kwargs={"temperature": 1e-10},
content_handler=content_handler,
),
prompt=PROMPT,
)
chain({"input_documents": docs, "question": query}, return_only_outputs=True)
Copy
---
<Callout icon="pen-to-square" iconType="regular">
[Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/oss/python/integrations/llms/sagemaker.mdx)
</Callout>
<Tip icon="terminal" iconType="regular">
[Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>