Claude Code × Amazon Bedrock 완전 가이드 | AWS에서 Claude 프로덕션 운영하기

“Claude API를 프로덕션 서비스에 사용하고 싶지만 API 키 관리가 불안하다” “사내 보안 요건으로 데이터를 AWS 밖으로 내보낼 수 없다” — 이런 벽에 부딪힌 엔지니어에게 Amazon Bedrock은 최선의 답입니다.

저는 업무에서 ECS 위의 API 서버에 AI를 통합할 때, 처음에는 Anthropic API를 직접 사용했습니다. 하지만 보안 심사에서 “외부 서비스로의 API 키 관리가 문제”라는 지적을 받았습니다. Bedrock으로 전환하고 나서는 IAM 역할만으로 인증이 완결되어 API 키 관리에서 해방되었습니다. 이 글에서는 Claude Code를 사용한 Bedrock 구현부터 프로덕션 운영까지 해설합니다.

Amazon Bedrock이란?

Amazon Bedrock은 AWS의 관리형 AI 모델 서비스입니다. Claude (Anthropic), Llama (Meta), Titan (Amazon) 등 여러 모델을 통합된 API로 호출할 수 있습니다.

왜 Bedrock을 사용하는가?

관점	Anthropic API	Amazon Bedrock
인증	API 키	AWS IAM 역할
청구	Anthropic에 직접	AWS 청구에 통합
VPC 지원	없음	PrivateLink로 완전 폐쇄망
데이터 보관	Anthropic의 정책	AWS의 정책
컴플라이언스	SOC2 등	SOC2 / ISO27001 / HIPAA 등

개인 개발이라면 Anthropic API가 편하지만, 엔터프라이즈·금융·의료계에서는 Bedrock이 유일한 선택이 되는 경우가 늘고 있습니다.

Step 1: 초기 설정

모델 액세스 신청

먼저 AWS 콘솔에서 Claude 모델에 대한 액세스를 신청합니다.

# 사용 가능한 모델 목록 확인
aws bedrock list-foundation-models \
  --by-provider anthropic \
  --region us-east-1 \
  --query 'modelSummaries[].{id:modelId, name:modelName}'

# 출력 예시
[
  {"id": "anthropic.claude-opus-4-5",     "name": "Claude Opus 4.5"},
  {"id": "anthropic.claude-sonnet-4-6",   "name": "Claude Sonnet 4.6"},
  {"id": "anthropic.claude-haiku-4-5-20251001", "name": "Claude Haiku 4.5"}
]

중요: 주요 사용 가능 리전은 us-east-1 (버지니아)과 us-west-2 (오레곤)입니다. 도쿄 리전은 Cross-region inference를 통해 사용할 수 있습니다.

SDK 설치

npm install @anthropic-ai/sdk @aws-sdk/client-bedrock-runtime

Step 2: 기본 구현

Anthropic SDK의 Bedrock 지원 사용 (권장)

공식 Anthropic SDK에는 Bedrock 지원이 내장되어 있습니다. 일반 Anthropic API와 거의 동일한 작성법을 사용할 수 있어 기존 코드 마이그레이션 비용이 최소화됩니다.

// src/lib/bedrock-client.ts
import Anthropic from "@anthropic-ai/sdk";

// Lambda/ECS 등의 IAM 역할을 사용하는 경우 자격 증명 불필요
const bedrock = new Anthropic.AnthropicBedrock({
  awsRegion: process.env.AWS_REGION ?? "us-east-1",
  // 로컬 개발 시 AWS CLI 프로파일이 자동으로 사용됨
});

export async function generateText(
  prompt: string,
  options: { model?: string; maxTokens?: number } = {}
): Promise<string> {
  const { model = "anthropic.claude-sonnet-4-6", maxTokens = 1024 } = options;

  const response = await bedrock.messages.create({
    model,
    max_tokens: maxTokens,
    messages: [{ role: "user", content: prompt }],
  });

  return response.content[0].type === "text" ? response.content[0].text : "";
}

Bedrock의 모델 ID는 Anthropic API와 다릅니다:

Anthropic API: claude-sonnet-4-6
Bedrock:       anthropic.claude-sonnet-4-6  (프리픽스가 붙음)

스트리밍 지원

긴 응답에는 스트리밍이 필수입니다.

// src/lib/bedrock-stream.ts
export async function* streamText(
  prompt: string,
  model = "anthropic.claude-sonnet-4-6"
): AsyncGenerator<string> {
  const stream = await bedrock.messages.stream({
    model,
    max_tokens: 4096,
    messages: [{ role: "user", content: prompt }],
  });

  for await (const chunk of stream) {
    if (
      chunk.type === "content_block_delta" &&
      chunk.delta.type === "text_delta"
    ) {
      yield chunk.delta.text;
    }
  }
}

// 사용 예시 (Next.js App Router의 경우)
export async function POST(req: Request) {
  const { prompt } = await req.json();
  const encoder = new TextEncoder();

  const stream = new ReadableStream({
    async start(controller) {
      for await (const text of streamText(prompt)) {
        controller.enqueue(encoder.encode(text));
      }
      controller.close();
    },
  });

  return new Response(stream, {
    headers: { "Content-Type": "text/event-stream" },
  });
}

Step 3: Lambda + Bedrock 패턴

서버리스로 AI 기능을 제공하는 가장 많이 사용되는 구성입니다.

claude -p "
src/lambda/ai-handler.ts에 다음 Lambda 함수를 구현해:
- 이벤트에서 prompt와 maxTokens를 받음
- Bedrock (claude-sonnet-4-6)를 호출하고 결과를 반환
- 에러는 ThrottlingException (재시도)과 ValidationException (400)으로 처리 분기
- 실행 시간을 로그에 기록
- 컨테이너 외부에서 클라이언트 초기화 (콜드 스타트 최적화)
"

// src/lambda/ai-handler.ts
import { Handler, APIGatewayProxyEvent, APIGatewayProxyResult } from "aws-lambda";
import Anthropic from "@anthropic-ai/sdk";

// 모듈 스코프에서 초기화 (컨테이너 재사용 시 캐시됨)
const bedrock = new Anthropic.AnthropicBedrock({
  awsRegion: process.env.AWS_REGION,
});

export const handler: Handler<APIGatewayProxyEvent, APIGatewayProxyResult> = async (event) => {
  const startTime = Date.now();

  try {
    const { prompt, maxTokens = 512 } = JSON.parse(event.body ?? "{}");

    if (!prompt) {
      return { statusCode: 400, body: JSON.stringify({ error: "prompt is required" }) };
    }

    const response = await bedrock.messages.create({
      model: "anthropic.claude-sonnet-4-6",
      max_tokens: maxTokens,
      messages: [{ role: "user", content: prompt }],
    });

    const duration = Date.now() - startTime;
    console.log(JSON.stringify({
      level: "INFO",
      duration_ms: duration,
      input_tokens: response.usage.input_tokens,
      output_tokens: response.usage.output_tokens,
    }));

    return {
      statusCode: 200,
      body: JSON.stringify({
        text: response.content[0].type === "text" ? response.content[0].text : "",
        usage: response.usage,
      }),
    };
  } catch (error: any) {
    if (error.name === "ThrottlingException") {
      console.warn("Rate limited by Bedrock, client should retry");
      return { statusCode: 429, body: JSON.stringify({ error: "Rate limited, please retry" }) };
    }
    console.error("Bedrock error:", error);
    return { statusCode: 500, body: JSON.stringify({ error: "AI generation failed" }) };
  }
};

Lambda의 IAM 정책

// CDK에서의 IAM 설정
import * as iam from "aws-cdk-lib/aws-iam";

lambdaFunction.addToRolePolicy(new iam.PolicyStatement({
  effect: iam.Effect.ALLOW,
  actions: [
    "bedrock:InvokeModel",
    "bedrock:InvokeModelWithResponseStream",
  ],
  resources: [
    `arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-6`,
    `arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-haiku-4-5-20251001`,
  ],
}));

Step 4: RAG (Retrieval-Augmented Generation) 구현

사내 문서나 제품 정보를 Claude에게 읽혀서 답변하게 하는 패턴입니다.

claude -p "
Bedrock Knowledge Base를 사용한 RAG 시스템을 구현해.

구성:
- S3에 문서를 저장
- Bedrock Knowledge Base로 벡터 인덱싱
- 사용자의 질문을 기반으로 문서 검색
- Claude Sonnet으로 답변 생성

TypeScript + AWS SDK v3으로 구현.
Knowledge Base ID는 환경 변수 KNOWLEDGE_BASE_ID에서 취득.
"

// src/lib/rag.ts
import {
  BedrockAgentRuntimeClient,
  RetrieveAndGenerateCommand,
} from "@aws-sdk/client-bedrock-agent-runtime";

const agentClient = new BedrockAgentRuntimeClient({ region: "us-east-1" });

export async function ragQuery(question: string): Promise<{
  answer: string;
  citations: string[];
}> {
  const response = await agentClient.send(
    new RetrieveAndGenerateCommand({
      input: { text: question },
      retrieveAndGenerateConfiguration: {
        type: "KNOWLEDGE_BASE",
        knowledgeBaseConfiguration: {
          knowledgeBaseId: process.env.KNOWLEDGE_BASE_ID!,
          modelArn: `arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-6`,
          retrievalConfiguration: {
            vectorSearchConfiguration: { numberOfResults: 5 },
          },
        },
      },
    })
  );

  const answer = response.output?.text ?? "";
  const citations = (response.citations ?? [])
    .flatMap((c) => c.retrievedReferences ?? [])
    .map((r) => r.location?.s3Location?.uri ?? "")
    .filter(Boolean);

  return { answer, citations };
}

Step 5: 비용 최적화

// 모델 선택 유틸리티
type TaskType = "classify" | "extract" | "summarize" | "generate" | "complex";

const MODEL_MAP: Record<TaskType, string> = {
  classify: "anthropic.claude-haiku-4-5-20251001",  // $0.80/1M 입력 토큰
  extract:  "anthropic.claude-haiku-4-5-20251001",
  summarize: "anthropic.claude-sonnet-4-6",          // $3.00/1M 입력 토큰
  generate: "anthropic.claude-sonnet-4-6",
  complex:  "anthropic.claude-opus-4-5",             // $15.00/1M 입력 토큰
};

export function selectModel(task: TaskType): string {
  return MODEL_MAP[task];
}

프롬프트 캐시로 입력 비용 절감

// Bedrock에서도 프롬프트 캐시를 사용할 수 있음
const response = await bedrock.messages.create({
  model: "anthropic.claude-sonnet-4-6",
  max_tokens: 1024,
  system: [
    {
      type: "text",
      text: longSystemPrompt,
      cache_control: { type: "ephemeral" },  // 5분간 캐시
    },
  ],
  messages: [{ role: "user", content: userQuery }],
});

주의해야 할 함정 5가지

1. 리전이 지원되지 않음

Claude on Bedrock은 전 리전에서 사용할 수 없습니다. 2026년 현재 us-east-1과 us-west-2가 주력입니다. 도쿄에서 사용하려면 Cross-region inference (크로스 리전 추론)을 활성화합니다.

// 크로스 리전 추론용 model ARN 사용
const crossRegionModelArn = 
  "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-6";

2. 모델 액세스 신청을 잊음

Bedrock에서는 사용하려는 모델마다 “Model access” 신청이 필요합니다. 신청 없이 호출하면 AccessDeniedException이 발생합니다. Claude Code로 코딩하기 전에 반드시 신청하세요.

3. Lambda의 타임아웃이 짧음

Claude의 응답에 10~~30초가 걸릴 수 있습니다. Lambda의 기본값 3초로는 반드시 타임아웃이 발생합니다. **최소 30초, 긴 생성이라면 60~~300초**로 설정하세요.

4. Bedrock의 모델 ID를 Anthropic API의 ID와 혼동함

❌ Anthropic API의 ID를 그대로 사용: "claude-sonnet-4-6"
✅ Bedrock용 ID: "anthropic.claude-sonnet-4-6"

5. Cross-region inference의 지연을 고려하지 않음

도쿄에서 us-east-1의 모델을 호출하면 왕복 네트워크 지연이 추가됩니다 (약 100~200ms). 실시간성이 요구되는 앱에서는 스트리밍으로 체감 지연을 줄이세요.

정리

태스크	Claude Code의 기여
기본 구현	AnthropicBedrock 클라이언트와 함수 생성
Lambda 통합	핸들러·IAM 정책을 일괄 생성
RAG 구현	Knowledge Base 연동 코드를 자동 생성
비용 최적화	태스크에 맞는 모델 선택 로직 설계
문제 해결	에러 로그에서 원인과 수정안 제시

Claude Code로 개발하고 Bedrock으로 프로덕션 운영——이 조합은 보안·비용·확장성 모두를 충족합니다. 먼저 무료 Bedrock 체험부터 시작하고, 프로덕션으로 이전할 때는 IAM 역할 설정만으로 완결됩니다.

Claude Code × Amazon Bedrock 완전 가이드 | AWS에서 Claude 프로덕션 운영하기

Amazon Bedrock이란?

왜 Bedrock을 사용하는가?

Step 1: 초기 설정

모델 액세스 신청

SDK 설치

Step 2: 기본 구현

Anthropic SDK의 Bedrock 지원 사용 (권장)

스트리밍 지원

Step 3: Lambda + Bedrock 패턴

Lambda의 IAM 정책

Step 4: RAG (Retrieval-Augmented Generation) 구현

Step 5: 비용 최적화

프롬프트 캐시로 입력 비용 절감

주의해야 할 함정 5가지

정리

관련 글

참고 자료

Claude Code 워크플로우를 한 단계 업그레이드하세요

무료 PDF: 5분 완성 Claude Code 치트시트

관련 글

Claude Code × AWS CodePipeline/CodeBuild 완전 가이드 | CI/CD 파이프라인 자동 구축

Claude Code × AWS CloudWatch 완벽 가이드 | 로그 분석·알람 설정·대시보드 자동 구축

Claude Code × AWS ECS/Fargate 완전 가이드 | 컨테이너 배포 자동화하기

관련 상품

실전 Claude Code 프롬프트 템플릿 50선