Claude Code × Amazon Bedrock Complete Guide | Running Claude in Production on AWS
Complete guide to using Amazon Bedrock with Claude Code. From IAM authentication, streaming, Lambda integration, RAG implementation, to cost optimization — based on Masa's real production experience.
If you’ve hit a wall with “I want to use Claude API in production but I’m worried about API key management” or “Our internal security policy won’t allow data to leave AWS” — Amazon Bedrock is the answer.
When I was integrating AI into an API server on ECS for work, I initially used the Anthropic API directly. But a security review flagged “management of API keys to external services” as a concern. After switching to Bedrock, authentication was handled entirely through IAM roles, and I was freed from API key management. This article covers everything from implementing Bedrock with Claude Code to production operations.
What is Amazon Bedrock?
Amazon Bedrock is a managed AI model service from AWS. You can call multiple models — Claude (Anthropic), Llama (Meta), Titan (Amazon) — through a unified API.
Why Use Bedrock?
| Aspect | Anthropic API | Amazon Bedrock |
|---|---|---|
| Authentication | API key | AWS IAM role |
| Billing | Directly to Anthropic | Integrated into AWS billing |
| VPC support | None | Fully private with PrivateLink |
| Data retention | Anthropic’s policy | AWS’s policy |
| Compliance | SOC2, etc. | SOC2 / ISO27001 / HIPAA, etc. |
Anthropic API is convenient for personal projects, but for enterprise, finance, and healthcare use cases, Bedrock is increasingly the only option.
Step 1: Initial Setup
Requesting Model Access
First, request access to Claude models in the AWS console.
# Check the list of available models
aws bedrock list-foundation-models \
--by-provider anthropic \
--region us-east-1 \
--query 'modelSummaries[].{id:modelId, name:modelName}'
# Sample output
[
{"id": "anthropic.claude-opus-4-5", "name": "Claude Opus 4.5"},
{"id": "anthropic.claude-sonnet-4-6", "name": "Claude Sonnet 4.6"},
{"id": "anthropic.claude-haiku-4-5-20251001", "name": "Claude Haiku 4.5"}
]
Important: The primary available regions are us-east-1 (Virginia) and us-west-2 (Oregon). Tokyo region can be used via Cross-region inference.
SDK Installation
npm install @anthropic-ai/sdk @aws-sdk/client-bedrock-runtime
Step 2: Basic Implementation
Using Anthropic SDK’s Built-in Bedrock Support (Recommended)
The official Anthropic SDK has built-in Bedrock support. Since the syntax is nearly identical to the regular Anthropic API, the migration cost from existing code is minimal.
// src/lib/bedrock-client.ts
import Anthropic from "@anthropic-ai/sdk";
// No credentials needed when IAM role is used (e.g., Lambda/ECS)
const bedrock = new Anthropic.AnthropicBedrock({
awsRegion: process.env.AWS_REGION ?? "us-east-1",
// AWS CLI profile is used automatically during local development
});
export async function generateText(
prompt: string,
options: { model?: string; maxTokens?: number } = {}
): Promise<string> {
const { model = "anthropic.claude-sonnet-4-6", maxTokens = 1024 } = options;
const response = await bedrock.messages.create({
model,
max_tokens: maxTokens,
messages: [{ role: "user", content: prompt }],
});
return response.content[0].type === "text" ? response.content[0].text : "";
}
Bedrock model IDs differ from the Anthropic API:
Anthropic API: claude-sonnet-4-6
Bedrock: anthropic.claude-sonnet-4-6 (prefix added)
Streaming Support
Streaming is essential for long responses.
// src/lib/bedrock-stream.ts
export async function* streamText(
prompt: string,
model = "anthropic.claude-sonnet-4-6"
): AsyncGenerator<string> {
const stream = await bedrock.messages.stream({
model,
max_tokens: 4096,
messages: [{ role: "user", content: prompt }],
});
for await (const chunk of stream) {
if (
chunk.type === "content_block_delta" &&
chunk.delta.type === "text_delta"
) {
yield chunk.delta.text;
}
}
}
// Usage example (Next.js App Router)
export async function POST(req: Request) {
const { prompt } = await req.json();
const encoder = new TextEncoder();
const stream = new ReadableStream({
async start(controller) {
for await (const text of streamText(prompt)) {
controller.enqueue(encoder.encode(text));
}
controller.close();
},
});
return new Response(stream, {
headers: { "Content-Type": "text/event-stream" },
});
}
Step 3: Lambda + Bedrock Pattern
The most common architecture for providing serverless AI features.
claude -p "
Implement the following Lambda function in src/lambda/ai-handler.ts:
- Accept prompt and maxTokens from the event
- Call Bedrock (claude-sonnet-4-6) and return the result
- Handle errors: ThrottlingException (retry) and ValidationException (400)
- Log execution time
- Initialize client outside the handler (cold start optimization)
"
// src/lambda/ai-handler.ts
import { Handler, APIGatewayProxyEvent, APIGatewayProxyResult } from "aws-lambda";
import Anthropic from "@anthropic-ai/sdk";
// Initialize at module scope (cached on container reuse)
const bedrock = new Anthropic.AnthropicBedrock({
awsRegion: process.env.AWS_REGION,
});
export const handler: Handler<APIGatewayProxyEvent, APIGatewayProxyResult> = async (event) => {
const startTime = Date.now();
try {
const { prompt, maxTokens = 512 } = JSON.parse(event.body ?? "{}");
if (!prompt) {
return { statusCode: 400, body: JSON.stringify({ error: "prompt is required" }) };
}
const response = await bedrock.messages.create({
model: "anthropic.claude-sonnet-4-6",
max_tokens: maxTokens,
messages: [{ role: "user", content: prompt }],
});
const duration = Date.now() - startTime;
console.log(JSON.stringify({
level: "INFO",
duration_ms: duration,
input_tokens: response.usage.input_tokens,
output_tokens: response.usage.output_tokens,
}));
return {
statusCode: 200,
body: JSON.stringify({
text: response.content[0].type === "text" ? response.content[0].text : "",
usage: response.usage,
}),
};
} catch (error: any) {
if (error.name === "ThrottlingException") {
console.warn("Rate limited by Bedrock, client should retry");
return { statusCode: 429, body: JSON.stringify({ error: "Rate limited, please retry" }) };
}
console.error("Bedrock error:", error);
return { statusCode: 500, body: JSON.stringify({ error: "AI generation failed" }) };
}
};
Lambda IAM Policy
// IAM configuration with CDK
import * as iam from "aws-cdk-lib/aws-iam";
lambdaFunction.addToRolePolicy(new iam.PolicyStatement({
effect: iam.Effect.ALLOW,
actions: [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream",
],
resources: [
`arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-6`,
`arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-haiku-4-5-20251001`,
],
}));
Step 4: RAG (Retrieval-Augmented Generation) Implementation
A pattern where Claude reads internal documents or product information to answer questions.
claude -p "
Implement a RAG system using Bedrock Knowledge Base.
Architecture:
- Store documents in S3
- Index with Bedrock Knowledge Base vector indexing
- Retrieve documents based on user questions
- Generate answers with Claude Sonnet
Implement with TypeScript + AWS SDK v3.
Get Knowledge Base ID from the KNOWLEDGE_BASE_ID environment variable.
"
// src/lib/rag.ts
import {
BedrockAgentRuntimeClient,
RetrieveAndGenerateCommand,
} from "@aws-sdk/client-bedrock-agent-runtime";
const agentClient = new BedrockAgentRuntimeClient({ region: "us-east-1" });
export async function ragQuery(question: string): Promise<{
answer: string;
citations: string[];
}> {
const response = await agentClient.send(
new RetrieveAndGenerateCommand({
input: { text: question },
retrieveAndGenerateConfiguration: {
type: "KNOWLEDGE_BASE",
knowledgeBaseConfiguration: {
knowledgeBaseId: process.env.KNOWLEDGE_BASE_ID!,
modelArn: `arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-6`,
retrievalConfiguration: {
vectorSearchConfiguration: { numberOfResults: 5 },
},
},
},
})
);
const answer = response.output?.text ?? "";
const citations = (response.citations ?? [])
.flatMap((c) => c.retrievedReferences ?? [])
.map((r) => r.location?.s3Location?.uri ?? "")
.filter(Boolean);
return { answer, citations };
}
Step 5: Cost Optimization
// Model selection utility
type TaskType = "classify" | "extract" | "summarize" | "generate" | "complex";
const MODEL_MAP: Record<TaskType, string> = {
classify: "anthropic.claude-haiku-4-5-20251001", // $0.80/1M input
extract: "anthropic.claude-haiku-4-5-20251001",
summarize: "anthropic.claude-sonnet-4-6", // $3.00/1M input
generate: "anthropic.claude-sonnet-4-6",
complex: "anthropic.claude-opus-4-5", // $15.00/1M input
};
export function selectModel(task: TaskType): string {
return MODEL_MAP[task];
}
Reduce Input Costs with Prompt Caching
// Prompt caching is also available in Bedrock
const response = await bedrock.messages.create({
model: "anthropic.claude-sonnet-4-6",
max_tokens: 1024,
system: [
{
type: "text",
text: longSystemPrompt,
cache_control: { type: "ephemeral" }, // Cache for 5 minutes
},
],
messages: [{ role: "user", content: userQuery }],
});
5 Common Pitfalls
1. Region not supported
Claude on Bedrock is not available in all regions. As of 2026, us-east-1 and us-west-2 are the primary regions. To use it from Tokyo, enable Cross-region inference.
// Use the cross-region inference model ARN
const crossRegionModelArn =
"arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-6";
2. Forgetting to request model access
In Bedrock, you must request “Model access” for each model you want to use. Calling a model without requesting access will result in an AccessDeniedException. Always request access before coding with Claude Code.
3. Lambda timeout too short
Claude responses can take 10–30 seconds. The Lambda default of 3 seconds will definitely time out. Set it to at least 30 seconds, and 60–300 seconds for longer generations.
4. Confusing Bedrock model IDs with Anthropic API IDs
❌ Using the Anthropic API ID directly: "claude-sonnet-4-6"
✅ Bedrock ID: "anthropic.claude-sonnet-4-6"
5. Not accounting for Cross-region inference latency
Calling models in us-east-1 from Tokyo adds round-trip network latency (approximately 100–200ms). For latency-sensitive applications, use streaming to reduce perceived delay.
Summary
| Task | Claude Code’s Contribution |
|---|---|
| Basic implementation | Generates AnthropicBedrock client and functions |
| Lambda integration | Generates handler and IAM policy together |
| RAG implementation | Auto-generates Knowledge Base integration code |
| Cost optimization | Designs model selection logic by task type |
| Troubleshooting | Identifies cause and suggests fix from error logs |
Develop with Claude Code, run in production on Bedrock — this combination satisfies security, cost, and scalability requirements all at once. Start with the free Bedrock trial, and when you’re ready to go to production, all you need is to configure the IAM role.
Related Articles
- Claude Code × AWS Lambda Complete Guide
- Claude Code × AWS IAM Complete Guide
- Complete Guide to Reducing Claude Code API Costs by 90%
References
Level up your Claude Code workflow
50 battle-tested prompt templates you can copy-paste into Claude Code right now.
Free PDF: Claude Code Cheatsheet in 5 Minutes
Just enter your email and we'll send you the single-page A4 cheatsheet right away.
We handle your data with care and never send spam.
About the Author
Masa
Engineer obsessed with Claude Code. Runs claudecode-lab.com, a 10-language tech media with 2,000+ pages.
Related Posts
Claude Code × AWS CodePipeline/CodeBuild Complete Guide | Automate CI/CD Pipeline Build
Automatically build CI/CD with AWS CodePipeline & CodeBuild using Claude Code. Real code examples for pipeline design, buildspec.yml generation, test automation, and CDK infrastructure.
Claude Code × AWS CloudWatch Complete Guide | Log Analysis, Alarm Setup & Dashboard Automation
Boost AWS CloudWatch efficiency with Claude Code. Real-world code for log pattern analysis, automatic alarm configuration, metrics dashboards, and incident investigation.
Claude Code × AWS ECS/Fargate Complete Guide | Automate Container Deployments
Automate AWS ECS/Fargate deployments with Claude Code. From task definitions and service configuration to Blue/Green deployments and CDK infrastructure — based on Masa's real-world experience.