Amazon Titan Embedding Pricing: Titan Text Embeddings V2 on Bedrock (April 2026)
AWS Bedrock's embedding page covers all Bedrock models. This page focuses on Titan V2, the binary embedding storage trick, and when AWS-native embeddings make sense over direct API access.
Current Pricing (US East)
| Model | $/M tokens | Dims | Context | Binary |
|---|---|---|---|---|
| Titan Text Embeddings V2 | $0.20 | 1024 | 8,192 | Yes (4x storage reduction) |
| Cohere Embed v3 (via Bedrock) | $0.10 | 1024 | 512 | No |
| Voyage 3.5 (via Bedrock) | $0.06 | 1024 | 32,000 | No |
Bedrock pricing varies by region. Rates above are for US East 1. Other regions may be 10-20% higher. Always verify at the AWS Bedrock pricing page.
Binary Embeddings: 4x Storage Reduction
Titan V2 supports binary quantization - storing each embedding dimension as 1 bit instead of 32 bits. This reduces storage cost by 96% with a 5-10% reduction in retrieval accuracy (depending on the domain). AWS buries this feature in a blog post; it is one of the most cost-effective storage optimizations available from any provider.
| Scale | Float32 storage | Binary storage | Reduction |
|---|---|---|---|
| 1M vectors | 3.8 GB | 122 MB | 32x smaller |
| 10M vectors | 38.1 GB | 1.2 GB | 32x smaller |
| 100M vectors | 381.5 GB | 11.9 GB | 32x smaller |
Binary storage uses 1 bit per dimension vs 32 bits for float32. At 1024 dims, each binary vector is 128 bytes instead of 4,096 bytes.
When Bedrock Wins: Compliance, Residency, AWS Ecosystem
- - AWS-only data processing compliance
- - Existing AWS Enterprise Agreement
- - OpenSearch Serverless integration
- - SageMaker pipeline embedding steps
- - Regional data residency required
- - Multi-model Bedrock workflow
- - Cost is primary concern (use OAI small)
- - No AWS compliance requirements
- - Need OpenAI Batch API discount
- - Need Voyage's accuracy advantage
- - No existing AWS infrastructure
OpenSearch Serverless + Titan V2: Reference Architecture Cost
For AWS-native semantic search, Titan V2 + OpenSearch Serverless is the reference architecture. Monthly cost for a medium-scale application (100k documents, 5k queries/day):
Note: OpenSearch Serverless minimum 2 OCUs at ~$175/OCU/month is the dominant cost - roughly $350/month. For smaller applications, a pgvector or Qdrant self-hosted deployment is significantly cheaper.