Embedding Cost: Real Prices from OpenAI, Cohere, Voyage, Google & AWS
The independent, engineer-grade reference for embedding model pricing. Not a vendor page, not a blog post. A calculator, comparison tables, and real-world RAG worked examples.
Quick Cost Estimate
Live~20,000 documents at 500 tokens each
Cheapest option for 10.0M tokens: BGE-M3 (self-hosted) at $0.001/M tokens.Full calculator with storage
Key Pricing Landmarks
Quick Decision Guide
Per-model rates ($0.02/M small, $0.13/M large, $0.10/M ada-002), 50% Batch API discount math, Azure OpenAI differences, Matryoshka dimensions, and migration paths.
Pricing Cheat Sheet
Full comparison with MTEB scores| Model | Provider | $/M tokens | Batch $/M | Dims | MTEB | |
|---|---|---|---|---|---|---|
| text-embedding-3-small | OpenAI | $0.020 | $0.010 | 1,536 | 62.3 | Details |
| text-embedding-3-large | OpenAI | $0.130 | $0.065 | 3,072 | 64.6 | Details |
| embed-v4 | Cohere | $0.100 | - | 1,024 | 55 | Details |
| voyage-3.5 | Voyage AI | $0.060 | $0.040 | 1,024 | 67.1 | Details |
| voyage-3-large | Voyage AI | $0.180 | $0.120 | 2,048 | 68.9 | Details |
| voyage-3-lite | Voyage AI | $0.020 | $0.013 | 512 | 61.7 | Details |
| gemini-embedding-2-preview | $0.200 | - | 3,072 | 68 | Details | |
| gemini-embedding-001 | $0.150 | - | 3,072 | 65.4 | Details | |
| Titan Text Embeddings V2 | Amazon Bedrock | $0.200 | - | 1,024 | 62.8 | Details |
| BGE-M3 (self-hosted) | Self-Hosted | $0.001 | $0.001 | 1,024 | 66.5 | Details |
Prices as of May 2026. Green = cheapest tier, red = most expensive tier. MTEB Retrieval average scores where publicly available.
How Embedding Pricing Works
Embedding models convert text into numeric vectors. Providers charge for the tokens you send in (input-only pricing) - there are no output tokens to pay for. This makes embeddings 10-100x cheaper than chat completions at similar token volumes.
The formula is simple: cost = (tokens / 1,000,000) x rate. Where engineers get surprised is the Batch API discount (50% off for OpenAI, 33% off for Voyage) which can halve your indexing bill overnight.
The other surprise is storage cost. Generating 1B embeddings with OpenAI small costs $20 once. Storing those vectors in a managed vector DB can cost $200+/month forever. For long-lived corpora, the recurring storage bill dwarfs the one-time generation cost within a few months.
Chunk overlap also inflates token counts 10-25% over the raw text size. A 1GB corpus is typically 750M raw tokens but 830-940M billed tokens after overlapping chunking. Plan for it.
Cost at Common Scales
| Scale | OAI small | Voyage 3.5 | Self-hosted |
|---|---|---|---|
| 10M tokens | $0.20 | $0.60 | $0.01 |
| 100M tokens | $2 | $6 | $0.10 |
| 1B tokens | $20 | $60 | $1 |
| 10B tokens | $200 | $600 | $10 |
Self-hosted cost is estimated for BGE-M3 on a spot A100. Excludes DevOps overhead. Break-even analysis
Go Deeper
Multi-provider calculator with storage cost, batch toggle, and Year 1 total
Side-by-side table with MTEB scores, context windows, and free tiers
Interactive calculator: at what volume does BGE-M3 beat the APIs?
Pinecone vs pgvector vs Qdrant at 1M / 10M / 100M vector scale