Cohere Embed Pricing: embed-v4, embed-v3 & Multilingual Options (April 2026)
Independent pricing reference for Cohere's embedding models. Text and image rates, context window caveats, multilingual performance data, and enterprise deployment options.
Current Pricing
| Model | $/M tokens | Batch | Dims | Context | Best for |
|---|---|---|---|---|---|
| embed-v4 (text) | $0.10 | None | 1,024 | 512 tokens | Multilingual text search |
| embed-v4 (images) | $0.47 | None | 1,024 | N/A | Visual search, cross-modal |
| embed-v3 (legacy) | $0.10 | None | 1,024 | 512 tokens | Migrate to v4 |
Multilingual Strength: The Core Case for Cohere
Cohere's primary competitive advantage is multilingual quality. Embed-v4 supports 100+ languages with demonstrably better performance on non-Latin scripts compared to OpenAI text-embedding-3-small. Published benchmarks show 15-20% retrieval quality improvement for Arabic, Hindi, Japanese, and Chinese content.
For applications that operate purely in English, this advantage largely disappears. At $0.10/M tokens (5x more than OpenAI small), Cohere is hard to justify for English-only RAG. The math changes when your retrieval quality requirements in non-English languages are non-negotiable.
Image Embedding with embed-v4
At $0.47/M image tokens, Cohere's multimodal capability enables a shared embedding space for text and images. A text query can retrieve relevant images, and image inputs can retrieve relevant text - useful for e-commerce product search, media libraries, and cross-modal RAG. The 1024-dimension shared space means you can index images and text into the same vector database collection.
Enterprise Deployment: AWS Marketplace & On-Premises
Cohere has a strong enterprise deployment story. Models are available through:
- -AWS Bedrock: Cohere Embed is available as a Bedrock foundation model. Data stays within your AWS VPC. Bedrock pricing applies (may differ from direct).
- -Azure AI Foundry: Available via Microsoft's model marketplace for enterprise Azure customers.
- -Private deployment: Cohere offers private cloud and on-premises deployment for regulated industries where data residency is mandatory.
When to Pick Cohere vs OpenAI vs Voyage
- - Multilingual corpus (100+ languages)
- - Non-Latin script accuracy matters
- - AWS/Azure enterprise deployment
- - Multimodal (text + image) search
- - English-only RAG applications
- - Budget is primary constraint
- - Existing OpenAI API integration
- - Long documents (8k token context)
- - Best accuracy-to-price ratio
- - Domain-specific models (code, law)
- - Long-context documents (32k)
- - Mix index/query model sizes
Free Tier
Cohere's free trial tier allows 100 API calls per minute with rate-limited access to all embedding models. There is no lifetime token cap - the limitation is rate, not volume. This is sufficient for building and testing RAG pipelines but not production workloads. For more than 100 calls/minute, you need a paid plan.