Independent resource. Not affiliated with any provider. Always verify pricing on provider sites.
$embeddingcost

Azure OpenAI Embedding Pricing 2026 (May 2026)

Per-token pricing for text-embedding-3-small, 3-large, and ada-002 on Azure OpenAI Service, plus how PTU reserved capacity changes the math and the key differences against OpenAI direct.

Verified May 2026

Current Pricing (Pay-As-You-Go)

ModelStandard $/MBatch $/MDimsMTEB
text-embedding-3-small
Recommended
$0.02not offered153662.3
text-embedding-3-large
High accuracy
$0.13not offered307264.6
text-embedding-ada-002
Legacy - migrate
$0.10not offered153660.5

Pay-as-you-go rates verified May 2026. The Batch API 50% discount available on OpenAI direct is not offered on Azure OpenAI for embedding models.

Cost at scale (pay-as-you-go)

Model100M tokens1B tokens10B tokens
text-embedding-3-small$2$20$200
text-embedding-3-large$13$130$1,300
text-embedding-ada-002$10$100$1,000

Provisioned Throughput Units (PTUs)

PTUs are Azure's reserved-capacity pricing model. You commit to a fixed throughput allocation for a monthly or annual term in exchange for predictable pricing, guaranteed capacity, and protection against pay-as-you-go list-price changes.

Embedding workloads typically need fewer PTUs than chat-completion workloads because embedding requests are short and high-throughput. The exact discount versus pay-as-you-go depends on region, term length, and any Enterprise Agreement-level negotiation. Public list reference points have historically broken even with pay-as-you-go at roughly 200-300 million tokens per month for embeddings, but Microsoft does not publish a single canonical PTU price for embeddings, so you must request a quote.

When PTU makes sense: sustained high-throughput indexing (catalogue ingestion, search-index refresh), latency-sensitive online retrieval that needs guaranteed capacity, or regulated workloads where pay-as-you-go variance is a procurement blocker. For variable or bursty workloads, pay-as-you-go is almost always cheaper.

Azure OpenAI versus OpenAI direct: key differences

AspectOpenAI directAzure OpenAI
Pay-as-you-go $/M tokens$0.02 - $0.13$0.02 - $0.13 (matches)
Batch API 50% discountYesNo
Reserved capacityNot offered for embeddingsPTU (monthly / annual term)
Regional controlGlobal only25+ regions, data residency
Enterprise complianceSOC 2, GDPRSOC 2, GDPR, HIPAA, FedRAMP, more via inheritance
Private networkingPublic endpoint onlyPrivate Endpoints, VNet
Volume discountsLimited (Enterprise tier)EA-level discounts on Azure commit

Choosing between Azure and OpenAI direct

Pick OpenAI direct if your workload is mostly bulk indexing (Batch API saves 50%), you have no data-residency requirement, and you want the lowest absolute cost.

Pick Azure OpenAI if you need EU / UK / US data residency, you are already on an Azure Enterprise Agreement (committed Azure spend reduces effective embedding cost), you need Private Endpoints / VNet, or you require HIPAA / FedRAMP compliance.

Pick Azure PTU if you have sustained, predictable embedding throughput above roughly 200-300 million tokens per month and your latency requirements rule out the Batch API anyway. Get a Microsoft account team quote before assuming PTU is cheaper - the answer is region- and term-dependent.

OpenAI direct pricingAll embedding providersCost calculatorMethodology

Updated May 2026