Independent resource. Not affiliated with any provider. Always verify pricing on provider sites.
$embeddingcost

Google Gemini Embedding Pricing: gemini-embedding-001, 2-preview & text-embedding-005 (April 2026)

Google's embedding pricing is spread across Gemini API and Vertex AI docs. This page consolidates the rates, explains Matryoshka dimension options, and clarifies the Gemini API vs Vertex AI billing differences.

Verified April 2026

Current Pricing

Model$/M tokensDims (MRL)ContextStatus
gemini-embedding-2-preview$0.20768/1536/30728,192 tokensPreview
gemini-embedding-001$0.15768/1536/30722,048 tokensGA (stable)
text-embedding-005$0.15768/1536/30722,048 tokensLegacy

All models accessed via Gemini API or Vertex AI. Preview pricing subject to change at GA launch.

Matryoshka Dimensions: Storage Cost Impact

Both gemini-embedding-001 and gemini-embedding-2-preview support Matryoshka Representation Learning. You can request 768, 1536, or 3072-dimension vectors. The API token price is identical regardless of dimension count - savings are in downstream storage. For 100 million vectors:

DimensionsBytes/vectorGB per 100M vecsStorage ratio
3,07212,28811.4 GB4x (baseline)
1,5366,1445.7 GB2x
7683,0722.9 GB1x (cheapest)

Using 768 dimensions instead of 3072 reduces storage cost by 4x. Quality loss on MTEB Retrieval is typically 2-5 points. Good trade-off for high-scale applications where storage costs dominate.

Vertex AI vs Gemini API: Which to Use

Gemini API (Google AI Studio)
  • - Free tier: ~1,500 requests/day
  • - Simple token-based billing
  • - Quick setup, great for prototypes
  • - Rate limits lower than Vertex
  • - Data processed in Google's infra
Vertex AI
  • - $300 free trial credits (90 days)
  • - Enterprise quotas and SLAs
  • - VPC controls, data residency
  • - Google Cloud billing integration
  • - Best for production GCP workloads

gemini-embedding-2: The Multimodal Upgrade

The gemini-embedding-2-preview model is natively multimodal - it can embed text, images, and video in a shared vector space. At $0.20/M tokens (text), it is Google's premium offering. During preview, pricing may change at general availability. The 8,192-token context window (vs 2,048 for gemini-embedding-001) is a significant upgrade for long-document RAG applications.

Frequently Asked Questions

How much does Google Gemini embedding cost?
gemini-embedding-001 costs $0.15 per million tokens (GA, stable). gemini-embedding-2-preview costs $0.20 per million tokens. Both support MRL dimensions of 768/1536/3072 with no additional cost.
What is the difference between Gemini API and Vertex AI for embeddings?
Gemini API is simpler with a free rate-limited tier. Vertex AI is the enterprise platform with VPC controls, data residency, and Google Cloud billing. Pricing is similar but Vertex adds enterprise SLAs and quota flexibility.
Does Google support Matryoshka dimensions for embeddings?
Yes. Both gemini-embedding-001 and gemini-embedding-2-preview support MRL with output dimensions of 3072, 1536, or 768. Using 768 dimensions instead of 3072 reduces storage cost by 4x.
What is the Google Gemini embedding free tier?
Google AI Studio offers approximately 1,500 free embedding requests per day. Vertex AI provides $300 in free credits valid for 90 days for new GCP accounts.
Compare all models
Google vs OpenAI vs Voyage vs Cohere
Cost optimization
MRL dimension reduction strategy
Full calculator
Include storage in your Google estimate
Disclaimer: Independent resource. Not affiliated with Google. Pricing from Google AI and Vertex AI public pricing pages, verified April 2026. Always verify at ai.google.dev/pricing.