Customer Support Bot Cost Breakdown: 50,000-Ticket RAG Implementation (April 2026)
Mid-market SaaS company. 50,000 historical support tickets. 2,000 new queries per day. Answers must cite source tickets. Full cost breakdown across three provider stacks.
Scenario Inputs
Cost by Provider Stack
OpenAI small + pgvectorrecommended
The cheapest production-grade option. pgvector on an existing Postgres instance costs nothing extra. Total Year 1 under $2.
| Cost item | Standard | With Batch API | Notes |
|---|---|---|---|
| One-time indexing (25M tokens) | $0.50 | $0.25 | Once; re-run on model upgrade |
| Monthly query embedding (1.8M tokens) | $0.04/mo | N/A (real-time) | Ongoing; can't batch queries |
| Vector storage (pgvector) | $0.0066/mo | $0.0066/mo | 307 MB at $0.02/GB |
| Year 1 Total | $1.01 | $0.76 | Embed (batch) + 12 mo ongoing |
OpenAI small + Pinecone SL
Managed vector DB adds $0.10/month storage. Easier ops than pgvector but unnecessary at this scale.
| Cost item | Standard | With Batch API | Notes |
|---|---|---|---|
| One-time indexing (25M tokens) | $0.50 | $0.25 | Once; re-run on model upgrade |
| Monthly query embedding (1.8M tokens) | $0.04/mo | N/A (real-time) | Ongoing; can't batch queries |
| Vector storage (Pinecone serverless) | $0.09/mo | $0.09/mo | 307 MB at $0.33/GB |
| Year 1 Total | $2.06 | $1.81 | Embed (batch) + 12 mo ongoing |
Voyage 3.5 + pgvector
Higher MTEB accuracy (67.1 vs 62.3). Worth it if retrieval quality directly affects answer quality. 3x embed cost.
| Cost item | Standard | With Batch API | Notes |
|---|---|---|---|
| One-time indexing (25M tokens) | $1.50 | $1.00 | Once; re-run on model upgrade |
| Monthly query embedding (1.8M tokens) | $0.11/mo | N/A (real-time) | Ongoing; can't batch queries |
| Vector storage (pgvector) | $0.0066/mo | $0.0066/mo | 307 MB at $0.02/GB |
| Year 1 Total | $2.87 | $2.37 | Embed (batch) + 12 mo ongoing |
Year 1 Cost (with 1 re-embedding pass)
If you upgrade embedding models once during Year 1 (common when moving from ada-002 to text-embedding-3-small, or from 3-small to Voyage), add one more one-time indexing cost. For this scenario at OpenAI small batch: $0.25 x 2 = $0.50 total indexing. Annual total still under $1.50 including queries and storage.
Optimization Opportunities
Use for the initial archive indexing pass. Queries must use standard real-time API.
Many support bot queries are near-identical ("how do I reset my password?"). LRU cache on exact query text captures a significant fraction.
At 307 MB, storage cost is trivial on either platform. But pgvector removes any cold-start risk and dependency on Pinecone availability.