Vector Database Comparison: Pinecone vs Weaviate vs Chroma vs pgvector
Choosing the right vector database shapes the retrieval quality, operational burden, and cost of your RAG or semantic search system. This guide compares the leading options with concrete trade-offs.
Quick comparison
| Database | Type | Hosting | Best for | Weak at |
|---|---|---|---|---|
| Pinecone | Purpose-built | Managed cloud | Zero-ops production scale | SQL joins, complex filters |
| Weaviate | Purpose-built | Self / cloud | Multi-tenancy, hybrid search | Ops complexity |
| Chroma | Purpose-built | In-process / cloud | Prototyping, local dev | Large-scale production |
| Qdrant | Purpose-built | Self / cloud | High-throughput, low latency | Managed tier maturity |
| pgvector | PostgreSQL ext | Any Postgres | Existing Postgres stacks | Billion-vector ANN scale |
| Milvus | Purpose-built | Self / cloud | Billion-vector datasets | Operational simplicity |
Pinecone
Fully managed, serverless vector database. Near-zero operational overhead.
import { Pinecone } from '@pinecone-database/pinecone';
const pc = new Pinecone({ apiKey: process.env.PINECONE_API_KEY! });
const index = pc.index('my-index');
// Upsert
await index.upsert([{
id: 'doc-1',
values: embedding,
metadata: { source: 'docs/api.md', category: 'api' },
}]);
// Query
const results = await index.query({
vector: queryEmbedding,
topK: 5,
filter: { category: 'api' },
includeMetadata: true,
});- Serverless tier: pay per read/write unit — ideal for low-traffic apps.
- Pods tier: predictable latency for high-traffic production.
- Supports sparse-dense hybrid search (sparse encoder required).
Weaviate
Open-source, GraphQL-native, with built-in hybrid search and multi-tenancy.
import weaviate from 'weaviate-client';
const client = await weaviate.connectToWeaviateCloud(process.env.WCS_URL!, {
authCredentials: new weaviate.ApiKey(process.env.WCS_API_KEY!),
});
const collection = client.collections.get('Document');
// Insert
await collection.data.insert({
properties: { title: 'API Reference', content: chunk },
vectors: embedding,
});
// Hybrid query
const results = await collection.query.hybrid(queryText, {
limit: 5,
alpha: 0.75, // 0=BM25 only, 1=vector only
returnMetadata: ['score'],
});Chroma
Simplest to start with. Runs in-process with no external dependency.
import { ChromaClient } from 'chromadb';
const client = new ChromaClient();
const collection = await client.getOrCreateCollection({ name: 'docs' });
// Add documents (Chroma embeds automatically if you set an embedding function)
await collection.add({
ids: ['doc-1', 'doc-2'],
documents: [chunk1, chunk2],
metadatas: [{ source: 'page-1' }, { source: 'page-2' }],
});
// Query
const results = await collection.query({
queryTexts: [userQuestion],
nResults: 5,
where: { source: 'page-1' },
});pgvector
PostgreSQL extension — no new infrastructure if you already run Postgres.
-- Setup
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
source TEXT,
chunk TEXT,
embedding VECTOR(1536) -- 1536 for text-embedding-3-small
);
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- Insert
INSERT INTO documents (source, chunk, embedding)
VALUES ('docs/api.md', $1, $2::vector);
-- Nearest-neighbor query
SELECT source, chunk, 1 - (embedding <=> $1::vector) AS similarity
FROM documents
ORDER BY embedding <=> $1::vector
LIMIT 5;- HNSW index (PostgreSQL 16+) offers better recall than IVFFlat for most datasets.
- Combine with full-text search using
tsvectorfor hybrid search.
Qdrant
Written in Rust for high throughput and low memory usage.
import { QdrantClient } from '@qdrant/js-client-rest';
const client = new QdrantClient({ url: 'http://localhost:6333' });
await client.upsert('my_collection', {
wait: true,
points: [{
id: 1,
vector: embedding,
payload: { source: 'docs/api.md', category: 'api' },
}],
});
const results = await client.search('my_collection', {
vector: queryEmbedding,
limit: 5,
filter: {
must: [{ key: 'category', match: { value: 'api' } }],
},
});Decision framework
| Situation | Recommended |
|---|---|
| Prototype / hackathon | Chroma (in-process) |
| Already on PostgreSQL, <10M vectors | pgvector + HNSW |
| Production, no infra team | Pinecone serverless |
| Need multi-tenant isolation | Weaviate |
| High QPS, cost-sensitive | Qdrant self-hosted |
| Billion+ vectors | Milvus / Zilliz |
Performance tips
- Use HNSW over IVFFlat for better recall at production scale.
- Normalize vectors before storage if using cosine similarity (saves computation at query time).
- Enable payload indexing on filter fields to avoid full-scan filtering.
- Benchmark your embedding model dimensions — smaller dimensions reduce memory but may hurt recall.
Takeaway
Start with pgvector if you already run PostgreSQL. Graduate to Pinecone or Qdrant when operational simplicity or throughput requirements justify a dedicated vector store. Avoid over-engineering the vector layer before your retrieval quality has been validated.