What is a vector database?
A vector database stores embeddings — numeric representations of text, images, or other data — and finds the ones most similar to a query vector, fast. It is the search engine behind RAG, recommendations, and similarity features, built for nearest-neighbor lookups at scale.
Why it matters
RAG and most LLM applications depend on retrieving relevant content by meaning, not keywords, and that is exactly what vector search does. As embeddings become central to AI products, knowing how to store and query them efficiently is a core production skill. It is where the AI engineering job often lives.
What to learn
- Embeddings as points in high-dimensional space
- Similarity metrics: cosine and dot product
- Approximate nearest neighbor search and why exact is too slow
- Vector database options: pgvector, Pinecone, Qdrant
- Indexing and the speed-accuracy trade-off
- Metadata filtering alongside vector search
- Keeping embeddings in sync with source data
Common pitfall
Mixing embeddings from different models or versions in the same index. Embeddings are only comparable if produced by the same model — vectors from two different models live in incompatible spaces, so similarity is meaningless. Re-embed everything when you change the embedding model, and never mix versions.
Resources
Primary (free):
- pgvector — GitHub · docs
- Pinecone — Vector database guide · docs
- Qdrant — Documentation · docs
Practice
Embed a small set of documents, store the vectors in a vector store (pgvector is easy to start with), and run a similarity search for a query. Add a metadata filter so search is scoped. Done when semantically related items rank above unrelated ones for a query.
Outcomes
- Explain embeddings and vector similarity search.
- Choose and use a vector database.
- Combine vector search with metadata filtering.
- Keep embeddings consistent with one model version.