Techniques

What Is Vector Databases?

A vector database is a specialised database that stores information as mathematical representations (vectors) — enabling AI systems to find semantically similar content, which powers search, recommendations, and RAG systems.

The Plain-English Explanation

Traditional databases find exact matches: search for "dog" and you get results containing exactly the word "dog." Vector databases find semantic matches: search for "dog" and you also get results about "puppy," "canine," "pet," and "golden retriever" — because the database understands these concepts are related.

This works by converting text (or images, audio, etc.) into vectors — lists of numbers that represent meaning. Similar concepts end up near each other in this numerical space. When you search, the database finds vectors closest to your query, returning the most semantically relevant results.

Why It Matters

Vector databases are the foundation of RAG systems, semantic search, and recommendation engines. If you want AI to answer questions about your company's documents, find similar products, or match candidates to jobs based on skills rather than keywords, you need a vector database. They're the bridge between your data and AI's ability to understand it.

Examples in Practice

Common Misconceptions

Myth: Vector databases replace traditional databases.

Reality: They complement traditional databases. Vector databases handle similarity search and semantic queries; traditional databases handle structured queries, transactions, and exact lookups. Most systems use both.

Myth: Vector databases are only for AI companies.

Reality: Any organisation that wants semantic search, recommendations, or RAG needs a vector database. Managed services like Pinecone make them accessible to teams without infrastructure expertise.

Myth: Setting up a vector database is extremely technical.

Reality: Managed solutions like Pinecone, Weaviate Cloud, and Supabase vector support make it straightforward. You can have a working vector database in under an hour with modern tools.

Related Terms

Learn Vector Databases in Depth

Module 5 of AI Agents & Automation covers vector databases and RAG — from concept to hands-on implementation, including building your own knowledge retrieval system.

Explore AI Agents & Automation

Frequently Asked Questions

Which vector database should I use?
Pinecone is the easiest to start with (fully managed). Weaviate and Qdrant are popular open-source options. Chroma works well for small projects. For most beginners, Pinecone or Supabase's vector features are the fastest path.
How much does a vector database cost?
Pinecone's free tier handles small projects. Paid plans start around $70/month. Self-hosted options (Weaviate, Qdrant) cost only your server expenses. For most use cases, costs are very manageable.
Do I need a vector database to use RAG?
For production RAG systems, yes — a vector database enables efficient semantic search across large document collections. For quick experiments, some tools (like ChatGPT's file upload) handle the vectorisation internally.
Back to AI Glossary