Weaviate Launches Hybrid Search 2.0 with 60% Faster Query Performance
Weaviate's new hybrid search engine combines BM25, vector search, and learned ranking in a single optimized index for superior RAG retrieval.
Announcement
Weaviate has released Hybrid Search 2.0, a complete rewrite of their hybrid search engine that delivers significantly better performance and accuracy while simplifying configuration.
Key Improvements
Performance Gains
Compared to Hybrid Search 1.0:
| Metric | v1.0 | v2.0 | Improvement |
|---|---|---|---|
| Query latency (p50) | 85ms | 34ms | -60% |
| Query latency (p95) | 240ms | 78ms | -68% |
| Throughput | 1,200 q/s | 3,500 q/s | +192% |
| Index build time | 45 min | 18 min | -60% |
Unified Index
v2.0 uses a single unified index for both vector and keyword search:
Old (v1.0):
Vector index (HNSW) + Keyword index (BM25) = 2 indexes
ā Search both, merge results
New (v2.0):
Unified hybrid index = 1 index
ā Single traversal, fused scoring
Benefits:
- 40% less storage
- Faster queries (no merging overhead)
- Better cache locality
Learned Fusion
Replaces manual alpha tuning with learned fusion:
Old approach:
DEVELOPERpython# Manual tuning required results = client.query.get("Document") .with_hybrid(query, alpha=0.7) # Trial and error .do()
New approach:
DEVELOPERpython# Automatic learned fusion results = client.query.get("Document") .with_hybrid(query, fusion_type="learned") # No alpha needed .do()
Learned fusion model trains on query patterns to optimize scoring.
Benchmark:
- Manual alpha: 52.3% nDCG@10
- Learned fusion: 57.8% nDCG@10 (+10.5%)
New Features
Filter-Aware Hybrid Search
Hybrid search now respects filters efficiently:
DEVELOPERpythonresults = ( client.query.get("Product") .with_hybrid("wireless headphones", fusion_type="learned") .with_where({ "path": ["price"], "operator": "LessThan", "valueNumber": 200 }) .with_limit(10) .do() )
Performance:
- v1.0: Post-filter (slow)
- v2.0: Filter-aware index traversal (60% faster)
Multi-Vector Hybrid
Support for multiple vector representations:
DEVELOPERpython# Index with multiple embeddings client.data_object.create({ "text": "Product description...", "vectors": { "semantic": [...], # General embedding "domain": [...], # Domain-specific embedding "multilingual": [...] # Cross-lingual embedding } }) # Query with automatic vector selection results = client.query.get("Product") .with_hybrid(query, vector_name="auto") # Selects best vector .do()
Hybrid Explain
Debug hybrid search scoring:
DEVELOPERpythonresults = ( client.query.get("Document") .with_hybrid(query, explain=True) .do() ) for result in results: print(f"Combined score: {result['_additional']['score']}") print(f" BM25 score: {result['_additional']['explainScore']['bm25']}") print(f" Vector score: {result['_additional']['explainScore']['vector']}") print(f" Fusion weight: {result['_additional']['explainScore']['fusion']}")
Helps understand why documents ranked where they did.
Architecture Changes
HNSW-BM25 Fusion Index
New index structure:
HNSW Graph Nodes:
- Vector embedding
- BM25 term frequencies
- Metadata
- Filters
Single traversal:
- Navigate HNSW graph
- Calculate vector similarity
- Calculate BM25 score
- Apply learned fusion
- Check filters (early termination)
Key innovation: BM25 data collocated with HNSW nodes.
Dynamic Fusion
Fusion weights adapt per query:
DEVELOPERpython# Query analysis query_type = analyze(query) # keyword-heavy vs. semantic if query_type == "keyword-heavy": fusion_weights = {"bm25": 0.7, "vector": 0.3} elif query_type == "semantic": fusion_weights = {"bm25": 0.3, "vector": 0.7} else: fusion_weights = {"bm25": 0.5, "vector": 0.5} # Apply dynamically score = ( fusion_weights["bm25"] * bm25_score + fusion_weights["vector"] * vector_score )
Eliminates need for manual alpha tuning.
Migration Guide
Upgrading from v1.0
DEVELOPERpython# Old (v1.0) results = ( client.query.get("Document") .with_hybrid(query="search query", alpha=0.75) .with_limit(10) .do() ) # New (v2.0) - minimal changes results = ( client.query.get("Document") .with_hybrid( query="search query", fusion_type="learned" # Replace alpha ) .with_limit(10) .do() )
Reindexing
v2.0 requires reindexing:
DEVELOPERbash# Export data weaviate export --collection Documents --output backup.json # Upgrade Weaviate docker pull semitechnologies/weaviate:1.25.0 # Reimport (automatically uses new index) weaviate import --collection Documents --input backup.json
Downtime: ~2 hours for 10M documents
Benchmarks
BEIR Benchmark
Tested on BEIR retrieval benchmark:
| Dataset | BM25 | Vector | Hybrid v1 | Hybrid v2 |
|---|---|---|---|---|
| MS MARCO | 22.8 | 38.2 | 41.3 | 45.7 |
| NQ | 32.9 | 52.3 | 56.8 | 61.2 |
| FiQA | 23.6 | 32.1 | 35.4 | 39.8 |
| ArguAna | 41.5 | 38.9 | 43.2 | 46.1 |
| SciFact | 66.5 | 67.2 | 72.1 | 75.8 |
Average improvement: +6.8% over v1.0
Real-World Performance
Customer report (10M documents):
Latency:
- p50: 34ms (was 85ms)
- p95: 78ms (was 240ms)
- p99: 145ms (was 580ms)
Throughput:
- Single node: 3,500 q/s (was 1,200 q/s)
- 3-node cluster: 9,800 q/s (was 3,100 q/s)
Cost:
- Same infrastructure handles 3x traffic
- 66% cost reduction per query
Best Practices
Fusion Type Selection
DEVELOPERpython# Use learned fusion (default) .with_hybrid(query, fusion_type="learned") # Use relative score (for specific use cases) .with_hybrid(query, fusion_type="relative_score", alpha=0.7) # Use RRF (rank-based fusion) .with_hybrid(query, fusion_type="rrf")
Recommendation: Start with learned fusion.
Filter Optimization
DEVELOPERpython# Good: Selective filters first .with_where({ "operator": "And", "operands": [ {"path": ["category"], "operator": "Equal", "valueString": "electronics"}, {"path": ["price"], "operator": "LessThan", "valueNumber": 200} ] }) # Bad: Non-selective filters first (slower)
Vector Selection
DEVELOPERpython# Let Weaviate choose .with_hybrid(query, vector_name="auto") # Or specify explicitly .with_hybrid(query, vector_name="semantic")
Availability
- Weaviate 1.25+ (released October 2025)
- Weaviate Cloud Services (WCS) - auto-upgraded
- Self-hosted - upgrade available
Limitations
Reindexing Required
- Cannot upgrade in-place
- Must rebuild indexes
- Plan for downtime
Memory Usage
- Unified index uses 15% more RAM (but less disk)
- Benefits outweigh costs for most use cases
Learning Period
- Learned fusion requires ~1000 queries to optimize
- Falls back to heuristics until trained
Future Roadmap
Planned for 2026:
- Multi-modal hybrid search (text + images)
- Semantic BM25 (contextual term weighting)
- Graph-augmented hybrid search
- Real-time fusion model updates
Resources
- Documentation: weaviate.io/developers/hybrid-search-v2
- Migration guide: weaviate.io/developers/migration/v2
- Benchmarks: weaviate.io/benchmarks/hybrid-search
Conclusion
Weaviate's Hybrid Search 2.0 represents a significant leap in retrieval technology, combining performance improvements with better accuracy through learned fusion. The unified index architecture sets a new standard for hybrid search in vector databases, making it an excellent choice for production RAG applications.
Tags
Related Guides
Qdrant: Advanced Vector Search Features
Leverage Qdrant's powerful features: payload indexing, quantization, distributed deployment for high-performance RAG.
Milvus: Billion-Scale Vector Search
Deploy Milvus for production-scale RAG handling billions of vectors with horizontal scaling and GPU acceleration.
Advanced Chunking Strategies for RAG Systems in 2025
Recent research reveals new document chunking approaches that significantly improve RAG system performance