Weaviate Launches Hybrid Search 2.0 with 60% Faster Query Performance
Weaviate's new hybrid search engine combines BM25, vector search, and learned ranking in a single optimized index for superior RAG retrieval.
Announcement
Weaviate has released Hybrid Search 2.0, a complete rewrite of their hybrid search engine that delivers significantly better performance and accuracy while simplifying configuration.
Key Improvements
Performance Gains
Compared to Hybrid Search 1.0:
| Metric | v1.0 | v2.0 | Improvement |
|---|---|---|---|
| Query latency (p50) | 85ms | 34ms | -60% |
| Query latency (p95) | 240ms | 78ms | -68% |
| Throughput | 1,200 q/s | 3,500 q/s | +192% |
| Index build time | 45 min | 18 min | -60% |
Unified Index
v2.0 uses a single unified index for both vector and keyword search:
Old (v1.0):
Vector index (HNSW) + Keyword index (BM25) = 2 indexes
→ Search both, merge results
New (v2.0):
Unified hybrid index = 1 index
→ Single traversal, fused scoring
Benefits:
- 40% less storage
- Faster queries (no merging overhead)
- Better cache locality
Learned Fusion
Replaces manual alpha tuning with learned fusion:
Old approach:
DEVELOPERpython# Manual tuning required results = client.query.get("Document") .with_hybrid(query, alpha=0.7) # Trial and error .do()
New approach:
DEVELOPERpython# Automatic learned fusion results = client.query.get("Document") .with_hybrid(query, fusion_type="learned") # No alpha needed .do()
Learned fusion model trains on query patterns to optimize scoring.
Benchmark:
- Manual alpha: 52.3% nDCG@10
- Learned fusion: 57.8% nDCG@10 (+10.5%)
New Features
Filter-Aware Hybrid Search
Hybrid search now respects filters efficiently:
DEVELOPERpythonresults = ( client.query.get("Product") .with_hybrid("wireless headphones", fusion_type="learned") .with_where({ "path": ["price"], "operator": "LessThan", "valueNumber": 200 }) .with_limit(10) .do() )
Performance:
- v1.0: Post-filter (slow)
- v2.0: Filter-aware index traversal (60% faster)
Multi-Vector Hybrid
Support for multiple vector representations:
DEVELOPERpython# Index with multiple embeddings client.data_object.create({ "text": "Product description...", "vectors": { "semantic": [...], # General embedding "domain": [...], # Domain-specific embedding "multilingual": [...] # Cross-lingual embedding } }) # Query with automatic vector selection results = client.query.get("Product") .with_hybrid(query, vector_name="auto") # Selects best vector .do()
Hybrid Explain
Debug hybrid search scoring:
DEVELOPERpythonresults = ( client.query.get("Document") .with_hybrid(query, explain=True) .do() ) for result in results: print(f"Combined score: {result['_additional']['score']}") print(f" BM25 score: {result['_additional']['explainScore']['bm25']}") print(f" Vector score: {result['_additional']['explainScore']['vector']}") print(f" Fusion weight: {result['_additional']['explainScore']['fusion']}")
Helps understand why documents ranked where they did.
Architecture Changes
HNSW-BM25 Fusion Index
New index structure:
HNSW Graph Nodes:
- Vector embedding
- BM25 term frequencies
- Metadata
- Filters
Single traversal:
- Navigate HNSW graph
- Calculate vector similarity
- Calculate BM25 score
- Apply learned fusion
- Check filters (early termination)
Key innovation: BM25 data collocated with HNSW nodes.
Dynamic Fusion
Fusion weights adapt per query:
DEVELOPERpython# Query analysis query_type = analyze(query) # keyword-heavy vs. semantic if query_type == "keyword-heavy": fusion_weights = {"bm25": 0.7, "vector": 0.3} elif query_type == "semantic": fusion_weights = {"bm25": 0.3, "vector": 0.7} else: fusion_weights = {"bm25": 0.5, "vector": 0.5} # Apply dynamically score = ( fusion_weights["bm25"] * bm25_score + fusion_weights["vector"] * vector_score )
Eliminates need for manual alpha tuning.
Migration Guide
Upgrading from v1.0
DEVELOPERpython# Old (v1.0) results = ( client.query.get("Document") .with_hybrid(query="search query", alpha=0.75) .with_limit(10) .do() ) # New (v2.0) - minimal changes results = ( client.query.get("Document") .with_hybrid( query="search query", fusion_type="learned" # Replace alpha ) .with_limit(10) .do() )
Reindexing
v2.0 requires reindexing:
DEVELOPERbash# Export data weaviate export --collection Documents --output backup.json # Upgrade Weaviate docker pull semitechnologies/weaviate:1.25.0 # Reimport (automatically uses new index) weaviate import --collection Documents --input backup.json
Downtime: ~2 hours for 10M documents
Benchmarks
BEIR Benchmark
Tested on BEIR retrieval benchmark:
| Dataset | BM25 | Vector | Hybrid v1 | Hybrid v2 |
|---|---|---|---|---|
| MS MARCO | 22.8 | 38.2 | 41.3 | 45.7 |
| NQ | 32.9 | 52.3 | 56.8 | 61.2 |
| FiQA | 23.6 | 32.1 | 35.4 | 39.8 |
| ArguAna | 41.5 | 38.9 | 43.2 | 46.1 |
| SciFact | 66.5 | 67.2 | 72.1 | 75.8 |
Average improvement: +6.8% over v1.0
Real-World Performance
Customer report (10M documents):
Latency:
- p50: 34ms (was 85ms)
- p95: 78ms (was 240ms)
- p99: 145ms (was 580ms)
Throughput:
- Single node: 3,500 q/s (was 1,200 q/s)
- 3-node cluster: 9,800 q/s (was 3,100 q/s)
Cost:
- Same infrastructure handles 3x traffic
- 66% cost reduction per query
Best Practices
Fusion Type Selection
DEVELOPERpython# Use learned fusion (default) .with_hybrid(query, fusion_type="learned") # Use relative score (for specific use cases) .with_hybrid(query, fusion_type="relative_score", alpha=0.7) # Use RRF (rank-based fusion) .with_hybrid(query, fusion_type="rrf")
Recommendation: Start with learned fusion.
Filter Optimization
DEVELOPERpython# Good: Selective filters first .with_where({ "operator": "And", "operands": [ {"path": ["category"], "operator": "Equal", "valueString": "electronics"}, {"path": ["price"], "operator": "LessThan", "valueNumber": 200} ] }) # Bad: Non-selective filters first (slower)
Vector Selection
DEVELOPERpython# Let Weaviate choose .with_hybrid(query, vector_name="auto") # Or specify explicitly .with_hybrid(query, vector_name="semantic")
Availability
- Weaviate 1.25+ (released October 2025)
- Weaviate Cloud Services (WCS) - auto-upgraded
- Self-hosted - upgrade available
Limitations
Reindexing Required
- Cannot upgrade in-place
- Must rebuild indexes
- Plan for downtime
Memory Usage
- Unified index uses 15% more RAM (but less disk)
- Benefits outweigh costs for most use cases
Learning Period
- Learned fusion requires ~1000 queries to optimize
- Falls back to heuristics until trained
Future Roadmap
Planned for 2026:
- Multi-modal hybrid search (text + images)
- Semantic BM25 (contextual term weighting)
- Graph-augmented hybrid search
- Real-time fusion model updates
Resources
- Documentation: weaviate.io/developers/hybrid-search-v2
- Migration guide: weaviate.io/developers/migration/v2
- Benchmarks: weaviate.io/benchmarks/hybrid-search
Conclusion
Weaviate's Hybrid Search 2.0 represents a significant leap in retrieval technology, combining performance improvements with better accuracy through learned fusion. The unified index architecture sets a new standard for hybrid search in vector databases, making it an excellent choice for production RAG applications.
Tags
Related Posts
Pinecone Serverless: Evolutions and Pricing
Pinecone announces major updates to its Serverless offering: new features, price reductions, and improved performance.
Vector Databases 2026: Trends and New Players
Complete overview of the vector database market in 2026. New entrants, major evolutions, and comparison of solutions for your RAG applications.
Qdrant: Advanced Vector Search Features
Leverage Qdrant's powerful features: payload indexing, quantization, distributed deployment for high-performance RAG.