Weaviate Launches Hybrid Search 2.0 with 60% Faster Query Performance
Weaviate's new hybrid search engine combines BM25, vector search, and learned ranking in a single optimized index for superior RAG retrieval.
- Author
- Ailog Research Team
- Published
- Reading time
- 4 min read
Announcement
Weaviate has released Hybrid Search 2.0, a complete rewrite of their hybrid search engine that delivers significantly better performance and accuracy while simplifying configuration.
Key Improvements
Performance Gains
Compared to Hybrid Search 1.0:
| Metric | v1.0 | v2.0 | Improvement | |--------|------|------|-------------| | Query latency (p50) | 85ms | 34ms | -60% | | Query latency (p95) | 240ms | 78ms | -68% | | Throughput | 1,200 q/s | 3,500 q/s | +192% | | Index build time | 45 min | 18 min | -60% |
Unified Index
v2.0 uses a single unified index for both vector and keyword search:
Old (v1.0): `` Vector index (HNSW) + Keyword index (BM25) = 2 indexes → Search both, merge results `
New (v2.0): ` Unified hybrid index = 1 index → Single traversal, fused scoring `
Benefits: • 40% less storage • Faster queries (no merging overhead) • Better cache locality
Learned Fusion
Replaces manual alpha tuning with learned fusion:
Old approach: `python Manual tuning required results = client.query.get("Document") .with_hybrid(query, alpha=0.7) Trial and error .do() `
New approach: `python Automatic learned fusion results = client.query.get("Document") .with_hybrid(query, fusion_type="learned") No alpha needed .do() `
Learned fusion model trains on query patterns to optimize scoring.
Benchmark: • Manual alpha: 52.3% nDCG@10 • Learned fusion: 57.8% nDCG@10 (+10.5%)
New Features
Filter-Aware Hybrid Search
Hybrid search now respects filters efficiently:
`python results = ( client.query.get("Product") .with_hybrid("wireless headphones", fusion_type="learned") .with_where({ "path": ["price"], "operator": "LessThan", "valueNumber": 200 }) .with_limit(10) .do() ) `
Performance: • v1.0: Post-filter (slow) • v2.0: Filter-aware index traversal (60% faster)
Multi-Vector Hybrid
Support for multiple vector representations:
`python Index with multiple embeddings client.data_object.create({ "text": "Product description...", "vectors": { "semantic": [...], General embedding "domain": [...], Domain-specific embedding "multilingual": [...] Cross-lingual embedding } })
Query with automatic vector selection results = client.query.get("Product") .with_hybrid(query, vector_name="auto") Selects best vector .do() `
Hybrid Explain
Debug hybrid search scoring:
`python results = ( client.query.get("Document") .with_hybrid(query, explain=True) .do() )
for result in results: print(f"Combined score: {result['_additional']['score']}") print(f" BM25 score: {result['_additional']['explainScore']['bm25']}") print(f" Vector score: {result['_additional']['explainScore']['vector']}") print(f" Fusion weight: {result['_additional']['explainScore']['fusion']}") `
Helps understand why documents ranked where they did.
Architecture Changes
HNSW-BM25 Fusion Index
New index structure:
` HNSW Graph Nodes: • Vector embedding • BM25 term frequencies • Metadata • Filters
Single traversal: • Navigate HNSW graph • Calculate vector similarity • Calculate BM25 score • Apply learned fusion • Check filters (early termination) `
Key innovation: BM25 data collocated with HNSW nodes.
Dynamic Fusion
Fusion weights adapt per query:
`python Query analysis query_type = analyze(query) keyword-heavy vs. semantic
if query_type == "keyword-heavy": fusion_weights = {"bm25": 0.7, "vector": 0.3} elif query_type == "semantic": fusion_weights = {"bm25": 0.3, "vector": 0.7} else: fusion_weights = {"bm25": 0.5, "vector": 0.5}
Apply dynamically score = ( fusion_weights["bm25"] bm25_score + fusion_weights["vector"] vector_score ) `
Eliminates need for manual alpha tuning.
Migration Guide
Upgrading from v1.0
`python Old (v1.0) results = ( client.query.get("Document") .with_hybrid(query="search query", alpha=0.75) .with_limit(10) .do() )
New (v2.0) - minimal changes results = ( client.query.get("Document") .with_hybrid( query="search query", fusion_type="learned" Replace alpha ) .with_limit(10) .do() ) `
Reindexing
v2.0 requires reindexing:
`bash Export data weaviate export --collection Documents --output backup.json
Upgrade Weaviate docker pull semitechnologies/weaviate:1.25.0
Reimport (automatically uses new index) weaviate import --collection Documents --input backup.json `
Downtime: ~2 hours for 10M documents
Benchmarks
BEIR Benchmark
Tested on BEIR retrieval benchmark:
| Dataset | BM25 | Vector | Hybrid v1 | Hybrid v2 | |---------|------|--------|-----------|-----------| | MS MARCO | 22.8 | 38.2 | 41.3 | 45.7 | | NQ | 32.9 | 52.3 | 56.8 | 61.2 | | FiQA | 23.6 | 32.1 | 35.4 | 39.8 | | ArguAna | 41.5 | 38.9 | 43.2 | 46.1 | | SciFact | 66.5 | 67.2 | 72.1 | 75.8 |
Average improvement: +6.8% over v1.0
Real-World Performance
Customer report (10M documents):
Latency: • p50: 34ms (was 85ms) • p95: 78ms (was 240ms) • p99: 145ms (was 580ms)
Throughput: • Single node: 3,500 q/s (was 1,200 q/s) • 3-node cluster: 9,800 q/s (was 3,100 q/s)
Cost: • Same infrastructure handles 3x traffic • 66% cost reduction per query
Best Practices
Fusion Type Selection
`python Use learned fusion (default) .with_hybrid(query, fusion_type="learned")
Use relative score (for specific use cases) .with_hybrid(query, fusion_type="relative_score", alpha=0.7)
Use RRF (rank-based fusion) .with_hybrid(query, fusion_type="rrf") `
Recommendation: Start with learned fusion.
Filter Optimization
`python Good: Selective filters first .with_where({ "operator": "And", "operands": [ {"path": ["category"], "operator": "Equal", "valueString": "electronics"}, {"path": ["price"], "operator": "LessThan", "valueNumber": 200} ] })
Bad: Non-selective filters first (slower) `
Vector Selection
`python Let Weaviate choose .with_hybrid(query, vector_name="auto")
Or specify explicitly .with_hybrid(query, vector_name="semantic") ``
Availability • Weaviate 1.25+ (released October 2025) • Weaviate Cloud Services (WCS) - auto-upgraded • Self-hosted - upgrade available
Limitations
Reindexing Required • Cannot upgrade in-place • Must rebuild indexes • Plan for downtime
Memory Usage • Unified index uses 15% more RAM (but less disk) • Benefits outweigh costs for most use cases
Learning Period • Learned fusion requires ~1000 queries to optimize • Falls back to heuristics until trained
Future Roadmap
Planned for 2026: • Multi-modal hybrid search (text + images) • Semantic BM25 (contextual term weighting) • Graph-augmented hybrid search • Real-time fusion model updates
Resources • Documentation: weaviate.io/developers/hybrid-search-v2 • Migration guide: weaviate.io/developers/migration/v2 • Benchmarks: weaviate.io/benchmarks/hybrid-search
Conclusion
Weaviate's Hybrid Search 2.0 represents a significant leap in retrieval technology, combining performance improvements with better accuracy through learned fusion. The unified index architecture sets a new standard for hybrid search in vector databases, making it an excellent choice for production RAG applications.