News

Cohere Launches Embed v4: 30% Better Accuracy with Smaller Model Size

October 28, 2025
4 min read
Ailog Research Team

Cohere's new embedding model delivers state-of-the-art performance on MTEB benchmark while reducing dimensions from 1024 to 768, cutting costs and improving speed.

Announcement

Cohere has released Embed v4, their latest embedding model that achieves significant improvements in accuracy, efficiency, and multilingual performance.

Key Improvements

Performance Gains

MTEB (Massive Text Embedding Benchmark) scores:

ModelDimensionsAvg ScoreRetrievalClassification
Embed v3102464.252.371.8
Embed v476866.855.174.2
OpenAI ada-002153660.949.268.5
OpenAI text-3-large307264.654.670.1

Reduced Dimensions

Moving from 1024 to 768 dimensions provides:

  • 25% less storage per embedding
  • 20% faster similarity search
  • 15% lower API costs
  • No accuracy loss (actually improved)

Multilingual Excellence

Embed v4 supports 100+ languages with strong performance:

  • English: 68.2 (MTEB)
  • Chinese: 65.1
  • Spanish: 64.8
  • Arabic: 62.3
  • Hindi: 61.7

Cross-lingual retrieval (query in one language, retrieve in another) improved by 35%.

Technical Innovations

Matryoshka Embeddings

Embed v4 uses Matryoshka Representation Learning, allowing flexible dimension reduction:

DEVELOPERpython
# Generate full 768-dim embedding full_embedding = co.embed(texts=["sample text"], model="embed-v4") # Truncate to smaller dimensions without recomputing embedding_256 = full_embedding[:256] # Use first 256 dims embedding_512 = full_embedding[:512] # Use first 512 dims # Trade-off: smaller size vs. slight accuracy loss

Dimension vs. accuracy:

  • 768 dims: 100% accuracy (baseline)
  • 512 dims: 98.5% accuracy
  • 256 dims: 95.2% accuracy
  • 128 dims: 89.1% accuracy

Instruction-Aware Embeddings

Embed v4 takes optional task instructions for better domain adaptation:

DEVELOPERpython
# Standard embedding embedding = co.embed( texts=["Machine learning model"], model="embed-v4" ) # With task instruction for better domain alignment embedding = co.embed( texts=["Machine learning model"], model="embed-v4", input_type="search_document", embedding_types=["float"] ) # For queries (different from documents) query_embedding = co.embed( texts=["How does ML work?"], model="embed-v4", input_type="search_query" )

Training Improvements

Trained on:

  • 1.2 trillion tokens (3x more than v3)
  • Synthetic hard negatives
  • Contrastive learning with dynamic batching
  • Multi-task training across 50+ tasks

Pricing

Embed v4 pricing (per 1M tokens):

  • embed-v4: $0.10
  • embed-v4-light: $0.02 (384 dims, slightly lower accuracy)

Compared to competitors:

  • OpenAI text-embedding-3-small: $0.02 (1536 dims)
  • OpenAI text-embedding-3-large: $0.13 (3072 dims)

Migration Guide

Upgrading from v3 to v4:

DEVELOPERpython
# Old (v3) response = co.embed( texts=texts, model="embed-english-v3.0" ) # New (v4) response = co.embed( texts=texts, model="embed-v4", input_type="search_document" # New parameter )

Note: v3 and v4 embeddings are not compatible. You must re-embed your entire corpus.

Use Cases

Embed v4 particularly excels at:

  • Multilingual search: Better cross-language retrieval
  • Code search: Improved semantic code understanding
  • Domain-specific RAG: Instruction parameter helps adaptation
  • Large-scale systems: Reduced dimensions = lower costs

Benchmarks

Retrieval Tasks

Tested on BeIR benchmark (zero-shot retrieval):

DatasetEmbed v3Embed v4Improvement
NQ52.856.3+6.6%
HotpotQA63.267.1+6.2%
FEVER75.379.8+6.0%
Climate-FEVER23.128.4+22.9%
SciFact66.271.8+8.5%

Classification

On standard text classification benchmarks:

  • Banking77: 86.2% → 89.1% (+3.4%)
  • Amazon Reviews: 63.8% → 67.2% (+5.3%)
  • TREC: 91.3% → 93.7% (+2.6%)

Availability

  • Generally available via Cohere API
  • Supported in all SDKs (Python, Node.js, Go, Java)
  • Coming soon to AWS Bedrock and Azure
  • Self-hosted option via Cohere Private Deployment

Best Practices

Dimension Selection

  • 768 dims: Default, best quality
  • 512 dims: Good balance for most use cases
  • 256 dims: Cost-optimized, still strong performance

Input Types

  • search_document: For documents being indexed
  • search_query: For search queries
  • classification: For classification tasks
  • clustering: For clustering tasks

Migration Strategy

  1. Test v4 on sample queries
  2. Compare retrieval quality
  3. Re-embed corpus incrementally
  4. Use A/B testing during transition

Conclusion

Embed v4 sets a new standard for production embedding models, combining state-of-the-art accuracy with practical efficiency improvements. The flexible dimensions via Matryoshka embeddings make it suitable for a wide range of deployment scenarios and budgets.

Tags

embeddingsCoheremodelsperformance

Related Guides