Multilingual Embeddings for Global RAG
Build RAG systems that work across languages using multilingual embedding models and cross-lingual retrieval.
- Author
- Ailog Research Team
- Published
- Reading time
- 11 min read
- Level
- intermediate
- RAG Pipeline Step
- Embedding
Why Multilingual?
Use cases: • Global customer support • Cross-language research • International knowledge bases • Multilingual chatbots
Top Multilingual Models (Nov 2025)
mE5-large (Microsoft): • 100+ languages • 1024 dimensions • Best performance/cost
multilingual-e5-large-instruct: • Instruction-tuned • Query/passage optimization • SOTA on MIRACL benchmark
LaBSE (Google): • 109 languages • 768 dimensions • Excellent for similar languages
Basic Implementation
``python from sentence_transformers import SentenceTransformer
model = SentenceTransformer('intfloat/multilingual-e5-large-instruct')
English query, French documents query_en = "What is machine learning?" docs_fr = [ "L'apprentissage automatique est une branche de l'IA", "Les réseaux de neurones sont utilisés en ML" ]
Embed with instruction prefix query_emb = model.encode(f"query: {query_en}") doc_embs = model.encode([f"passage: {doc}" for doc in docs_fr])
Compute similarity from sklearn.metrics.pairwise import cosine_similarity scores = cosine_similarity([query_emb], doc_embs)[0] `
Cross-Lingual Retrieval
Search in any language, retrieve in any language:
`python Query in English, documents in multiple languages query = "How to bake bread?"
documents = { "en": "To bake bread, mix flour, water, yeast...", "fr": "Pour faire du pain, mélanger farine, eau, levure...", "es": "Para hacer pan, mezclar harina, agua, levadura...", "de": "Um Brot zu backen, Mehl, Wasser, Hefe mischen..." }
Encode all with same model query_emb = model.encode(f"query: {query}") doc_embs = { lang: model.encode(f"passage: {text}") for lang, text in documents.items() }
All documents are comparable in the same vector space `
Language Detection + Routing
`python from langdetect import detect
def multilingual_rag(query): Detect query language lang = detect(query)
Route to language-specific index (optional optimization) if lang in ['en', 'fr', 'de']: index = f"docs_{lang}" else: index = "docs_multilingual"
Search results = vector_db.search( collection_name=index, query_vector=model.encode(f"query: {query}") )
return results `
Translation Fallback
For rare languages, translate first:
`python from transformers import pipeline
translator = pipeline("translation", model="facebook/nllb-200-distilled-600M")
def translate_then_search(query, source_lang, target_lang='en'): Translate query to common language if source_lang != target_lang: translated = translator( query, src_lang=source_lang, tgt_lang=target_lang )[0]['translation_text'] else: translated = query
Search in translated space results = vector_search(translated)
return results `
Qdrant with Language Filtering
`python from qdrant_client import QdrantClient from qdrant_client.models import Filter, FieldCondition, MatchValue
client = QdrantClient("localhost", port=6333)
Index documents with language metadata client.upsert( collection_name="multilingual_docs", points=[{ "id": 1, "vector": embedding, "payload": { "text": "...", "language": "fr" } }] )
Search with language filter results = client.search( collection_name="multilingual_docs", query_vector=query_embedding, query_filter=Filter( must=[ FieldCondition( key="language", match=MatchValue(value="fr") ) ] ) ) `
Evaluation Across Languages
`python MIRACL benchmark (Multilingual Information Retrieval) from mteb import MTEB
model = SentenceTransformer('intfloat/multilingual-e5-large-instruct')
evaluation = MTEB(tasks=["MIRACL"]) results = evaluation.run(model, output_folder="results/")
print(f"Average nDCG@10: {results['MIRACL']['ndcg@10']}") ``
Multilingual embeddings unlock global RAG. Use mE5 for best results in November 2025.