News

OpenAI Assistants v2: Improved Integrated RAG

April 30, 2026
6 min read
Ailog Team

OpenAI launches Assistants v2 with enhanced native RAG capabilities: improved file search, source annotations, and integrated vector stores.

OpenAI Strengthens Native RAG Offering

OpenAI launches version 2 of its Assistants API with significant improvements to RAG capabilities. File search becomes more powerful, source annotations more precise, and vector stores more flexible.

"Assistants v2 represents our vision of turnkey RAG," explains Sam Altman during the keynote. "Developers can build production-ready RAG applications in just a few lines of code."

Assistants v2 API Updates

Improved File Search

File search v2 brings major improvements:

Featurev1v2
Files per vector store10010,000
Max size per file512MB2GB
Supported formats1225+
Table parsingBasicAdvanced
Image parsingNoYes (OCR)
DEVELOPERpython
from openai import OpenAI client = OpenAI() # Create a vector store vector_store = client.beta.vector_stores.create( name="knowledge-base", chunking_strategy={ "type": "semantic", # New: semantic chunking "min_chunk_size": 100, "max_chunk_size": 800 } ) # Upload files client.beta.vector_stores.files.upload( vector_store_id=vector_store.id, file=open("document.pdf", "rb") ) # Create an assistant with RAG assistant = client.beta.assistants.create( name="RAG Assistant", model="gpt-4-turbo", tools=[{"type": "file_search"}], tool_resources={ "file_search": { "vector_store_ids": [vector_store.id] } } )

Chunking strategies are now configurable directly in the API.

Source Annotations

Responses now include precise annotations:

DEVELOPERpython
# Response with annotations { "content": "Revenue increased by 15%[1].", "annotations": [ { "type": "file_citation", "text": "[1]", "file_id": "file-abc123", "quote": "Annual revenue shows 15% growth", "page": 12, "confidence": 0.94 } ] }

Annotations include:

  • Exact quote from source document
  • Page number (for PDFs)
  • Confidence score
  • Link to source file

This feature is crucial for hallucination detection.

Shared Vector Stores

Vector stores can now be shared between assistants:

DEVELOPERpython
# Create a shared vector store shared_store = client.beta.vector_stores.create( name="company-knowledge", sharing="organization" # New ) # Use in multiple assistants for assistant_id in [assistant1, assistant2, assistant3]: client.beta.assistants.update( assistant_id, tool_resources={ "file_search": { "vector_store_ids": [shared_store.id] } } )

Improved Streaming

RAG response streaming is more granular:

DEVELOPERpython
with client.beta.threads.runs.stream( thread_id=thread.id, assistant_id=assistant.id ) as stream: for event in stream: if event.event == "thread.message.delta": print(event.data.delta.content[0].text.value, end="") elif event.event == "file_search.start": print(f"\n[Searching in {len(event.data.files)} files...]") elif event.event == "file_search.results": print(f"\n[{len(event.data.results)} results found]")

Performance and Limits

Benchmarks

OpenAI publishes benchmarks on standard RAG tasks:

MetricAssistants v1Assistants v2
Recall@572%86%
Precision@568%81%
Median latency2.1s1.4s
Citation accuracy78%91%

Current Limits

LimitValue
Vector stores per organization100
Files per vector store10,000
Tokens per file5M
Parallel requests50
Vector store retention30 days (configurable)

Pricing

New Pricing Model

ComponentPrice
Vector store (GB/day)$0.10
File search (1K requests)$0.03
Input tokens$10/M
Output tokens$30/M

Comparison with Custom Solutions

ApproachEstimated Monthly Cost*
Assistants v2$200-500
Pinecone + GPT-4$300-700
Qdrant self-hosted + GPT-4$150-400
Ailog RAG-as-a-Service$50-200

*For 100K requests/month, 1000 documents

Check our guide on RAG cost optimization.

Use Cases

When to Use Assistants v2

Ideal for:

  • Rapid prototypes
  • Teams without RAG expertise
  • Moderate traffic applications
  • All-in-one integration

Less suitable for:

  • Very high volume (> 1M requests/month)
  • Advanced customization needs
  • Data sovereignty constraints
  • Multi-LLM architectures

Complete Example

DEVELOPERpython
from openai import OpenAI client = OpenAI() # 1. Create vector store with documents vector_store = client.beta.vector_stores.create(name="docs") client.beta.vector_stores.file_batches.upload_and_poll( vector_store_id=vector_store.id, files=[open(f, "rb") for f in ["doc1.pdf", "doc2.pdf"]] ) # 2. Create the assistant assistant = client.beta.assistants.create( name="Support Bot", model="gpt-4-turbo", instructions="You are a support assistant. Always cite your sources.", tools=[{"type": "file_search"}], tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}} ) # 3. Create a conversation thread = client.beta.threads.create() client.beta.threads.messages.create( thread_id=thread.id, role="user", content="How do I configure product X?" ) # 4. Execute and stream with client.beta.threads.runs.stream( thread_id=thread.id, assistant_id=assistant.id ) as stream: for text in stream.text_deltas: print(text, end="")

Migration from v1

Breaking Changes

  • retrieval renamed to file_search
  • New annotation structure
  • Vector stores mandatory (no more direct file attachments)

Migration Guide

DEVELOPERpython
# Before (v1) assistant = client.beta.assistants.create( tools=[{"type": "retrieval"}], file_ids=["file-123"] ) # After (v2) vector_store = client.beta.vector_stores.create() client.beta.vector_stores.files.create( vector_store_id=vector_store.id, file_id="file-123" ) assistant = client.beta.assistants.create( tools=[{"type": "file_search"}], tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}} )

Our Take

Assistants v2 represents a significant improvement:

Strengths:

  • Simplified turnkey RAG
  • Precise source annotations
  • Good integration with OpenAI ecosystem

Points of attention:

  • OpenAI lock-in
  • Limited customization
  • Data hosted at OpenAI

For projects requiring more control or sovereignty, solutions like Ailog offer an alternative with French hosting and advanced customization.

Check our guide to best RAG platforms to compare.

Tags

RAGOpenAIAssistants APIGPT-4LLM

Related Posts

Ailog Assistant

Ici pour vous aider

Salut ! Pose-moi des questions sur Ailog et comment intégrer votre RAG dans vos projets !