5. RetrievalIntermediate
Parent Document Retrieval: Context Without Noise
November 13, 2025
9 min read
Ailog Research Team
Search small chunks, retrieve full documents: the best of both precision and context for RAG systems.
The Problem
Small chunks:
- ✅ Precise retrieval
- ❌ Missing context
Large chunks:
- ✅ Full context
- ❌ Noisy retrieval
Solution: Search small, return large.
How It Works
- Index: Small chunks (200 tokens)
- Search: Find relevant small chunks
- Retrieve: Return parent document (2000 tokens)
Basic Implementation
DEVELOPERpythonimport uuid # Store chunks with parent reference chunks = [] documents = [] for doc in raw_documents: parent_id = str(uuid.uuid4()) # Store full document documents.append({ "id": parent_id, "content": doc, "embedding": embed(doc) }) # Create small chunks for chunk in split_into_chunks(doc, size=200): chunks.append({ "id": str(uuid.uuid4()), "content": chunk, "embedding": embed(chunk), "parent_id": parent_id # Link to parent }) # Index chunks only vector_db.upsert(collection="chunks", documents=chunks)
Retrieval
DEVELOPERpythondef parent_document_retrieval(query, k=5): # Search small chunks chunk_results = vector_db.search( collection="chunks", query_vector=embed(query), limit=k ) # Get parent document IDs parent_ids = [chunk["parent_id"] for chunk in chunk_results] # Fetch parent documents parent_docs = [ doc for doc in documents if doc["id"] in parent_ids ] return parent_docs
Langchain Implementation
DEVELOPERpythonfrom langchain.retrievers import ParentDocumentRetriever from langchain.storage import InMemoryStore from langchain.vectorstores import Chroma from langchain.text_splitter import RecursiveCharacterTextSplitter # Store for parent documents docstore = InMemoryStore() # Vector store for chunks vectorstore = Chroma(embedding_function=embeddings) # Splitters child_splitter = RecursiveCharacterTextSplitter(chunk_size=200) parent_splitter = RecursiveCharacterTextSplitter(chunk_size=2000) # Create retriever retriever = ParentDocumentRetriever( vectorstore=vectorstore, docstore=docstore, child_splitter=child_splitter, parent_splitter=parent_splitter ) # Add documents retriever.add_documents(documents) # Retrieve (returns full parent docs) results = retriever.get_relevant_documents("machine learning")
Multi-Level Hierarchy
DEVELOPERpython# Book → Chapter → Paragraph structure def create_hierarchy(book): book_id = str(uuid.uuid4()) for chapter in book.chapters: chapter_id = str(uuid.uuid4()) # Index paragraphs (small) for paragraph in chapter.paragraphs: vector_db.upsert({ "id": str(uuid.uuid4()), "content": paragraph, "embedding": embed(paragraph), "parent_id": chapter_id, # Chapter "grandparent_id": book_id # Book }) # Store chapter chapters[chapter_id] = chapter # Store book books[book_id] = book def retrieve_with_context(query): # Find relevant paragraphs paragraphs = vector_db.search(embed(query), limit=3) # Get surrounding context results = [] for p in paragraphs: chapter = chapters[p["parent_id"]] book = books[p["grandparent_id"]] results.append({ "match": p["content"], "chapter": chapter, "book_title": book.title }) return results
Windowed Retrieval
Return chunk + surrounding context:
DEVELOPERpythondef windowed_retrieval(query, window_size=2): # Find relevant chunk chunk_results = vector_db.search(embed(query), limit=5) # Get chunks before and after expanded_results = [] for chunk in chunk_results: parent_doc = get_document(chunk["parent_id"]) chunk_index = find_chunk_index(parent_doc, chunk["content"]) # Get window start = max(0, chunk_index - window_size) end = min(len(parent_doc.chunks), chunk_index + window_size + 1) expanded_chunk = "".join(parent_doc.chunks[start:end]) expanded_results.append(expanded_chunk) return expanded_results
Qdrant Implementation
DEVELOPERpythonfrom qdrant_client import QdrantClient from qdrant_client.models import PointStruct client = QdrantClient("localhost", port=6333) # Create collection with parent ID in payload client.create_collection( collection_name="chunks", vectors_config={"size": 1536, "distance": "Cosine"} ) # Insert chunks with parent reference points = [] for i, chunk in enumerate(chunks): points.append(PointStruct( id=i, vector=chunk["embedding"], payload={ "content": chunk["content"], "parent_id": chunk["parent_id"] } )) client.upsert(collection_name="chunks", points=points) # Retrieve def retrieve_parents(query): results = client.search( collection_name="chunks", query_vector=embed(query), limit=5 ) # Get unique parent IDs parent_ids = list(set([r.payload["parent_id"] for r in results])) # Fetch parents from document store parents = [get_document(pid) for pid in parent_ids] return parents
When to Use
✅ Use parent document retrieval when:
- Documents have clear structure
- You need full context for LLM
- Precision is important
❌ Don't use when:
- Documents are already small (< 500 tokens)
- You want to minimize token usage
- Context is not important
Parent document retrieval gives you precision without sacrificing context. Best of both worlds.
Tags
retrievalchunkingcontextparent document
Related Guides
guidesintermediate
Hybrid Search: Combine Semantic and Keyword Search
Boost retrieval accuracy by 20-30%: combine vector search with BM25 keyword matching for superior RAG performance.
10 min read
guidesintermediate
Query Expansion: Retrieve More Relevant Results
Improve recall by 40%: expand user queries with synonyms, sub-queries, and LLM-generated variations.
10 min read
guidesadvanced
MMR: Diversify Search Results with Maximal Marginal Relevance
Reduce redundancy in RAG retrieval: use MMR to balance relevance and diversity for better context quality.
9 min read