Name: Ailog - RAG as a Service Platform
Availability: InStock
Rating: 4.8 (156 reviews)

TL;DR

Hierarchical chunking = preserve sections, subsections, and paragraphs
Benefit: rich context + fine granularity simultaneously
Implementation: nested chunks with hierarchy metadata
Typical gain: +20-35% relevance on structured documents
Test hierarchical chunking on Ailog

Why Hierarchical Chunking?

Real documents have structure:

Chapters > Sections > Subsections > Paragraphs
This hierarchy carries semantic meaning

Classic chunking (fixed-size or semantic) ignores this structure:

Original document:
├── Chapter 1: Introduction
│   ├── 1.1 Background
│   └── 1.2 Objectives
└── Chapter 2: Methods
    ├── 2.1 Approach A
    └── 2.2 Approach B

Classic chunking:
[Chunk 1: "...end of background. 1.2 Objectives..."]  ❌ Mixed sections
[Chunk 2: "...beginning of methods..."]               ❌ Lost hierarchy

Hierarchical Chunking Principle

Create chunks at multiple levels with parent-child links:

DEVELOPERpython
# Preserved hierarchical structure
{
    "id": "doc1_ch2_s1",
    "content": "2.1 Approach A - Detailed description...",
    "metadata": {
        "level": 3,
        "parent_id": "doc1_ch2",
        "path": ["Chapter 2: Methods", "2.1 Approach A"],
        "document_id": "doc1"
    }
}

Python Implementation

Hierarchy Extraction

DEVELOPERpython
import re
from dataclasses import dataclass
from typing import List, Optional

@dataclass
class HierarchicalChunk:
    id: str
    content: str
    level: int
    title: str
    parent_id: Optional[str]
    path: List[str]
    children_ids: List[str]

def extract_hierarchy(text: str, patterns: dict = None) -> List[HierarchicalChunk]:
    """
    Extracts hierarchical structure from a document.

    patterns: Regex to detect levels
    """
    if patterns is None:
        patterns = {
            1: r'^# (.+)$',           # Main title
            2: r'^## (.+)$',          # Sections
            3: r'^### (.+)$',         # Subsections
            4: r'^#### (.+)$',        # Sub-subsections
        }

    chunks = []
    current_path = []
    parent_stack = []  # Stack of (level, chunk_id)

    # Split by headers
    lines = text.split('\n')
    current_content = []
    current_title = "Document"
    current_level = 0
    chunk_counter = 0

    for line in lines:
        header_found = False

        for level, pattern in patterns.items():
            match = re.match(pattern, line, re.MULTILINE)
            if match:
                # Save previous chunk
                if current_content:
                    chunk_id = f"chunk_{chunk_counter}"
                    parent_id = parent_stack[-1][1] if parent_stack else None

                    chunk = HierarchicalChunk(
                        id=chunk_id,
                        content='\n'.join(current_content).strip(),
                        level=current_level,
                        title=current_title,
                        parent_id=parent_id,
                        path=current_path.copy(),
                        children_ids=[]
                    )
                    chunks.append(chunk)
                    chunk_counter += 1

                # Update hierarchy
                current_title = match.group(1)
                current_level = level
                current_content = []

                # Update path and parent stack
                while parent_stack and parent_stack[-1][0] >= level:
                    parent_stack.pop()
                    if current_path:
                        current_path.pop()

                current_path.append(current_title)
                parent_stack.append((level, f"chunk_{chunk_counter}"))

                header_found = True
                break

        if not header_found:
            current_content.append(line)

    # Don't forget last chunk
    if current_content:
        chunk_id = f"chunk_{chunk_counter}"
        parent_id = parent_stack[-1][1] if parent_stack else None

        chunk = HierarchicalChunk(
            id=chunk_id,
            content='\n'.join(current_content).strip(),
            level=current_level,
            title=current_title,
            parent_id=parent_id,
            path=current_path.copy(),
            children_ids=[]
        )
        chunks.append(chunk)

    return chunks

Multi-Level Indexing

DEVELOPERpython
def index_hierarchical_chunks(chunks: List[HierarchicalChunk], vector_db):
    """
    Indexes chunks with their hierarchical context.
    """
    for chunk in chunks:
        # Create enriched context
        path_context = " > ".join(chunk.path)
        enriched_content = f"{path_context}\n\n{chunk.content}"

        # Generate embedding
        embedding = embed(enriched_content)

        # Store with metadata
        vector_db.upsert(
            id=chunk.id,
            embedding=embedding,
            metadata={
                "content": chunk.content,
                "title": chunk.title,
                "level": chunk.level,
                "parent_id": chunk.parent_id,
                "path": path_context,
                "path_list": chunk.path
            }
        )

Contextual Retrieval

Strategy: Small-to-Big

Search in fine chunks, return parent context:

DEVELOPERpython
def hierarchical_retrieve(query: str, vector_db, k: int = 3) -> List[dict]:
    """
    Retrieves relevant chunks with their parent context.
    """
    # 1. Fine search (lowest level)
    results = vector_db.query(
        query_embedding=embed(query),
        filter={"level": {"$gte": 3}},  # Subsections and below
        limit=k * 2
    )

    # 2. Enrich with parent context
    enriched_results = []
    seen_parents = set()

    for result in results:
        parent_id = result.metadata.get("parent_id")

        # Retrieve parent chain
        context_chain = [result.metadata["content"]]
        current_parent = parent_id

        while current_parent and current_parent not in seen_parents:
            parent = vector_db.get(current_parent)
            if parent:
                context_chain.insert(0, parent.metadata["content"])
                seen_parents.add(current_parent)
                current_parent = parent.metadata.get("parent_id")
            else:
                break

        enriched_results.append({
            "chunk": result,
            "full_context": "\n\n---\n\n".join(context_chain),
            "path": result.metadata["path"]
        })

    return enriched_results[:k]

Strategy: Big-to-Small

Search at section level, then drill-down:

DEVELOPERpython
def drill_down_retrieve(query: str, vector_db, k: int = 3) -> List[dict]:
    """
    Start with sections, then refine to details.
    """
    # 1. Search at section level
    sections = vector_db.query(
        query_embedding=embed(query),
        filter={"level": 2},
        limit=k
    )

    # 2. For each relevant section, search for details
    detailed_results = []

    for section in sections:
        # Search children of this section
        children = vector_db.query(
            query_embedding=embed(query),
            filter={
                "parent_id": section.id
            },
            limit=3
        )

        detailed_results.append({
            "section": section,
            "details": children,
            "combined_context": (
                section.metadata["content"] + "\n\n" +
                "\n".join([c.metadata["content"] for c in children])
            )
        })

    return detailed_results

LlamaIndex: Parent Document Retriever

LlamaIndex offers native implementation:

DEVELOPERpython
from llama_index import VectorStoreIndex, ServiceContext
from llama_index.node_parser import HierarchicalNodeParser
from llama_index.retrievers import AutoMergingRetriever
from llama_index.query_engine import RetrieverQueryEngine

# 1. Hierarchical parser
node_parser = HierarchicalNodeParser.from_defaults(
    chunk_sizes=[2048, 512, 128]  # Granularity levels
)

# 2. Create nodes
nodes = node_parser.get_nodes_from_documents(documents)

# 3. Index
index = VectorStoreIndex(nodes)

# 4. Retriever with auto-merging
retriever = AutoMergingRetriever(
    index.as_retriever(similarity_top_k=6),
    index.storage_context,
    verbose=True
)

# 5. Query engine
query_engine = RetrieverQueryEngine.from_args(retriever)

response = query_engine.query("What methods are used?")

LangChain: Parent Document Retriever

DEVELOPERpython
from langchain.retrievers import ParentDocumentRetriever
from langchain.storage import InMemoryStore
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma

# Splitters for different levels
parent_splitter = RecursiveCharacterTextSplitter(chunk_size=2000)
child_splitter = RecursiveCharacterTextSplitter(chunk_size=400)

# Store for parents
docstore = InMemoryStore()

# Vectorstore for children (fine search)
vectorstore = Chroma(embedding_function=embeddings)

# Parent Document Retriever
retriever = ParentDocumentRetriever(
    vectorstore=vectorstore,
    docstore=docstore,
    child_splitter=child_splitter,
    parent_splitter=parent_splitter,
)

# Add documents
retriever.add_documents(documents)

# Search: children match, parents returned
results = retriever.get_relevant_documents("Question about methods")

Metadata Optimization

Enrich Semantic Path

DEVELOPERpython
def create_semantic_path(chunk: HierarchicalChunk) -> str:
    """
    Creates a readable semantic path for the LLM.
    """
    path_parts = []

    for i, title in enumerate(chunk.path):
        level_prefix = {
            0: "Document:",
            1: "Chapter:",
            2: "Section:",
            3: "Subsection:",
            4: "Paragraph:"
        }.get(i, "")

        path_parts.append(f"{level_prefix} {title}")

    return " → ".join(path_parts)

# Example output:
# "Document: Technical Manual → Chapter: Installation → Section: Prerequisites"

Add Breadcrumbs to Context

DEVELOPERpython
def format_context_with_breadcrumbs(chunks: List[dict]) -> str:
    """
    Formats context with breadcrumbs for the LLM.
    """
    formatted = []

    for chunk in chunks:
        breadcrumb = chunk['path']
        content = chunk['content']

        formatted.append(f"""
📍 {breadcrumb}

{content}
""")

    return "\n---\n".join(formatted)

When to Use Hierarchical Chunking

Use it when:

Long, structured documents (manuals, technical docs)
Clear hierarchy (chapters, sections, subsections)
Need both broad context AND fine precision
Questions that span multiple levels

Avoid when:

Flat documents (emails, chats, logs)
Very homogeneous content
Strict latency constraints (retrieval overhead)
Very short documents (< 2000 tokens)

Benchmarks

Document Type	Fixed Chunking	Semantic	Hierarchical
Technical docs	65%	72%	88%
Structured reports	58%	68%	85%
Scientific papers	62%	75%	82%
Narrative text	70%	78%	72%

MRR@5 on internal test datasets

Hierarchical excels on structured documents but provides no gain on narrative content.

Related Guides

Chunking:

Chunking Strategies - Overview of approaches
Semantic Chunking - Meaning-based chunking
Fixed-Size Chunking - Classic approach

Retrieval:

Parent Document Retrieval - Retrieval with parent context
Retrieval Strategies - Advanced techniques

Need help implementing hierarchical chunking on your complex documents? Let's discuss your project →

Hierarchical Chunking: Preserving Document Structure