Entity Memory: Remember Mentioned Entities
Complete guide to implementing entity memory in a RAG system: tracking people, products and concepts mentioned in the conversation.
Entity Memory: Remember Mentioned Entities
Entity Memory extracts and stores key entities mentioned in conversation: people, products, companies, locations, dates, etc. Unlike Buffer or Summary Memory that keep raw text, Entity Memory builds a dynamic "knowledge graph" enriched throughout exchanges. This enables natural references like "it", "this product", "that option".
Why Entity Memory?
The Entity Reference Problem
In natural conversation, users constantly reference previously mentioned entities:
User: "I'm looking for the Dell XPS 15"
AI: "The Dell XPS 15 is an excellent laptop..."
User: "What's its warranty?" <- "its" = Dell XPS 15
AI: "The Dell XPS 15 warranty..."
User: "And compared to the Lenovo one?" <- "the one" = which Lenovo product?
AI: ???
Without Entity Memory, the LLM must guess which entity the user is referring to.
Entity Memory Architecture
┌─────────────────────────────────────────────────────────────┐
│ ENTITY MEMORY │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ ENTITY STORE │ │
│ ├────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ "Dell XPS 15" (PRODUCT) │ │
│ │ - Type: Laptop │ │
│ │ - Price: $1599 │ │
│ │ - Warranty: 2 years │ │
│ │ - Mentions: 3 │ │
│ │ │ │
│ │ "Lenovo ThinkPad X1" (PRODUCT) │ │
│ │ - Type: Laptop │ │
│ │ - Price: $1799 │ │
│ │ - Mentions: 1 │ │
│ │ │ │
│ └────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Key statistic: Entity Memory improves response relevance by 40-60% for conversations with frequent references to the same objects.
Use Cases
| Domain | Tracked Entities | Reference Example |
|---|---|---|
| E-commerce | Products, brands, categories | "This product", "the same in black" |
| Support | Tickets, products, issues | "My account", "this error" |
| HR | Candidates, positions, companies | "This policy", "the position" |
| Real Estate | Properties, neighborhoods, buyers | "This apartment", "the seller" |
Basic Implementation
Entity Structure
DEVELOPERpythonfrom typing import Dict, List, Optional from dataclasses import dataclass, field from datetime import datetime import json @dataclass class Entity: """Represents an extracted entity""" name: str type: str # PERSON, PRODUCT, ORGANIZATION, etc. attributes: Dict = field(default_factory=dict) mentions: int = 1 first_seen: datetime = field(default_factory=datetime.now) last_seen: datetime = field(default_factory=datetime.now) def update(self, new_attributes: Dict) -> None: """Update entity attributes""" self.attributes.update(new_attributes) self.mentions += 1 self.last_seen = datetime.now() def to_dict(self) -> Dict: """Convert to dictionary""" return { "name": self.name, "type": self.type, "attributes": self.attributes, "mentions": self.mentions }
Complete Entity Memory
DEVELOPERpythonclass EntityMemory: """ Entity-based conversational memory """ def __init__(self, llm, entity_types: List[str] = None): self.llm = llm self.entity_types = entity_types or [ "PERSON", "PRODUCT", "ORGANIZATION", "LOCATION", "DATE", "MONEY", "CONCEPT" ] self.entities: Dict[str, Entity] = {} self.recent_messages: List[Dict] = [] def add_message(self, role: str, content: str) -> None: """Add a message and extract entities""" self.recent_messages.append({ "role": role, "content": content, "timestamp": datetime.now().isoformat() }) # Extract entities from message entities = self._extract_entities(content) self._update_entity_store(entities) def _extract_entities(self, text: str) -> List[Dict]: """Extract entities from text using LLM""" prompt = f"""Extract named entities from this text. Entity types to detect: {', '.join(self.entity_types)} Text: "{text}" Return a JSON array: [{{"name": "name", "type": "TYPE", "attributes": {{}}}}, ...] If no entities, return: [] JSON:""" response = self.llm.invoke(prompt) try: return json.loads(response) except json.JSONDecodeError: return [] def _update_entity_store(self, entities: List[Dict]) -> None: """Update entity store""" for entity_data in entities: name = entity_data.get("name", "").lower() if not name: continue if name in self.entities: self.entities[name].update(entity_data.get("attributes", {})) else: self.entities[name] = Entity( name=entity_data.get("name"), type=entity_data.get("type", "UNKNOWN"), attributes=entity_data.get("attributes", {}) ) def get_recent_entities(self, limit: int = 5) -> List[Entity]: """Get most recently mentioned entities""" sorted_entities = sorted( self.entities.values(), key=lambda e: e.last_seen, reverse=True ) return sorted_entities[:limit] def resolve_reference(self, reference: str) -> Optional[str]: """Resolve a reference like 'it', 'this product'""" if not self.entities: return None entities_json = json.dumps([e.to_dict() for e in self.get_recent_entities(5)]) prompt = f"""Reference to resolve: "{reference}" Known entities: {entities_json} Which entity corresponds to "{reference}"? Reply only with the name, or "NONE" if no match.""" result = self.llm.invoke(prompt).strip() return result if result != "NONE" else None def get_context(self) -> str: """Generate context to inject into prompt""" if not self.entities: return "" recent = self.get_recent_entities(5) entity_lines = [] for e in recent: attrs = ", ".join([f"{k}: {v}" for k, v in e.attributes.items()]) line = f"- {e.name} ({e.type})" if attrs: line += f": {attrs}" entity_lines.append(line) return "Entities in context:\n" + "\n".join(entity_lines) def clear(self) -> None: """Reset memory""" self.entities = {} self.recent_messages = []
Integration with RAG
DEVELOPERpythonclass RAGWithEntityMemory: """RAG Pipeline with Entity Memory""" def __init__(self, vector_store, llm): self.vector_store = vector_store self.llm = llm self.memory = EntityMemory(llm) def query(self, user_message: str) -> str: """Execute RAG query with entity context""" # 1. Resolve references resolved_message = self._resolve_references(user_message) # 2. Extract entities self.memory.add_message("user", resolved_message) # 3. Retrieve documents docs = self.vector_store.similarity_search(resolved_message, k=3) doc_context = "\n\n".join([d.page_content for d in docs]) # 4. Build prompt entity_context = self.memory.get_context() prompt = f"""{entity_context} Relevant documents: {doc_context} Question: {user_message} Answer considering mentioned entities.""" # 5. Generate response response = self.llm.invoke(prompt) self.memory.add_message("assistant", response) return response def _resolve_references(self, text: str) -> str: """Resolve pronominal references""" references = ["it", "this", "this product", "that", "this one"] for ref in references: if ref in text.lower(): resolved = self.memory.resolve_reference(ref) if resolved: text = text.replace(ref, resolved, 1) return text
Advanced Techniques
Extraction with Relations
DEVELOPERpythonclass RelationalEntityMemory(EntityMemory): """Entity Memory with entity relations""" def __init__(self, llm, **kwargs): super().__init__(llm, **kwargs) self.relations: List[Dict] = [] def _extract_entities(self, text: str) -> List[Dict]: """Extract entities AND relations""" prompt = f"""Analyze this text and extract: 1. Named entities 2. Relations between entities Relation types: COMPARES_TO, PREFERS, OWNS, INTERESTED_IN Text: "{text}" JSON: {{ "entities": [{{"name": "...", "type": "...", "attributes": {{}}}}], "relations": [{{"source": "e1", "relation": "TYPE", "target": "e2"}}] }}""" response = self.llm.invoke(prompt) try: data = json.loads(response) for rel in data.get("relations", []): self.relations.append(rel) return data.get("entities", []) except json.JSONDecodeError: return [] def get_related_entities(self, entity_name: str) -> List[Dict]: """Find related entities""" related = [] name_lower = entity_name.lower() for rel in self.relations: if rel["source"].lower() == name_lower: related.append({"entity": rel["target"], "relation": rel["relation"]}) elif rel["target"].lower() == name_lower: related.append({"entity": rel["source"], "relation": rel["relation"]}) return related
Multi-Session Persistence
DEVELOPERpythonfrom pathlib import Path class PersistentEntityMemory(EntityMemory): """Entity Memory with persistent storage""" def __init__(self, llm, user_id: str, storage_dir: str = "./entity_store"): super().__init__(llm) self.user_id = user_id self.storage_path = Path(storage_dir) / f"{user_id}.json" self._load() def _load(self) -> None: """Load entities from file""" if self.storage_path.exists(): with open(self.storage_path, "r") as f: data = json.load(f) for name, entity_data in data.get("entities", {}).items(): self.entities[name] = Entity( name=entity_data["name"], type=entity_data["type"], attributes=entity_data.get("attributes", {}), mentions=entity_data.get("mentions", 1) ) def save(self) -> None: """Save entities""" self.storage_path.parent.mkdir(exist_ok=True) data = { "user_id": self.user_id, "entities": {name: e.to_dict() for name, e in self.entities.items()} } with open(self.storage_path, "w") as f: json.dump(data, f, indent=2) def add_message(self, role: str, content: str) -> None: super().add_message(role, content) self.save()
Implementation with LangChain
DEVELOPERpythonfrom langchain.memory import ConversationEntityMemory from langchain.chains import ConversationChain from langchain_openai import ChatOpenAI llm = ChatOpenAI(model="gpt-4", temperature=0.7) # Native LangChain Entity Memory memory = ConversationEntityMemory(llm=llm, return_messages=True) conversation = ConversationChain(llm=llm, memory=memory, verbose=True) # Usage response = conversation.predict(input="I'm looking for a Dell XPS 15") response = conversation.predict(input="What's its price?") # Inspect entities print(memory.entity_store)
Best Practices
1. Entity Types by Domain
DEVELOPERpython# E-commerce ECOMMERCE_TYPES = ["PRODUCT", "BRAND", "CATEGORY", "PRICE", "FEATURE"] # Support SUPPORT_TYPES = ["PRODUCT", "ISSUE", "SOLUTION", "TICKET_ID", "PERSON"] # HR HR_TYPES = ["PERSON", "POSITION", "COMPANY", "SKILL", "EDUCATION"]
2. Entity Expiration
DEVELOPERpythonfrom datetime import timedelta class ExpiringEntityMemory(EntityMemory): def __init__(self, llm, ttl_minutes: int = 30, **kwargs): super().__init__(llm, **kwargs) self.ttl = timedelta(minutes=ttl_minutes) def _cleanup_expired(self) -> None: now = datetime.now() expired = [name for name, e in self.entities.items() if now - e.last_seen > self.ttl] for name in expired: del self.entities[name] def get_context(self) -> str: self._cleanup_expired() return super().get_context()
3. Intelligent Prioritization
DEVELOPERpythondef get_prioritized_entities(self, query: str, limit: int = 5) -> List[Entity]: """Return most relevant entities for a query""" scored = [] for entity in self.entities.values(): score = 0 recency = (datetime.now() - entity.last_seen).total_seconds() score += max(0, 100 - recency / 60) score += entity.mentions * 10 if entity.name.lower() in query.lower(): score += 50 scored.append((entity, score)) scored.sort(key=lambda x: x[1], reverse=True) return [e for e, _ in scored[:limit]]
When to Use Entity Memory?
Ideal Cases
- E-commerce: Tracking compared products
- Technical support: Tracking problem and solutions
- CRM/Sales: Tracking contacts and opportunities
- Consulting: Retaining preferences and constraints
When to Avoid
| Situation | Alternative |
|---|---|
| Generic conversations | Buffer Memory |
| Critical latency | Simple buffer |
| Very short conversations | Buffer Memory |
Related Guides
- Buffer Memory - For simple conversations
- Summary Memory - For long conversations
- Conversational RAG - Overview
Entity Memory with Ailog
With Ailog, benefit from native entity management:
- Automatic extraction of entities with configurable types
- Coreference resolution ("it", "this one", etc.)
- Multi-session persistence of user entities
- Relationship graph between entities
- Analytics on most mentioned entities
Try Ailog for free and deploy a chatbot that remembers important entities.
Tags
Related Posts
Conversational RAG: Memory and Multi-Session Context
Implement RAG with conversational memory: context management, multi-session history, and personalized responses.
Buffer Memory: Simple Conversation History
Complete guide to implementing buffer memory in a conversational RAG system: keeping context of recent exchanges for coherent responses.
RAG Agents: Orchestrating Multi-Agent Systems
Architect multi-agent RAG systems: orchestration, specialization, collaboration and failure handling for complex assistants.