Appearance
Agent Memory & History Pipelines
Elsai Agents supports a highly flexible, modular memory system designed to manage agent state, conversation history length, and semantic contexts. Instead of relying on a static, single conversation buffer, Elsai enables you to build custom memory pipelines that shape history dynamically.
With the standalone elsai-model, elsai-embeddings, and elsai-vectordb packages, your agents can easily use external embedding models and vector databases to retrieve relevant historic context and persist memory over time.
Memory Pipeline Architecture
Elsai divides memory management into two types of steps within a pipeline:
- Elsai-Native Conversation Managers: Simple conversation sizing filters designed to run in memory or persist to a local folder.
- Persistence: Requires
persistence="elsai_file". - Classes:
ElsaiSlidingStep,ElsaiSummarizingStep.
- Persistence: Requires
- ElsAI Shaping Pipeline Steps: Advanced strategies that trim, summarize, or age-out messages based on token limits or similarity scoring.
- Persistence: Requires
persistence="elsai_json". - Classes:
ElsaiTrimmingStep,ElsaiSummarizationStep,ElsaiLRUStep,ElsaiTTLStep,ElsaiSimilarityStep.
- Persistence: Requires
IMPORTANT
To configure an agent with a custom memory pipeline, use the build_agent_with_memory builder function and configure MemoryConfig.
Pipeline Steps Reference
Below are the supported pipeline strategy steps that you can register in MemoryConfig.pipeline:
ElsaiSlidingStep
Maintains a simple sliding window of the most recent messages.
- Parameters:
window_size(int): Number of recent messages to preserve. Default:40.should_truncate_results(bool): Truncate oldest messages when exceeding size. Default:True.
ElsaiSummarizingStep
Compacts old messages by summarizing them using a language model once the history grows.
- Parameters:
preserve_recent_messages(int): Number of recent messages to leave untouched. Default:10.summary_ratio(float): Target ratio of summarization. Default:0.3.
ElsaiTrimmingStep
Trims older messages once a limit on message count or token count is exceeded.
- Parameters:
max_messages(int): Maximum messages to allow. Default:30.max_tokens(int | None): Optional token count threshold. Default:None.preserve_system(bool): Always keep the initial system prompt. Default:True.preserve_recent(int): Number of most recent messages to protect from trimming. Default:3.
ElsaiSummarizationStep
Converts older messages in the active window into a high-level prose summary.
- Parameters:
trigger_count(int): Trigger summarization when window exceeds this size. Default:20.preserve_system(bool): Keep system prompt. Default:True.
- Requires configuring
MemoryConfig.summarizer_llm.
ElsaiLRUStep
Performs Least Recently Used (LRU) eviction on conversations.
- Parameters:
max_messages(int): Maximum message window limit. Default:30.preserve_system(bool): Keep system prompt. Default:True.preserve_recent(int): Protect the last N messages from eviction. Default:5.
ElsaiTTLStep
Ages out old messages from history based on elapsed time.
- Parameters:
ttl_seconds(int): Time-to-live threshold in seconds. Default:3600(1 hour).preserve_system(bool): Keep system prompt. Default:True.preserve_recent(int): Protect recent messages. Default:5.
Semantic Context Injection Hooks
To provide long-term associative memory, you can attach similarity search and semantic memory hooks directly through MemoryConfig:
1. Similarity Retrieval Config (MemoryConfig.similarity)
Automatically performs vector similarity search on user input against the conversation database and injects matching context.
- Key parameters:
similarity_config(dict): Connection configurations (includes vector DB client and embedding client).top_k(int): Number of matched messages to retrieve. Default:5.injection_mode(str):"system_append"(appends to system prompt) or"user_preamble"(prepends to user message).
2. Semantic Memory Config (MemoryConfig.semantic)
Maintains abstract facts (like preferences) about a user across multiple sessions and injects them.
- Key parameters:
user_id_key(str): The metadata key matching the user ID. Default:"user_id".injection_mode(str): Where to insert the retrieved facts. Default:"system_append".
Basic Usage Example
Below is a complete example of creating an Elsai agent with a custom memory pipeline, using a Chroma local vector database and Titan embeddings on Bedrock to run similarity searches.
python
import os
from pathlib import Path
from elsai_model.openai import OpenAIConnector
from elsai.integrations.elsai_embeddings import EmbeddingBackendConfig, build_embedding_client
from elsai.integrations.elsai_vectordb import VectorBackendConfig, build_vectordb_client
from elsai.integrations.elsai_memory import (
MemoryConfig,
ElsaiTrimmingStep,
SimilarityRetrievalConfig,
build_agent_with_memory,
)
# 1. Initialize standalone Embedding & Vector DB clients
embed_client = build_embedding_client(
EmbeddingBackendConfig(
provider="bedrock",
aws_region="us-east-1",
model_name="amazon.titan-embed-text-v1"
)
)
vector_db = build_vectordb_client(
VectorBackendConfig(
provider="chroma",
collection_name="agent_history",
persist_directory="./chroma_db",
)
)
# 2. Build the similarity config dictionary
similarity_setup = {
"vector_database": {
"name": "chroma",
"client": vector_db,
"collection_name": "agent_history",
},
"embedding_model": {
"name": "bedrock",
"client": embed_client,
},
}
# 3. Define your memory and persistent pipelines
memory_config = MemoryConfig(
run_id="session_user_123",
role="customer_support",
persistence="elsai_json",
pipeline=[ElsaiTrimmingStep(max_messages=15)],
similarity=SimilarityRetrievalConfig(
similarity_config=similarity_setup,
top_k=3
)
)
# 4. Spin up the agent using build_agent_with_memory
model = OpenAIConnector(model_name="gpt-4o")
agent = build_agent_with_memory(
config=memory_config,
model=model,
)
# 5. Run the agent
result = agent("What did we talk about during our last chat regarding database deployment?")
print(result)