Interactive playground for testing and comparing 9 different AI agent memory optimization strategies
This project implements 9 different memory optimization techniques for AI agents, providing a comprehensive solution for managing conversation history and context in production AI systems. Each strategy is implemented as a modular, plug-and-play class with a unified interface.
- Token Cost Reduction: Prevent exponential growth in LLM API costs
- Context Preservation: Maintain relevant information across conversations
- Scalability: Handle long conversations efficiently
- Performance: Optimize response times and memory usage
- Sequential Memory - Complete conversation history storage
- Sliding Window Memory - Fixed-size recent conversation window
- Summarization Memory - LLM-based conversation compression
- Retrieval Memory (RAG) - Vector similarity search for semantic retrieval
- Memory-Augmented Memory - Persistent memory tokens with sliding window
- Hierarchical Memory - Multi-layered working + long-term memory
- Graph Memory - Knowledge graph with entity relationships
- Compression Memory - Intelligent compression with importance scoring
- OS-like Memory - RAM/disk simulation with paging mechanisms
- Modular Architecture - Strategy pattern for easy swapping
- Interactive Playground - Streamlit web interface for testing
- Performance Analytics - Token usage and response time tracking
- Batch Comparison - Test multiple strategies simultaneously
- Production Ready - FastAPI endpoints for deployment
- Real-time Metrics - Memory statistics and performance monitoring
- Python 3.10+
- OpenAI API Key
- Clone the repository
git clone https://github.com/AIAnytime/Agent-Memory-Playground.git
cd Agent-Memory-Playground
- Install dependencies
pip install -r requirements.txt
- Configure environment
# Create .env file
echo "OPENAI_API_KEY=your_openai_api_key_here" > .env
streamlit run streamlit_playground.py
- Open http://localhost:8501 in your browser
- Enter your OpenAI API key in the sidebar
- Select a memory strategy and start testing!
uvicorn api:app --reload
- API documentation: http://localhost:8000/docs
- Create sessions, chat, and monitor performance via REST API
python example_usage.py
- Interactive CLI for testing all memory strategies
- Detailed memory statistics and performance metrics
from memory_strategies import SequentialMemory, AIAgent
# Initialize memory strategy
memory = SequentialMemory()
agent = AIAgent(memory_strategy=memory)
# Chat with the agent
response = agent.chat("Hello! My name is Alex.")
print(response["ai_response"])
# Memory automatically preserved for next interaction
response = agent.chat("What's my name?")
print(response["ai_response"]) # Will remember "Alex"
from memory_strategies import RetrievalMemory, AIAgent
# Initialize RAG-based memory
memory = RetrievalMemory(k=3) # Retrieve top 3 similar conversations
agent = AIAgent(memory_strategy=memory)
# Build conversation history
agent.chat("I'm a software engineer working on ML projects")
agent.chat("I prefer Python and love coffee")
agent.chat("I'm building a recommendation system")
# Query with semantic similarity
response = agent.chat("What do you know about my work?")
# Will retrieve relevant context about ML, Python, and recommendation systems
# Create a session with hierarchical memory
curl -X POST "http://localhost:8000/sessions" \
-H "Content-Type: application/json" \
-d '{
"strategy_type": "hierarchical",
"system_prompt": "You are a helpful AI assistant.",
"api_key": "your_openai_key"
}'
# Chat with the session
curl -X POST "http://localhost:8000/sessions/{session_id}/chat" \
-H "Content-Type: application/json" \
-d '{
"message": "Remember that I prefer concise responses",
"api_key": "your_openai_key"
}'
Strategy | Token Efficiency | Retrieval Speed | Memory Usage | Best For |
---|---|---|---|---|
Sequential | β Low | β‘ Instant | π High | Short conversations |
Sliding Window | β High | β‘ Instant | π Constant | Real-time chat |
Retrieval (RAG) | β High | π Fast | π Medium | Production systems |
Hierarchical | β Very High | π Fast | π Medium | Complex applications |
Graph Memory | π Medium | π Slow | π High | Knowledge systems |
AIAgent
βββ BaseMemoryStrategy (Abstract)
β βββ add_message()
β βββ get_context()
β βββ clear()
βββ SequentialMemory
βββ SlidingWindowMemory
βββ RetrievalMemory
βββ ... (6 more strategies)
- Memory Strategies: Modular memory implementations
- AI Agent: Core agent using strategy pattern
- Utilities: Token counting, embeddings, LLM integration
- API Layer: FastAPI endpoints for production use
- Playground: Streamlit interface for testing
Track essential performance metrics:
{
"total_content_tokens": 1250, # Raw conversation data
"total_prompt_tokens": 4800, # Actual LLM costs
"average_retrieval_time": 0.15, # Memory access speed
"memory_efficiency": 0.73, # Compression ratio
"context_relevance_score": 0.89 # Quality of retrieved context
}
Sliding Window Memory
SlidingWindowMemory(window_size=4) # Keep last 4 conversation turns
Retrieval Memory (RAG)
RetrievalMemory(k=3) # Retrieve top 3 similar conversations
Hierarchical Memory
HierarchicalMemory(
window_size=2, # Working memory size
k=3 # Long-term retrieval count
)
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "api:app", "--host", "0.0.0.0", "--port", "8000"]
OPENAI_API_KEY=your_openai_api_key
OPENAI_MODEL=gpt-4o-mini
EMBEDDING_MODEL=text-embedding-3-small
Run the test suite:
python -m pytest tests/
Run performance benchmarks:
python benchmark.py
- Technical Guide - Comprehensive implementation details
- API Documentation - FastAPI interactive docs
- Strategy Comparison - Performance analysis
- Production Guide - Deployment best practices
We welcome contributions! Please see our Contributing Guidelines for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
- OpenAI for providing the GPT models and embeddings
- Streamlit for the amazing web framework
- FastAPI for the high-performance API framework
- FAISS for efficient vector similarity search
- Website: aianytime.net
- Creator Portfolio: sonukumar.site
- YouTube: @AIAnytime
- Issues: GitHub Issues
Built with β€οΈ by AI Anytime
Star this repo if you find it helpful!