Skip to content

AIAnytime/Agent-Memory-Playground

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI Agent Memory Design & Optimization Playground

Python Streamlit FastAPI License

Interactive playground for testing and comparing 9 different AI agent memory optimization strategies

AI Agent Memory Playground

Overview

This project implements 9 different memory optimization techniques for AI agents, providing a comprehensive solution for managing conversation history and context in production AI systems. Each strategy is implemented as a modular, plug-and-play class with a unified interface.

Why Memory Optimization Matters

  • Token Cost Reduction: Prevent exponential growth in LLM API costs
  • Context Preservation: Maintain relevant information across conversations
  • Scalability: Handle long conversations efficiently
  • Performance: Optimize response times and memory usage

Memory Strategies Implemented

Basic Strategies

  1. Sequential Memory - Complete conversation history storage
  2. Sliding Window Memory - Fixed-size recent conversation window
  3. Summarization Memory - LLM-based conversation compression

Advanced Strategies

  1. Retrieval Memory (RAG) - Vector similarity search for semantic retrieval
  2. Memory-Augmented Memory - Persistent memory tokens with sliding window
  3. Hierarchical Memory - Multi-layered working + long-term memory

Complex Strategies

  1. Graph Memory - Knowledge graph with entity relationships
  2. Compression Memory - Intelligent compression with importance scoring
  3. OS-like Memory - RAM/disk simulation with paging mechanisms

Features

  • Modular Architecture - Strategy pattern for easy swapping
  • Interactive Playground - Streamlit web interface for testing
  • Performance Analytics - Token usage and response time tracking
  • Batch Comparison - Test multiple strategies simultaneously
  • Production Ready - FastAPI endpoints for deployment
  • Real-time Metrics - Memory statistics and performance monitoring

Installation

Prerequisites

  • Python 3.10+
  • OpenAI API Key

Setup

  1. Clone the repository
git clone https://github.com/AIAnytime/Agent-Memory-Playground.git
cd Agent-Memory-Playground
  1. Install dependencies
pip install -r requirements.txt
  1. Configure environment
# Create .env file
echo "OPENAI_API_KEY=your_openai_api_key_here" > .env

Quick Start

1. Interactive Playground (Streamlit)

streamlit run streamlit_playground.py
  • Open http://localhost:8501 in your browser
  • Enter your OpenAI API key in the sidebar
  • Select a memory strategy and start testing!

2. API Server (FastAPI)

uvicorn api:app --reload

3. Command Line Example

python example_usage.py
  • Interactive CLI for testing all memory strategies
  • Detailed memory statistics and performance metrics

Usage Examples

Basic Usage

from memory_strategies import SequentialMemory, AIAgent

# Initialize memory strategy
memory = SequentialMemory()
agent = AIAgent(memory_strategy=memory)

# Chat with the agent
response = agent.chat("Hello! My name is Alex.")
print(response["ai_response"])

# Memory automatically preserved for next interaction
response = agent.chat("What's my name?")
print(response["ai_response"])  # Will remember "Alex"

Advanced RAG Implementation

from memory_strategies import RetrievalMemory, AIAgent

# Initialize RAG-based memory
memory = RetrievalMemory(k=3)  # Retrieve top 3 similar conversations
agent = AIAgent(memory_strategy=memory)

# Build conversation history
agent.chat("I'm a software engineer working on ML projects")
agent.chat("I prefer Python and love coffee")
agent.chat("I'm building a recommendation system")

# Query with semantic similarity
response = agent.chat("What do you know about my work?")
# Will retrieve relevant context about ML, Python, and recommendation systems

Production API Usage

# Create a session with hierarchical memory
curl -X POST "http://localhost:8000/sessions" \
  -H "Content-Type: application/json" \
  -d '{
    "strategy_type": "hierarchical",
    "system_prompt": "You are a helpful AI assistant.",
    "api_key": "your_openai_key"
  }'

# Chat with the session
curl -X POST "http://localhost:8000/sessions/{session_id}/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Remember that I prefer concise responses",
    "api_key": "your_openai_key"
  }'

Performance Comparison

Strategy Token Efficiency Retrieval Speed Memory Usage Best For
Sequential ❌ Low ⚑ Instant πŸ“ˆ High Short conversations
Sliding Window βœ… High ⚑ Instant πŸ“Š Constant Real-time chat
Retrieval (RAG) βœ… High πŸ” Fast πŸ“Š Medium Production systems
Hierarchical βœ… Very High πŸ” Fast πŸ“Š Medium Complex applications
Graph Memory πŸ” Medium 🐌 Slow πŸ“ˆ High Knowledge systems

Architecture

Strategy Pattern Design

AIAgent
β”œβ”€β”€ BaseMemoryStrategy (Abstract)
β”‚   β”œβ”€β”€ add_message()
β”‚   β”œβ”€β”€ get_context()
β”‚   └── clear()
β”œβ”€β”€ SequentialMemory
β”œβ”€β”€ SlidingWindowMemory
β”œβ”€β”€ RetrievalMemory
└── ... (6 more strategies)

Key Components

  • Memory Strategies: Modular memory implementations
  • AI Agent: Core agent using strategy pattern
  • Utilities: Token counting, embeddings, LLM integration
  • API Layer: FastAPI endpoints for production use
  • Playground: Streamlit interface for testing

Monitoring & Metrics

Track essential performance metrics:

{
    "total_content_tokens": 1250,      # Raw conversation data
    "total_prompt_tokens": 4800,       # Actual LLM costs
    "average_retrieval_time": 0.15,    # Memory access speed
    "memory_efficiency": 0.73,         # Compression ratio
    "context_relevance_score": 0.89    # Quality of retrieved context
}

Configuration

Memory Strategy Parameters

Sliding Window Memory

SlidingWindowMemory(window_size=4)  # Keep last 4 conversation turns

Retrieval Memory (RAG)

RetrievalMemory(k=3)  # Retrieve top 3 similar conversations

Hierarchical Memory

HierarchicalMemory(
    window_size=2,  # Working memory size
    k=3            # Long-term retrieval count
)

Production Deployment

Docker Deployment

FROM python:3.9-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
EXPOSE 8000

CMD ["uvicorn", "api:app", "--host", "0.0.0.0", "--port", "8000"]

Environment Variables

OPENAI_API_KEY=your_openai_api_key
OPENAI_MODEL=gpt-4o-mini
EMBEDDING_MODEL=text-embedding-3-small

Testing

Run the test suite:

python -m pytest tests/

Run performance benchmarks:

python benchmark.py

Documentation

Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Acknowledgments

  • OpenAI for providing the GPT models and embeddings
  • Streamlit for the amazing web framework
  • FastAPI for the high-performance API framework
  • FAISS for efficient vector similarity search

Support & Contact


Built with ❀️ by AI Anytime

Star this repo if you find it helpful!

Releases

No releases published

Packages

No packages published

Languages