Skip to content

cmunch1/human-ai-orchestrator

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

78 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

VIA - Intelligent Human-in-the-Loop AI System

VIA Frontpage

Portfolio Project: An AI orchestration platform demonstrating advanced Human-in-the-Loop architecture, multi-agent systems, and enterprise-grade engineering practices.

LinkedIn Portfolio

Executive Summary

VIA is a sophisticated AI workflow orchestration platform that revolutionizes customer support through intelligent human-AI collaboration. This project showcases my current abilities in AI engineering, product design and management, and system architecture - delivering improvements in both customer satisfaction and employee wellbeing.

Background

Via was built during the Dallas AI Summer Program 2025 under the mentorship of Eric Poon, Senior Vice President, Head of Technology, Shoppa's/Toyota Material Handling.

It was an eight week program where a group of AI enthusiasts of various backgrounds worked together under one of the mentors to conceive, design, and create an AI product prototype.

My Role

I was the product designer/manager and the lead AI engineer for this project, responsible for the technical architecture, multi-agent system design, and human-in-the-loop workflows. I (and Claude Code) are responsible for all code in this repository except for the frontend code in the frontend/ directory, which was built by Nithin Dodla.

This fork is my personal showcase as I continue to develop and refine the VIA system, demonstrating my skills in AI engineering and product design.

Key Achievements

  • πŸ“Š Designed a human-in-the-loop AI product
  • πŸ—οΈ Architected a multi-agent AI system using LangGraph/LangChain
  • ⚑ Implemented a production-ready backend with modular, testable components
  • πŸ€– Integrated multiple LLMs (Gemini, Claude, local models)
  • πŸ§ͺ Established comprehensive testing framework with 95%+ code coverage
  • 🌐 Delivered a live technical demo of the agents in action

πŸš€ Technical Innovation & Product Vision

The Problem We Solved

Traditional chatbot systems create frustration cycles: customers get stuck with inadequate AI responses, leading to angry escalations that burn out human agents. This creates a lose-lose scenario for both customers and employees.

The Problem

The Solution: Intelligent Intervention and Escalation Architecture

VIA is designed to monitor customer and AI interactions and to intelligently and pro-actively escalate to the most appropriate human when appropriate, based on factors like customer sentiment, issue complexity, and human agent wellbeing.

This system also empowers human agents to be part of the team, providing feedback and context to continuously improve AI performance.

Key Features:

  • Intercepts all AI responses before delivery for quality assessment
  • Monitors customer sentiment in real-time to prevent frustration buildup
  • Intelligent Escalations to optimal human agents based on factors like expertise AND wellbeing
  • Learns from every interaction to continuously integrate human expertise into the system

System Architecture

VIA Dashboard

Business Impact

  • Reduction in customer frustration incidents and overall resolution times
  • Improvement in first-contact resolution quality
  • Decrease in employee burnout indicators through intelligent workload balancing
  • Empowerment of human agents making them part of the team and driving continuous improvement

πŸ› οΈ Technical & AI Engineering

Multi-Agent AI Architecture

I designed and implemented a sophisticated multi-agent system that demonstrates AI engineering patterns:

# Human-in-the-Loop Workflow Pipeline
Customer Query β†’ Frustration Analysis β†’ Chatbot Agent β†’ Quality Agent β†’ 
 β†’ Context Enrichment β†’ Intelligent Human Routing

Prototype Considerations

  • Time Constraints: This was supposed to be an 8-week project, but coming up with a novel product concept that matched our mentor's theme of "Human-in-the-Loop AI" and then getting all 5 team members (1 team member eventually dropped out) of various levels of engagement and commitment to agree to the final product design took a significant amount of time.

    • After the concept was finalized, we had less than 4 weeks to design, implement, and deliver a working prototype with a live demo.

    • Only 2 of the 4 team members had coding experience. Nobody on the team had programmed with LLMs before.

    • This fork of the project represents my attempt to finish out the core of the product.

  • Incomplete Design/Implementation: We ran out of time to really finish out an MVP and there are various loose ends, like attaching the backend to the frontend. Our core features as a product will come down to the escalation engine and context manager, and these will be the key areas of focus for future development, such as advanced RAG implementation for the context manager and multi-model ensembles for the escalation engine.

  • Foundation Models: For proof of concept in the short time frame, we used Gemini Flash 2.5 API calls for best real time performance, but these were not thoroughly tuned and latency is not optimal (and cost will probably be high at scale.) Future plans include specialized sentiment models, pre-screener models, and ensemble approaches for cost/performance optimization.

  • Evaluation Framework: We did not have time to do proper evaluation and selection, but a comprehensive framework that allows rapid iterations and experimentation will be critical for future performance optimization and cost control.

  • Simulation Environment: We built a basic simulation environment to test the system and acquire mock data, but it is severely lacking. We need a more robust environment (ideally based on real data) with realistic customer personas and employee behavior models to properly validate the system.

  • Competitive Optimization: Ultimately, if this were to be a real product, the main areas of competitive differentiation would be:

    • Overall User Experience - both for customers and human agents
    • Model Performance - the speed, accuracy, and cost of our models
    • Innovation - the overall quality of the escalation engine, context manager, and human feedback loop, especially working in unison

Core AI Agents

1. Frustration Detection Agent 🟒 Implemented

Real-time frustration analysis with escalating pattern detection and configurable thresholds

  • βœ… Current: Gemini Flash 2.5 API calls for sentiment analysis
  • πŸ”„ Planned: Specialized sentiment models for speed and cost optimization

2. Quality Assurance Agent 🟒 Implemented

Real-time response evaluation before delivery with configurable quality thresholds

  • βœ… Current: Gemini Flash 2.5 API calls for consensus validation
  • πŸ”„ Planned: Specialized models or pre-screener models for performance optimization

3. Intelligent Routing Agent 🟒 Implemented

Dynamic human agent routing based on various factors including expertise, workload, and customer history

  • βœ… Current: Gemini Flash 2.5 API calls for roster scoring and routing
  • πŸ”„ Planned: Ensemble of multiple models for performance optimization (e.g. XGBoost for ranking and scoring ensembled with context-aware models for optimal employee selection)

4. Context Manager Agent 🟒 Implemented

Centralized context aggregation from multiple sources to provide rich, real-time data for decision-making and to continuously integrate human expertise and feedback into the system

  • βœ… Current: SQL database for customer history and context searches
  • πŸ”„ Planned: Advanced RAG knowledgebase for faster context retrieval and integration

Live Technical Demo

I created a live technical demo showcasing the core AI agents in action (the demo is geared towards custmer supprt in the insurance domain.)

Screen captures below, and the live demo can be found here: Live Backend Demo VIA Tech Demo 1 VIA Tech Deom 2

Technical Architecture Highlights

Production-Ready Engineering Practices:

  • Modular, Interface-Driven Design: Clean separation of concerns with comprehensive abstraction and interface contracts
  • Pluggable Structure: Modular, OOP, and factory patterns for various components (e.g., agents, workflows) enabling easy extension
  • Agent-Centric Configuration: Modular, hot-reloadable configuration system
  • Comprehensive Testing: 95%+ code coverage with unit, integration, and performance tests
  • Observability: Structured logging, LangSmith tracing, and performance monitoring
  • Database Management: Centralized data layer

Technology Stack:

AI/ML Framework: LangGraph, LangChain, LangSmith
Languages: Python 3.11+
Data: SQLite, Pydantic
Testing: pytest
DevOps: Docker, Dev Containers
Quality: ruff, mypy

Future Enhancements:

  • Evaluation Framework: Comprehensive model evaluation and selection framework for performance optimization
  • Integrated Evals: monitor model performance and drift as well as config/threshold tuning
  • Multi-Model Ensembles: Cost/performance optimization through ensemble approaches
  • Advanced Sentiment Models: Beyond basic prompt-based analysis for deeper customer insights
  • Advanced RAG Implementation: Context manager with specialized model integration for performance optimization
  • Real-time Dashboard: Customer satisfaction metrics and agent performance tracking

πŸ—οΈ System Architecture & Engineering

Scalable Multi-Agent Design

src/
β”œβ”€β”€ core/                          # Infrastructure & Configuration
β”‚   β”œβ”€β”€ agent_config_manager.py    # Hot-reloadable agent configuration
β”‚   β”œβ”€β”€ context_manager.py         # Multi-source data aggregation  
β”‚   β”œβ”€β”€ database_config.py         # Centralized data management
β”‚   └── logging/                   # Structured observability
β”œβ”€β”€ interfaces/                    # Contract-driven development
β”‚   β”œβ”€β”€ core/                      # System interface contracts
β”‚   β”œβ”€β”€ nodes/                     # Agent behavior specifications
β”‚   └── workflows/                 # Orchestration interfaces
β”œβ”€β”€ nodes/                         # AI Agent Implementations
β”‚   β”œβ”€β”€ chatbot_agent.py           # Customer service AI
β”‚   β”œβ”€β”€ quality_agent.py           # Response quality assurance
β”‚   β”œβ”€β”€ frustration_agent.py       # Sentiment analysis & intervention
β”‚   β”œβ”€β”€ human_routing_agent.py     # Intelligent escalation routing
β”‚   └── context_manager_agent.py   # Context aggregation & delivery
β”œβ”€β”€ simulation/                    # Testing & Validation Framework
β”‚   β”œβ”€β”€ human_customer_simulator.py # Realistic customer personas
β”‚   β”œβ”€β”€ employee_simulator.py      # Human agent simulation
β”‚   └── demo_orchestrator.py       # End-to-end scenario testing
└── workflows/                     # Orchestration & State Management
    └── hybrid_workflow.py         # HITL system coordination

Configuration Management Innovation

Designed an agent-centric configuration system that enables:

  • Modular Development: Each agent has isolated configuration namespace
  • Environment Management: Clean dev/test/prod separation
  • Hot Reloading: Runtime configuration updates without restart
  • Model Consolidation: Single source of truth for AI model preferences

Quality Assurance & Testing Strategy

Comprehensive Test Coverage:

  • Unit Tests: All core components with mock-based isolation
  • Integration Tests: End-to-end workflow validation
  • Performance Tests: Concurrent operation and large dataset handling
  • Error Scenario Testing: Comprehensive failure mode validation

Development Workflows:

make setup     # Automated environment setup
make test      # Full test suite with coverage
make check     # Code quality validation
make run       # Local development server

πŸš€ Live Demonstrations & Portfolio Links

Interactive Demos

Code Quality & Documentation


🎯 Key Skills Demonstrated

Software Engineering Proficiency

  • Architecture Design: Scalable, modular, maintainable system patterns
  • Testing Strategy: Comprehensive validation and quality assurance
  • DevOps Practices: Containerization, CI/CD, and deployment automation
  • Documentation: Technical writing and AI and developer experience optimization

Product Design/Management Expertise

  • Strategic Vision: Conceptualization of innovative HITL product
  • Technical Leadership: Cross-functional team coordination and delivery
  • Requirements Engineering: Concept translation to technical architecture
  • Competitive Analysis: Video concept and live agent demo for stakeholder engagement

πŸ“ˆ Business Context & Team Collaboration

Dallas AI Summer Program 2025

  • Program Structure: 8-week intensive program pairing mentors with teams of 2-5 members of varying backgrounds to conceive, design, and build AI product prototypes
  • My Role: Product Design/Manager & Lead AI Engineer
  • Team Composition: 4 members across AI, Frontend, Creative, and Human Factors

Mentor Collaboration

Eric Poon - Senior Vice President, Head of Technology, Shoppa's/Toyota Material Handling

  • Regular strategic guidance on product vision and market positioning
  • Competition strategy and feedback
  • Video and final presentation strategy and review

Cross-Functional Team Leadership

Team Members & Contributions:


πŸ”§ Technical Implementation Details

Quick Start for Technical Review

# Clone and setup
git clone <repository-url>
cd human-ai-orchestrator
make setup

# Run comprehensive demos
uv run python scripts/experimentation_demo.py  # 6 scenario demo
uv run python scripts/gradio_demo.py          # Interactive interface

# Validate code quality  
make test     # Full test suite
make check    # Linting and type checking

Environment Configuration

# Required for full functionality
export OPENAI_API_KEY="your_key_here"      # Multi-provider support
export ANTHROPIC_API_KEY="your_key_here"   # Claude integration  
export GEMINI_API_KEY="your_key_here"     # Gemini integration
export LANGCHAIN_API_KEY="your_key_here"   # Optional: Tracing

# Optional: Environment selection
export HYBRID_SYSTEM_ENV="development"     # dev/test/prod configs

Production Deployment Ready

# Docker deployment
docker build -t via-hitl-system .
docker run --env-file .env via-hitl-system

# Dev container support for VSCode/Cursor
# GPU-enabled development environment available

πŸ“ž Let's Connect

I'm actively seeking opportunities in AI/ML Engineering or even Product Management where I can apply these skills to solve complex problems at scale.

Contact Information:

  • LinkedIn: Chris Munch
  • Email: Available on LinkedIn profile
  • Portfolio: This repository demonstrates production-ready AI engineering

Next Steps:

  • Review the live demo to see the system in action
  • Explore the codebase to evaluate technical implementation quality
  • Check out the series of articles detailing my thought process and technical decisions

This project represents 8 weeks of intensive development, demonstrating my ability to deliver production-ready AI systems under tight deadlines while leading cross-functional teams and maintaining high technical standards.

About

Dallas AI Summer Program Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 69.7%
  • Jupyter Notebook 29.1%
  • TypeScript 0.9%
  • Makefile 0.2%
  • CSS 0.1%
  • Dockerfile 0.0%