Portfolio Project: An AI orchestration platform demonstrating advanced Human-in-the-Loop architecture, multi-agent systems, and enterprise-grade engineering practices.
VIA is a sophisticated AI workflow orchestration platform that revolutionizes customer support through intelligent human-AI collaboration. This project showcases my current abilities in AI engineering, product design and management, and system architecture - delivering improvements in both customer satisfaction and employee wellbeing.
Via was built during the Dallas AI Summer Program 2025 under the mentorship of Eric Poon, Senior Vice President, Head of Technology, Shoppa's/Toyota Material Handling.
It was an eight week program where a group of AI enthusiasts of various backgrounds worked together under one of the mentors to conceive, design, and create an AI product prototype.
I was the product designer/manager and the lead AI engineer for this project, responsible for the technical architecture, multi-agent system design, and human-in-the-loop workflows. I (and Claude Code) are responsible for all code in this repository except for the frontend code in the frontend/
directory, which was built by Nithin Dodla.
This fork is my personal showcase as I continue to develop and refine the VIA system, demonstrating my skills in AI engineering and product design.
- π Designed a human-in-the-loop AI product
- ποΈ Architected a multi-agent AI system using LangGraph/LangChain
- β‘ Implemented a production-ready backend with modular, testable components
- π€ Integrated multiple LLMs (Gemini, Claude, local models)
- π§ͺ Established comprehensive testing framework with 95%+ code coverage
- π Delivered a live technical demo of the agents in action
Traditional chatbot systems create frustration cycles: customers get stuck with inadequate AI responses, leading to angry escalations that burn out human agents. This creates a lose-lose scenario for both customers and employees.
VIA is designed to monitor customer and AI interactions and to intelligently and pro-actively escalate to the most appropriate human when appropriate, based on factors like customer sentiment, issue complexity, and human agent wellbeing.
This system also empowers human agents to be part of the team, providing feedback and context to continuously improve AI performance.
Key Features:
- Intercepts all AI responses before delivery for quality assessment
- Monitors customer sentiment in real-time to prevent frustration buildup
- Intelligent Escalations to optimal human agents based on factors like expertise AND wellbeing
- Learns from every interaction to continuously integrate human expertise into the system
- Reduction in customer frustration incidents and overall resolution times
- Improvement in first-contact resolution quality
- Decrease in employee burnout indicators through intelligent workload balancing
- Empowerment of human agents making them part of the team and driving continuous improvement
I designed and implemented a sophisticated multi-agent system that demonstrates AI engineering patterns:
# Human-in-the-Loop Workflow Pipeline
Customer Query β Frustration Analysis β Chatbot Agent β Quality Agent β
β Context Enrichment β Intelligent Human Routing
-
Time Constraints: This was supposed to be an 8-week project, but coming up with a novel product concept that matched our mentor's theme of "Human-in-the-Loop AI" and then getting all 5 team members (1 team member eventually dropped out) of various levels of engagement and commitment to agree to the final product design took a significant amount of time.
-
After the concept was finalized, we had less than 4 weeks to design, implement, and deliver a working prototype with a live demo.
-
Only 2 of the 4 team members had coding experience. Nobody on the team had programmed with LLMs before.
-
This fork of the project represents my attempt to finish out the core of the product.
-
-
Incomplete Design/Implementation: We ran out of time to really finish out an MVP and there are various loose ends, like attaching the backend to the frontend. Our core features as a product will come down to the escalation engine and context manager, and these will be the key areas of focus for future development, such as advanced RAG implementation for the context manager and multi-model ensembles for the escalation engine.
-
Foundation Models: For proof of concept in the short time frame, we used Gemini Flash 2.5 API calls for best real time performance, but these were not thoroughly tuned and latency is not optimal (and cost will probably be high at scale.) Future plans include specialized sentiment models, pre-screener models, and ensemble approaches for cost/performance optimization.
-
Evaluation Framework: We did not have time to do proper evaluation and selection, but a comprehensive framework that allows rapid iterations and experimentation will be critical for future performance optimization and cost control.
-
Simulation Environment: We built a basic simulation environment to test the system and acquire mock data, but it is severely lacking. We need a more robust environment (ideally based on real data) with realistic customer personas and employee behavior models to properly validate the system.
-
Competitive Optimization: Ultimately, if this were to be a real product, the main areas of competitive differentiation would be:
- Overall User Experience - both for customers and human agents
- Model Performance - the speed, accuracy, and cost of our models
- Innovation - the overall quality of the escalation engine, context manager, and human feedback loop, especially working in unison
1. Frustration Detection Agent π’ Implemented
Real-time frustration analysis with escalating pattern detection and configurable thresholds
- β Current: Gemini Flash 2.5 API calls for sentiment analysis
- π Planned: Specialized sentiment models for speed and cost optimization
2. Quality Assurance Agent π’ Implemented
Real-time response evaluation before delivery with configurable quality thresholds
- β Current: Gemini Flash 2.5 API calls for consensus validation
- π Planned: Specialized models or pre-screener models for performance optimization
3. Intelligent Routing Agent π’ Implemented
Dynamic human agent routing based on various factors including expertise, workload, and customer history
- β Current: Gemini Flash 2.5 API calls for roster scoring and routing
- π Planned: Ensemble of multiple models for performance optimization (e.g. XGBoost for ranking and scoring ensembled with context-aware models for optimal employee selection)
4. Context Manager Agent π’ Implemented
Centralized context aggregation from multiple sources to provide rich, real-time data for decision-making and to continuously integrate human expertise and feedback into the system
- β Current: SQL database for customer history and context searches
- π Planned: Advanced RAG knowledgebase for faster context retrieval and integration
I created a live technical demo showcasing the core AI agents in action (the demo is geared towards custmer supprt in the insurance domain.)
Screen captures below, and the live demo can be found here: Live Backend Demo
Production-Ready Engineering Practices:
- Modular, Interface-Driven Design: Clean separation of concerns with comprehensive abstraction and interface contracts
- Pluggable Structure: Modular, OOP, and factory patterns for various components (e.g., agents, workflows) enabling easy extension
- Agent-Centric Configuration: Modular, hot-reloadable configuration system
- Comprehensive Testing: 95%+ code coverage with unit, integration, and performance tests
- Observability: Structured logging, LangSmith tracing, and performance monitoring
- Database Management: Centralized data layer
Technology Stack:
AI/ML Framework: LangGraph, LangChain, LangSmith
Languages: Python 3.11+
Data: SQLite, Pydantic
Testing: pytest
DevOps: Docker, Dev Containers
Quality: ruff, mypy
Future Enhancements:
- Evaluation Framework: Comprehensive model evaluation and selection framework for performance optimization
- Integrated Evals: monitor model performance and drift as well as config/threshold tuning
- Multi-Model Ensembles: Cost/performance optimization through ensemble approaches
- Advanced Sentiment Models: Beyond basic prompt-based analysis for deeper customer insights
- Advanced RAG Implementation: Context manager with specialized model integration for performance optimization
- Real-time Dashboard: Customer satisfaction metrics and agent performance tracking
src/
βββ core/ # Infrastructure & Configuration
β βββ agent_config_manager.py # Hot-reloadable agent configuration
β βββ context_manager.py # Multi-source data aggregation
β βββ database_config.py # Centralized data management
β βββ logging/ # Structured observability
βββ interfaces/ # Contract-driven development
β βββ core/ # System interface contracts
β βββ nodes/ # Agent behavior specifications
β βββ workflows/ # Orchestration interfaces
βββ nodes/ # AI Agent Implementations
β βββ chatbot_agent.py # Customer service AI
β βββ quality_agent.py # Response quality assurance
β βββ frustration_agent.py # Sentiment analysis & intervention
β βββ human_routing_agent.py # Intelligent escalation routing
β βββ context_manager_agent.py # Context aggregation & delivery
βββ simulation/ # Testing & Validation Framework
β βββ human_customer_simulator.py # Realistic customer personas
β βββ employee_simulator.py # Human agent simulation
β βββ demo_orchestrator.py # End-to-end scenario testing
βββ workflows/ # Orchestration & State Management
βββ hybrid_workflow.py # HITL system coordination
Designed an agent-centric configuration system that enables:
- Modular Development: Each agent has isolated configuration namespace
- Environment Management: Clean dev/test/prod separation
- Hot Reloading: Runtime configuration updates without restart
- Model Consolidation: Single source of truth for AI model preferences
Comprehensive Test Coverage:
- Unit Tests: All core components with mock-based isolation
- Integration Tests: End-to-end workflow validation
- Performance Tests: Concurrent operation and large dataset handling
- Error Scenario Testing: Comprehensive failure mode validation
Development Workflows:
make setup # Automated environment setup
make test # Full test suite with coverage
make check # Code quality validation
make run # Local development server
- π΄ Live Backend Demo - Hugging Face Spaces deployment
- π¨ Frontend Prototype - React-based user interface
- π Presentation Deck - Complete technical overview
- π Development Guide - Comprehensive technical documentation
- π§ͺ Test Suite - 95%+ coverage with real-world scenarios
- βοΈ Configuration Examples - Production-ready setup guides
- Architecture Design: Scalable, modular, maintainable system patterns
- Testing Strategy: Comprehensive validation and quality assurance
- DevOps Practices: Containerization, CI/CD, and deployment automation
- Documentation: Technical writing and AI and developer experience optimization
- Strategic Vision: Conceptualization of innovative HITL product
- Technical Leadership: Cross-functional team coordination and delivery
- Requirements Engineering: Concept translation to technical architecture
- Competitive Analysis: Video concept and live agent demo for stakeholder engagement
- Program Structure: 8-week intensive program pairing mentors with teams of 2-5 members of varying backgrounds to conceive, design, and build AI product prototypes
- My Role: Product Design/Manager & Lead AI Engineer
- Team Composition: 4 members across AI, Frontend, Creative, and Human Factors
Eric Poon - Senior Vice President, Head of Technology, Shoppa's/Toyota Material Handling
- Regular strategic guidance on product vision and market positioning
- Competition strategy and feedback
- Video and final presentation strategy and review
Team Members & Contributions:
- Chris Munch - Product Designer/Manager, AI Architecture, Backend Development
- Snehaa Muthiah - Creative Director, Branding, Presentation Design
- Nithin Dodla - Frontend Development, UI/UX
- Thomas Siskos - Human Factors Research, Marketing Strategy
# Clone and setup
git clone <repository-url>
cd human-ai-orchestrator
make setup
# Run comprehensive demos
uv run python scripts/experimentation_demo.py # 6 scenario demo
uv run python scripts/gradio_demo.py # Interactive interface
# Validate code quality
make test # Full test suite
make check # Linting and type checking
# Required for full functionality
export OPENAI_API_KEY="your_key_here" # Multi-provider support
export ANTHROPIC_API_KEY="your_key_here" # Claude integration
export GEMINI_API_KEY="your_key_here" # Gemini integration
export LANGCHAIN_API_KEY="your_key_here" # Optional: Tracing
# Optional: Environment selection
export HYBRID_SYSTEM_ENV="development" # dev/test/prod configs
# Docker deployment
docker build -t via-hitl-system .
docker run --env-file .env via-hitl-system
# Dev container support for VSCode/Cursor
# GPU-enabled development environment available
I'm actively seeking opportunities in AI/ML Engineering or even Product Management where I can apply these skills to solve complex problems at scale.
Contact Information:
- LinkedIn: Chris Munch
- Email: Available on LinkedIn profile
- Portfolio: This repository demonstrates production-ready AI engineering
Next Steps:
- Review the live demo to see the system in action
- Explore the codebase to evaluate technical implementation quality
- Check out the series of articles detailing my thought process and technical decisions
This project represents 8 weeks of intensive development, demonstrating my ability to deliver production-ready AI systems under tight deadlines while leading cross-functional teams and maintaining high technical standards.