VeriBot: Intelligent AI Response Testing Automation

VeriBot is a lightweight, configurable framework for automated testing of AI language models. It allows testers to validate AI responses against expected keywords and criteria, providing a structured approach to quality assurance for conversational AI systems.

Features

Test Case Management: Parse and execute test cases from structured text files
Keyword Validation: Verify AI responses contain expected keywords and phrases
Multi-turn Conversation Testing: Support for contextual tests that span multiple exchanges
Detailed Reporting: Generate CSV reports with test results and failure details
Progress Tracking: Real-time visibility into test execution status
Configurable API Integration: Currently supports DeepSeek API with extensible design

Installation

# Clone the repository
git clone https://github.com/antonyga/VeriBot.git
cd VeriBot

# Install dependencies
pip install requests

Configuration

Create a config.py file in the project root with your API credentials:

# DeepSeek API configuration
DEEPSEEK_API_KEY = 'your-api-key-here'

Customize test cases in testCases.txt following the format:

**1. Test Name**  
Prompt: "Your test prompt here"  
Expected Keywords: ["keyword1", "keyword2"]  
Pass criteria: Response contains "keyword1", "keyword2"

Usage

Run the test suite:

python test_runner.py

The script will:

Parse all test cases from your test file
Send each prompt to the AI service
Validate responses against expected keywords
Generate a detailed report of results

Test Case Format

VeriBot supports various test case types:

Standard Test Case

**1. Factual Q&A**  
Prompt: "What year did humans first land on the moon?"  
Expected Keywords: ["1969"]  
Pass criteria: Response contains "1969"

Multi-turn Conversation Test

**5. Multi-Turn Context**  
Prompt 1: "Who wrote Romeo and Juliet?"  
Expected Keywords: ["Shakespeare"]  
Pass criteria: Response contains "Shakespeare"  
Prompt 2 (follow-up): "What other tragedies did they write?"  
Expected Keywords: ["Hamlet", "Macbeth"]  
Pass criteria: Response contains "Hamlet", "Macbeth"

Test Categories

The included test cases cover a wide range of AI capabilities:

Factual knowledge retrieval
Creative content generation
Instructional responses
Role-playing scenarios
Contextual understanding
Ambiguity handling
Mathematical computations
Linguistic capabilities
Cultural knowledge

Output

Results are saved to test_results.csv with the following information:

Test number and name
Prompt used
Expected keywords
Pass/fail status
Missing keywords (if any)
Response snippet

Customization

Adding New Test Cases

Add new test cases to testCases.txt following the established format.

Supporting Different AI Providers

To use a different AI provider:

Update the API endpoint in test_runner.py
Modify the request structure in call_deepseek_api()
Adjust the response parsing logic if needed

Project Structure

VeriBot/
├── test_runner.py     # Main execution script
├── config.py          # API credentials
├── testCases.txt      # Test case definitions
└── test_results.csv   # Generated test results

Use Cases

QA Testing: Verify AI responses meet quality standards
Regression Testing: Ensure new model versions maintain expected behavior
Response Validation: Check factual accuracy and keyword presence
Multi-turn Validation: Test conversational memory and context handling

Future Enhancements

Support for more complex validation beyond keyword matching
Response time measurement and performance benchmarking
HTML report generation with interactive visualizations
Integration with CI/CD pipelines

Some Images

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Created by @antonyga for Q-Aware Labs - ISTQB Certified AI Software Tester, specializing in AI system validation and prompt engineering.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
README.md		README.md
TestResults.png		TestResults.png
Used Prompts - VeriBot.pdf		Used Prompts - VeriBot.pdf
config.py		config.py
csvTestResults.png		csvTestResults.png
runnerScript.png		runnerScript.png
testCases.txt		testCases.txt
test_results.csv		test_results.csv
test_runner.py		test_runner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VeriBot: Intelligent AI Response Testing Automation

Features

Installation

Configuration

Usage

Test Case Format

Standard Test Case

Multi-turn Conversation Test

Test Categories

Output

Customization

Adding New Test Cases

Supporting Different AI Providers

Project Structure

Use Cases

Future Enhancements

Some Images

License

About

About

Uh oh!

Releases

Packages

Languages

Q-Aware-Labs/veribot

Folders and files

Latest commit

History

Repository files navigation

VeriBot: Intelligent AI Response Testing Automation

Features

Installation

Configuration

Usage

Test Case Format

Standard Test Case

Multi-turn Conversation Test

Test Categories

Output

Customization

Adding New Test Cases

Supporting Different AI Providers

Project Structure

Use Cases

Future Enhancements

Some Images

License

About

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages