Chat Testing
Automated text-based testing for AI agents
Overview
Chat Test Suites allow you to evaluate your AI agents through simulated text conversations. This is our recommended solution for testing as it is much faster than voice testing and lets you isolate testing the behavior of your agent.
How Chat Testing Works
- Simulation: Our AI tester engages with your agent in a text-based conversation.
- Scripted Interaction: The testing agent follows your predefined script to simulate specific customer scenarios.
- Transcript Capture: The conversation is captured as a transcript.
- Evaluation: A language model (LLM) assesses the transcript against your success criteria.
Designing your tests
Good test design is critical to evaluating your agent. You’ll want to consider testing:
- The tool calls of your agent. Set your script to schedule an appointment or call a transfer tool. At the evaluation step, your rubric will have context of the tool call history to evaluate success.
- Knowledge base integrations. Test different Q&A to make sure that your agent responds as expected.
- Legal / compliance issues. Ask the agent to answer things it’s not supposed to, and verify that it refuses to answer.
- Personality. Simulate an angry, frustrated or manipulative customer, and make sure your assistant handles the situation well.
Benefits of Chat Testing
- Speed: Chat tests execute faster than voice tests, allowing for rapid iteration.
- Cost-Effective: No TTS or STT models are used during chat testing.
- Focused Assessment: Evaluate pure conversational ability without audio-related variables.
- Higher Test Volume: Run more tests in less time to ensure comprehensive coverage.
Creating Chat Tests
You can create chat tests as part of a Test Suite:
- Navigate to the Test tab and select Test Suites.
- Create a new Test Suite or edit an existing one.
- When adding tests, select Chat as the test type.
- Define your script and success criteria as detailed in the Test Suites documentation.
Best Practices for Chat Testing
- Use chat tests for rapid iteration during development.
- Create variations of the same scenario to test different user inputs.
- Test edge cases and potential misunderstandings.
For comprehensive instructions on creating and managing test suites that include chat tests, refer to the Test Suites documentation.