Simulations quickstart
Test your AI assistants with realistic AI-powered callers
Test your AI assistants with realistic AI-powered callers
This quickstart guide will help you test your AI assistants and squads using realistic, AI-powered callers. In just a few minutes, you’ll create test scenarios, define evaluation criteria, and validate your agents work correctly under different conditions.
Simulations is Vapi’s voice agent testing framework that enables you to systematically test assistants and squads using AI-powered callers that follow defined instructions and evaluate outcomes using structured outputs. Instead of relying on manual testing or rigid scripts, Simulations recreate real conversations and measure whether your assistant behaves correctly. Test your agents by:
Simulations help you maintain quality and catch issues early:
Simulations support two transport modes:
Use chat mode during development for quick iteration, then switch to voice mode for final validation before deployment.
A simulation suite for an appointment booking assistant that tests:
Sign up at dashboard.vapi.ai
Get your API key from API Keys in sidebar
You’ll also need an existing assistant or squad to test. You can create one in the Dashboard or use the API.
Personalities define how the AI tester behaves during a simulation. A personality is a full assistant configuration that controls the tester’s voice, model, and behavior via system prompt.
Start with the built-in default personalities to get familiar with the system before creating custom ones.
Personality types: Consider creating personalities for different customer types you encounter: decisive buyers, confused users, detail-oriented customers, or frustrated callers.
Scenarios define what the test is evaluating. A scenario contains:
Evaluations use structured outputs to extract data from the conversation and compare against expected values.
=true=trueEach evaluation consists of:
Schema type restrictions: Evaluations only support primitive schema types: string, number, integer, boolean. Objects and arrays are not supported.
Evaluation tips: Use boolean structured outputs for pass/fail checks like “appointment_booked” or “issue_resolved”. Use numeric outputs with comparators for metrics like “satisfaction_score >= 4”.
Simulations pair a scenario with a personality. The target assistant or squad is specified when you run the simulation.
Multiple simulations: Create several simulations with different personality and scenario combinations to thoroughly test your assistant across various conditions.
Simulation suites group multiple simulations into a single batch that runs together.
Suite organization: Group related simulations together. For example, create separate suites for “Booking Tests”, “Cancellation Tests”, and “Rescheduling Tests”.
Execute simulations against your assistant or squad. You can run individual simulations or entire suites.
vapi.websocket (default)vapi.webchat (faster, no audio)Analyze the results of your simulation runs to understand how your assistant performed.
When all evaluations pass, you’ll see:
Pass criteria:
status is “ended”itemCounts.passed equals itemCounts.totalpassed: trueWhen evaluation fails, you’ll see details about what went wrong:
Failure indicators:
itemCounts.failed > 0Full conversation transcripts are available for all simulation runs, making it easy to understand exactly what happened during each test.
Learn about tool mocks, hooks, CI/CD integration, and testing strategies
Create and configure assistants to test
Learn about chat-based testing with mock conversations
Learn how to define structured outputs for evaluations
Best practices for effective simulation testing:
vapi.webchat for rapid iteration, then validate with voiceSimulation concurrency follows your organization’s call concurrency limits. Each voice simulation uses 2 concurrent call slots (one for the AI tester, one for your assistant being tested). Chat mode simulations are more efficient since they don’t require audio processing. If you need higher concurrency limits, contact support.
Simulations use AI-powered testers that have actual conversations with your assistant, producing real call recordings and transcripts. Evals use mock conversations with predefined messages and judge the responses. Use Simulations for realistic end-to-end testing; use Evals for faster, more controlled validation.
Yes! You can either define inline structured outputs in your scenario evaluations, or reference existing structured outputs by ID using the structuredOutputId field.
Create a simulation that targets a squad instead of an assistant. Use the target.type: "squad" and target.squadId fields when creating a run.
Need assistance? We’re here to help: