Simulations quickstart

Overview

This quickstart guide will help you test your AI assistants and squads using realistic, AI-powered callers. In just a few minutes, you’ll create test scenarios, define evaluation criteria, and validate your agents work correctly under different conditions.

What are Simulations?

Simulations is Vapi’s voice agent testing framework that enables you to systematically test assistants and squads using AI-powered callers that follow defined instructions and evaluate outcomes using structured outputs. Instead of relying on manual testing or rigid scripts, Simulations recreate real conversations and measure whether your assistant behaves correctly. Test your agents by:

Creating personalities - Define a full assistant configuration for the AI tester (voice, model, system prompt)
Defining scenarios - Specify instructions for the tester and evaluations using structured outputs
Creating simulations - Pair scenarios with personalities
Running simulations - Execute tests against your assistant or squad in voice or chat mode
Reviewing results - Analyze pass/fail outcomes based on structured output evaluations

When are Simulations useful?

Simulations help you maintain quality and catch issues early:

Pre-deployment testing - Validate new assistant configurations before going live
Regression testing - Ensure prompt or tool changes don’t break existing behaviors
Conversation flow validation - Test multi-turn interactions and complex scenarios
Personality-based testing - Verify your agent handles different caller types appropriately
Squad handoff testing - Ensure smooth transitions between squad members
Performance monitoring - Track success rates over time and identify regressions

Voice vs Chat mode

Simulations support two transport modes:

Voice mode

Full voice simulation with audio
Realistic end-to-end testing
Tests speech recognition and synthesis
Produces call recordings

Chat mode

Text-based chat simulation
Faster execution
Lower cost (no audio processing)
Ideal for rapid iteration

Use chat mode during development for quick iteration, then switch to voice mode for final validation before deployment.

What you’ll build

A simulation suite for an appointment booking assistant that tests:

Different caller personalities (confused user, impatient customer)
Evaluation criteria using structured outputs with comparators
Real-time monitoring of test runs
Both voice and chat mode execution

Prerequisites

Vapi account

API key

Get your API key from API Keys in sidebar

You’ll also need an existing assistant or squad to test. You can create one in the Dashboard or use the API.

Step 1: Create a personality

Personalities define how the AI tester behaves during a simulation. A personality is a full assistant configuration that controls the tester’s voice, model, and behavior via system prompt.

Dashboard

cURL

Navigate to Simulations

Log in to dashboard.vapi.ai
Click on Simulations in the left sidebar
Click the Personalities tab

Create a personality

Click Create Personality
Name: Enter “Impatient Customer”
Assistant Configuration: Configure the tester assistant:
- Model: Select your preferred LLM (e.g., GPT-4o)
- System Prompt: Define the personality behavior:
```
You are an impatient customer who wants quick answers.
Speak directly and may interrupt if responses are too long.
You expect immediate solutions to your problems.
```
- Voice: Select a voice for the tester (optional for chat mode)
Click Save

Start with the built-in default personalities to get familiar with the system before creating custom ones.

Personality types: Consider creating personalities for different customer types you encounter: decisive buyers, confused users, detail-oriented customers, or frustrated callers.

Step 2: Create a scenario

Scenarios define what the test is evaluating. A scenario contains:

Instructions: What the tester should do during the call
Evaluations: Structured outputs with expected values to validate outcomes

Dashboard

cURL

Navigate to Scenarios

In Simulations, click the Scenarios tab
Click Create Scenario

Configure the scenario

Name: Enter “Book Appointment”

Instructions: Define what the tester should do:

You are calling to book an appointment for next Monday at 2pm.
Confirm your identity when asked and provide any required information.
End the call once you receive a confirmation number.

Add evaluations

Evaluations use structured outputs to extract data from the conversation and compare against expected values.

Click Add Evaluation
Create or select a structured output:
- Name: “appointment_booked”
- Schema Type: boolean
Set the Comparator: =
Set the Expected Value: true
Mark as Required: Yes
Add another evaluation for confirmation number:
- Name: “confirmation_provided”
- Schema Type: boolean
- Comparator: =
- Expected Value: true
Click Save Scenario

Evaluation structure

Each evaluation consists of:

Field	Description
`structuredOutputId`	Reference to an existing structured output (mutually exclusive with `structuredOutput`)
`structuredOutput`	Inline structured output definition (mutually exclusive with `structuredOutputId`)
`comparator`	Comparison operator: `=`, `!=`, `>`, `<`, `>=`, `<=`
`value`	Expected value (string, number, or boolean)
`required`	Whether this evaluation must pass for the simulation to pass (default: `true`)

Schema type restrictions: Evaluations only support primitive schema types: string, number, integer, boolean. Objects and arrays are not supported.

Comparator options

Comparator	Description	Supported Types
`=`	Equals	string, number, integer, boolean
`!=`	Not equals	string, number, integer, boolean
`>`	Greater than	number, integer
`<`	Less than	number, integer
`>=`	Greater than or equal	number, integer
`<=`	Less than or equal	number, integer

Evaluation tips: Use boolean structured outputs for pass/fail checks like “appointment_booked” or “issue_resolved”. Use numeric outputs with comparators for metrics like “satisfaction_score >= 4”.

Step 3: Create a simulation

Simulations pair a scenario with a personality. The target assistant or squad is specified when you run the simulation.

Dashboard

cURL

Navigate to Simulations

In Simulations, click the Simulations tab
Click Create Simulation

Configure the simulation

Name: Enter “Appointment Booking - Impatient Customer” (optional)
Scenario: Select “Book Appointment” from the dropdown
Personality: Select “Impatient Customer” from the dropdown
Click Save Simulation

Multiple simulations: Create several simulations with different personality and scenario combinations to thoroughly test your assistant across various conditions.

Step 4: Create a simulation suite (optional)

Simulation suites group multiple simulations into a single batch that runs together.

Dashboard

cURL

Navigate to Suites

In Simulations, click the Suites tab
Click Create Suite

Configure the suite

Name: Enter “Appointment Booking Regression Suite”
Click Add Simulations
Select the simulations you want to include:
- “Appointment Booking - Impatient Customer”
- “Appointment Booking - Confused User”
- “Appointment Booking - Decisive Customer”
Click Save Suite

Suite organization: Group related simulations together. For example, create separate suites for “Booking Tests”, “Cancellation Tests”, and “Rescheduling Tests”.

Step 5: Run a simulation

Execute simulations against your assistant or squad. You can run individual simulations or entire suites.

Dashboard

cURL

Start a run

Navigate to your simulation or suite
Click Run
Select the Target:
- Choose Assistant or Squad
- Select from the dropdown
Configure Transport (optional):
- Voice: vapi.websocket (default)
- Chat: vapi.webchat (faster, no audio)
Set Iterations (optional): Number of times to run each simulation
Click Start Run

Monitor progress

Click the Runs tab to see live status updates
Watch as each simulation progresses:
- Queued - Waiting to start
- Running - Test in progress
- Ended - Test finished
For voice mode, click Listen on any running test to hear the call live

Step 6: Review results

Analyze the results of your simulation runs to understand how your assistant performed.

Successful run

When all evaluations pass, you’ll see:

1 {
2   "id": "550e8400-e29b-41d4-a716-446655440007",
3   "status": "ended",
4   "itemCounts": {
5     "total": 3,
6     "passed": 3,
7     "failed": 0,
8     "running": 0,
9     "queued": 0,
10     "canceled": 0
11   },
12   "startedAt": "2024-01-15T09:50:05Z",
13   "endedAt": "2024-01-15T09:52:30Z"
14 }

Pass criteria:

status is “ended”
itemCounts.passed equals itemCounts.total
All required evaluations show passed: true

Failed run

When evaluation fails, you’ll see details about what went wrong:

1 {
2   "id": "550e8400-e29b-41d4-a716-446655440008",
3   "status": "ended",
4   "itemCounts": {
5     "total": 3,
6     "passed": 2,
7     "failed": 1,
8     "running": 0,
9     "queued": 0,
10     "canceled": 0
11   }
12 }

Failure indicators:

itemCounts.failed > 0
Individual run items show which evaluations failed and why

Dashboard

cURL

View run results

Navigate to the Runs tab
Click on a completed run to see details
View the summary showing pass/fail counts

Investigate failures

Click on any failed simulation
Review the Conversation to see the full transcript
Check which evaluations failed and their actual vs expected values
For voice mode, click Listen to Recording to hear the full call

Track performance over time

Go to the main Simulations page
View historical runs and their pass rates
Monitor trends to identify regressions

Full conversation transcripts are available for all simulation runs, making it easy to understand exactly what happened during each test.

Next steps

Advanced simulation testing

Learn about tool mocks, hooks, CI/CD integration, and testing strategies

Assistants guide

Create and configure assistants to test

Evals quickstart

Learn about chat-based testing with mock conversations

Structured outputs

Learn how to define structured outputs for evaluations

Tips for success

Best practices for effective simulation testing:

Start with chat mode - Use vapi.webchat for rapid iteration, then validate with voice
Use realistic personalities - Model your test callers after actual customer types
Define clear evaluations - Use specific, measurable structured outputs
Group related tests - Organize suites by feature or user flow
Monitor trends - Track pass rates over time to catch regressions early
Test after changes - Run your simulation suites after updating prompts or tools
Listen to recordings - Audio recordings reveal issues that metrics alone miss
Iterate on failures - Use failed tests to improve both your assistant and test design

Frequently asked questions

How many concurrent simulations can I run?

Simulation concurrency follows your organization’s call concurrency limits. Each voice simulation uses 2 concurrent call slots (one for the AI tester, one for your assistant being tested). Chat mode simulations are more efficient since they don’t require audio processing. If you need higher concurrency limits, contact support.

What's the difference between Simulations and Evals?

Simulations use AI-powered testers that have actual conversations with your assistant, producing real call recordings and transcripts. Evals use mock conversations with predefined messages and judge the responses. Use Simulations for realistic end-to-end testing; use Evals for faster, more controlled validation.

Can I use my own structured outputs?

Yes! You can either define inline structured outputs in your scenario evaluations, or reference existing structured outputs by ID using the structuredOutputId field.

How do I test squad handoffs?

Create a simulation that targets a squad instead of an assistant. Use the target.type: "squad" and target.squadId fields when creating a run.

Get help

Need assistance? We’re here to help: