Call queue management with Twilio

Handle high-volume calls with Twilio queues when hitting Vapi concurrency limits

Overview

When your application receives more simultaneous calls than your Vapi concurrency limit allows, calls can be rejected. A call queue system using Twilio queues solves this by holding excess calls in a queue and processing them as capacity becomes available.

In this guide, you’ll learn to:

  • Set up Twilio call queues for high-volume scenarios
  • Implement concurrency tracking to respect Vapi limits
  • Build a queue processing system with JavaScript
  • Handle call dequeuing and Vapi integration seamlessly

This approach is ideal for call centers, customer support lines, or any application expecting call volumes that exceed your Vapi concurrency limit.

Prerequisites

Before implementing call queue management, ensure you have:

  • Vapi Account: Access to the Vapi Dashboard with your API key
  • Twilio Account: Active Twilio account with Account SID and Auth Token
  • Twilio CLI: Install from twil.io/cli for queue management
  • Phone Number: Twilio phone number configured for incoming calls
  • Assistant: Configured Vapi assistant ID for handling calls
  • Server Environment: Node.js server capable of receiving webhooks

You’ll need to know your Vapi account’s concurrency limit. Check your plan details in the Vapi Dashboard under billing settings.

How it works

The queue management system operates in three phases:

Queue Incoming

Incoming calls are automatically placed in a Twilio queue when received

Track Capacity

Server monitors active Vapi calls against your concurrency limit

Process Queue

When capacity is available, calls are dequeued and connected to Vapi

Call Flow:

Call Queue Management Flow
  1. Incoming call → Twilio receives call and executes webhook
  2. Queue placement → Call is placed in Twilio queue with hold music
  3. Automatic processing → Server automatically checks queue every 1 second
  4. Capacity check → Server verifies if Vapi concurrency limit allows new calls
  5. Dequeue & connect → Available calls are dequeued and connected to Vapi assistants
  6. Concurrency tracking → System tracks active calls via end-of-call webhooks

Implementation Guide

1

Create Twilio Queue

First, create a Twilio queue using the Twilio CLI to hold incoming calls.

$twilio api:core:queues:create \
> --friendly-name customer-support

Expected Response:

1{
2 "account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
3 "average_wait_time": 0,
4 "current_size": 0,
5 "date_created": "2024-01-15T18:39:09.000Z",
6 "date_updated": "2024-01-15T18:39:09.000Z",
7 "friendly_name": "customer-support",
8 "max_size": 100,
9 "sid": "QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
10 "uri": "/2010-04-01/Accounts/ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Queues/QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.json"
11}

Save the queue sid (e.g., QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) - you’ll need this for queue operations.

2

Configure Phone Number Webhook

Configure your Twilio phone number to send incoming calls to your queue endpoint.

  1. Go to Twilio Console > Phone Numbers
  2. Select your phone number
  3. Set A call comes in webhook to: https://your-server.com/incoming
  4. Set HTTP method to POST
  5. Save configuration
3

Set up Server Environment

Create your Node.js server with the required dependencies and environment variables.

Install Dependencies:

$npm install express twilio axios dotenv

Environment Variables (.env):

$# Vapi Configuration
>VAPI_API_KEY=your_vapi_api_key_here
>VAPI_PHONE_NUMBER_ID=your_phone_number_id
>VAPI_ASSISTANT_ID=your_assistant_id
>
># Twilio Configuration
>TWILIO_ACCOUNT_SID=your_twilio_account_sid
>TWILIO_AUTH_TOKEN=your_twilio_auth_token
>TWILIO_QUEUE_SID=QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
>
># Server Configuration
>PORT=3000
>MAX_CONCURRENCY=5
4

Implement Queue Management Server

Create the main server file with queue handling, concurrency tracking, and Vapi integration.

server.js
1const express = require('express');
2const twilio = require('twilio');
3const axios = require('axios');
4require('dotenv').config();
5
6const app = express();
7const twilioClient = twilio(process.env.TWILIO_ACCOUNT_SID, process.env.TWILIO_AUTH_TOKEN);
8
9// Concurrency and queue tracking
10let activeCalls = 0;
11let callsInQueue = 0;
12const MAX_CONCURRENCY = parseInt(process.env.MAX_CONCURRENCY) || 5;
13
14// Middleware
15app.use(express.json());
16app.use(express.urlencoded({ extended: true }));
17
18// Incoming call handler - adds calls to queue
19app.post('/incoming', (req, res) => {
20 try {
21 const twiml = `<?xml version="1.0" encoding="UTF-8"?>
22 <Response>
23 <Enqueue>customer-support</Enqueue>
24 </Response>`;
25
26 res.set('Content-Type', 'application/xml');
27 res.send(twiml);
28
29 // Increment queue counter
30 callsInQueue++;
31 console.log(`Call ${req.body.CallSid} added to queue. Calls in queue: ${callsInQueue}`);
32 } catch (error) {
33 console.error('Error handling incoming call:', error);
34 res.status(500).send('Error processing call');
35 }
36});
37
38// Queue processing function
39async function processQueue() {
40 try {
41 // Check if we have capacity for more calls
42 if (activeCalls >= MAX_CONCURRENCY) {
43 return;
44 }
45
46 // Check if there are calls in queue
47 if (callsInQueue === 0) {
48 return;
49 }
50
51 // Get next call from queue
52 const members = await twilioClient.queues(process.env.TWILIO_QUEUE_SID)
53 .members
54 .list({ limit: 1 });
55
56 if (members.length === 0) {
57 // No calls in queue - sync our counter
58 callsInQueue = 0;
59 return;
60 }
61
62 const member = members[0];
63 console.log(`Processing queued call: ${member.callSid}`);
64
65 // Get Vapi TwiML for this call
66 const twiml = await initiateVapiCall(member.callSid, member.phoneNumber);
67
68 if (twiml) {
69 // Update call with Vapi TwiML
70 await twilioClient.calls(member.callSid).update({ twiml });
71
72 // Increment active call counter and decrement queue counter
73 activeCalls++;
74 callsInQueue--;
75 console.log(`Call connected to Vapi. Active calls: ${activeCalls}/${MAX_CONCURRENCY}, Queue: ${callsInQueue}`);
76 } else {
77 console.error(`Failed to get TwiML for call ${member.callSid}`);
78 }
79 } catch (error) {
80 console.error('Error processing queue:', error);
81 }
82}
83
84// Generate Vapi TwiML for a call
85async function initiateVapiCall(callSid, customerNumber) {
86 const payload = {
87 phoneNumberId: process.env.VAPI_PHONE_NUMBER_ID,
88 phoneCallProviderBypassEnabled: true,
89 customer: { number: customerNumber },
90 assistantId: process.env.VAPI_ASSISTANT_ID,
91 };
92
93 const headers = {
94 'Authorization': `Bearer ${process.env.VAPI_API_KEY}`,
95 'Content-Type': 'application/json',
96 };
97
98 try {
99 const response = await axios.post('https://api.vapi.ai/call', payload, { headers });
100
101 if (response.data && response.data.phoneCallProviderDetails) {
102 return response.data.phoneCallProviderDetails.twiml;
103 } else {
104 throw new Error('Invalid response structure from Vapi');
105 }
106 } catch (error) {
107 console.error(`Error initiating Vapi call for ${callSid}:`, error.message);
108 return null;
109 }
110}
111
112// Webhook for call completion (decrements active calls)
113app.post('/call-ended', (req, res) => {
114 try {
115 // Handle Vapi end-of-call-report webhook
116 const message = req.body.message;
117
118 if (message && message.type === 'end-of-call-report') {
119 const callId = message.call?.id;
120
121 if (activeCalls > 0) {
122 activeCalls--;
123 console.log(`Vapi call ${callId} ended. Active calls: ${activeCalls}/${MAX_CONCURRENCY}`);
124 // Note: Queue processing happens automatically every 1 second
125 }
126 }
127
128 res.status(200).send('OK');
129 } catch (error) {
130 console.error('Error handling Vapi webhook:', error);
131 res.status(500).send('Error');
132 }
133});
134
135// Manual queue processing endpoint (for testing/monitoring)
136app.post('/process-queue', async (req, res) => {
137 try {
138 await processQueue();
139 res.json({
140 message: 'Queue processing triggered',
141 activeCalls,
142 callsInQueue,
143 maxConcurrency: MAX_CONCURRENCY
144 });
145 } catch (error) {
146 console.error('Error in manual queue processing:', error);
147 res.status(500).json({ error: 'Failed to process queue' });
148 }
149});
150
151// Health check endpoint
152app.get('/health', (req, res) => {
153 res.json({
154 status: 'healthy',
155 activeCalls,
156 callsInQueue,
157 maxConcurrency: MAX_CONCURRENCY,
158 availableCapacity: MAX_CONCURRENCY - activeCalls
159 });
160});
161
162// Start server
163const PORT = process.env.PORT || 3000;
164app.listen(PORT, () => {
165 console.log(`Queue management server running on port ${PORT}`);
166 console.log(`Max concurrency: ${MAX_CONCURRENCY}`);
167
168 // Start automatic queue processing every 1 second
169 startQueueProcessor();
170});
171
172// Automatic queue processing with 1-second interval
173function startQueueProcessor() {
174 setInterval(async () => {
175 try {
176 // Only process queue if there are calls waiting
177 if (callsInQueue > 0) {
178 await processQueue();
179 }
180 } catch (error) {
181 console.error('Error in automatic queue processing:', error);
182 }
183 }, 1000); // Check queue every 1 second
184
185 console.log('Automatic queue processor started (1-second interval)');
186}
187
188module.exports = app;
5

Configure Vapi Webhooks for Call Tracking

Configure your Vapi assistant to send end-of-call-report webhooks for accurate concurrency tracking.

Assistant Configuration: You need to configure your assistant with proper webhook settings to receive call status updates.

assistant-configuration.js
1const assistantConfig = {
2 name: "Queue Management Assistant",
3 // ... other assistant configuration
4
5 // Configure server URL for webhooks
6 server: {
7 url: "https://your-server.com",
8 timeoutSeconds: 20
9 },
10
11 // Configure which messages to send to your server
12 serverMessages: ["end-of-call-report", "status-update"]
13};

The webhook will be sent to your server URL with the message type end-of-call-report when calls end. This allows you to decrement your active call counter accurately. See the Assistant API reference for all available server message types.

Webhook Payload Example: Your /call-ended endpoint will receive a webhook with this structure:

end-of-call-report-payload.json
1{
2 "message": {
3 "type": "end-of-call-report",
4 "call": {
5 "id": "73a6da0f-c455-4bb6-bf4a-5f0634871430",
6 "status": "ended",
7 "endedReason": "assistant-ended-call"
8 }
9 }
10}
6

Test the Queue System

Deploy your server and test the complete queue management flow.

Start Your Server:

$node server.js

Test Scenarios:

  1. Single call: Call your Twilio number - should connect immediately
  2. Multiple calls: Make several simultaneous calls to test queuing
  3. Capacity limit: Make more calls than your MAX_CONCURRENCY setting
  4. Queue processing: Check that calls are processed as others end

Monitor Queue Status:

$# Check server health and capacity
>curl https://your-server.com/health
>
># Manually trigger queue processing
>curl -X POST https://your-server.com/process-queue

Automatic Queue Processing

The system includes automatic queue processing that runs continuously to ensure optimal call handling:

How It Works

  • Smart checking: Only runs processQueue() when callsInQueue > 0
  • Queue counter tracking: Increments when calls enter queue, decrements when processed
  • 1-second intervals: The server checks for queued calls every second
  • Efficient processing: Only processes calls when both capacity is available AND calls are waiting
  • Error handling: Continues running even if individual queue checks fail

Benefits

Immediate Response

Calls are processed within 1 second of capacity becoming available

No Manual Triggers

No need to manually trigger queue processing - it happens automatically

Fault Tolerant

System continues running even if individual API calls fail

Scalable

Handles high-volume scenarios without missing queued calls

Performance Considerations

  • Minimal API Calls: Only queries Twilio API when callsInQueue > 0
  • Counter Synchronization: Automatically syncs queue counter with actual Twilio queue state
  • Efficient Resource Usage: Avoids unnecessary processing when queue is empty
  • Graceful Degradation: Handles temporary API failures without crashing
  • Smart Logging: Provides clear visibility into queue and active call counts

The automatic processing ensures that as soon as a Vapi call ends and creates capacity, the next queued call will be processed within 1 second, providing near-real-time queue management.

Troubleshooting

Common causes:

  • Server not receiving call-ended webhooks (check webhook URLs)
  • Concurrency counter stuck (restart server to reset)
  • Vapi API errors (check API key and assistant ID)

Solutions:

  • Verify webhook URLs are publicly accessible
  • Add logging to track concurrency changes
  • Test Vapi API calls independently

Check these items:

  • MAX_CONCURRENCY setting is appropriate for your Vapi plan
  • Queue processing is being triggered (check logs)
  • No errors in Vapi TwiML generation

Debug steps:

  • Call /process-queue endpoint manually
  • Check /health endpoint for current capacity
  • Review server logs for error messages

Potential issues:

  • Invalid phone number format (use E.164 format)
  • Incorrect Vapi configuration (phone number ID, assistant ID)
  • Network timeouts during TwiML generation

Solutions:

  • Validate all phone numbers before processing
  • Add timeout handling to API calls
  • Implement retry logic for failed Vapi requests