Call queue management with Twilio

Handle high-volume calls with Twilio queues when hitting Vapi concurrency limits

Overview

When your application receives more simultaneous calls than your Vapi concurrency limit allows, calls can be rejected. A call queue system using Twilio queues solves this by holding excess calls in a queue and processing them as capacity becomes available.

In this guide, you’ll learn to:

  • Set up Twilio call queues for high-volume scenarios
  • Implement concurrency tracking to respect Vapi limits
  • Build a queue processing system with JavaScript
  • Handle call dequeuing and Vapi integration seamlessly

This approach is ideal for call centers, customer support lines, or any application expecting call volumes that exceed your Vapi concurrency limit.

Prerequisites

Before implementing call queue management, ensure you have:

  • Vapi Account: Access to the Vapi Dashboard with your API key
  • Twilio Account: Active Twilio account with Account SID and Auth Token
  • Twilio CLI: Install from twil.io/cli for queue management
  • Phone Number: Twilio phone number configured for incoming calls
  • Assistant: Configured Vapi assistant ID for handling calls
  • Server Environment: Node.js server capable of receiving webhooks
  • Redis Instance: Redis server for persistent state management (local, cloud, or serverless-compatible)

You’ll need to know your Vapi account’s concurrency limit. Check your plan details in the Vapi Dashboard under billing settings.

For production deployments, especially in serverless environments, Redis ensures your call counters persist across server restarts and function invocations.

How it works

The queue management system operates in three phases:

Queue Incoming

Incoming calls are automatically placed in a Twilio queue when received

Track Capacity

Server monitors active Vapi calls against your concurrency limit

Process Queue

When capacity is available, calls are dequeued and connected to Vapi

Call Flow:

  1. Incoming call → Twilio receives call and executes webhook
  2. Queue placement → Call is placed in Twilio queue with hold music
  3. Automatic processing → Server processes queue immediately when capacity changes
  4. Capacity check → Server verifies if Vapi concurrency limit allows new calls using Redis
  5. Dequeue & connect → Available calls are dequeued and connected to Vapi assistants
  6. Persistent tracking → Redis tracks active calls across server restarts and serverless invocations

Implementation Guide

1

Create Twilio Queue

First, create a Twilio queue using the Twilio CLI to hold incoming calls.

$twilio api:core:queues:create \
> --friendly-name customer-support

Expected Response:

1{
2 "account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
3 "average_wait_time": 0,
4 "current_size": 0,
5 "date_created": "2024-01-15T18:39:09.000Z",
6 "date_updated": "2024-01-15T18:39:09.000Z",
7 "friendly_name": "customer-support",
8 "max_size": 100,
9 "sid": "QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
10 "uri": "/2010-04-01/Accounts/ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Queues/QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.json"
11}

Save the queue sid (e.g., QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) - you’ll need this for queue operations.

2

Configure Phone Number Webhook

Configure your Twilio phone number to send incoming calls to your queue endpoint.

  1. Go to Twilio Console > Phone Numbers
  2. Select your phone number
  3. Set A call comes in webhook to: https://your-server.com/incoming
  4. Set HTTP method to POST
  5. Save configuration
3

Set up Redis for Persistent State

Configure Redis for persistent call counter storage. Choose the option that best fits your deployment:

Install Redis locally:

$# macOS (using Homebrew)
>brew install redis
>brew services start redis
>
># Ubuntu/Debian
>sudo apt update
>sudo apt install redis-server
>sudo systemctl start redis-server
>
># Docker
>docker run -d -p 6379:6379 redis:alpine

Test connection:

$redis-cli ping
># Should return: PONG
4

Set up Server Environment

Create your Node.js server with the required dependencies and environment variables.

Install Dependencies:

$npm install express twilio axios dotenv redis

Environment Variables (.env):

$# Vapi Configuration
>VAPI_API_KEY=your_vapi_api_key_here
>VAPI_PHONE_NUMBER_ID=your_phone_number_id
>VAPI_ASSISTANT_ID=your_assistant_id
>
># Twilio Configuration
>TWILIO_ACCOUNT_SID=your_twilio_account_sid
>TWILIO_AUTH_TOKEN=your_twilio_auth_token
>TWILIO_QUEUE_SID=QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
>
># Redis Configuration (for persistent state)
>REDIS_URL=redis://localhost:6379
># For Redis Cloud: REDIS_URL=rediss://username:password@host:port
># For Upstash (serverless): REDIS_URL=rediss://default:password@host:port
>
># Server Configuration
>PORT=3000
>MAX_CONCURRENCY=5
5

Implement Queue Management Server

Create the main server file with queue handling, concurrency tracking, and Vapi integration.

server.js
1const express = require('express');
2const twilio = require('twilio');
3const axios = require('axios');
4const redis = require('redis');
5require('dotenv').config();
6
7const app = express();
8const twilioClient = twilio(process.env.TWILIO_ACCOUNT_SID, process.env.TWILIO_AUTH_TOKEN);
9
10// Redis client for persistent state management
11const redisClient = redis.createClient({
12 url: process.env.REDIS_URL || 'redis://localhost:6379'
13});
14
15const MAX_CONCURRENCY = parseInt(process.env.MAX_CONCURRENCY) || 5;
16const REDIS_KEYS = {
17 ACTIVE_CALLS: 'vapi:queue:active_calls',
18 CALLS_IN_QUEUE: 'vapi:queue:calls_in_queue'
19};
20
21// Middleware
22app.use(express.json());
23app.use(express.urlencoded({ extended: true }));
24
25// Initialize Redis connection
26async function initializeRedis() {
27 try {
28 await redisClient.connect();
29 console.log('Connected to Redis');
30
31 // Initialize counters if they don't exist
32 const activeCalls = await redisClient.get(REDIS_KEYS.ACTIVE_CALLS);
33 const callsInQueue = await redisClient.get(REDIS_KEYS.CALLS_IN_QUEUE);
34
35 if (activeCalls === null) {
36 await redisClient.set(REDIS_KEYS.ACTIVE_CALLS, '0');
37 }
38 if (callsInQueue === null) {
39 await redisClient.set(REDIS_KEYS.CALLS_IN_QUEUE, '0');
40 }
41 } catch (error) {
42 console.error('Redis connection failed:', error);
43 process.exit(1);
44 }
45}
46
47// Helper functions for Redis operations
48async function getActiveCalls() {
49 const count = await redisClient.get(REDIS_KEYS.ACTIVE_CALLS);
50 return parseInt(count) || 0;
51}
52
53async function getCallsInQueue() {
54 const count = await redisClient.get(REDIS_KEYS.CALLS_IN_QUEUE);
55 return parseInt(count) || 0;
56}
57
58async function incrementActiveCalls() {
59 return await redisClient.incr(REDIS_KEYS.ACTIVE_CALLS);
60}
61
62async function decrementActiveCalls() {
63 const current = await getActiveCalls();
64 if (current > 0) {
65 return await redisClient.decr(REDIS_KEYS.ACTIVE_CALLS);
66 }
67 return current;
68}
69
70async function incrementCallsInQueue() {
71 return await redisClient.incr(REDIS_KEYS.CALLS_IN_QUEUE);
72}
73
74async function decrementCallsInQueue() {
75 const current = await getCallsInQueue();
76 if (current > 0) {
77 return await redisClient.decr(REDIS_KEYS.CALLS_IN_QUEUE);
78 }
79 return current;
80}
81
82async function syncCallsInQueue() {
83 await redisClient.set(REDIS_KEYS.CALLS_IN_QUEUE, '0');
84}
85
86// Incoming call handler - adds calls to queue
87app.post('/incoming', async (req, res) => {
88 try {
89 const twiml = `<?xml version="1.0" encoding="UTF-8"?>
90 <Response>
91 <Enqueue>customer-support</Enqueue>
92 </Response>`;
93
94 res.set('Content-Type', 'application/xml');
95 res.send(twiml);
96
97 // Increment queue counter in Redis
98 const queueCount = await incrementCallsInQueue();
99 console.log(`Call ${req.body.CallSid} added to queue. Calls in queue: ${queueCount}`);
100
101 // Immediately check if we can process this call
102 setImmediate(() => processQueue());
103
104 } catch (error) {
105 console.error('Error handling incoming call:', error);
106 res.status(500).send('Error processing call');
107 }
108});
109
110async function processQueue() {
111 try {
112 const activeCalls = await getActiveCalls();
113 const callsInQueue = await getCallsInQueue();
114
115 // Check if we have capacity for more calls
116 if (activeCalls >= MAX_CONCURRENCY) {
117 return;
118 }
119
120 // Check if there are calls in queue
121 if (callsInQueue === 0) {
122 return;
123 }
124
125 // Get next call from queue
126 const members = await twilioClient.queues(process.env.TWILIO_QUEUE_SID)
127 .members
128 .list({ limit: 1 });
129
130 if (members.length === 0) {
131 // No calls in queue - sync our counter
132 await syncCallsInQueue();
133 return;
134 }
135
136 const member = members[0];
137 console.log(`Processing queued call: ${member.callSid}`);
138
139 // Get Vapi TwiML for this call
140 const twiml = await initiateVapiCall(member.callSid, member.phoneNumber);
141
142 if (twiml) {
143 // Update call with Vapi TwiML
144 await twilioClient.calls(member.callSid).update({ twiml });
145
146 // Update counters in Redis
147 const newActiveCalls = await incrementActiveCalls();
148 const newQueueCount = await decrementCallsInQueue();
149
150 console.log(`Call connected to Vapi. Active calls: ${newActiveCalls}/${MAX_CONCURRENCY}, Queue: ${newQueueCount}`);
151
152 // Check if we can process more calls immediately
153 if (newActiveCalls < MAX_CONCURRENCY && newQueueCount > 0) {
154 setImmediate(() => processQueue());
155 }
156 } else {
157 console.error(`Failed to get TwiML for call ${member.callSid}`);
158 }
159 } catch (error) {
160 console.error('Error processing queue:', error);
161 }
162}
163
164// Generate Vapi TwiML for a call
165async function initiateVapiCall(callSid, customerNumber) {
166 const payload = {
167 phoneNumberId: process.env.VAPI_PHONE_NUMBER_ID,
168 phoneCallProviderBypassEnabled: true,
169 customer: { number: customerNumber },
170 assistantId: process.env.VAPI_ASSISTANT_ID,
171 };
172
173 const headers = {
174 'Authorization': `Bearer ${process.env.VAPI_API_KEY}`,
175 'Content-Type': 'application/json',
176 };
177
178 try {
179 const response = await axios.post('https://api.vapi.ai/call', payload, { headers });
180
181 if (response.data && response.data.phoneCallProviderDetails) {
182 return response.data.phoneCallProviderDetails.twiml;
183 } else {
184 throw new Error('Invalid response structure from Vapi');
185 }
186 } catch (error) {
187 console.error(`Error initiating Vapi call for ${callSid}:`, error.message);
188 return null;
189 }
190}
191
192// Webhook for call completion - triggers immediate queue processing
193app.post('/call-ended', async (req, res) => {
194 try {
195 // Handle Vapi end-of-call-report webhook
196 const message = req.body.message;
197
198 if (message && message.type === 'end-of-call-report') {
199 const callId = message.call?.id;
200
201 const newActiveCalls = await decrementActiveCalls();
202 console.log(`Vapi call ${callId} ended. Active calls: ${newActiveCalls}/${MAX_CONCURRENCY}`);
203
204 // Immediately process queue when capacity becomes available
205 setImmediate(() => processQueue());
206 }
207
208 res.status(200).send('OK');
209 } catch (error) {
210 console.error('Error handling Vapi webhook:', error);
211 res.status(500).send('Error');
212 }
213});
214
215// Manual queue processing endpoint (for testing/monitoring)
216app.post('/process-queue', async (req, res) => {
217 try {
218 await processQueue();
219 const activeCalls = await getActiveCalls();
220 const callsInQueue = await getCallsInQueue();
221
222 res.json({
223 message: 'Queue processing triggered',
224 activeCalls,
225 callsInQueue,
226 maxConcurrency: MAX_CONCURRENCY
227 });
228 } catch (error) {
229 console.error('Error in manual queue processing:', error);
230 res.status(500).json({ error: 'Failed to process queue' });
231 }
232});
233
234// Health check endpoint
235app.get('/health', async (req, res) => {
236 try {
237 const activeCalls = await getActiveCalls();
238 const callsInQueue = await getCallsInQueue();
239
240 res.json({
241 status: 'healthy',
242 activeCalls,
243 callsInQueue,
244 maxConcurrency: MAX_CONCURRENCY,
245 availableCapacity: MAX_CONCURRENCY - activeCalls,
246 redis: redisClient.isOpen ? 'connected' : 'disconnected'
247 });
248 } catch (error) {
249 console.error('Error in health check:', error);
250 res.status(500).json({
251 status: 'error',
252 error: error.message,
253 redis: redisClient.isOpen ? 'connected' : 'disconnected'
254 });
255 }
256});
257
258// Graceful shutdown
259process.on('SIGINT', async () => {
260 console.log('Shutting down gracefully...');
261 await redisClient.quit();
262 process.exit(0);
263});
264
265process.on('SIGTERM', async () => {
266 console.log('Shutting down gracefully...');
267 await redisClient.quit();
268 process.exit(0);
269});
270
271// Start server
272async function startServer() {
273 await initializeRedis();
274
275 const PORT = process.env.PORT || 3000;
276 app.listen(PORT, () => {
277 console.log(`Queue management server running on port ${PORT}`);
278 console.log(`Max concurrency: ${MAX_CONCURRENCY}`);
279 console.log('Using callback-driven queue processing (no timers)');
280 });
281}
282
283startServer().catch(console.error);
284
285module.exports = app;
6

Configure Vapi Webhooks for Call Tracking

Configure your Vapi assistant to send end-of-call-report webhooks for accurate concurrency tracking.

Assistant Configuration: You need to configure your assistant with proper webhook settings to receive call status updates.

assistant-configuration.js
1const assistantConfig = {
2 name: "Queue Management Assistant",
3 // ... other assistant configuration
4
5 // Configure server URL for webhooks
6 server: {
7 url: "https://your-server.com",
8 timeoutSeconds: 20
9 },
10
11 // Configure which messages to send to your server
12 serverMessages: ["end-of-call-report", "status-update"]
13};

The webhook will be sent to your server URL with the message type end-of-call-report when calls end. This allows you to decrement your active call counter accurately. See the Assistant API reference for all available server message types.

Webhook Payload Example: Your /call-ended endpoint will receive a webhook with this structure:

end-of-call-report-payload.json
1{
2 "message": {
3 "type": "end-of-call-report",
4 "call": {
5 "id": "73a6da0f-c455-4bb6-bf4a-5f0634871430",
6 "status": "ended",
7 "endedReason": "assistant-ended-call"
8 }
9 }
10}
7

Test the Queue System

Deploy your server and test the complete queue management flow.

Start Your Server:

$node server.js

Test Scenarios:

  1. Single call: Call your Twilio number - should connect immediately
  2. Multiple calls: Make several simultaneous calls to test queuing
  3. Capacity limit: Make more calls than your MAX_CONCURRENCY setting
  4. Queue processing: Check that calls are processed as others end

Monitor Queue Status:

$# Check server health and capacity
>curl https://your-server.com/health
>
># Manually trigger queue processing
>curl -X POST https://your-server.com/process-queue

Callback-Driven Queue Processing

The system uses event-driven queue processing that responds immediately to capacity changes, eliminating the need for timers and preventing memory leaks:

How It Works

  • Event-driven: Queue processing is triggered by actual events (call start, call end)
  • Redis persistence: Call counters are stored in Redis, surviving server restarts and serverless deployments
  • Immediate processing: Uses setImmediate() to process queue as soon as capacity becomes available
  • No timers: Eliminates memory leak risks from long-running intervals
  • Recursive processing: Automatically processes multiple queued calls when capacity allows

Key Improvements

Instant Response

Queue processing happens immediately when calls end or arrive

Serverless Ready

Redis persistence works across serverless function invocations

Memory Safe

No timers means no memory leaks from long-running processes

Production Resilient

Counters survive server restarts and deployments

Architecture Benefits

  • Event-driven triggers: Processing occurs on actual state changes, not arbitrary intervals
  • Persistent state: Redis ensures counters are never lost, even in serverless environments
  • Efficient resource usage: No CPU cycles wasted on empty queue checks
  • Immediate capacity utilization: New calls are processed instantly when space becomes available
  • Graceful degradation: Redis connection failures are handled with proper error logging

Processing Triggers

Queue processing is automatically triggered when:

  1. New call arrives → setImmediate(() => processQueue()) after adding to queue
  2. Call ends → setImmediate(() => processQueue()) after decrementing active count
  3. Successful processing → Recursively processes more calls if capacity and queue allow

Redis is required for this implementation. Ensure your Redis instance is properly configured and accessible from your deployment environment.

Troubleshooting

Common causes:

  • Redis server not running or unreachable
  • Incorrect REDIS_URL configuration
  • Network connectivity issues in production

Solutions:

  • Test Redis connection: redis-cli ping (should return PONG)
  • Verify REDIS_URL format matches your provider
  • Check firewall rules and security groups
  • Monitor Redis logs for authentication errors

Health check endpoint shows Redis status:

$curl https://your-server.com/health
># Check "redis" field in response

Common causes:

  • Server not receiving call-ended webhooks (check webhook URLs)
  • Redis counter desync (rare, but possible)
  • Vapi API errors (check API key and assistant ID)

Solutions:

  • Verify webhook URLs are publicly accessible
  • Check Redis counters: redis-cli get vapi:queue:active_calls
  • Reset counters manually if needed: redis-cli set vapi:queue:active_calls 0
  • Test Vapi API calls independently

Debug Redis state:

$# Check current counter values
>redis-cli mget vapi:queue:active_calls vapi:queue:calls_in_queue

Check these items:

  • MAX_CONCURRENCY setting is appropriate for your Vapi plan
  • Redis counters are accurate (compare with actual Twilio queue)
  • No errors in Vapi TwiML generation

Debug steps:

  • Call /process-queue endpoint manually
  • Check /health endpoint for current capacity and Redis status
  • Review server logs for Redis connection errors
  • Verify queue processing triggers are firing

Serverless-specific considerations:

  • Use connection pooling for Redis (Upstash recommended)
  • Cold starts may cause initial Redis connection delays
  • Function timeout limits may interrupt long-running operations

Solutions:

  • Configure appropriate function timeout (30+ seconds)
  • Use Redis providers optimized for serverless (Upstash)
  • Implement connection retry logic
  • Monitor function execution logs for timeout errors

Potential issues:

  • Invalid phone number format (use E.164 format)
  • Incorrect Vapi configuration (phone number ID, assistant ID)
  • Network timeouts during TwiML generation
  • Redis operations timing out

Solutions:

  • Validate all phone numbers before processing
  • Add timeout handling to API calls and Redis operations
  • Implement retry logic for failed Vapi requests
  • Monitor Redis response times

Production considerations:

  • Redis connection pooling for high-traffic scenarios
  • Monitor Redis memory usage and eviction policies
  • Consider Redis clustering for extreme scale
  • Implement circuit breakers for external API calls

Monitoring recommendations:

  • Track Redis connection health
  • Monitor queue processing latency
  • Alert on Redis counter anomalies
  • Log all state transitions for debugging

Next steps

Now that you have a production-ready call queue system with Redis persistence and callback-driven processing:

Consider implementing health checks, metrics collection, and alerting around your Redis counters and queue processing latency for production monitoring.