Call queue management with Twilio

Overview

When your application receives more simultaneous calls than your Vapi concurrency limit allows, calls can be rejected. A call queue system using Twilio queues solves this by holding excess calls in a queue and processing them as capacity becomes available.

In this guide, you’ll learn to:

Set up Twilio call queues for high-volume scenarios
Implement concurrency tracking to respect Vapi limits
Build a queue processing system with JavaScript
Handle call dequeuing and Vapi integration seamlessly

This approach is ideal for call centers, customer support lines, or any application expecting call volumes that exceed your Vapi concurrency limit.

Prerequisites

Before implementing call queue management, ensure you have:

Vapi Account: Access to the Vapi Dashboard with your API key
Twilio Account: Active Twilio account with Account SID and Auth Token
Twilio CLI: Install from twil.io/cli for queue management
Phone Number: Twilio phone number configured for incoming calls
Assistant: Configured Vapi assistant ID for handling calls
Server Environment: Node.js server capable of receiving webhooks
Redis Instance: Redis server for persistent state management (local, cloud, or serverless-compatible)

You’ll need to know your Vapi account’s concurrency limit. Check your plan details in the Vapi Dashboard under billing settings.

For production deployments, especially in serverless environments, Redis ensures your call counters persist across server restarts and function invocations.

How it works

The queue management system operates in three phases:

Queue Incoming

Incoming calls are automatically placed in a Twilio queue when received

Track Capacity

Server monitors active Vapi calls against your concurrency limit

Process Queue

When capacity is available, calls are dequeued and connected to Vapi

Call Flow:

Incoming call → Twilio receives call and executes webhook
Queue placement → Call is placed in Twilio queue with hold music
Automatic processing → Server processes queue immediately when capacity changes
Capacity check → Server verifies if Vapi concurrency limit allows new calls using Redis
Dequeue & connect → Available calls are dequeued and connected to Vapi assistants
Persistent tracking → Redis tracks active calls across server restarts and serverless invocations

Implementation Guide

Create Twilio Queue

First, create a Twilio queue using the Twilio CLI to hold incoming calls.

$ twilio api:core:queues:create \
>    --friendly-name customer-support

Expected Response:

1 {
2   "account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
3   "average_wait_time": 0,
4   "current_size": 0,
5   "date_created": "2024-01-15T18:39:09.000Z",
6   "date_updated": "2024-01-15T18:39:09.000Z", 
7   "friendly_name": "customer-support",
8   "max_size": 100,
9   "sid": "QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
10   "uri": "/2010-04-01/Accounts/ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Queues/QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.json"
11 }

Save the queue sid (e.g., QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) - you’ll need this for queue operations.

Configure Phone Number Webhook

Configure your Twilio phone number to send incoming calls to your queue endpoint.

Go to Twilio Console > Phone Numbers
Select your phone number
Set A call comes in webhook to: https://your-server.com/incoming
Set HTTP method to POST
Save configuration

Set up Redis for Persistent State

Configure Redis for persistent call counter storage. Choose the option that best fits your deployment:

Local Development

Cloud Redis

Serverless (Upstash)

Install Redis locally:

$ # macOS (using Homebrew)
> brew install redis
> brew services start redis
> 
> # Ubuntu/Debian
> sudo apt update
> sudo apt install redis-server
> sudo systemctl start redis-server
> 
> # Docker
> docker run -d -p 6379:6379 redis:alpine

Test connection:

$ redis-cli ping
> # Should return: PONG

Set up Server Environment

Create your Node.js server with the required dependencies and environment variables.

Install Dependencies:

$ npm install express twilio axios dotenv redis

Environment Variables (.env):

$ # Vapi Configuration
> VAPI_API_KEY=your_vapi_api_key_here
> VAPI_PHONE_NUMBER_ID=your_phone_number_id
> VAPI_ASSISTANT_ID=your_assistant_id
> 
> # Twilio Configuration  
> TWILIO_ACCOUNT_SID=your_twilio_account_sid
> TWILIO_AUTH_TOKEN=your_twilio_auth_token
> TWILIO_QUEUE_SID=QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
> 
> # Redis Configuration (for persistent state)
> REDIS_URL=redis://localhost:6379
> # For Redis Cloud: REDIS_URL=rediss://username:password@host:port
> # For Upstash (serverless): REDIS_URL=rediss://default:password@host:port
> 
> # Server Configuration
> PORT=3000
> MAX_CONCURRENCY=5

Implement Queue Management Server

Create the main server file with queue handling, concurrency tracking, and Vapi integration.

server.js

1 const express = require('express');
2 const twilio = require('twilio');
3 const axios = require('axios');
4 const redis = require('redis');
5 require('dotenv').config();
6 
7 const app = express();
8 const twilioClient = twilio(process.env.TWILIO_ACCOUNT_SID, process.env.TWILIO_AUTH_TOKEN);
9 
10 // Redis client for persistent state management
11 const redisClient = redis.createClient({
12   url: process.env.REDIS_URL || 'redis://localhost:6379'
13 });
14 
15 const MAX_CONCURRENCY = parseInt(process.env.MAX_CONCURRENCY) || 5;
16 const REDIS_KEYS = {
17   ACTIVE_CALLS: 'vapi:queue:active_calls',
18   CALLS_IN_QUEUE: 'vapi:queue:calls_in_queue'
19 };
20 
21 // Middleware
22 app.use(express.json());
23 app.use(express.urlencoded({ extended: true }));
24 
25 // Initialize Redis connection
26 async function initializeRedis() {
27   try {
28     await redisClient.connect();
29     console.log('Connected to Redis');
30     
31     // Initialize counters if they don't exist
32     const activeCalls = await redisClient.get(REDIS_KEYS.ACTIVE_CALLS);
33     const callsInQueue = await redisClient.get(REDIS_KEYS.CALLS_IN_QUEUE);
34     
35     if (activeCalls === null) {
36       await redisClient.set(REDIS_KEYS.ACTIVE_CALLS, '0');
37     }
38     if (callsInQueue === null) {
39       await redisClient.set(REDIS_KEYS.CALLS_IN_QUEUE, '0');
40     }
41   } catch (error) {
42     console.error('Redis connection failed:', error);
43     process.exit(1);
44   }
45 }
46 
47 // Helper functions for Redis operations
48 async function getActiveCalls() {
49   const count = await redisClient.get(REDIS_KEYS.ACTIVE_CALLS);
50   return parseInt(count) || 0;
51 }
52 
53 async function getCallsInQueue() {
54   const count = await redisClient.get(REDIS_KEYS.CALLS_IN_QUEUE);
55   return parseInt(count) || 0;
56 }
57 
58 async function incrementActiveCalls() {
59   return await redisClient.incr(REDIS_KEYS.ACTIVE_CALLS);
60 }
61 
62 async function decrementActiveCalls() {
63   const current = await getActiveCalls();
64   if (current > 0) {
65     return await redisClient.decr(REDIS_KEYS.ACTIVE_CALLS);
66   }
67   return current;
68 }
69 
70 async function incrementCallsInQueue() {
71   return await redisClient.incr(REDIS_KEYS.CALLS_IN_QUEUE);
72 }
73 
74 async function decrementCallsInQueue() {
75   const current = await getCallsInQueue();
76   if (current > 0) {
77     return await redisClient.decr(REDIS_KEYS.CALLS_IN_QUEUE);
78   }
79   return current;
80 }
81 
82 async function syncCallsInQueue() {
83   await redisClient.set(REDIS_KEYS.CALLS_IN_QUEUE, '0');
84 }
85 
86 // Incoming call handler - adds calls to queue
87 app.post('/incoming', async (req, res) => {
88   try {
89     const twiml = `<?xml version="1.0" encoding="UTF-8"?>
90       <Response>
91         <Enqueue>customer-support</Enqueue>
92       </Response>`;
93     
94     res.set('Content-Type', 'application/xml');
95     res.send(twiml);
96     
97     // Increment queue counter in Redis
98     const queueCount = await incrementCallsInQueue();
99     console.log(`Call ${req.body.CallSid} added to queue. Calls in queue: ${queueCount}`);
100     
101     // Immediately check if we can process this call
102     setImmediate(() => processQueue());
103     
104   } catch (error) {
105     console.error('Error handling incoming call:', error);
106     res.status(500).send('Error processing call');
107   }
108 });
109 
110 async function processQueue() {
111   try {
112     const activeCalls = await getActiveCalls();
113     const callsInQueue = await getCallsInQueue();
114     
115     // Check if we have capacity for more calls
116     if (activeCalls >= MAX_CONCURRENCY) {
117       return;
118     }
119 
120     // Check if there are calls in queue
121     if (callsInQueue === 0) {
122       return;
123     }
124 
125     // Get next call from queue
126     const members = await twilioClient.queues(process.env.TWILIO_QUEUE_SID)
127       .members
128       .list({ limit: 1 });
129 
130     if (members.length === 0) {
131       // No calls in queue - sync our counter
132       await syncCallsInQueue();
133       return;
134     }
135 
136     const member = members[0];
137     console.log(`Processing queued call: ${member.callSid}`);
138 
139     // Get Vapi TwiML for this call
140     const twiml = await initiateVapiCall(member.callSid, member.phoneNumber);
141     
142     if (twiml) {
143       // Update call with Vapi TwiML
144       await twilioClient.calls(member.callSid).update({ twiml });
145       
146       // Update counters in Redis
147       const newActiveCalls = await incrementActiveCalls();
148       const newQueueCount = await decrementCallsInQueue();
149       
150       console.log(`Call connected to Vapi. Active calls: ${newActiveCalls}/${MAX_CONCURRENCY}, Queue: ${newQueueCount}`);
151       
152       // Check if we can process more calls immediately
153       if (newActiveCalls < MAX_CONCURRENCY && newQueueCount > 0) {
154         setImmediate(() => processQueue());
155       }
156     } else {
157       console.error(`Failed to get TwiML for call ${member.callSid}`);
158     }
159   } catch (error) {
160     console.error('Error processing queue:', error);
161   }
162 }
163 
164 // Generate Vapi TwiML for a call
165 async function initiateVapiCall(callSid, customerNumber) {
166   const payload = {
167     phoneNumberId: process.env.VAPI_PHONE_NUMBER_ID,
168     phoneCallProviderBypassEnabled: true,
169     customer: { number: customerNumber },
170     assistantId: process.env.VAPI_ASSISTANT_ID,
171   };
172 
173   const headers = {
174     'Authorization': `Bearer ${process.env.VAPI_API_KEY}`,
175     'Content-Type': 'application/json',
176   };
177 
178   try {
179     const response = await axios.post('https://api.vapi.ai/call', payload, { headers });
180     
181     if (response.data && response.data.phoneCallProviderDetails) {
182       return response.data.phoneCallProviderDetails.twiml;
183     } else {
184       throw new Error('Invalid response structure from Vapi');
185     }
186   } catch (error) {
187     console.error(`Error initiating Vapi call for ${callSid}:`, error.message);
188     return null;
189   }
190 }
191 
192 // Webhook for call completion - triggers immediate queue processing
193 app.post('/call-ended', async (req, res) => {
194   try {
195     // Handle Vapi end-of-call-report webhook
196     const message = req.body.message;
197     
198     if (message && message.type === 'end-of-call-report') {
199       const callId = message.call?.id;
200       
201       const newActiveCalls = await decrementActiveCalls();
202       console.log(`Vapi call ${callId} ended. Active calls: ${newActiveCalls}/${MAX_CONCURRENCY}`);
203       
204       // Immediately process queue when capacity becomes available
205       setImmediate(() => processQueue());
206     }
207     
208     res.status(200).send('OK');
209   } catch (error) {
210     console.error('Error handling Vapi webhook:', error);
211     res.status(500).send('Error');
212   }
213 });
214 
215 // Manual queue processing endpoint (for testing/monitoring)
216 app.post('/process-queue', async (req, res) => {
217   try {
218     await processQueue();
219     const activeCalls = await getActiveCalls();
220     const callsInQueue = await getCallsInQueue();
221     
222     res.json({ 
223       message: 'Queue processing triggered',
224       activeCalls,
225       callsInQueue,
226       maxConcurrency: MAX_CONCURRENCY 
227     });
228   } catch (error) {
229     console.error('Error in manual queue processing:', error);
230     res.status(500).json({ error: 'Failed to process queue' });
231   }
232 });
233 
234 // Health check endpoint
235 app.get('/health', async (req, res) => {
236   try {
237     const activeCalls = await getActiveCalls();
238     const callsInQueue = await getCallsInQueue();
239     
240     res.json({
241       status: 'healthy',
242       activeCalls,
243       callsInQueue,
244       maxConcurrency: MAX_CONCURRENCY,
245       availableCapacity: MAX_CONCURRENCY - activeCalls,
246       redis: redisClient.isOpen ? 'connected' : 'disconnected'
247     });
248   } catch (error) {
249     console.error('Error in health check:', error);
250     res.status(500).json({ 
251       status: 'error', 
252       error: error.message,
253       redis: redisClient.isOpen ? 'connected' : 'disconnected'
254     });
255   }
256 });
257 
258 // Graceful shutdown
259 process.on('SIGINT', async () => {
260   console.log('Shutting down gracefully...');
261   await redisClient.quit();
262   process.exit(0);
263 });
264 
265 process.on('SIGTERM', async () => {
266   console.log('Shutting down gracefully...');
267   await redisClient.quit();
268   process.exit(0);
269 });
270 
271 // Start server
272 async function startServer() {
273   await initializeRedis();
274   
275   const PORT = process.env.PORT || 3000;
276   app.listen(PORT, () => {
277     console.log(`Queue management server running on port ${PORT}`);
278     console.log(`Max concurrency: ${MAX_CONCURRENCY}`);
279     console.log('Using callback-driven queue processing (no timers)');
280   });
281 }
282 
283 startServer().catch(console.error);
284 
285 module.exports = app;

Configure Vapi Webhooks for Call Tracking

Configure your Vapi assistant to send end-of-call-report webhooks for accurate concurrency tracking.

Assistant Configuration: You need to configure your assistant with proper webhook settings to receive call status updates.

assistant-configuration.js

1 const assistantConfig = {
2   name: "Queue Management Assistant",
3   // ... other assistant configuration
4   
5   // Configure server URL for webhooks
6   server: {
7     url: "https://your-server.com",
8     timeoutSeconds: 20
9   },
10   
11   // Configure which messages to send to your server
12   serverMessages: ["end-of-call-report", "status-update"]
13 };

The webhook will be sent to your server URL with the message type end-of-call-report when calls end. This allows you to decrement your active call counter accurately. See the Assistant API reference for all available server message types.

Webhook Payload Example: Your /call-ended endpoint will receive a webhook with this structure:

end-of-call-report-payload.json

1 {
2   "message": {
3     "type": "end-of-call-report",
4     "call": {
5       "id": "73a6da0f-c455-4bb6-bf4a-5f0634871430",
6       "status": "ended",
7       "endedReason": "assistant-ended-call"
8     }
9   }
10 }

Test the Queue System

Deploy your server and test the complete queue management flow.

Start Your Server:

$ node server.js

Test Scenarios:

Single call: Call your Twilio number - should connect immediately
Multiple calls: Make several simultaneous calls to test queuing
Capacity limit: Make more calls than your MAX_CONCURRENCY setting
Queue processing: Check that calls are processed as others end

Monitor Queue Status:

$ # Check server health and capacity
> curl https://your-server.com/health
> 
> # Manually trigger queue processing
> curl -X POST https://your-server.com/process-queue

Callback-Driven Queue Processing

The system uses event-driven queue processing that responds immediately to capacity changes, eliminating the need for timers and preventing memory leaks:

How It Works

Event-driven: Queue processing is triggered by actual events (call start, call end)
Redis persistence: Call counters are stored in Redis, surviving server restarts and serverless deployments
Immediate processing: Uses setImmediate() to process queue as soon as capacity becomes available
No timers: Eliminates memory leak risks from long-running intervals
Recursive processing: Automatically processes multiple queued calls when capacity allows

Key Improvements

Instant Response

Queue processing happens immediately when calls end or arrive

Serverless Ready

Redis persistence works across serverless function invocations

Memory Safe

No timers means no memory leaks from long-running processes

Production Resilient

Counters survive server restarts and deployments

Architecture Benefits

Event-driven triggers: Processing occurs on actual state changes, not arbitrary intervals
Persistent state: Redis ensures counters are never lost, even in serverless environments
Efficient resource usage: No CPU cycles wasted on empty queue checks
Immediate capacity utilization: New calls are processed instantly when space becomes available
Graceful degradation: Redis connection failures are handled with proper error logging

Processing Triggers

Queue processing is automatically triggered when:

New call arrives → setImmediate(() => processQueue()) after adding to queue
Call ends → setImmediate(() => processQueue()) after decrementing active count
Successful processing → Recursively processes more calls if capacity and queue allow

Redis is required for this implementation. Ensure your Redis instance is properly configured and accessible from your deployment environment.

Troubleshooting

Redis connection issues

Common causes:

Redis server not running or unreachable
Incorrect REDIS_URL configuration
Network connectivity issues in production

Solutions:

Test Redis connection: redis-cli ping (should return PONG)
Verify REDIS_URL format matches your provider
Check firewall rules and security groups
Monitor Redis logs for authentication errors

Health check endpoint shows Redis status:

$ curl https://your-server.com/health
> # Check "redis" field in response

Calls not being dequeued

Common causes:

Server not receiving call-ended webhooks (check webhook URLs)
Redis counter desync (rare, but possible)
Vapi API errors (check API key and assistant ID)

Solutions:

Verify webhook URLs are publicly accessible
Check Redis counters: redis-cli get vapi:queue:active_calls
Reset counters manually if needed: redis-cli set vapi:queue:active_calls 0
Test Vapi API calls independently

Debug Redis state:

$ # Check current counter values
> redis-cli mget vapi:queue:active_calls vapi:queue:calls_in_queue

Queue filling up but not processing

Check these items:

MAX_CONCURRENCY setting is appropriate for your Vapi plan
Redis counters are accurate (compare with actual Twilio queue)
No errors in Vapi TwiML generation

Debug steps:

Call /process-queue endpoint manually
Check /health endpoint for current capacity and Redis status
Review server logs for Redis connection errors
Verify queue processing triggers are firing

Serverless deployment issues

Serverless-specific considerations:

Use connection pooling for Redis (Upstash recommended)
Cold starts may cause initial Redis connection delays
Function timeout limits may interrupt long-running operations

Solutions:

Configure appropriate function timeout (30+ seconds)
Use Redis providers optimized for serverless (Upstash)
Implement connection retry logic
Monitor function execution logs for timeout errors

Calls dropping or hanging up

Potential issues:

Invalid phone number format (use E.164 format)
Incorrect Vapi configuration (phone number ID, assistant ID)
Network timeouts during TwiML generation
Redis operations timing out

Solutions:

Validate all phone numbers before processing
Add timeout handling to API calls and Redis operations
Implement retry logic for failed Vapi requests
Monitor Redis response times

Performance optimization

Production considerations:

Redis connection pooling for high-traffic scenarios
Monitor Redis memory usage and eviction policies
Consider Redis clustering for extreme scale
Implement circuit breakers for external API calls

Monitoring recommendations:

Track Redis connection health
Monitor queue processing latency
Alert on Redis counter anomalies
Log all state transitions for debugging

Next steps

Now that you have a production-ready call queue system with Redis persistence and callback-driven processing:

Advanced Call Features: Explore call recording, analysis, and advanced routing options
Monitoring & Analytics: Set up comprehensive call analytics and performance monitoring
Scaling Considerations: Learn about enterprise features for high-volume deployments
Assistant Optimization: Enhance your assistants with personalization and dynamic variables

Consider implementing health checks, metrics collection, and alerting around your Redis counters and queue processing latency for production monitoring.