Call queue management with Twilio
Handle high-volume calls with Twilio queues when hitting Vapi concurrency limits
Overview
When your application receives more simultaneous calls than your Vapi concurrency limit allows, calls can be rejected. A call queue system using Twilio queues solves this by holding excess calls in a queue and processing them as capacity becomes available.
In this guide, you’ll learn to:
- Set up Twilio call queues for high-volume scenarios
- Implement concurrency tracking to respect Vapi limits
- Build a queue processing system with JavaScript
- Handle call dequeuing and Vapi integration seamlessly
This approach is ideal for call centers, customer support lines, or any application expecting call volumes that exceed your Vapi concurrency limit.
Prerequisites
Before implementing call queue management, ensure you have:
- Vapi Account: Access to the Vapi Dashboard with your API key
- Twilio Account: Active Twilio account with Account SID and Auth Token
- Twilio CLI: Install from twil.io/cli for queue management
- Phone Number: Twilio phone number configured for incoming calls
- Assistant: Configured Vapi assistant ID for handling calls
- Server Environment: Node.js server capable of receiving webhooks
- Redis Instance: Redis server for persistent state management (local, cloud, or serverless-compatible)
You’ll need to know your Vapi account’s concurrency limit. Check your plan details in the Vapi Dashboard under billing settings.
For production deployments, especially in serverless environments, Redis ensures your call counters persist across server restarts and function invocations.
How it works
The queue management system operates in three phases:
Incoming calls are automatically placed in a Twilio queue when received
Server monitors active Vapi calls against your concurrency limit
When capacity is available, calls are dequeued and connected to Vapi
Call Flow:
- Incoming call → Twilio receives call and executes webhook
- Queue placement → Call is placed in Twilio queue with hold music
- Automatic processing → Server processes queue immediately when capacity changes
- Capacity check → Server verifies if Vapi concurrency limit allows new calls using Redis
- Dequeue & connect → Available calls are dequeued and connected to Vapi assistants
- Persistent tracking → Redis tracks active calls across server restarts and serverless invocations
Implementation Guide
Create Twilio Queue
First, create a Twilio queue using the Twilio CLI to hold incoming calls.
Expected Response:
Save the queue sid
(e.g., QUaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
) - you’ll need this for queue operations.
Configure Phone Number Webhook
Configure your Twilio phone number to send incoming calls to your queue endpoint.
- Go to Twilio Console > Phone Numbers
- Select your phone number
- Set A call comes in webhook to:
https://your-server.com/incoming
- Set HTTP method to
POST
- Save configuration
Set up Redis for Persistent State
Configure Redis for persistent call counter storage. Choose the option that best fits your deployment:
Local Development
Cloud Redis
Serverless (Upstash)
Install Redis locally:
Test connection:
Set up Server Environment
Create your Node.js server with the required dependencies and environment variables.
Install Dependencies:
Environment Variables (.env):
Implement Queue Management Server
Create the main server file with queue handling, concurrency tracking, and Vapi integration.
Configure Vapi Webhooks for Call Tracking
Configure your Vapi assistant to send end-of-call-report webhooks for accurate concurrency tracking.
Assistant Configuration: You need to configure your assistant with proper webhook settings to receive call status updates.
The webhook will be sent to your server URL with the message type end-of-call-report
when calls end. This allows you to decrement your active call counter accurately. See the Assistant API reference for all available server message types.
Webhook Payload Example:
Your /call-ended
endpoint will receive a webhook with this structure:
Test the Queue System
Deploy your server and test the complete queue management flow.
Start Your Server:
Test Scenarios:
- Single call: Call your Twilio number - should connect immediately
- Multiple calls: Make several simultaneous calls to test queuing
- Capacity limit: Make more calls than your
MAX_CONCURRENCY
setting - Queue processing: Check that calls are processed as others end
Monitor Queue Status:
Callback-Driven Queue Processing
The system uses event-driven queue processing that responds immediately to capacity changes, eliminating the need for timers and preventing memory leaks:
How It Works
- Event-driven: Queue processing is triggered by actual events (call start, call end)
- Redis persistence: Call counters are stored in Redis, surviving server restarts and serverless deployments
- Immediate processing: Uses
setImmediate()
to process queue as soon as capacity becomes available - No timers: Eliminates memory leak risks from long-running intervals
- Recursive processing: Automatically processes multiple queued calls when capacity allows
Key Improvements
Queue processing happens immediately when calls end or arrive
Redis persistence works across serverless function invocations
No timers means no memory leaks from long-running processes
Counters survive server restarts and deployments
Architecture Benefits
- Event-driven triggers: Processing occurs on actual state changes, not arbitrary intervals
- Persistent state: Redis ensures counters are never lost, even in serverless environments
- Efficient resource usage: No CPU cycles wasted on empty queue checks
- Immediate capacity utilization: New calls are processed instantly when space becomes available
- Graceful degradation: Redis connection failures are handled with proper error logging
Processing Triggers
Queue processing is automatically triggered when:
- New call arrives →
setImmediate(() => processQueue())
after adding to queue - Call ends →
setImmediate(() => processQueue())
after decrementing active count - Successful processing → Recursively processes more calls if capacity and queue allow
Redis is required for this implementation. Ensure your Redis instance is properly configured and accessible from your deployment environment.
Troubleshooting
Redis connection issues
Common causes:
- Redis server not running or unreachable
- Incorrect
REDIS_URL
configuration - Network connectivity issues in production
Solutions:
- Test Redis connection:
redis-cli ping
(should return PONG) - Verify
REDIS_URL
format matches your provider - Check firewall rules and security groups
- Monitor Redis logs for authentication errors
Health check endpoint shows Redis status:
Calls not being dequeued
Common causes:
- Server not receiving call-ended webhooks (check webhook URLs)
- Redis counter desync (rare, but possible)
- Vapi API errors (check API key and assistant ID)
Solutions:
- Verify webhook URLs are publicly accessible
- Check Redis counters:
redis-cli get vapi:queue:active_calls
- Reset counters manually if needed:
redis-cli set vapi:queue:active_calls 0
- Test Vapi API calls independently
Debug Redis state:
Queue filling up but not processing
Check these items:
MAX_CONCURRENCY
setting is appropriate for your Vapi plan- Redis counters are accurate (compare with actual Twilio queue)
- No errors in Vapi TwiML generation
Debug steps:
- Call
/process-queue
endpoint manually - Check
/health
endpoint for current capacity and Redis status - Review server logs for Redis connection errors
- Verify queue processing triggers are firing
Serverless deployment issues
Serverless-specific considerations:
- Use connection pooling for Redis (Upstash recommended)
- Cold starts may cause initial Redis connection delays
- Function timeout limits may interrupt long-running operations
Solutions:
- Configure appropriate function timeout (30+ seconds)
- Use Redis providers optimized for serverless (Upstash)
- Implement connection retry logic
- Monitor function execution logs for timeout errors
Calls dropping or hanging up
Potential issues:
- Invalid phone number format (use E.164 format)
- Incorrect Vapi configuration (phone number ID, assistant ID)
- Network timeouts during TwiML generation
- Redis operations timing out
Solutions:
- Validate all phone numbers before processing
- Add timeout handling to API calls and Redis operations
- Implement retry logic for failed Vapi requests
- Monitor Redis response times
Performance optimization
Production considerations:
- Redis connection pooling for high-traffic scenarios
- Monitor Redis memory usage and eviction policies
- Consider Redis clustering for extreme scale
- Implement circuit breakers for external API calls
Monitoring recommendations:
- Track Redis connection health
- Monitor queue processing latency
- Alert on Redis counter anomalies
- Log all state transitions for debugging
Next steps
Now that you have a production-ready call queue system with Redis persistence and callback-driven processing:
- Advanced Call Features: Explore call recording, analysis, and advanced routing options
- Monitoring & Analytics: Set up comprehensive call analytics and performance monitoring
- Scaling Considerations: Learn about enterprise features for high-volume deployments
- Assistant Optimization: Enhance your assistants with personalization and dynamic variables
Consider implementing health checks, metrics collection, and alerting around your Redis counters and queue processing latency for production monitoring.