Data Flow
Overview
When using Vapi, data flows through multiple components during a voice conversation. Understanding this flow is essential for security-conscious organizations, especially when integrating custom bucket storage or custom model providers.
This guide explains:
- The complete voice pipeline architecture
- What data passes through each component
- What data is stored on Vapi’s infrastructure vs your own
- Which components support “bring your own” infrastructure
Understanding Log Types
Vapi generates two distinct types of logs during calls:
System Logs are strictly internal to Vapi and are never shared with customers or uploaded to custom storage buckets. They contain infrastructure-level data used for Vapi’s operational purposes only.
Voice Pipeline Architecture
Vapi orchestrates a sophisticated voice pipeline with multiple modular components. Each component can be configured to use Vapi’s default providers, your own API keys, or your own custom servers.
Complete Pipeline Flow
Pipeline Components
1. Transport Layer
The transport layer handles real-time audio streaming between users and Vapi.
Audio Formats:
- PCM: 16-bit, 16kHz (highest quality)
- Mu-Law: 8-bit, 8kHz (telephony standard)
2. Speech-to-Text (Transcriber)
Converts user audio into text in real-time using streaming recognition.
Custom Transcriber: Vapi supports custom transcriber integration via WebSocket. See Custom Transcriber.
Bring Your Own API Key:
- ✅ Supported: Deepgram, Gladia, AssemblyAI, Speechmatics, Google, Azure
- ❌ Not supported: Talkscriber
3. Orchestration Layer (Vapi Proprietary)
Vapi runs proprietary real-time models that make conversations feel natural. These models are not customizable and run on Vapi’s infrastructure.
Orchestration models process data in real-time but do not persist the audio or intermediate results. All processing is ephemeral. Only final transcripts and call logs are stored (unless HIPAA mode is enabled).
4. Language Model (LLM)
Generates conversational responses based on transcribed user input.
Custom LLM: Vapi supports custom LLM integration via OpenAI-compatible endpoints. See Custom LLM.
Bring Your Own API Key:
- ✅ Supported: OpenAI, Anthropic, Azure OpenAI, Google Gemini, Groq, DeepSeek, OpenRouter, Together AI, Cerebras, DeepInfra, Perplexity, Anyscale, xAI
5. Text-to-Speech (Voice)
Converts LLM responses into spoken audio.
Custom Voice: Vapi supports custom TTS integration via audio streaming endpoints. See Custom TTS.
Bring Your Own API Key:
- ✅ Supported: ElevenLabs, PlayHT, Cartesia, Deepgram, OpenAI TTS, Azure, LMNT, Rime AI, Smallest AI, Neuphonic, WellSaid, Hume
Default Data Flow
In the default configuration, Vapi handles all pipeline components and stores artifacts on Vapi’s infrastructure.
Default storage on Vapi:
- Call Logs (Customer-Accessible):
- Call recordings (configurable retention)
- Full transcripts with timestamps
- Call logs with component-level detail
- Structured outputs from call analysis
- Internal (Vapi Only):
- Product usage metrics and analytics
- System logs for operational monitoring
Custom Storage Data Flow
When you configure custom bucket storage, call recordings and call logs are uploaded to your infrastructure. System logs and product usage metrics remain on Vapi’s infrastructure.
Supported storage providers:
- AWS S3
- GCP Cloud Storage
- Cloudflare R2
- Supabase Storage
- Azure Blob Storage
System Logs and Product Usage Metrics are always stored on Vapi’s infrastructure and are never uploaded to custom storage buckets. These are internal operational data used by Vapi only.
Custom Models Data Flow
When using custom transcriber, LLM, or voice servers, data flows to your infrastructure for processing.
With full custom configuration:
- Your servers process: Audio transcription, LLM inference, speech synthesis
- Vapi handles: Orchestration (endpointing, interruptions, etc.), transport routing
- Your storage receives: Recordings, transcripts, call logs
- Vapi storage retains: Product usage metrics, system logs (internal only)
Bring Your Own Infrastructure Summary
The Orchestration Layer (endpointing, interruption detection, emotion detection, backchanneling, filler injection) is Vapi’s core value proposition and runs exclusively on Vapi infrastructure. Audio processed by these models is ephemeral and not stored.
Artifacts Storage Summary
HIPAA Mode Important Notice: When HIPAA mode is enabled (hipaaEnabled: true) and no custom storage is configured, Vapi will not store call recordings or transcripts. This data will be lost after the call ends. To retain call data in HIPAA mode, you must configure a custom storage bucket.
What Data Passes Through Vapi
Even with maximum custom configuration, certain data passes through Vapi’s orchestration:
Recommendations by Use Case
Maximum data control (enterprise/regulated)
Configure:
- Custom Transcriber via WebSocket endpoint
- Custom LLM via OpenAI-compatible server
- Custom Voice via audio streaming endpoint
- Custom bucket storage for all call logs
- HIPAA mode to prevent Vapi call log storage
Result: Only orchestration signals (ephemeral) pass through Vapi. System logs remain on Vapi infrastructure (never shared).
Data residency compliance
- Use custom bucket storage in your required region
- Use custom LLM hosted in-region OR provider with regional endpoints
- Use custom Voice hosted in-region if needed
Note: Orchestration models run on Vapi’s US/EU infrastructure (data is ephemeral). System logs remain on Vapi infrastructure.
Cost optimization with own API keys
- Enable Provider Keys for Transcriber, LLM, and Voice
- Vapi uses your API keys, you’re billed directly by providers
- No custom server setup required
HIPAA compliance
- Enable
hipaaEnabled: true - Important: Configure custom storage to retain call recordings and transcripts
- Use only HIPAA-compliant providers (Deepgram, Azure, OpenAI, Anthropic, ElevenLabs)
- See HIPAA Compliance
Without custom storage configured, HIPAA mode will result in no call recordings or transcripts being stored. Data will be lost after call completion.
Next Steps
Custom Integration Guides
- Custom Transcriber - Bring your own speech-to-text
- Custom LLM - Bring your own language model
- Custom TTS - Bring your own voice synthesis
Storage Configuration
- AWS S3 - S3 bucket setup
- GCP Cloud Storage - GCP bucket setup
- Cloudflare R2 - R2 setup
Compliance
- HIPAA Compliance - Healthcare data handling
- PCI Compliance - Payment data handling
- GDPR Compliance - EU data protection