Data Flow

Understand how data flows through Vapi when using custom storage and custom models

Overview

When using Vapi, data flows through multiple components during a voice conversation. Understanding this flow is essential for security-conscious organizations, especially when integrating custom bucket storage or custom model providers.

This guide explains:

  • The complete voice pipeline architecture
  • What data passes through each component
  • What data is stored on Vapi’s infrastructure vs your own
  • Which components support “bring your own” infrastructure

Understanding Log Types

Vapi generates two distinct types of logs during calls:

Log TypeDescriptionVisibilityCustom Storage
System LogsInternal operational logs used by Vapi for debugging, monitoring, and system healthVapi internal only❌ Never uploaded to custom bucket
Call LogsConversation data including transcripts, recordings, and call metadataAvailable to customers via API/Dashboard✅ Can be uploaded to custom bucket

System Logs are strictly internal to Vapi and are never shared with customers or uploaded to custom storage buckets. They contain infrastructure-level data used for Vapi’s operational purposes only.


Voice Pipeline Architecture

Vapi orchestrates a sophisticated voice pipeline with multiple modular components. Each component can be configured to use Vapi’s default providers, your own API keys, or your own custom servers.

Complete Pipeline Flow


Pipeline Components

1. Transport Layer

The transport layer handles real-time audio streaming between users and Vapi.

Transport TypeDescriptionUse Case
SIPSession Initiation ProtocolTraditional phone systems, PBX integration
TelephonyTwilio, Telnyx, Plivo integrationsPSTN calls, phone numbers
WebSocketDirect bidirectional audio streamingWeb applications, custom integrations
WebRTCBrowser-based real-time communicationWeb and mobile apps via LiveKit/Daily

Audio Formats:

  • PCM: 16-bit, 16kHz (highest quality)
  • Mu-Law: 8-bit, 8kHz (telephony standard)

2. Speech-to-Text (Transcriber)

Converts user audio into text in real-time using streaming recognition.

Custom Transcriber: Vapi supports custom transcriber integration via WebSocket. See Custom Transcriber.

Bring Your Own API Key:

  • ✅ Supported: Deepgram, Gladia, AssemblyAI, Speechmatics, Google, Azure
  • ❌ Not supported: Talkscriber

3. Orchestration Layer (Vapi Proprietary)

Vapi runs proprietary real-time models that make conversations feel natural. These models are not customizable and run on Vapi’s infrastructure.

ModelPurpose
EndpointingDetects when user finishes speaking using audio-text fusion
Interruption DetectionDistinguishes barge-in from affirmations like “uh-huh”
Background Noise FilteringRemoves ambient sounds in real-time
Background Voice FilteringIsolates primary speaker from TVs, echoes, others
BackchannelingAdds natural affirmations (“uh-huh”, “yeah”, “got it”)
Emotion DetectionAnalyzes emotional tone and passes to LLM
Filler InjectionAdds natural speech patterns (“um”, “like”, “so”)

Orchestration models process data in real-time but do not persist the audio or intermediate results. All processing is ephemeral. Only final transcripts and call logs are stored (unless HIPAA mode is enabled).

4. Language Model (LLM)

Generates conversational responses based on transcribed user input.

Custom LLM: Vapi supports custom LLM integration via OpenAI-compatible endpoints. See Custom LLM.

Bring Your Own API Key:

  • ✅ Supported: OpenAI, Anthropic, Azure OpenAI, Google Gemini, Groq, DeepSeek, OpenRouter, Together AI, Cerebras, DeepInfra, Perplexity, Anyscale, xAI

5. Text-to-Speech (Voice)

Converts LLM responses into spoken audio.

Custom Voice: Vapi supports custom TTS integration via audio streaming endpoints. See Custom TTS.

Bring Your Own API Key:

  • ✅ Supported: ElevenLabs, PlayHT, Cartesia, Deepgram, OpenAI TTS, Azure, LMNT, Rime AI, Smallest AI, Neuphonic, WellSaid, Hume

Default Data Flow

In the default configuration, Vapi handles all pipeline components and stores artifacts on Vapi’s infrastructure.

Default storage on Vapi:

  • Call Logs (Customer-Accessible):
    • Call recordings (configurable retention)
    • Full transcripts with timestamps
    • Call logs with component-level detail
    • Structured outputs from call analysis
  • Internal (Vapi Only):
    • Product usage metrics and analytics
    • System logs for operational monitoring

Custom Storage Data Flow

When you configure custom bucket storage, call recordings and call logs are uploaded to your infrastructure. System logs and product usage metrics remain on Vapi’s infrastructure.

Supported storage providers:

  • AWS S3
  • GCP Cloud Storage
  • Cloudflare R2
  • Supabase Storage
  • Azure Blob Storage

System Logs and Product Usage Metrics are always stored on Vapi’s infrastructure and are never uploaded to custom storage buckets. These are internal operational data used by Vapi only.


Custom Models Data Flow

When using custom transcriber, LLM, or voice servers, data flows to your infrastructure for processing.

With full custom configuration:

  • Your servers process: Audio transcription, LLM inference, speech synthesis
  • Vapi handles: Orchestration (endpointing, interruptions, etc.), transport routing
  • Your storage receives: Recordings, transcripts, call logs
  • Vapi storage retains: Product usage metrics, system logs (internal only)

Bring Your Own Infrastructure Summary

ComponentBring Your Own Key (BYOK)Custom Server
Transport✅ Twilio, Telnyx, Vonage, etc.✅ WebSocket/SIP
Transcriber✅ Most providersCustom Transcriber
Orchestration❌ Vapi only❌ Vapi only
LLM✅ All providersCustom LLM
Voice✅ All providersCustom TTS
Storage✅ S3/GCP/R2/Azure✅ S3/GCP/R2/Azure

The Orchestration Layer (endpointing, interruption detection, emotion detection, backchanneling, filler injection) is Vapi’s core value proposition and runs exclusively on Vapi infrastructure. Audio processed by these models is ephemeral and not stored.


Artifacts Storage Summary

ArtifactDefault LocationCustom Storage SupportedHIPAA Mode
Call RecordingsVapi✅ YesNot stored on Vapi
TranscriptsVapi✅ YesNot stored on Vapi
Call LogsVapi✅ YesNot stored on Vapi
Product Usage MetricsVapi❌ NoVapi only
System LogsVapi❌ NoVapi only
Structured OutputsVapi✅ Yes (via webhook)Configurable

HIPAA Mode Important Notice: When HIPAA mode is enabled (hipaaEnabled: true) and no custom storage is configured, Vapi will not store call recordings or transcripts. This data will be lost after the call ends. To retain call data in HIPAA mode, you must configure a custom storage bucket.


What Data Passes Through Vapi

Even with maximum custom configuration, certain data passes through Vapi’s orchestration:

Data TypeProcessingRetention
Raw audio streamsReal-time routing to Transcriber/VoiceEphemeral (not stored)
Transcribed textOrchestration analysis, LLM routingCall logs (unless HIPAA)
LLM responsesFiller injection, Voice routingCall logs (unless HIPAA)
Emotion metadataPassed to LLM contextEphemeral
Call signalingSIP/WebSocket managementMetadata only

Recommendations by Use Case

Configure:

  • Custom Transcriber via WebSocket endpoint
  • Custom LLM via OpenAI-compatible server
  • Custom Voice via audio streaming endpoint
  • Custom bucket storage for all call logs
  • HIPAA mode to prevent Vapi call log storage

Result: Only orchestration signals (ephemeral) pass through Vapi. System logs remain on Vapi infrastructure (never shared).

  • Use custom bucket storage in your required region
  • Use custom LLM hosted in-region OR provider with regional endpoints
  • Use custom Voice hosted in-region if needed

Note: Orchestration models run on Vapi’s US/EU infrastructure (data is ephemeral). System logs remain on Vapi infrastructure.

  • Enable Provider Keys for Transcriber, LLM, and Voice
  • Vapi uses your API keys, you’re billed directly by providers
  • No custom server setup required
  • Enable hipaaEnabled: true
  • Important: Configure custom storage to retain call recordings and transcripts
  • Use only HIPAA-compliant providers (Deepgram, Azure, OpenAI, Anthropic, ElevenLabs)
  • See HIPAA Compliance

Without custom storage configured, HIPAA mode will result in no call recordings or transcripts being stored. Data will be lost after call completion.


Next Steps

Custom Integration Guides

Storage Configuration

Compliance