Background speech denoising
Overview
Background speech denoising helps create clearer conversations by filtering out unwanted sounds while users speak. Vapi offers two complementary denoising technologies that can be used independently or together for optimal results.
In this guide, you’ll learn to:
- Enable Smart Denoising using Krisp technology (recommended for most users)
- Configure experimental Fourier denoising with customizable parameters
- Combine both methods for enhanced noise reduction
- Fine-tune settings for different environments
For most use cases, Smart Denoising alone provides excellent results. Fourier denoising is a highly experimental feature that requires significant tuning and may not work well in all environments.
Denoising methods
Smart Denoising (Krisp)
Smart Denoising uses Krisp’s AI-powered technology to remove background noise in real-time. This method is highly effective for common noise sources like:
- Keyboard typing
- Background conversations
- Traffic and street noise
- Air conditioning and fans
- Pet sounds
Fourier Denoising (Experimental)
Fourier denoising uses frequency-domain filtering to remove consistent background noise. This experimental method offers fine-grained control through multiple parameters and includes automatic media detection for TV/music/radio backgrounds.
Fourier denoising is highly experimental and comes with significant limitations:
- Requires extensive tweaking to work properly
- May not work well in all audio environments (e.g., when headphones are used)
- Can introduce audio artifacts or distortions
- Should only be used when Smart Denoising alone is insufficient
For most users, Smart Denoising should be sufficient. Only proceed with Fourier denoising if you have specific requirements and are prepared to test extensively.
Configuration
Background speech denoising is configured through the backgroundSpeechDenoisingPlan
property on your assistant:
Smart Denoising configuration
Smart Denoising has a simple on/off configuration:
Enable or disable Krisp-powered smart denoising
Example: Smart Denoising only
Fourier Denoising configuration
Fourier denoising offers multiple parameters for fine-tuning:
Enable or disable experimental Fourier denoising
Automatically detect and filter consistent background media (TV/music/radio)
Fallback threshold in dB when no baseline is established (-80 to 0)
How far below the rolling baseline to filter audio, in dB (-30 to -5)
- Lower values (e.g., -10) = more aggressive filtering
- Higher values (e.g., -20) = more conservative filtering
Rolling window size in milliseconds for baseline calculation (1000 to 30000)
- Larger windows = slower adaptation, more stability
- Smaller windows = faster adaptation, less stability
Percentile for baseline calculation (1 to 99)
- Higher percentiles (e.g., 85) = focus on louder speech
- Lower percentiles (e.g., 50) = include quieter speech
Example: Adding Fourier Denoising to Smart Denoising
Combined denoising
For maximum noise reduction, combine both methods. Processing order:
- Smart Denoising (Krisp) processes first
- Fourier Denoising processes the Krisp output
Environment-specific configurations
Quiet office environment
Minimal speech denoising for clear environments:
Noisy call center
Aggressive filtering for high-noise environments:
Home environment with TV/music
Optimized for media background noise:
Best practices
For most users, Smart Denoising alone is the recommended solution. It handles the vast majority of common noise scenarios effectively without configuration complexity. Only consider adding Fourier denoising if you have specific requirements that Smart Denoising cannot address.
When to use each method
Smart Denoising only:
- General-purpose noise reduction
- Unpredictable noise patterns
- When simplicity is preferred
Smart Denoising + Fourier Denoising:
- Maximum noise reduction required
- Consistent background noise that Smart Denoising alone cannot fully handle
- Complex acoustic environments with media (TV/music/radio)
- Premium user experiences requiring fine-tuned control
- Willing to invest time in testing and tuning
- Not using headphones (Fourier may cause issues with headphone audio)
Fourier Denoising should never be used alone. It’s designed to complement Smart Denoising by providing additional filtering after Krisp has done the initial noise reduction.
Performance considerations
Audio quality: Aggressive filtering may affect voice quality. Test different settings to find the right balance between noise reduction and natural speech preservation.
Testing recommendations
- Test in your target environment
- Start with default settings
- Adjust parameters incrementally
- Monitor user feedback
- A/B test different configurations
Troubleshooting fourier denoising
Voice sounds robotic or distorted
Reduce filtering aggressiveness:
- Increase
baselineOffsetDb
(e.g., -20 instead of -15) - Decrease
baselinePercentile
(e.g., 75 instead of 85) - Try Smart Denoising only
Background noise still audible
Increase filtering:
- Enable both denoising methods
- Decrease
baselineOffsetDb
(e.g., -12 instead of -15) - Ensure
mediaDetectionEnabled
is true for TV/music
Speech cutting out intermittently
Adjust detection sensitivity:
- Increase
windowSizeMs
for more stability - Adjust
staticThreshold
if baseline isn’t establishing - Check if user’s voice level is consistent