Background speech denoising

Overview

Background speech denoising helps create clearer conversations by filtering out unwanted sounds while users speak. Vapi offers two complementary denoising technologies that can be used independently or together for optimal results.

In this guide, you’ll learn to:

  • Enable Smart Denoising using Krisp technology (recommended for most users)
  • Configure experimental Fourier denoising with customizable parameters
  • Combine both methods for enhanced noise reduction
  • Fine-tune settings for different environments

For most use cases, Smart Denoising alone provides excellent results. Fourier denoising is a highly experimental feature that requires significant tuning and may not work well in all environments.

Denoising methods

Smart Denoising (Krisp)

Smart Denoising uses Krisp’s AI-powered technology to remove background noise in real-time. This method is highly effective for common noise sources like:

  • Keyboard typing
  • Background conversations
  • Traffic and street noise
  • Air conditioning and fans
  • Pet sounds

Fourier Denoising (Experimental)

Fourier denoising uses frequency-domain filtering to remove consistent background noise. This experimental method offers fine-grained control through multiple parameters and includes automatic media detection for TV/music/radio backgrounds.

Fourier denoising is highly experimental and comes with significant limitations:

  • Requires extensive tweaking to work properly
  • May not work well in all audio environments (e.g., when headphones are used)
  • Can introduce audio artifacts or distortions
  • Should only be used when Smart Denoising alone is insufficient

For most users, Smart Denoising should be sufficient. Only proceed with Fourier denoising if you have specific requirements and are prepared to test extensively.

Configuration

Background speech denoising is configured through the backgroundSpeechDenoisingPlan property on your assistant:

1import { VapiClient } from "@vapi-ai/server-sdk";
2
3const vapi = new VapiClient({
4 token: process.env.VAPI_API_KEY
5});
6
7const assistant = await vapi.assistants.create({
8 name: "Customer Support",
9 backgroundSpeechDenoisingPlan: {
10 // Enable Smart Denoising
11 smartDenoisingPlan: {
12 enabled: true
13 },
14 // Enable Fourier Denoising (optional)
15 fourierDenoisingPlan: {
16 enabled: true,
17 mediaDetectionEnabled: true,
18 staticThreshold: -35,
19 baselineOffsetDb: -15,
20 windowSizeMs: 3000,
21 baselinePercentile: 85
22 }
23 }
24});

Smart Denoising configuration

Smart Denoising has a simple on/off configuration:

smartDenoisingPlan.enabled
booleanDefaults to false

Enable or disable Krisp-powered smart denoising

Example: Smart Denoising only

1const assistant = await vapi.assistants.create({
2 name: "Support Agent",
3 backgroundSpeechDenoisingPlan: {
4 smartDenoisingPlan: {
5 enabled: true
6 }
7 }
8});

Fourier Denoising configuration

Fourier denoising offers multiple parameters for fine-tuning:

fourierDenoisingPlan.enabled
booleanDefaults to false

Enable or disable experimental Fourier denoising

fourierDenoisingPlan.mediaDetectionEnabled
booleanDefaults to true

Automatically detect and filter consistent background media (TV/music/radio)

fourierDenoisingPlan.staticThreshold
numberDefaults to -35

Fallback threshold in dB when no baseline is established (-80 to 0)

fourierDenoisingPlan.baselineOffsetDb
numberDefaults to -15

How far below the rolling baseline to filter audio, in dB (-30 to -5)

  • Lower values (e.g., -10) = more aggressive filtering
  • Higher values (e.g., -20) = more conservative filtering
fourierDenoisingPlan.windowSizeMs
numberDefaults to 3000

Rolling window size in milliseconds for baseline calculation (1000 to 30000)

  • Larger windows = slower adaptation, more stability
  • Smaller windows = faster adaptation, less stability
fourierDenoisingPlan.baselinePercentile
numberDefaults to 85

Percentile for baseline calculation (1 to 99)

  • Higher percentiles (e.g., 85) = focus on louder speech
  • Lower percentiles (e.g., 50) = include quieter speech

Example: Adding Fourier Denoising to Smart Denoising

1const assistant = await vapi.assistants.create({
2 name: "Call Center Agent",
3 backgroundSpeechDenoisingPlan: {
4 // Always enable Smart Denoising first
5 smartDenoisingPlan: {
6 enabled: true
7 },
8 // Add Fourier Denoising for additional filtering
9 fourierDenoisingPlan: {
10 enabled: true,
11 mediaDetectionEnabled: true,
12 // More aggressive filtering for noisy environments
13 baselineOffsetDb: -10,
14 // Faster adaptation for dynamic environments
15 windowSizeMs: 2000,
16 // Focus on louder, clearer speech
17 baselinePercentile: 90
18 }
19 }
20});

Combined denoising

For maximum noise reduction, combine both methods. Processing order:

  1. Smart Denoising (Krisp) processes first
  2. Fourier Denoising processes the Krisp output

Environment-specific configurations

Quiet office environment

Minimal speech denoising for clear environments:

1const assistant = await vapi.assistants.create({
2 name: "Office Assistant",
3 backgroundSpeechDenoisingPlan: {
4 smartDenoisingPlan: {
5 enabled: true
6 }
7 // No Fourier denoising needed
8 }
9});

Noisy call center

Aggressive filtering for high-noise environments:

1const assistant = await vapi.assistants.create({
2 name: "Call Center Agent",
3 backgroundSpeechDenoisingPlan: {
4 smartDenoisingPlan: {
5 enabled: true
6 },
7 fourierDenoisingPlan: {
8 enabled: true,
9 mediaDetectionEnabled: true,
10 baselineOffsetDb: -10, // Aggressive filtering
11 windowSizeMs: 2000, // Fast adaptation
12 baselinePercentile: 90 // Focus on clear speech
13 }
14 }
15});

Home environment with TV/music

Optimized for media background noise:

1const assistant = await vapi.assistants.create({
2 name: "Home Assistant",
3 backgroundSpeechDenoisingPlan: {
4 smartDenoisingPlan: {
5 enabled: true
6 },
7 fourierDenoisingPlan: {
8 enabled: true,
9 mediaDetectionEnabled: true, // Essential for TV/music
10 baselineOffsetDb: -15,
11 windowSizeMs: 4000,
12 baselinePercentile: 80
13 }
14 }
15});

Best practices

For most users, Smart Denoising alone is the recommended solution. It handles the vast majority of common noise scenarios effectively without configuration complexity. Only consider adding Fourier denoising if you have specific requirements that Smart Denoising cannot address.

When to use each method

Smart Denoising only:

  • General-purpose noise reduction
  • Unpredictable noise patterns
  • When simplicity is preferred

Smart Denoising + Fourier Denoising:

  • Maximum noise reduction required
  • Consistent background noise that Smart Denoising alone cannot fully handle
  • Complex acoustic environments with media (TV/music/radio)
  • Premium user experiences requiring fine-tuned control
  • Willing to invest time in testing and tuning
  • Not using headphones (Fourier may cause issues with headphone audio)

Fourier Denoising should never be used alone. It’s designed to complement Smart Denoising by providing additional filtering after Krisp has done the initial noise reduction.

Performance considerations

Audio quality: Aggressive filtering may affect voice quality. Test different settings to find the right balance between noise reduction and natural speech preservation.

Testing recommendations

  1. Test in your target environment
  2. Start with default settings
  3. Adjust parameters incrementally
  4. Monitor user feedback
  5. A/B test different configurations

Troubleshooting fourier denoising

Reduce filtering aggressiveness:

  • Increase baselineOffsetDb (e.g., -20 instead of -15)
  • Decrease baselinePercentile (e.g., 75 instead of 85)
  • Try Smart Denoising only

Increase filtering:

  • Enable both denoising methods
  • Decrease baselineOffsetDb (e.g., -12 instead of -15)
  • Ensure mediaDetectionEnabled is true for TV/music

Adjust detection sensitivity:

  • Increase windowSizeMs for more stability
  • Adjust staticThreshold if baseline isn’t establishing
  • Check if user’s voice level is consistent