Email address reading

Overview

Email addresses are one of the trickiest pieces of information for a voice agent to handle. They contain special characters (@, ., -, _), mixed-case text, and domain names that text-to-speech (TTS) engines often mispronounce or blur together when spoken aloud.

This guide covers three layers of the solution:

Built-in formatting — Vapi automatically transforms email characters for TTS so they sound natural, with zero configuration.
API configuration — You can fine-tune which formatters run, disable email formatting selectively, or customize the entire formatting pipeline.
Prompt engineering — You instruct the LLM how to collect, read back, and confirm emails in conversation so users feel confident their address was captured correctly.

How Vapi handles emails automatically

Vapi’s voice formatting plan runs a 14-step pipeline that transforms raw LLM text into natural-sounding speech before it reaches the TTS provider. The email formatter is step 8 in this pipeline. It replaces @ with “at” and . with “dot” so the spoken output is intelligible without any prompt changes.

Raw LLM output	What the user hears
`john.doe@example.com`	”john dot doe at example dot com”
`SALES@company.org`	”SALES at company dot org”
`jane_smith-work@my-company.co.uk`	”jane underscore smith dash work at my dash company dot co dot uk”

The email formatter is enabled by default. You do not need to configure anything for basic email reading to work.

Where email formatting fits in the pipeline

The formatter runs after acronym and dollar-amount formatting, and before date, time, and phone number formatting. Here is the full 14-step pipeline with the email step highlighted:

Step	Formatter key	What it does	Default
1	`removeAngleBrackets`	Removes `<...>` tags (except `<break>`, `<spell>`, `<< >>`)	On
2	`markdown`	Removes markdown symbols (`_`, `, `~`)	On
3	`asterisk`	Removes text wrapped in `` or `*`	Off
4	`newline`	Converts `\n` to `.` for smoother phrasing	On
5	`colon`	Replaces `:` with `.`	On
6	`acronym`	Formats acronyms (e.g., `NASA` to `nasa`)	On
7	`dollarAmount`	`$42.50` to “forty two dollars and fifty cents”	On
8	`email`	`@` to “at”, `.` to “dot” in email addresses	On
9	`date`	`2023-05-10` to “Wednesday, May 10, 2023”	On
10	`time`	`14:00` to “14”	On
11	`distance` / `unit` / `percentage`	`5km` to “5 kilometers”, `50%` to “50 percent”	On
12	`phoneNumber`	`123-456-7890` to “1 2 3 4 5 6 7 8 9 0”	On
13	`number`	Formats general numbers, years, decimals	On
14	`stripAsterisk`	Removes remaining `*` characters	On

For full details on every step, see the voice formatting plan reference.

Configuring email formatting via the API

The email formatter runs automatically with no configuration needed. However, you can customize the behavior through the formatPlan on your assistant’s voice configuration.

Configuration path

assistant.voice.chunkPlan.formatPlan

The relevant TypeScript type:

FormatPlan type definition

1 interface FormatPlan {
2   enabled?: boolean;                                      // default: true
3   numberToDigitsCutoff?: number;                          // default: 2025
4   replacements?: FormatPlanReplacementsItem[];            // default: []
5   formattersEnabled?: FormatPlanFormattersEnabledItem[];  // default: all formatters
6 }

The formattersEnabled array accepts any combination of these values: removeAngleBrackets, markdown, asterisk, newline, colon, acronym, dollarAmount, email, date, time, distance, unit, percentage, phoneNumber, number, stripAsterisk.

The formattersEnabled property was introduced on 2025-02-20. Before that date, you could only toggle all formatting on or off with the enabled flag. If you are using an older API version, use enabled: false to disable all formatting.

Default behavior (no configuration needed)

By default, all formatters — including email — are enabled. You do not need to set anything for email addresses to be read correctly:

Default -- email formatting is already on

1 {
2   "voice": {
3     "chunkPlan": {
4       "formatPlan": {
5         "enabled": true
6       }
7     }
8   }
9 }

Enable only specific formatters

If you want tight control over which transformations run, pass only the formatter keys you need. This example enables only the email and phoneNumber formatters:

1 {
2   "voice": {
3     "chunkPlan": {
4       "formatPlan": {
5         "formattersEnabled": ["email", "phoneNumber"]
6       }
7     }
8   }
9 }

When you set formattersEnabled, only the listed formatters run. All others are disabled. Make sure to include every formatter you need.

Disable email formatting while keeping all others

Omit email from the formattersEnabled array. The TTS provider will then receive the raw @ and . characters, and pronunciation depends entirely on the provider and your prompt:

All formatters except email

1 {
2   "voice": {
3     "chunkPlan": {
4       "formatPlan": {
5         "formattersEnabled": [
6           "removeAngleBrackets",
7           "markdown",
8           "newline",
9           "colon",
10           "acronym",
11           "dollarAmount",
12           "date",
13           "time",
14           "distance",
15           "unit",
16           "percentage",
17           "phoneNumber",
18           "number",
19           "stripAsterisk"
20         ]
21       }
22     }
23   }
24 }

Disable all formatting

To send raw LLM output directly to TTS with no transformations at all:

Disable all formatting

1 {
2   "voice": {
3     "chunkPlan": {
4       "formatPlan": {
5         "enabled": false
6       }
7     }
8   }
9 }

Disabling all formatting means numbers, currencies, dates, phone numbers, and emails will all be sent raw to the TTS provider. Most providers will produce unnatural or garbled speech for these patterns.

Why prompt engineering still matters

Even though TTS formatting handles the character-level pronunciation, the LLM still controls how the conversation flows. Without explicit instructions, the agent might:

Read the email once at normal speed and move on, leaving the user unsure.
Fail to spell out ambiguous parts (was it “Jon” or “John”?).
Mispronounce uncommon domain names.
Skip a confirmation step entirely.

Good prompt instructions solve these problems at the conversational level.

System prompt: collecting an email

When asking a user for their email, instruct the agent to be patient and explicit about what it needs. The following snippet can be added to your system prompt.

System prompt -- collecting email

1 [Email Collection]
2 When you need to collect the user's email address:
3 1. Ask clearly: "Could you please tell me your email address?"
4 2. Listen to the full response before repeating anything back.
5 3. Once you have the email, read it back using these pronunciation rules:
6    - Say "@" as "at"
7    - Say "." as "dot"
8    - Say "-" as "dash"
9    - Say "_" as "underscore"
10 4. After reading it back, ask "Is that correct?"
11 5. If the user says no, ask them to spell it out letter by letter.
12 6. Never guess or autocorrect the email. Use exactly what the user provides.

System prompt: reading back and confirming an email

The confirmation step is where most agents fail. They read the email too fast or only once. This snippet teaches the agent to slow down and spell when needed.

System prompt -- confirming email

1 [Email Confirmation]
2 When reading an email address back to the user:
3 1. Speak slowly and clearly. Pause briefly between each part of the email
4    (username, "at", domain, "dot", extension).
5 2. For the username part, if it contains common words, say the words.
6    If it is ambiguous or uncommon, spell it out letter by letter.
7    For example:
8    - "john.doe" → "john dot doe"
9    - "jdoe42" → "j, d, o, e, four, two"
10    - "msmith" → "m, s, m, i, t, h"
11 3. For the domain, use the familiar name if it is a well-known provider:
12    - "gmail.com" → "gmail dot com"
13    - "yahoo.com" → "yahoo dot com"
14    - "outlook.com" → "outlook dot com"
15    - "hotmail.com" → "hotmail dot com"
16    If the domain is uncommon, spell it out letter by letter.
17 4. Always end with: "Is that correct?"
18 5. If the user corrects any part, repeat the entire email back again
19    after applying the correction.

Spelling out letter by letter

For ambiguous usernames or unfamiliar domains, letter-by-letter spelling removes all doubt. Add this instruction to your prompt so the agent knows when and how to spell.

System prompt -- letter-by-letter spelling

1 [Letter-by-Letter Spelling]
2 When spelling out part of an email:
3 - Say each letter individually with a brief pause between letters.
4 - For numbers, say the digit name ("one", "two", "three"), not the numeral.
5 - For uppercase vs lowercase, only mention case if the email is case-sensitive
6   or the user specifically asks.
7 - Use the NATO phonetic alphabet only if the user is having trouble
8   understanding individual letters. For example:
9   "b as in bravo, d as in delta"

Most email providers treat addresses as case-insensitive, so you typically do not need to distinguish uppercase from lowercase. Your prompt can note this to keep the conversation simpler.

Handling common domains naturally

You can make the agent sound more natural by teaching it to recognize popular email domains and say them as single words rather than spelling them out.

System prompt -- common domains

1 [Common Email Domains]
2 When reading these domains, say them as words, not spelled out:
3 - gmail.com → "gmail dot com"
4 - yahoo.com → "yahoo dot com"
5 - outlook.com → "outlook dot com"
6 - hotmail.com → "hotmail dot com"
7 - icloud.com → "icloud dot com"
8 - aol.com → "A O L dot com"
9 - protonmail.com → "proton mail dot com"
10 For any domain not in this list, spell it out letter by letter to avoid confusion.

Complete example: appointment booking agent

Below is a full system prompt section you can copy into your assistant configuration. It combines all the techniques above into a single, production-ready block.

Complete system prompt section

1 [Identity]
2 You are Sarah, a friendly appointment scheduling assistant for Acme Dental.
3 
4 [Email Collection and Confirmation]
5 When you need the user's email address:
6 1. Ask: "What email address should we send the confirmation to?"
7 2. Wait for the full response. Do not interrupt.
8 3. Read the email back to the user following these rules:
9    - Say "@" as "at"
10    - Say "." as "dot"
11    - Say "-" as "dash"
12    - Say "_" as "underscore"
13    - Speak slowly with a brief pause between each part.
14    - For well-known domains (gmail, yahoo, outlook, hotmail, icloud),
15      say the domain name naturally.
16    - For unfamiliar domains, spell them out letter by letter.
17    - For the username, if it is a recognizable name or word, say it normally.
18      If it looks like an abbreviation or random string, spell it out letter
19      by letter.
20 4. After reading the email, ask: "Did I get that right?"
21 5. If the user says no:
22    - Ask: "Could you spell it out for me letter by letter?"
23    - Listen carefully, then read the corrected version back.
24    - Ask again: "Is that correct now?"
25 6. Do not proceed to the next step until the user confirms the email.
26 7. Never modify, autocorrect, or guess any part of the email address.
27 
28 [Example Conversation]
29 Agent: "What email address should we send the confirmation to?"
30 User: "It's jsmith42@newcompany.io"
31 Agent: "Let me read that back. j, s, m, i, t, h, four, two ...at... new company
32         ...dot... i, o. Did I get that right?"
33 User: "Yes, that's correct."

Including an example conversation in your system prompt helps the LLM understand the exact pacing and format you expect. This is one of the most effective techniques for consistent behavior.

Using pronunciation dictionaries for domains

If your agents frequently encounter a specific company or domain name that TTS mispronounces, you can use pronunciation dictionaries (available with ElevenLabs voices) to set the correct pronunciation at the TTS level.

For example, if the domain “vapi.ai” is being pronounced as “vappy dot ay-eye”, you could create an alias rule:

Pronunciation dictionary rule

1 {
2   "rules": [
3     {
4       "stringToReplace": "vapi",
5       "type": "alias",
6       "alias": "vaahpee"
7     }
8   ]
9 }

This approach is complementary to prompt engineering — pronunciation dictionaries fix TTS-level pronunciation, while prompt instructions control the conversational flow.

Using custom keywords for transcription accuracy

If the speech-to-text (STT) transcriber is mishearing specific email domains or usernames, custom keywords can boost transcription accuracy for those terms.

For example, if users frequently mention their company email domain “contoso.com” and the transcriber misinterprets it, you can add “contoso” as a custom keyword to improve recognition.

Best practices

Always confirm the full email address

Never assume an email is correct after hearing it once. Always read the complete email back and wait for confirmation before proceeding. This single step prevents the majority of email capture errors.

Use a two-pass approach for difficult emails

First, try reading the email back naturally (words and common domains). If the user says it is wrong, switch to letter-by-letter spelling for the entire address. This keeps simple emails fast while still handling complex ones reliably.

Do not autocorrect or assume

Instruct the agent to never modify any part of the email address. Common mistakes include changing “jon” to “john” or assuming “.com” when the user said “.co”. Treat the email as an exact string.

Handle interruptions gracefully

Users sometimes interrupt mid-readback with a correction. Instruct the agent to accept the correction, incorporate it, and then restart the full readback from the beginning so both parties are aligned.

Keep voice formatting enabled

Vapi’s built-in formatEmails transformer handles the TTS-level conversion of ”@” and ”.” automatically. Disabling the voice formatting plan will cause the TTS to receive raw characters, which may produce garbled output. Keep voice.chunkPlan.formatPlan.enabled set to true (the default).

Common issues

TTS reads the email as a URL or gibberish

This usually happens when voice formatting is disabled. Verify that voice.chunkPlan.formatPlan.enabled is set to true (the default). See the voice formatting plan for details.

Agent skips the confirmation step

Add an explicit instruction like “Do not proceed until the user confirms the email” to your system prompt. Reinforcing this with an example conversation in the prompt helps the LLM follow the flow consistently.

Agent modifies or autocorrects the email

LLMs sometimes try to be helpful by fixing perceived typos. Add a clear rule: “Never modify, autocorrect, or guess any part of the email address. Use exactly what the user provides.”

User says a letter but transcriber hears a different one

Letters like “b” and “d”, or “m” and “n”, sound similar over phone audio. If this happens frequently, instruct the agent to ask the user to use the NATO phonetic alphabet (“b as in bravo”) or use custom keywords to improve transcription accuracy for commonly confused terms.

Next steps

Now that your agent handles email addresses reliably:

Voice formatting plan — Full reference for all 14 formatting steps and customization options.
Prompting guide — General techniques for writing effective voice AI prompts.
Pronunciation dictionaries — Fine-tune TTS pronunciation for specific words and names.
Custom keywords — Improve transcription accuracy for specific terms.
Speech configuration — Configure endpointing, silence detection, and other speech settings.