Web Calling

Get started with Vapi on the Web.

Anywhere you can run client-side JavaScript, you can run Vapi. All the way from vanilla to complex component-based applications with React and Next.js.

Installation

Install the package:

$yarn add @vapi-ai/web

or w/ npm:

$npm install @vapi-ai/web

Import the package:

1import Vapi from "@vapi-ai/web";

Then, create a new instance of the Vapi class, passing your Public Key as a parameter to the constructor:

1const vapi = new Vapi("your-public-key");

You can find your public key in the Vapi Dashboard.

Starting a Call

Assistants can either be created on the fly (temporary) or created & persisted to your account (persistent).

Option 1: Temporary Assistant

If you want to customize properties from the frontend on the fly, you can create an assistant configuration object and pass it to the .start() method.

Here are the options we will pass to .start():

1 const assistantOptions = {
2 name: "Vapi’s Pizza Front Desk",
3 firstMessage: "Vappy’s Pizzeria speaking, how can I help you?",
4 transcriber: {
5 provider: "deepgram",
6 model: "nova-2",
7 language: "en-US",
8 },
9 voice: {
10 provider: "playht",
11 voiceId: "jennifer",
12 },
13 model: {
14 provider: "openai",
15 model: "gpt-4",
16 messages: [
17 {
18 role: "system",
19 content: `You are a voice assistant for Vappy’s Pizzeria, a pizza shop located on the Internet.
20
21 Your job is to take the order of customers calling in. The menu has only 3 types
22 of items: pizza, sides, and drinks. There are no other types of items on the menu.
23
24 1) There are 3 kinds of pizza: cheese pizza, pepperoni pizza, and vegetarian pizza
25 (often called "veggie" pizza).
26 2) There are 3 kinds of sides: french fries, garlic bread, and chicken wings.
27 3) There are 2 kinds of drinks: soda, and water. (if a customer asks for a
28 brand name like "coca cola", just let them know that we only offer "soda")
29
30 Customers can only order 1 of each item. If a customer tries to order more
31 than 1 item within each category, politely inform them that only 1 item per
32 category may be ordered.
33
34 Customers must order 1 item from at least 1 category to have a complete order.
35 They can order just a pizza, or just a side, or just a drink.
36
37 Be sure to introduce the menu items, don't assume that the caller knows what
38 is on the menu (most appropriate at the start of the conversation).
39
40 If the customer goes off-topic or off-track and talks about anything but the
41 process of ordering, politely steer the conversation back to collecting their order.
42
43 Once you have all the information you need pertaining to their order, you can
44 end the conversation. You can say something like "Awesome, we'll have that ready
45 for you in 10-20 minutes." to naturally let the customer know the order has been
46 fully communicated.
47
48 It is important that you collect the order in an efficient manner (succinct replies
49 & direct questions). You only have 1 task here, and it is to collect the customers
50 order, then end the conversation.
51
52 - Be sure to be kind of funny and witty!
53 - Keep all your responses short and simple. Use casual language, phrases like "Umm...", "Well...", and "I mean" are preferred.
54 - This is a voice conversation, so keep your responses short, like in a real conversation. Don't ramble for too long.`,
55 },
56 ],
57 },
58 };

Let’s break down the configuration options we passed:

  • name: the display name for the assistant in our dashboard (for internal purposes only)
  • firstMessage: the first message that our assistant will say when it picks up the web call
  • transcriber: the transcriber is what turns user speech into processable text for our LLM. This is the first step in the end-to-end voice pipeline. We are using Deepgram for transcription, specifically, their Nova 2 model. We also set the language to be transcribed as English.
  • voice: the final portion of the voice pipeline is turning LLM output-text into speech. This process is called “Text-to-speech” (or TTS for short). We use a voice provider called PlayHT, & a voice provided by them called jennifer.
  • model: for our LLM, we use gpt-4 (from OpenAI) & set our system prompt for the assistant. The system prompt configures the context, role, personality, instructions and so on for the assistant. In our case, the system prompt above will give us the behaviour we want.

Now we can call .start(), passing the temporary assistant configuration:

1vapi.start(assistantOptions);

More configuration options can be found in the Assistant API reference.

Option 2: Persistent Assistant

If you want to create an assistant that you can reuse across multiple calls, you can create a persistent assistant in the Vapi Dashboard. Here’s how you can do that:

If you haven’t already signed-up, you’re going to need an account before you can use the web dashboard. When you visit dashboard.vapi.ai you may see something like this:

Sign-up for an account (or log-in to your existing account) — you will then find yourself inside the web dashboard. It will look something like this:

Your dashboard may look a bit different if you already have an account with assistants in it. The main idea is that we’re in the dashboard now.

Now that you’re in your dashboard, we’re going to create an assistant.

Assistants are at the heart of how Vapi models AI voice agents — we will be setting certain properties on a new assistant to model an order-taking agent.

Once in the “Assistants” dashboard tab (you should be in it by-default after log-in), you will see a button to create a new assistant.

Ensure you are in the 'Assistants' dashboard tab, then this button will allow you to begin the assistant creation flow.

After clicking the create new assistant button, you will see a pop-up modal that asks you to pick a starter template. For our example we will start from a blank slate so choose the Blank Template option.

Ensure you are in the 'Assistants' dashboard tab, then this button will allow you to begin the assistant creation flow.

You will then be able to name your assistant — you can name it whatever you’d like (Vapi’s Pizza Front Desk, for example):

This name is only for internal labeling use. It is not an identifier, nor will the assistant be aware of this name.

Name your assistant.

Once you have named your assistant, you can hit “Create” to create it. You will then see something like this:

The assistant overview. You can edit your assistant’s transcriber, model, & voice — and edit other advanced configuration.

This is the assistant overview view — it gives you the ability to edit different attributes about your assistant, as well as see cost & latency projection information for each portion of it’s voice pipeline (this is very important data to have handy when building out your assistants).

Now we’re going to set the “brains” of the assistant, the large language model. We’re going to be using GPT-4 (from OpenAI) for this demo (though you’re free to use GPT-3.5, or any one of your favorite LLMs).

Before we proceed, we can set our provider key for OpenAI (this is just your OpenAI secret key).

You can see all of your provider keys in the “Provider Keys” dashboard tab. You can also go directly to dashboard.vapi.ai/keys.

Vapi uses provider keys you provide to communicate with LLM, TTS, & STT vendors on your behalf. It is most ideal that we set keys for the vendors we intend to use ahead of time.

We set our provider key for OpenAI so Vapi can make requests to their API.

While we’re here it’d be ideal for you to go & set up provider keys for other providers you’re familiar with & intend to use later.

Assistants can optionally be configured with a First Message. This first message will be spoken by your assistant when either:

  • A Web Call Connects: when a web call is started with your assistant
  • An Inbound Call is Picked-up: an inbound call is picked-up & answered by your assistant
  • An Outbound Call is Dialed & Picked-up: an outbound call is dialed by your assistant & a person picks up

Note that this first message cannot be interrupted & is guaranteed to be spoken. Certain use cases need a first message, while others do not.

For our use case, we will want a first message. It would be ideal for us to have a first message like this:

Vappy’s Pizzeria speaking, how can I help you?

Some text-to-speech voices may struggle to pronounce ‘Vapi’ correctly, compartmentalizing it to be spoken letter by letter “V. A. P. I.”

Some aspects of configuring your voice pipeline will require tweaks like this to get the target behaviour you want.

This will be spoken by the assistant when a web or inbound phone call is received.

We will now set the System Prompt for our assistant. If you’re familiar with OpenAI’s API, this is the first prompt in the message list that we feed our LLM (learn more about prompt engineering on the OpenAI docs).

The system prompt can be used to configure the context, role, personality, instructions and so on for the assistant. In our case, a system prompt like this will give us the behaviour we want:

You are a voice assistant for Vappy’s Pizzeria,
a pizza shop located on the Internet.
Your job is to take the order of customers calling in. The menu has only 3 types
of items: pizza, sides, and drinks. There are no other types of items on the menu.
1) There are 3 kinds of pizza: cheese pizza, pepperoni pizza, and vegetarian pizza
(often called "veggie" pizza).
2) There are 3 kinds of sides: french fries, garlic bread, and chicken wings.
3) There are 2 kinds of drinks: soda, and water. (if a customer asks for a
brand name like "coca cola", just let them know that we only offer "soda")
Customers can only order 1 of each item. If a customer tries to order more
than 1 item within each category, politely inform them that only 1 item per
category may be ordered.
Customers must order 1 item from at least 1 category to have a complete order.
They can order just a pizza, or just a side, or just a drink.
Be sure to introduce the menu items, don't assume that the caller knows what
is on the menu (most appropriate at the start of the conversation).
If the customer goes off-topic or off-track and talks about anything but the
process of ordering, politely steer the conversation back to collecting their order.
Once you have all the information you need pertaining to their order, you can
end the conversation. You can say something like "Awesome, we'll have that ready
for you in 10-20 minutes." to naturally let the customer know the order has been
fully communicated.
It is important that you collect the order in an efficient manner (succinct replies
& direct questions). You only have 1 task here, and it is to collect the customers
order, then end the conversation.
- Be sure to be kind of funny and witty!
- Keep all your responses short and simple. Use casual language, phrases like "Umm...", "Well...", and "I mean" are preferred.
- This is a voice conversation, so keep your responses short, like in a real conversation. Don't ramble for too long.

You can copy & paste the above prompt into the System Prompt field. Now the model configuration for your assistant should look something like this:

Note how our model provider is set to OpenAI & the model is set to GPT-4.

The transcriber is what turns user speech into processable text for our LLM. This is the first step in the end-to-end voice pipeline.

We will be using Deepgram (which provides blazing-fast & accurate Speech-to-Text) as our STT provider.

We will set our provider key for them in “Provider Keys”:

We will set the model to Nova 2 & the language to en for English. Now your assistant’s transcriber configuration should look something like this:

Note how our transcriber is set to 'deepgram', the model is set to 'Nova 2', & the language is set to English.

The final portion of the voice pipeline is turning LLM output-text into speech. This process is called “Text-to-speech” (or TTS for short).

We will be using a voice provider called PlayHT (they have very conversational voices), & a voice provided by them labeled Jennifer (female, en-US).

You are free to use your favorite TTS voice platform here. ElevenLabs is another alternative — by now you should get the flow of plugging in vendors into Vapi (add provider key + pick provider in assistant config).

You can skip the next step(s) if you don’t intend to use PlayHT.

If you haven’t already, sign up for an account with PlayHT at play.ht. Since their flows are liable to change — you can just grab your API Key & User ID from them.

You will want to select playht in the “provider” field, & Jennifer in the “voice” field. We will leave all of the other settings untouched.

Each voice provider offers a host of settings you can modulate to customize voices. Here we will leave all the defaults alone.

To customize additional fields, this can be done via the Assistant API instead.

Then, you can copy the assistant’s ID at the top of the assistant detail page:

Now we can call .start(), passing the persistent assistant’s ID:

1vapi.start("79f3XXXX-XXXX-XXXX-XXXX-XXXXXXXXce48");

If you need to override any assistant settings or set template variables, you can pass assistantOverrides as the second argument.

For example, if the first message is “Hello {{name}}”, you can set assistantOverrides to replace {{name}} with John:

1const assistantOverrides = {
2 transcriber: {
3 provider: "deepgram",
4 model: "nova-2",
5 language: "en-US",
6 },
7 recordingEnabled: false,
8 variableValues: {
9 name: "John",
10 },
11};
12
13vapi.start("79f3XXXX-XXXX-XXXX-XXXX-XXXXXXXXce48", assistantOverrides);
Built with