Streaming chat

Overview

Build a real-time chat interface that displays responses as they’re generated, creating an engaging user experience similar to ChatGPT. Perfect for interactive applications where users expect immediate visual feedback.

What You’ll Build:

Real-time streaming chat interface with progressive text display
Context management across multiple messages
Basic TypeScript implementation ready for production use

Prerequisites

Completed Chat quickstart tutorial
Basic knowledge of TypeScript/JavaScript and async/await

Scenario

We’ll enhance the TechFlow support chat from the quickstart to provide real-time streaming responses. Users will see text appear progressively as the AI generates it.

1. Enable Streaming in Your Requests

Add the stream parameter

Modify your chat request to enable streaming by adding "stream": true:

Streaming Chat Request

$ curl -X POST https://api.vapi.ai/chat \
>   -H "Authorization: Bearer YOUR_API_KEY" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "assistantId": "your-assistant-id",
>     "input": "Explain how to set up API authentication in detail",
>     "stream": true
>   }'

Understand the streaming response format

Instead of a single JSON response, you’ll receive Server-Sent Events (SSE):

SSE Event Format

1 // Example SSE events received:
2 data: {"id":"stream_123","path":"chat.output[0].content","delta":"Hello"}
3 data: {"id":"stream_123","path":"chat.output[0].content","delta":" there!"}
4 data: {"id":"stream_123","path":"chat.output[0].content","delta":" How can"}
5 data: {"id":"stream_123","path":"chat.output[0].content","delta":" I help?"}
6 
7 // TypeScript interface for SSE events:
8 interface SSEEvent {
9   id: string;
10   path: string;
11   delta: string;
12 }

2. Basic TypeScript Streaming Implementation

Create a simple streaming function

Here’s a basic streaming implementation:

streaming-chat.ts

1 async function streamChatMessage(
2   message: string, 
3   previousChatId?: string
4 ): Promise<string> {
5   const response = await fetch('https://api.vapi.ai/chat', {
6     method: 'POST',
7     headers: {
8       'Authorization': 'Bearer YOUR_API_KEY',
9       'Content-Type': 'application/json'
10     },
11     body: JSON.stringify({
12       assistantId: 'your-assistant-id',
13       input: message,
14       stream: true,
15       ...(previousChatId && { previousChatId })
16     })
17   });
18 
19   const reader = response.body?.getReader();
20   if (!reader) throw new Error('No reader available');
21   
22   const decoder = new TextDecoder();
23   let fullResponse = '';
24 
25   while (true) {
26     const { done, value } = await reader.read();
27     if (done) break;
28     
29     const chunk = decoder.decode(value);
30     const lines = chunk.split('\n').filter(line => line.trim());
31     
32     for (const line of lines) {
33       if (line.startsWith('data: ')) {
34         const data = JSON.parse(line.slice(6));
35         if (data.path && data.delta) {
36           fullResponse += data.delta;
37           process.stdout.write(data.delta);
38         }
39       }
40     }
41   }
42   
43   return fullResponse;
44 }

Test the streaming function

Try it out:

Test Streaming

1 const response = await streamChatMessage("Explain API rate limiting in detail");
2 console.log('\nComplete response:', response);

3. Streaming with Context Management

Handle conversation context

Maintain context across multiple streaming messages:

context-streaming.ts

1 async function createStreamingConversation() {
2   let lastChatId: string | undefined;
3 
4   async function sendMessage(input: string): Promise<string> {
5     const response = await fetch('https://api.vapi.ai/chat', {
6       method: 'POST',
7       headers: {
8         'Authorization': 'Bearer YOUR_API_KEY',
9         'Content-Type': 'application/json'
10       },
11       body: JSON.stringify({
12         assistantId: 'your-assistant-id',
13         input: input,
14         stream: true,
15         ...(lastChatId && { previousChatId: lastChatId })
16       })
17     });
18 
19     const reader = response.body?.getReader();
20     if (!reader) throw new Error('No reader available');
21     
22     const decoder = new TextDecoder();
23     let fullContent = '';
24     let currentChatId: string | undefined;
25 
26     while (true) {
27       const { done, value } = await reader.read();
28       if (done) break;
29       
30       const chunk = decoder.decode(value);
31       const lines = chunk.split('\n').filter(line => line.trim());
32       
33       for (const line of lines) {
34         if (line.startsWith('data: ')) {
35           const event = JSON.parse(line.slice(6));
36           
37           if (event.id && !currentChatId) {
38             currentChatId = event.id;
39           }
40           
41           if (event.path && event.delta) {
42             fullContent += event.delta;
43             process.stdout.write(event.delta);
44           }
45         }
46       }
47     }
48     
49     if (currentChatId) {
50       lastChatId = currentChatId;
51     }
52     
53     return fullContent;
54   }
55 
56   return { sendMessage };
57 }

Use the conversation manager

Test Context

1 const conversation = await createStreamingConversation();
2 
3 await conversation.sendMessage("My name is Alice");
4 console.log('\n---');
5 await conversation.sendMessage("What's my name?"); // Should remember Alice

Next Steps

Enhance your streaming chat further:

OpenAI compatibility - Use OpenAI SDK for streaming with familiar syntax
Non-streaming patterns - Learn about sessions and complex conversation management
Session management - Learn about context management with sessions and previousChatId in streaming
Add tools - Enable your assistant to call external APIs while streaming

Need help? Chat with the team on our Discord or mention us on X/Twitter.