Streaming chat

Build real-time chat experiences with token-by-token responses like ChatGPT

Overview

Build a real-time chat interface that displays responses as they’re generated, creating an engaging user experience similar to ChatGPT. Perfect for interactive applications where users expect immediate visual feedback.

What You’ll Build:

  • Real-time streaming chat interface with progressive text display
  • Context management across multiple messages
  • Basic TypeScript implementation ready for production use

Prerequisites

  • Completed Chat quickstart tutorial
  • Basic knowledge of TypeScript/JavaScript and async/await

Scenario

We’ll enhance the TechFlow support chat from the quickstart to provide real-time streaming responses. Users will see text appear progressively as the AI generates it.


1. Enable Streaming in Your Requests

1

Add the stream parameter

Modify your chat request to enable streaming by adding "stream": true:

Streaming Chat Request
$curl -X POST https://api.vapi.ai/chat \
> -H "Authorization: Bearer YOUR_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "assistantId": "your-assistant-id",
> "input": "Explain how to set up API authentication in detail",
> "stream": true
> }'
2

Understand the streaming response format

Instead of a single JSON response, you’ll receive Server-Sent Events (SSE):

SSE Event Format
1// Example SSE events received:
2data: {"id":"stream_123","path":"chat.output[0].content","delta":"Hello"}
3data: {"id":"stream_123","path":"chat.output[0].content","delta":" there!"}
4data: {"id":"stream_123","path":"chat.output[0].content","delta":" How can"}
5data: {"id":"stream_123","path":"chat.output[0].content","delta":" I help?"}
6
7// TypeScript interface for SSE events:
8interface SSEEvent {
9 id: string;
10 path: string;
11 delta: string;
12}

2. Basic TypeScript Streaming Implementation

1

Create a simple streaming function

Here’s a basic streaming implementation:

streaming-chat.ts
1async function streamChatMessage(
2 message: string,
3 previousChatId?: string
4): Promise<string> {
5 const response = await fetch('https://api.vapi.ai/chat', {
6 method: 'POST',
7 headers: {
8 'Authorization': 'Bearer YOUR_API_KEY',
9 'Content-Type': 'application/json'
10 },
11 body: JSON.stringify({
12 assistantId: 'your-assistant-id',
13 input: message,
14 stream: true,
15 ...(previousChatId && { previousChatId })
16 })
17 });
18
19 const reader = response.body?.getReader();
20 if (!reader) throw new Error('No reader available');
21
22 const decoder = new TextDecoder();
23 let fullResponse = '';
24
25 while (true) {
26 const { done, value } = await reader.read();
27 if (done) break;
28
29 const chunk = decoder.decode(value);
30 const lines = chunk.split('\n').filter(line => line.trim());
31
32 for (const line of lines) {
33 if (line.startsWith('data: ')) {
34 const data = JSON.parse(line.slice(6));
35 if (data.path && data.delta) {
36 fullResponse += data.delta;
37 process.stdout.write(data.delta);
38 }
39 }
40 }
41 }
42
43 return fullResponse;
44}
2

Test the streaming function

Try it out:

Test Streaming
1const response = await streamChatMessage("Explain API rate limiting in detail");
2console.log('\nComplete response:', response);

3. Streaming with Context Management

1

Handle conversation context

Maintain context across multiple streaming messages:

context-streaming.ts
1async function createStreamingConversation() {
2 let lastChatId: string | undefined;
3
4 async function sendMessage(input: string): Promise<string> {
5 const response = await fetch('https://api.vapi.ai/chat', {
6 method: 'POST',
7 headers: {
8 'Authorization': 'Bearer YOUR_API_KEY',
9 'Content-Type': 'application/json'
10 },
11 body: JSON.stringify({
12 assistantId: 'your-assistant-id',
13 input: input,
14 stream: true,
15 ...(lastChatId && { previousChatId: lastChatId })
16 })
17 });
18
19 const reader = response.body?.getReader();
20 if (!reader) throw new Error('No reader available');
21
22 const decoder = new TextDecoder();
23 let fullContent = '';
24 let currentChatId: string | undefined;
25
26 while (true) {
27 const { done, value } = await reader.read();
28 if (done) break;
29
30 const chunk = decoder.decode(value);
31 const lines = chunk.split('\n').filter(line => line.trim());
32
33 for (const line of lines) {
34 if (line.startsWith('data: ')) {
35 const event = JSON.parse(line.slice(6));
36
37 if (event.id && !currentChatId) {
38 currentChatId = event.id;
39 }
40
41 if (event.path && event.delta) {
42 fullContent += event.delta;
43 process.stdout.write(event.delta);
44 }
45 }
46 }
47 }
48
49 if (currentChatId) {
50 lastChatId = currentChatId;
51 }
52
53 return fullContent;
54 }
55
56 return { sendMessage };
57}
2

Use the conversation manager

Test Context
1const conversation = await createStreamingConversation();
2
3await conversation.sendMessage("My name is Alice");
4console.log('\n---');
5await conversation.sendMessage("What's my name?"); // Should remember Alice

Next Steps

Enhance your streaming chat further:

Need help? Chat with the team on our Discord or mention us on X/Twitter.