Get the (almost) daily changelog

Enhanced Transcription Features & Speech Processing

  1. Gladia Transcription Enhancements: Improve transcription accuracy and performance with new GladiaTranscriber features:

    • region: Choose between us-west and eu-west for optimal latency and data residency compliance
    • receivePartialTranscripts: Enable low-latency streaming transcription for real-time conversation flow
    • Enhanced language detection with support for both single and multiple language modes
  2. Advanced Deepgram Controls: Fine-tune speech recognition with enhanced DeepgramTranscriber settings:

    • eotThreshold: End-of-turn detection threshold for precise conversation boundaries (e.g., 0.7)
    • eotTimeoutMs: Maximum wait time for end-of-turn detection in milliseconds (e.g., 5000ms)
    • eagerEotThreshold: Early end-of-turn detection for responsive conversations (e.g., 0.3)
  3. AssemblyAI Keyterms Enhancement: Boost recognition accuracy for critical terms with AssemblyAITranscriber.keytermsPrompt:

    • Support for up to 100 keyterms, each up to 50 characters
    • Improved recognition for specific words and phrases
    • Additional cost: $0.04/hour when enabled
  4. Speechmatics Custom Vocabulary: Enhance recognition accuracy with SpeechmaticsCustomVocabularyItem:

    • content: The word or phrase to add (e.g., “Speechmatics”)
    • soundsLike: Alternative phonetic representations (e.g., [“speech mattix”]) for better pronunciation handling
  5. Word-Level Confidence: Access detailed transcription confidence data with CustomLLMModel.wordLevelConfidenceEnabled, providing word-by-word accuracy metrics for quality assessment and debugging.

  6. Enhanced Message Metadata: Store transcription confidence and other metadata in UserMessage.metadata, enabling detailed analysis of transcription quality and user speech patterns.

AssemblyAITranscriber.wordFinalizationMaxWaitTime is now deprecated. Use the new smart endpointing plans for better speech timing control. The deprecated property will be removed in a future release.

Transcription Improvements

Regional Processing

Choose optimal transcription regions with Gladia’s us-west and eu-west options for reduced latency and compliance.

Real-Time Streaming

Enable partial transcripts for immediate response processing, reducing perceived latency in conversations.

Advanced Speech Detection

Fine-tune end-of-turn detection with configurable thresholds and timeouts for natural conversation flow.

Custom Vocabulary

Improve accuracy for domain-specific terms, company names, and technical jargon with enhanced vocabulary support.