Quickstart (WebSocket)
Get real‑time transcripts in minutes using the built‑in recorder and a WebSocket‑capable provider. This guide uses Deepgram, but any WS provider works the same.
What you’ll build
Section titled “What you’ll build”- Start a microphone recorder (normalized PCM frames)
- Connect a WebSocket transcription stream
- Receive partial and final transcripts
Prerequisites
- HTTPS (or localhost)
- Microphone permission
- A provider key or short‑lived token endpoint
1) Install
Section titled “1) Install”# Requiredpnpm add @saraudio/runtime-browser @saraudio/deepgram
# Optional stages (VAD + Meter)pnpm add @saraudio/vad-energy @saraudio/meter2) Create a provider (Deepgram example)
Section titled “2) Create a provider (Deepgram example)”Simplest form: use a raw API key (great for quick local tries and server‑side usage).
import { deepgram } from '@saraudio/deepgram';
export const provider = deepgram({ model: 'nova-3', auth: { apiKey: '<DEEPGRAM_API_KEY>', },});Note: For production browsers, prefer short‑lived tokens from your backend. We’ll cover this in a separate Auth guide.
Secure variant (browser)
Section titled “Secure variant (browser)”Issue a short‑lived token on your server and use it via auth.getToken:
type EphemeralTokenResponse = { access_token: string; expires_in: number; // seconds};
let tokenCache: { value: string; expiresAt: number } | null = null;const nowMs = () => Date.now();
async function getToken(): Promise<string> { if (tokenCache && tokenCache.expiresAt - nowMs() > 2000) { return tokenCache.value; }
const response = await fetch('/api/deepgram/token', { method: 'POST' }); if (!response.ok) { throw new Error(`Failed to obtain Deepgram token (status ${response.status})`); }
const body: EphemeralTokenResponse = await response.json(); const token = body.access_token; const ttlSeconds = body.expires_in; const safeTtlMs = Math.max(1, ttlSeconds - 2) * 1000;
tokenCache = { value: token, expiresAt: nowMs() + safeTtlMs }; return token;}
export const provider = deepgram({ model: 'nova-3', auth: { getToken },});3) Create recorder and controller
Section titled “3) Create recorder and controller”import { createRecorder, createTranscription } from '@saraudio/runtime-browser';import { vadEnergy } from '@saraudio/vad-energy';import { meter } from '@saraudio/meter';
const recorder = createRecorder({ // Recommended: mono 16 kHz for low latency format: { sampleRate: 16000, channels: 1 }, // Stages: VAD for speech events; Meter for level visualization stages: [ vadEnergy({ thresholdDb: -50, attackMs: 80, releaseMs: 200 }), meter(), ], segmenter: true,});
const ctrl = createTranscription({ provider, recorder, transport: 'websocket', connection: { ws: { silencePolicy: 'keep' }, // 'keep' | 'drop' | 'mute' },});4) Wire up events and start
Section titled “4) Wire up events and start”ctrl.onPartial((text) => console.log('partial:', text));ctrl.onTranscript((r) => console.log('final:', r.text));ctrl.onError((e) => console.error(e));
await recorder.start();await ctrl.connect();
// Later: stop// await ctrl.disconnect();// await recorder.stop();Silence policy (optional)
Section titled “Silence policy (optional)”keep(default): send all frames (best quality, more bandwidth)drop: send only during speech (based on VAD)mute: keep cadence by sending zeroed frames in silence
Change at controller creation: connection.ws.silencePolicy.
- Concepts → Controller & Transport (policies, retries)
- Providers → Deepgram / Soniox (WS options)