Soniox
Soniox provides two distinct paths:
- WebSocket realtime (
stt-rt-v3) — partials and finals with low latency. - HTTP batch via Files API (
stt-async-v3) — upload → create job → poll → transcript.
Use raw model IDs. The same provider instance exposes both methods; the controller chooses the transport.
Install
Section titled “Install”# Requiredpnpm add @saraudio/soniox
# Optional stages (VAD + Meter)pnpm add @saraudio/vad-energy @saraudio/meterCreate a provider
Section titled “Create a provider”import { soniox } from '@saraudio/soniox';
export const provider = soniox({ model: 'stt-rt-v3', // or 'stt-async-v3' when targeting HTTP batch auth: { apiKey: '<SONIOX_API_KEY>' },});Tip: You can use the same instance for both transports; set
transporton the controller per session.
Authentication
Section titled “Authentication”Soniox has different rules for WS vs REST. Do not expose a permanent API key in the browser.
- WebSocket (realtime): issue a short‑lived temporary API key on your server using
POST /v1/auth/temporary-api-key, then pass it in the first WS message asapi_key. This is the recommended browser pattern for secure realtime streaming. - REST (Files/Transcriptions): requires a permanent project API key in
Authorization: Bearer <key>. Temporary keys are not valid for REST and will return401 unauthenticated.
How SARAUDIO maps this:
- WS sends credentials inside the init JSON (
api_key). - REST sets the
Authorizationheader. Auth priority isgetToken → token → apiKey. In browsers prefergetTokenthat returns the correct credential for the chosen transport.
Recommended patterns:
- Realtime (browser): server endpoint returns a temporary API key; client uses
auth.getToken, transport'websocket'. - Batch REST: call Soniox from your server (or via server proxy routes) with a permanent key; avoid calling
/v1/filesand/v1/transcriptionsdirectly from the browser.
If you use a temporary key against REST, Soniox will respond with 401 unauthenticated — this is expected.
WebSocket quickstart (realtime)
Section titled “WebSocket quickstart (realtime)”import { createRecorder, createTranscription } from '@saraudio/runtime-browser';import { soniox } from '@saraudio/soniox';import { vadEnergy } from '@saraudio/vad-energy';import { meter } from '@saraudio/meter';
const provider = soniox({ model: 'stt-rt-v3', auth: { apiKey: '<KEY>' } });
const recorder = createRecorder({ format: { sampleRate: 16000, channels: 1 }, stages: [vadEnergy({ thresholdDb: -50 }), meter()], segmenter: true,});
const ctrl = createTranscription({ provider, recorder, transport: 'websocket', connection: { ws: { silencePolicy: 'keep' } },});
ctrl.onPartial((t) => console.log('partial:', t));ctrl.onTranscript((r) => console.log('final:', r.text));
await recorder.start();await ctrl.connect();Notes
- Soniox tokens stream as “tokens”; partials are coalesced into text for you.
- Use
silencePolicy: 'drop'to send frames only during speech.
HTTP quickstart (Files API batch)
Section titled “HTTP quickstart (Files API batch)”The controller’s HTTP path calls provider.transcribe() for each chunk. The Soniox provider maps this to Files API:
upload → create transcription job → poll → fetch transcript.
import { createRecorder, createTranscription } from '@saraudio/runtime-browser';import { soniox } from '@saraudio/soniox';import { vadEnergy } from '@saraudio/vad-energy';import { meter } from '@saraudio/meter';
const provider = soniox({ model: 'stt-async-v3', auth: { apiKey: '<KEY>' } });
const recorder = createRecorder({ stages: [vadEnergy({ thresholdDb: -50 }), meter()], segmenter: true });
const ctrl = createTranscription({ provider, recorder, transport: 'http', flushOnSegmentEnd: true, // pair with intervalMs: 0 for one request per phrase connection: { http: { chunking: { intervalMs: 0, overlapMs: 500, maxInFlight: 1, timeoutMs: 30_000 } }, },});
ctrl.onTranscript((r) => console.log('final:', r.text));
await recorder.start();await ctrl.connect();Notes
- Use the async model (
stt-async-v3) for HTTP batch. Realtime model (stt-rt-v3) is for WebSocket. - Batch jobs incur additional latency (upload + processing). For live UX prefer WS.
Options (Soniox)
Section titled “Options (Soniox)”model: 'stt-rt-v3' | 'stt-async-v3'— realtime vs async (batch REST).sampleRate?: number— preferred sample rate; default 16000.channels?: 1 | 2— channel count; default 1.audioFormat?: 'pcm_s16le' | 'auto' | string— initial config for WS; defaultpcm_s16le.languageHints?: string[]— optional list like['en','es'].queueBudgetMs?: number— drop‑oldest send queue budget for WS; default 200ms (clamped [100..500]).
Common provider options
auth:{ apiKey?; token?; getToken? }baseUrl: string | builder per transportheaders,query,wsProtocols
Errors & retries
Section titled “Errors & retries”- 401/403 →
AuthenticationError - 429 →
RateLimitError(usesRetry‑Afterwhen present) - Other HTTP errors →
ProviderError - WS close with error JSON (error_code/error_message) is mapped to the proper error type.
- Controller: WS retry with backoff; HTTP flush timeout per request.
ctrl.onError((e) => { if (e.name === 'RateLimitError') console.warn('rate limited'); else console.error('soniox error', e);});Models
Section titled “Models”import { SONIOX_REALTIME_MODELS, SONIOX_ASYNC_MODELS } from '@saraudio/soniox';
// ['stt-rt-v3'] and ['stt-async-v3']console.log(SONIOX_REALTIME_MODELS, SONIOX_ASYNC_MODELS);Pick stt-rt-v3 for WS realtime or stt-async-v3 for HTTP batch.
Practical tips
Section titled “Practical tips”- Prefer WS for live captions/partials; use HTTP for async jobs or phrase‑based UX with segment‑only.
- Keep mono/16 kHz for low latency; the hook negotiates formats with the provider.
- For big files prefer server‑side batch pipelines (upload → job → webhook/poll → storage).
See also
- Getting Started → Quickstart (WebSocket), Quickstart (HTTP), Quickstart (Vue + WS)
- Concepts → Controller & Transport