Quickstart
The quickest way to try real-time transcription is via the web portal — no code required.
Using the Realtime API
The Realtime API streams audio over a WebSocket connection and returns transcript results as you speak. Unlike the Batch API, results arrive continuously — within milliseconds of the spoken words.
1. Create an API key
Create an API key in the portal, which you'll use to securely access the API. Store the key as a managed secret.
Enterprise customers may need to speak to Support to get your API keys.
2. Install the library
Install using pip:
pip install speechmatics-rt pyaudio
pyaudio is required for microphone input in this quickstart.
Install using npm:
npm install @speechmatics/real-time-client @speechmatics/auth
3. Run the example
Replace YOUR_API_KEY with your key, then run the script.
import asyncio
from speechmatics.rt import (
AudioEncoding, AudioFormat, AuthenticationError,
Microphone, ServerMessageType, TranscriptResult,
TranscriptionConfig, AsyncClient,
)
API_KEY = YOUR_API_KEY
# Set up config and format for transcription
audio_format = AudioFormat(
encoding=AudioEncoding.PCM_S16LE,
sample_rate=16000,
chunk_size=4096,
)
config = TranscriptionConfig(
language="en",
max_delay=0.7,
)
async def main():
# Set up microphone
mic = Microphone(
sample_rate=audio_format.sample_rate,
chunk_size=audio_format.chunk_size
)
if not mic.start():
print("Mic not started — please install PyAudio")
try:
async with AsyncClient(api_key=API_KEY) as client:
# Handle ADD_TRANSCRIPT message
@client.on(ServerMessageType.ADD_TRANSCRIPT)
def handle_finals(msg):
if final := TranscriptResult.from_message(msg).metadata.transcript:
print(f"[Final]: {final}")
try:
# Begin transcribing
await client.start_session(
transcription_config=config,
audio_format=audio_format
)
while True:
await client.send_audio(
await mic.read(
chunk_size=audio_format.chunk_size
)
)
except KeyboardInterrupt:
pass
finally:
mic.stop()
except AuthenticationError as e:
print(f"Auth error: {e}")
if __name__ == "__main__":
asyncio.run(main())
Speak into your microphone. You should see output like:
[Final]: Hello, welcome to Speechmatics.
[Final]: This is a real-time transcription example.
Press Ctrl+C to stop.
import https from "node:https";
import { createSpeechmaticsJWT } from "@speechmatics/auth";
import { RealtimeClient } from "@speechmatics/real-time-client";
const apiKey = YOUR_API_KEY;
const client = new RealtimeClient();
const streamURL = "https://media-ice.musicradio.com/LBCUKMP3";
async function transcribe() {
// Print transcript as we receive it
client.addEventListener("receiveMessage", ({ data }) => {
if (data.message === "AddTranscript") {
for (const result of data.results) {
if (result.type === "word") {
process.stdout.write(" ");
}
process.stdout.write(`${result.alternatives?.[0].content}`);
if (result.is_eos) {
process.stdout.write("\n");
}
}
} else if (data.message === "EndOfTranscript") {
process.stdout.write("\n");
process.exit(0);
} else if (data.message === "Error") {
process.stdout.write(`\n${JSON.stringify(data)}\n`);
process.exit(1);
}
});
const jwt = await createSpeechmaticsJWT({
type: "rt",
apiKey,
ttl: 60, // 1 minute
});
await client.start(jwt, {
transcription_config: {
language: "en",
operating_point: "enhanced",
max_delay: 1.0,
transcript_filtering_config: {
remove_disfluencies: true,
},
},
});
const stream = https.get(streamURL, (response) => {
// Handle the response stream
response.on("data", (chunk) => {
client.sendAudio(chunk);
});
response.on("end", () => {
console.log("Stream ended");
client.stopRecognition({ noTimeout: true });
});
response.on("error", (error) => {
console.error("Stream error:", error);
client.stopRecognition();
});
});
stream.on("error", (error) => {
console.error("Request error:", error);
client.stopRecognition();
});
}
transcribe();
This example transcribes a live radio stream. You should see a rolling transcript printed to the console.
Press Ctrl+C to stop.
Understanding the output
The API returns two types of transcript results. You can use either or both depending on your use case.
Finals represent the best transcription for a span of audio and are never updated once emitted. You can tune their latency using max_delay — lower values reduce delay at the cost of slight accuracy.
Partials are emitted immediately as audio arrives and may be revised as more context is processed. A common pattern is to display partials immediately, then replace them with finals as they arrive.
To receive partials, set enable_partials=True in your TranscriptionConfig and register a handler for ADD_PARTIAL_TRANSCRIPT:
config = TranscriptionConfig(
language="en",
max_delay=0.7,
enable_partials=True, # Enable partial transcripts
)
async with AsyncClient(api_key=API_KEY) as client:
@client.on(ServerMessageType.ADD_PARTIAL_TRANSCRIPT)
def handle_partials(msg):
if partial := TranscriptResult.from_message(msg).metadata.transcript:
print(f"[Partial]: {partial}")
@client.on(ServerMessageType.ADD_TRANSCRIPT)
def handle_finals(msg):
if final := TranscriptResult.from_message(msg).metadata.transcript:
print(f"[Final]: {final}")
With both handlers registered, you'll see partials arrive first, then be superseded by the final result:
[Partial]: Hello wel
[Partial]: Hello welcome to
[Final]: Hello, welcome to Speechmatics.
await client.start(jwt, {
transcription_config: {
language: "en",
enable_partials: true, // Enable partial transcripts
},
});
client.addEventListener("receiveMessage", ({ data }) => {
if (data.message === "AddPartialTranscript") {
process.stdout.write(`[Partial]: ${data.metadata.transcript}\r`);
} else if (data.message === "AddTranscript") {
console.log(`[Final]: ${data.metadata.transcript}`);
}
});
With both handlers registered, you'll see partials arrive first, then be superseded by the final result:
[Partial]: Hello wel
[Partial]: Hello welcome to
[Final]: Hello, welcome to Speechmatics.
Next steps
Now that you have real-time transcription working, explore these features to build more powerful applications.