### Install Gladia SDK Source: https://github.com/gladiaio/docs/blob/main/chapters/migrations/from-assembly.mdx Install the Gladia SDK for Python or TypeScript to enable realtime streaming. ```bash Python pip install gladiaio-sdk ``` ```bash Typescript npm i @gladiaio/sdk ``` -------------------------------- ### Install Gladia SDK Source: https://github.com/gladiaio/docs/blob/main/chapters/live-stt/quickstart.mdx Install the Gladia SDK using npm for JavaScript or pip/uv for Python. ```sh npm install @gladiaio/sdk ``` ```sh # Using pip pip install gladiaio-sdk # Using uv uv add gladiaio-sdk ``` -------------------------------- ### Install AssemblyAI SDK Source: https://github.com/gladiaio/docs/blob/main/chapters/migrations/from-assembly.mdx Install the AssemblyAI SDK for Python or TypeScript to enable realtime streaming. ```bash Python pip install assemblyai ``` ```bash Typescript npm i assemblyai ``` -------------------------------- ### Install Gladia CLI Dependencies Source: https://github.com/gladiaio/docs/blob/main/README.md Installs the Gladia CLI globally using npm. Ensure Node.js and npm are installed. ```bash npm i -g mint ``` -------------------------------- ### Start Gladia Session (Python) Source: https://github.com/gladiaio/docs/blob/main/chapters/migrations/from-assembly.mdx Starts a live transcription session using the Gladia SDK. Requires a pre-configured gladia_config object. ```python gladia_session = gladia_client.live_v2().start_session(gladia_config) ``` -------------------------------- ### Gladia Configuration Example Source: https://github.com/gladiaio/docs/blob/main/chapters/migrations/from-assembly.mdx Example JSON configuration for Gladia's realtime STT, mapping parameters from AssemblyAI. This includes model, audio format, language, and processing options. ```json { "model": "solaria-1", "encoding": "wav/pcm", "bit_depth": 16, "sample_rate": 16000, "channels": 1, "language_config": { "languages": ["en"], "code_switching": false }, "messages_config": { "receive_partial_transcripts": true, "receive_final_transcripts": true }, "endpointing": 0.8, "maximum_duration_without_endpointing": 30, "realtime_processing": { "custom_vocabulary": false, "custom_spelling": false } } ``` -------------------------------- ### Start Gladia Session (TypeScript) Source: https://github.com/gladiaio/docs/blob/main/chapters/migrations/from-assembly.mdx Starts a live transcription session using the Gladia SDK. Requires a pre-configured gladiaConfig object. ```typescript const gladiaSession = gladiaClient.liveV2().startSession(gladiaConfig); ``` -------------------------------- ### Example V2 Upload Response Source: https://github.com/gladiaio/docs/blob/main/chapters/pre-recorded-stt/migration-from-v1.mdx This is an example JSON response after successfully uploading an audio file using the /v2/upload endpoint. ```json { "audio_url": "https://api.gladia.io/file/636c70f6-92c1-4026-a8b6-0dfe3ecf826f", "audio_metadata": { "id": "636c70f6-92c1-4026-a8b6-0dfe3ecf826f", "filename": "conversation.wav", "extension": "wav", "size": 99515383, "audio_duration": 4146.468542, "number_of_channels": 2 } } ``` -------------------------------- ### Summarization Response Example Source: https://context7.com/gladiaio/docs/llms.txt Example response containing the generated text summary. Includes the summary text and execution time. ```json { "summarization": { "success": true, "results": "- Discussed Q2 roadmap priorities\n- Billing integration moved to top of backlog\n- Weekly sync rescheduled to Tuesday", "exec_time": 1.51 } } ``` -------------------------------- ### Chapterization Response Example Source: https://context7.com/gladiaio/docs/llms.txt Example response for chapterization, showing results with chapter headlines, summaries, gist, keywords, and time segments. ```json { "chapterization": { "success": true, "results": [ { "headline": "Embracing Hope: Past, Present, Future Interconnected", "summary": "In a world where minimalism is valued, yet excess is desired, hope remains. The past predicts the present.", "gist": "Hope ties past and future together", "keywords": ["hope", "present", "future", "minimalism"], "start": 0.0, "end": 19.84 } ] } } ``` -------------------------------- ### Install Gladia n8n Node via npm Source: https://github.com/gladiaio/docs/blob/main/chapters/integrations/n8n.mdx Use this command to install the Gladia community node for n8n. This requires a self-hosted n8n instance. ```bash npm install @gladiaio/n8n-nodes ``` -------------------------------- ### Run Gladia Local Development Server Source: https://github.com/gladiaio/docs/blob/main/README.md Starts a local development server for Gladia documentation. This command should be run from the root of your documentation directory where docs.json is located. ```bash mint dev ``` -------------------------------- ### V1 Configuration for Endpointing and Duration Source: https://github.com/gladiaio/docs/blob/main/chapters/live-stt/migration-from-v1.mdx Example V1 configuration for endpointing in milliseconds and maximum audio duration in seconds. ```json { "endpointing": 800, "maximum_audio_duration": 10 } ``` -------------------------------- ### Full Configuration Migration Sample Source: https://github.com/gladiaio/docs/blob/main/chapters/live-stt/migration-from-v1.mdx This example shows a comprehensive migration of V1 configuration parameters to their V2 equivalents, including changes in language handling, endpointing, and processing options. ```json { "encoding": "wav", "bit_depth": 8, "sample_rate": 48000, "model": "accurate", "endpointing": 800, "maximum_audio_duration": 10, "language_behaviour": "manual", "language": "english", "audio_enhancer": true, "word_timestamps": true } ``` ```json { "encoding": "wav/pcm", "bit_depth": 8, "sample_rate": 48000, "endpointing": 0.8, "maximum_duration_without_endpointing": 10, "language_config": { "languages": ["en"] } "pre_processing": { "audio_enhancer": true }, "realtime_processing": { "words_accurate_timestamps": true } } ``` -------------------------------- ### Start Transcription Session (Python) Source: https://github.com/gladiaio/docs/blob/main/chapters/migrations/from-assembly.mdx Initiates a live transcription session using the AssemblyAI SDK. Requires importing StreamingParameters. ```python from assemblyai.streaming.v3 import StreamingParameters assemblyClient.connect( StreamingParameters( sample_rate=16000, format_turns=True, ) ) ``` -------------------------------- ### Subtitle Export Response Example Source: https://context7.com/gladiaio/docs/llms.txt Example response showing generated subtitles in both SRT and VTT formats. Each format contains the subtitle content with timing information. ```json { "subtitles": [ { "format": "srt", "subtitles": "1\n00:00:00,210 --> 00:00:04,711\nHello, welcome to the earnings call.\n\n2\n00:00:05,100 --> 00:00:08,300\nThank you for joining today." }, { "format": "vtt", "subtitles": "WEBVTT\n\n00:00:00.210 --> 00:00:04.711\nHello, welcome to the earnings call." } ] } ``` -------------------------------- ### Start Transcription Session (TypeScript) Source: https://github.com/gladiaio/docs/blob/main/chapters/migrations/from-assembly.mdx Initiates a live transcription session using the AssemblyAI SDK. Requires connecting the transcriber. ```typescript const assemblySession = assemblyClient.streaming.transcriber({ sampleRate: 16_000, formatTurns: true, }); await assemblySession.connect(); ``` -------------------------------- ### Start Deepgram Live Transcription Session (TypeScript) Source: https://github.com/gladiaio/docs/blob/main/chapters/migrations/from-deepgram.mdx Establishes a live transcription connection with Deepgram using the provided configuration. The deepgramClient must be initialized. ```typescript const deepgramConnection = deepgramClient.listen.live(deepgramConfig); ``` -------------------------------- ### Audio-to-LLM Response Source: https://context7.com/gladiaio/docs/llms.txt This is an example of the response structure when using the Audio-to-LLM feature, showing results for each prompt. ```json { "audio_to_llm": { "success": true, "results": [ { "results": { "prompt": "Summarize the meeting as bullet points: main topics, decisions, and open questions.", "response": "- **Roadmap Q2**: Team aligned on shipping the billing integration first.\n- **Decision**: Weekly sync moved to Tuesday.\n- **Open question**: Whether to support SSO in v1 is still TBD." } }, { "results": { "prompt": "Give a concise paragraph summarizing what this meeting was about and the outcome.", "response": "The group reviewed Q2 priorities, agreed to prioritize billing, and rescheduled the standing meeting." } }, { "results": { "prompt": "List action items and follow-ups; include owners if mentioned.", "response": "- **Alex**: Finalize SSO requirements doc by Friday.\n- **Jamie**: Share billing API cutover checklist.\n- **Everyone**: Review the updated roadmap draft before next sync." } } ] } } ``` -------------------------------- ### V2 Configuration for Endpointing and Duration Source: https://github.com/gladiaio/docs/blob/main/chapters/live-stt/migration-from-v1.mdx Example V2 configuration for endpointing in seconds and maximum duration without endpointing. Note the unit change for endpointing and the parameter name change. ```json { "endpointing": 0.8, "maximum_duration_without_endpointing": 10 } ``` -------------------------------- ### Start Deepgram Live Transcription Session (Python) Source: https://github.com/gladiaio/docs/blob/main/chapters/migrations/from-deepgram.mdx Initiates a live transcription session with Deepgram using the provided configuration. Ensure the deepgram_client is initialized. ```python with deepgram_client.listen.v1.connect(deepgram_config) as deepgram_connection: # Further code to add event handlers deepgram_connection.start_listening() ``` -------------------------------- ### Get Live Session Result with cURL Source: https://context7.com/gladiaio/docs/llms.txt Retrieve the complete transcription and post-processing results for a finished live session using its ID. This example uses cURL for the HTTP GET request. ```bash curl --request GET \ --url https://api.gladia.io/v2/live/de70f43f-3041-46e0-892c-8e7f53800a22 \ --header 'x-gladia-key: YOUR_GLADIA_API_KEY' ``` -------------------------------- ### Start Session Message Source: https://github.com/gladiaio/docs/blob/main/api-reference/v2/live/message/start-session.mdx Defines the structure of the message payload used to start a new session. ```APIDOC ## Start Session Message ### Description This message is used to initiate a new session. ### Payload Definition This section would typically detail the fields within the `StartSessionMessage` payload, such as authentication tokens, user identifiers, or session configuration parameters. Since the source only provides the `openapi-schema: StartSessionMessage` reference without further details, a specific payload structure cannot be generated here. ### Example ```json { "message_type": "start_session", "payload": { "user_id": "user123", "session_config": { "timeout": 300 } } } ``` ``` -------------------------------- ### Install Deepgram SDK (Python/TypeScript) Source: https://github.com/gladiaio/docs/blob/main/chapters/migrations/from-deepgram.mdx Install the Deepgram SDK for Python using pip or for TypeScript using npm. ```bash pip install deepgram-sdk ``` ```bash npm i @deepgram/sdk ``` -------------------------------- ### Initiate Real-time Session Source: https://github.com/gladiaio/docs/blob/main/chapters/live-stt/quickstart.mdx Initialize the Gladia client and configure the live session parameters. Ensure correct audio encoding, sample rate, bit depth, and channels are provided. ```typescript const gladiaClient = new GladiaClient({ apiKey: , }); const gladiaConfig = { model: "solaria-1", encoding: 'wav/pcm', sample_rate: 16000, bit_depth: 16, channels: 1, language_config: { languages: ["fr"], code_switching: false, }, }; const liveSession = gladiaClient.liveV2().startSession(gladiaConfig); ``` ```python # Our Python SDK supports sync/threaded and asyncio versions. gladia_client = GladiaClient(api_key="") # sync/threaded version live_client = gladia_client.live_v2() # asyncio version live_client = gladia_client.live_v2_async() init_request = LiveV2InitRequest( model="solaria-1", encoding="wav/pcm", sample_rate=16000, bit_depth=16, channels=1, language_config=LiveV2LanguageConfig(languages=["fr"], code_switching=False), messages_config=LiveV2MessagesConfig(receive_partial_transcripts=True), ) live_session = live_client.start_session(init_request) ``` -------------------------------- ### Initialize Gladia Client Source: https://github.com/gladiaio/docs/blob/main/chapters/migrations/from-assembly.mdx Initialize the Gladia client with your API key. This client manages your live transcription connection. ```python from gladiaio_sdk import GladiaClient gladia_client = GladiaClient(api_key="") ``` ```typescript import { GladiaClient } from "@gladiaio/sdk"; const gladiaClient = new GladiaClient({ apiKey: process.env.GLADIA_API_KEY, }); ``` -------------------------------- ### Initiate Live Session with Configuration Source: https://github.com/gladiaio/docs/blob/main/chapters/live-stt/quickstart.mdx Start a real-time transcription session by sending a POST request to the /v2/live endpoint. Specify audio parameters like encoding, sample rate, bit depth, and channels. ```javascript const response = await fetch("https://api.gladia.io/v2/live", { method: "POST", headers: { "Content-Type": "application/json", "x-gladia-key": "", }, body: JSON.stringify({ encoding: "wav/pcm", sample_rate: 16000, bit_depth: 16, channels: 1, }), }); if (!response.ok) { // Look at the error message // It might be a configuration issue console.error( `${response.status}: ${(await response.text()) || response.statusText}` ); process.exit(response.status); } const { id, url } = await response.json(); ``` ```cURL curl --request POST \ --url https://api.gladia.io/v2/live \ --header 'Content-Type: application/json' \ --header 'x-gladia-key: YOUR_GLADIA_API_KEY' \ --data '{ \ "encoding": "wav/pcm", \ "sample_rate": 16000, \ "bit_depth": 16, \ "channels": 1 \ }' ``` -------------------------------- ### Translation Response Example Source: https://context7.com/gladiaio/docs/llms.txt This is an example of the translation results section in a response. It shows translated full transcripts and utterances for specified languages. ```json { "translation": { "success": true, "results": [ { "languages": ["fr"], "full_transcript": "Diviser l'infini dans un temps où moins est plus...", "utterances": [{ "text": "Diviser l'infini...", "start": 0.2, "end": 1.56 }] } ] } } ``` -------------------------------- ### Connect to Live WebSocket and Stream Audio (SDK) Source: https://context7.com/gladiaio/docs/llms.txt This snippet demonstrates how to use the Gladia SDK to establish a live WebSocket connection, stream audio, and receive real-time transcripts. ```APIDOC ## Connect to Live WebSocket and Stream Audio (SDK) This example shows how to use the Gladia SDK to initiate a live session, stream audio chunks, and handle incoming messages like transcripts. ### Language JavaScript ### Code ```javascript import { GladiaClient } from "@gladiaio/sdk"; import fs from "fs"; const gladiaClient = new GladiaClient({ apiKey: "YOUR_GLADIA_API_KEY" }); const liveSession = gladiaClient.liveV2().startSession({ model: "solaria-1", encoding: "wav/pcm", sample_rate: 16000, bit_depth: 16, channels: 1, language_config: { languages: ["en"], code_switching: false }, messages_config: { receive_partial_transcripts: true }, }); liveSession.on("started", () => console.log("Session started")); liveSession.on("ended", (msg) => console.log("Session ended", msg)); liveSession.on("error", (err) => console.error("Error:", err)); liveSession.on("message", (message) => { if (message.type === "transcript" && message.data.is_final) { console.log(`[FINAL] ${message.data.utterance.text}`); } if (message.type === "transcript" && !message.data.is_final) { process.stdout.write(`\r[PARTIAL] ${message.data.utterance.text}`); } }); // Stream audio file chunks const audioBuffer = fs.readFileSync("./audio.wav"); const CHUNK_SIZE = 4096; for (let i = 0; i < audioBuffer.length; i += CHUNK_SIZE) { liveSession.sendAudio(audioBuffer.subarray(i, i + CHUNK_SIZE)); } liveSession.stopRecording(); ``` ``` -------------------------------- ### Get Transcription Result Source: https://github.com/gladiaio/docs/blob/main/chapters/pre-recorded-stt/quickstart.mdx Poll the GET /v2/pre-recorded/:id endpoint or the result_url from the create response until the job status is 'done' to retrieve transcription results. ```APIDOC ## GET /v2/pre-recorded/:id ### Description Retrieves the transcription result for a pre-recorded audio file. Poll this endpoint until the job status is 'done'. ### Method GET ### Endpoint `/v2/pre-recorded/:id` ### Parameters #### Path Parameters - **id** (string) - Required - The unique identifier of the pre-recorded job. #### Headers - **x-gladia-key** (string) - Required - Your Gladia API key. ### Request Example ```javascript const id = "your_job_id"; // Replace with your job ID const response = await fetch( `https://api.gladia.io/v2/pre-recorded/${id}`, { method: "GET", headers: { "x-gladia-key": "", }, } ); if (!response.ok) { console.error( `${response.status}: ${(await response.text()) || response.statusText}` ); return; } const result = await response.json(); console.log(result); ``` ### Response #### Success Response (200) - **status** (string) - The current status of the transcription job (e.g., 'processing', 'done'). - **result_url** (string) - A URL to retrieve the full transcription results once the job is done. - Other fields may be present depending on the job status and completion. ``` -------------------------------- ### Connect to Live WebSocket and Stream Audio with SDK Source: https://context7.com/gladiaio/docs/llms.txt Use the Gladia SDK to establish a live WebSocket connection, stream audio chunks, and receive real-time transcripts. Ensure the SDK is installed and your API key is configured. ```javascript // JavaScript — full live session using the SDK import { GladiaClient } from "@gladiaio/sdk"; import fs from "fs"; const gladiaClient = new GladiaClient({ apiKey: "YOUR_GLADIA_API_KEY" }); const liveSession = gladiaClient.liveV2().startSession({ model: "solaria-1", encoding: "wav/pcm", sample_rate: 16000, bit_depth: 16, channels: 1, language_config: { languages: ["en"], code_switching: false }, messages_config: { receive_partial_transcripts: true }, }); liveSession.on("started", () => console.log("Session started")); liveSession.on("ended", (msg) => console.log("Session ended", msg)); liveSession.on("error", (err) => console.error("Error:", err)); liveSession.on("message", (message) => { if (message.type === "transcript" && message.data.is_final) { console.log(`[FINAL] ${message.data.utterance.text}`); } if (message.type === "transcript" && !message.data.is_final) { process.stdout.write(`\r[PARTIAL] ${message.data.utterance.text}`); } }); // Stream audio file chunks const audioBuffer = fs.readFileSync("./audio.wav"); const CHUNK_SIZE = 4096; for (let i = 0; i < audioBuffer.length; i += CHUNK_SIZE) { liveSession.sendAudio(audioBuffer.subarray(i, i + CHUNK_SIZE)); } liveSession.stopRecording(); ``` -------------------------------- ### Get Final Results Source: https://github.com/gladiaio/docs/blob/main/chapters/live-stt/quickstart.mdx Retrieve the complete transcription results and add-ons for a completed session by calling the GET /v2/live/:id endpoint. You need the session ID obtained from the initial request. ```APIDOC ## Get the final results If you want to get the complete result, you can call the [`GET /v2/live/:id` endpoint](/api-reference/v2/live/get) with the `id` you received from the initial request. ### Method GET ### Endpoint /v2/live/:id ### Parameters #### Path Parameters - **id** (string) - Required - The unique identifier of the live transcription session. #### Headers - **x-gladia-key** (string) - Required - Your Gladia API key. ### Request Example (JavaScript) ```javascript const response = await fetch(`https://api.gladia.io/v2/live/${id}`, { method: "GET", headers: { "x-gladia-key": "", }, }); if (!response.ok) { // Look at the error message // It might be a configuration issue console.error( `${response.status}: ${(await response.text()) || response.statusText}` ); return; } const result = await response.json(); console.log(result); ``` ### Request Example (cURL) ```bash curl --request GET \ --url https://api.gladia.io/v2/live/ID_OF_THE_SESSION \ --header 'x-gladia-key: YOUR_GLADIA_API_KEY' ``` ### Response #### Success Response (200) - **result** (object) - Contains the complete transcription results and add-ons for the session. ``` -------------------------------- ### Initiate Live Session Source: https://github.com/gladiaio/docs/blob/main/chapters/live-stt/quickstart.mdx Initiate a live speech-to-text session by sending a POST request to the /v2/live endpoint with your audio configuration. This returns a session ID and a WebSocket URL. ```APIDOC ## Initiate your real-time session First, call the [ endpoint](/api-reference/v2/live/init) and pass your configuration. It's important to correctly define the properties `encoding`, `sample_rate`, `bit_depth` and `channels` as we need them to parse your audio chunks. ### Method POST ### Endpoint https://api.gladia.io/v2/live ### Request Body - **encoding** (string) - Required - The audio encoding format (e.g., "wav/pcm"). - **sample_rate** (integer) - Required - The audio sample rate (e.g., 16000). - **bit_depth** (integer) - Required - The audio bit depth (e.g., 16). - **channels** (integer) - Required - The number of audio channels (e.g., 1). ### Request Example (JavaScript) ```javascript const response = await fetch("https://api.gladia.io/v2/live", { method: "POST", headers: { "Content-Type": "application/json", "x-gladia-key": "", }, body: JSON.stringify({ encoding: "wav/pcm", sample_rate: 16000, bit_depth: 16, channels: 1, }), }); if (!response.ok) { console.error( `${response.status}: ${(await response.text()) || response.statusText}` ); process.exit(response.status); } const { id, url } = await response.json(); ``` ### Request Example (cURL) ```bash cURL curl --request POST \ --url https://api.gladia.io/v2/live \ --header 'Content-Type: application/json' \ --header 'x-gladia-key: YOUR_GLADIA_API_KEY' \ --data '{ \ "encoding": "wav/pcm", \ "sample_rate": 16000, \ "bit_depth": 16, \ "channels": 1 \ }' ``` ### Response #### Success Response (200) - **id** (string) - The unique identifier for the live session. - **url** (string) - The WebSocket URL to connect to for real-time data. ### Response Example ```json { "id": "636c70f6-92c1-4026-a8b6-0dfe3ecf826f", "url": "wss://api.gladia.io/v2/live?token=636c70f6-92c1-4026-a8b6-0dfe3ecf826f" } ``` ``` -------------------------------- ### Get Pre-recorded Transcription Result Source: https://github.com/gladiaio/docs/blob/main/chapters/pre-recorded-stt/quickstart.mdx Poll the GET /v2/pre-recorded/:id endpoint to retrieve transcription results. Ensure your API key is included in the headers. Handle potential errors by checking the response status. ```javascript const response = await fetch( `https://api.gladia.io/v2/pre-recorded/${id}`, { method: "GET", headers: { "x-gladia-key": "", }, } ); if (!response.ok) { console.error( `${response.status}: ${(await response.text()) || response.statusText}` ); return; } const result = await response.json(); console.log(result); ``` -------------------------------- ### Speech Start Message Payload Source: https://github.com/gladiaio/docs/blob/main/api-reference/v2/live/message/speech-start.mdx This snippet shows the structure of the `SpeechStartMessage` payload. ```APIDOC ## Speech Start Message Payload ### Description This message is used to initiate a speech-to-text transcription session. ### Schema ```json { "type": "object", "properties": { "type": { "type": "string", "enum": ["speech_start"] }, "language": { "type": "string", "description": "The language of the audio to be transcribed. Use BCP-47 format." }, "model": { "type": "string", "description": "The transcription model to use. Defaults to 'faster-whisper-large-v3'." }, "webhook_url": { "type": "string", "description": "Optional URL to receive transcription results via webhook." }, "metadata": { "type": "object", "description": "Optional custom metadata to associate with the transcription." } }, "required": ["type", "language"] } ``` ``` -------------------------------- ### Utterance with Detected Language Source: https://context7.com/gladiaio/docs/llms.txt Example of a response utterance showing the detected language and timestamps. ```json { "text": "Bonjour tout le monde.", "language": "fr", "start": 5.10, "end": 6.80 } ``` -------------------------------- ### Initiate V1 WebSocket Connection Source: https://github.com/gladiaio/docs/blob/main/chapters/live-stt/migration-from-v1.mdx Connect to the V1 WebSocket URL and send configuration. Use this for existing V1 implementations. ```javascript import WebSocket from 'ws'; const socket = new WebSocket('wss://api.gladia.io/audio/text/audio-transcription'); socket.addEventListener("open", function() { // Send configuration socket.send(JSON.stringify({ 'x_gladia_key': 'YOUR_GLADIA_API_KEY', // ...config properties })) // Start sending audio chunks }); ``` -------------------------------- ### Initialize AssemblyAI Streaming Client Source: https://github.com/gladiaio/docs/blob/main/chapters/migrations/from-assembly.mdx Initialize the AssemblyAI streaming client with your API key and host. Ensure the API key is securely stored. ```python from assemblyai.streaming.v3 import StreamingClient, StreamingClientOptions api_key = "" assemblyClient = StreamingClient( StreamingClientOptions( api_key=api_key, api_host="streaming.assemblyai.com", ) ) ``` ```typescript import { AssemblyAI } from "assemblyai"; const assemblyClient = new AssemblyAI({ apiKey: process.env.ASSEMBLYAI_API_KEY!, }); ``` -------------------------------- ### Get Live Transcription Source: https://github.com/gladiaio/docs/blob/main/api-reference/v2/live/get.mdx Fetches the status, parameters, and result of a live transcription by its ID. ```APIDOC ## GET /v2/live/{id} ### Description Retrieves the status, parameters, and result of a live transcription. ### Method GET ### Endpoint /v2/live/{id} ### Parameters #### Path Parameters - **id** (string) - Required - The unique identifier of the live transcription. ``` -------------------------------- ### Pre-Recorded Transcription using Python SDK Source: https://context7.com/gladiaio/docs/llms.txt Initiate a pre-recorded transcription with a single call using the Gladia Python SDK. This example demonstrates how to configure language detection, diarization, translation, and summarization. ```python from gladiaio_sdk import GladiaClient gladia_client = GladiaClient(api_key="YOUR_GLADIA_API_KEY").prerecorded() transcription = gladia_client.transcribe( "https://api.gladia.io/file/636c70f6-92c1-4026-a8b6-0dfe3ecf826f", { "language_config": {"languages": ["en", "fr"], "code_switching": False}, "diarization": True, "diarization_config": {"min_speakers": 1, "max_speakers": 5}, "translation": True, "translation_config": {"model": "enhanced", "target_languages": ["es"]}, "summarization": True, "summarization_config": {"type": "bullet_points"}, }, ) print(transcription) ``` -------------------------------- ### Initialize Live Transcription with Multiple Channels Source: https://github.com/gladiaio/docs/blob/main/chapters/limits-and-specifications/multiple-channels.mdx Specify the channel count in your init request for live audio streams. This tells Gladia how many channels to expect and process separately. ```json { "channels": 2 } ``` -------------------------------- ### Get Pre-recorded Transcription Result Source: https://github.com/gladiaio/docs/blob/main/api-reference/v2/pre-recorded/get.mdx Fetches the status, parameters, and result of a pre-recorded transcription using its ID. ```APIDOC ## GET /v2/pre-recorded/{id} ### Description Retrieves the status, parameters, and result of a pre-recorded transcription job. ### Method GET ### Endpoint /v2/pre-recorded/{id} ### Parameters #### Path Parameters - **id** (string) - Required - The unique identifier of the pre-recorded transcription job. ``` -------------------------------- ### Get Final Results Source: https://github.com/gladiaio/docs/blob/main/chapters/live-stt/quickstart.mdx Retrieve the complete transcription and analysis results for a completed session using the session ID. ```APIDOC ## Get the final results If you want to get the complete result, you can call the [`GET /v2/live/:id` endpoint](/api-reference/v2/live/get) with the `id` you received from the initial request. ### Method GET ### Endpoint /v2/live/:id ### Parameters #### Path Parameters - **id** (string) - Required - The unique identifier of the live session. ### Request Example (JavaScript) ```javascript const response = await fetch(`https://api.gladia.io/v2/live/${sessionId}`, { method: 'GET', headers: { 'x-gladia-key': '', }, }); if (!response.ok) { console.error(`${response.status}: ${(await response.text()) || response.statusText}`) return; } const result = await response.json(); console.log(result) ``` ### Request Example (Python) ```python import os import requests session_id = "" api_key = os.environ.get("GLADIA_API_KEY") or "" response = requests.get( f"https://api.gladia.io/v2/live/{session_id}", headers={"x-gladia-key": api_key}, ) if not response.ok: print(f"{response.status_code}: {response.text or response.reason}") else: print(response.json()) ``` ### Request Example (cURL) ```bash cURL curl --request GET \ --url https://api.gladia.io/v2/live/ID_OF_THE_SESSION \ --header 'x-gladia-key: ' ``` ### Response #### Success Response (200) - **result** (object) - Contains the complete transcription and analysis results. ``` -------------------------------- ### Upload Audio File (Python) Source: https://github.com/gladiaio/docs/blob/main/chapters/pre-recorded-stt/quickstart.mdx Upload a local audio file using the Gladia SDK for Python. The response contains an audio_url to be used in subsequent steps. Requires your API key. ```python from gladiaio_sdk import GladiaClient gladia_client = GladiaClient(api_key="YOUR_GLADIA_API_KEY").prerecorded() upload_response = gladia_client.upload_file("YOUR_LOCAL_PATH") ``` -------------------------------- ### Get Transcription Result Source: https://github.com/gladiaio/docs/blob/main/api-reference/v2/transcription/get.mdx Fetches the status, parameters, and result of a transcription using its ID. Note: This endpoint is deprecated. ```APIDOC ## GET /v2/transcription/{id} ### Description Retrieves the status, parameters, and result of a transcription. ### Method GET ### Endpoint /v2/transcription/{id} ### Parameters #### Path Parameters - **id** (string) - Required - The unique identifier of the transcription. ``` -------------------------------- ### Connect to Live WebSocket and Stream Audio (Raw WebSocket) Source: https://context7.com/gladiaio/docs/llms.txt This snippet shows how to establish a raw WebSocket connection to Gladia's API, send audio data, and receive transcription results without using the SDK. ```APIDOC ## Connect to Live WebSocket and Stream Audio (Raw WebSocket) This example demonstrates establishing a direct WebSocket connection to the Gladia API, sending audio data as binary frames or base64 encoded JSON, and processing incoming messages. ### Language JavaScript ### Code ```javascript import WebSocket from "ws"; const { id, url } = await fetch("https://api.gladia.io/v2/live", { method: "POST", headers: { "Content-Type": "application/json", "x-gladia-key": "YOUR_GLADIA_API_KEY" }, body: JSON.stringify({ encoding: "wav/pcm", sample_rate: 16000, bit_depth: 16, channels: 1 }), }).then((r) => r.json()); const socket = new WebSocket(url); socket.on("open", () => { // Send binary audio chunks socket.send(audioBuffer); // Or send JSON with base64 socket.send(JSON.stringify({ type: "audio_chunk", data: { chunk: audioBuffer.toString("base64") } })); }); socket.on("message", (event) => { const message = JSON.parse(event.data.toString()); if (message.type === "transcript" && message.data.is_final) { console.log(message.data.utterance.text); } }); // Stop recording when done socket.send(JSON.stringify({ type: "stop_recording" })); ``` ``` -------------------------------- ### Create a Pre-recorded Transcription Job with SDK (Python) Source: https://github.com/gladiaio/docs/blob/main/chapters/pre-recorded-stt/quickstart.mdx Use the Gladia Python SDK to initiate a transcription job for pre-recorded audio. Configure language detection, custom vocabulary, and other options. ```python from gladiaio_sdk import GladiaClient gladia_client = GladiaClient(api_key="YOUR_GLADIA_API_KEY").prerecorded() job = gladia_client.create( { "audio_url": "YOUR_AUDIO_URL", "language_config": { "languages": ["en", "fr"], "code_switching": True, }, "custom_vocabulary": True, "custom_vocabulary_config": { "vocabulary": ["Gladia", "Solaria", "Salesforce"], }, } ) ``` -------------------------------- ### Utterance with Word-Level Timestamps Source: https://context7.com/gladiaio/docs/llms.txt This JSON structure shows an utterance with word-level start and end timestamps, along with confidence scores for each word. ```json { "utterances": [ { "text": "Split infinity in a time when less is more.", "start": 0.21, "end": 3.50, "words": [ { "word": "Split", "start": 0.21, "end": 0.69, "confidence": 1.0 }, { "word": " infinity", "start": 0.91, "end": 1.55, "confidence": 0.95 }, { "word": " in", "start": 1.60, "end": 1.80, "confidence": 0.99 } ] } ] } ``` -------------------------------- ### Summarize Live Audio (Post-processing) Source: https://context7.com/gladiaio/docs/llms.txt Enable summarization for live sessions via post-processing. Ensure `receive_post_processing_events` is enabled to retrieve the summary after the session ends. Use 'concise' for a brief summary. ```json { "post_processing": { "summarization": true, "summarization_config": { "type": "concise" } }, "messages_config": { "receive_post_processing_events": true } } ``` -------------------------------- ### Semantic Sentences Response Source: https://context7.com/gladiaio/docs/llms.txt Example response containing semantic sentences, including start/end times, confidence, language, speaker, and channel. ```json { "sentences": { "success": true, "results": [ { "sentence": "Amy, it says you are trained in technology.", "start": 0.47, "end": 2.46, "confidence": 0.95, "language": "en", "speaker": 0, "channel": 0 } ] } } ``` -------------------------------- ### Get Transcription Result Source: https://github.com/gladiaio/docs/blob/main/chapters/pre-recorded-stt/migration-from-v1.mdx Retrieve the transcription result for a pre-recorded audio file. This function is essential for accessing the processed speech-to-text output. ```javascript import GetTranscriptionResult from '/snippets/get-transcription-result.mdx'; ``` -------------------------------- ### Initiate Live (Real-Time) Session Source: https://context7.com/gladiaio/docs/llms.txt Create a real-time transcription WebSocket session. The response includes a unique `url` to open the WebSocket and a session `id` for later retrieval. Configure audio format, language, and feature options here before connecting. A single session cannot exceed 3 hours. ```APIDOC ## POST /v2/live ### Description Create a real-time transcription WebSocket session. The response includes a unique `url` to open the WebSocket and a session `id` for later retrieval. Configure audio format, language, and feature options here before connecting. A single session cannot exceed 3 hours. ### Method POST ### Endpoint `/v2/live` ### Parameters #### Request Body - **model** (string) - Optional - The transcription model to use (e.g., "solaria-1"). - **encoding** (string) - Required - The audio encoding format (e.g., "wav/pcm"). - **sample_rate** (integer) - Required - The audio sample rate. - **bit_depth** (integer) - Required - The audio bit depth. - **channels** (integer) - Required - The number of audio channels. - **language_config** (object) - Configuration for language detection and processing. - **languages** (array of strings) - Required - A list of languages to detect. - **code_switching** (boolean) - Optional - Whether to enable code switching. - **realtime_processing** (object) - Configuration for real-time processing features. - **translation** (boolean) - Optional - Whether to enable translation. - **translation_config** (object) - Configuration for translation. - **target_languages** (array of strings) - Required - A list of target languages for translation. - **model** (string) - Optional - The translation model to use (e.g., "base"). - **sentiment_analysis** (boolean) - Optional - Whether to enable sentiment analysis. - **messages_config** (object) - Configuration for message types to receive. - **receive_partial_transcripts** (boolean) - Optional - Whether to receive partial transcripts. - **receive_final_transcripts** (boolean) - Optional - Whether to receive final transcripts. - **receive_realtime_processing_events** (boolean) - Optional - Whether to receive real-time processing events. ### Request Example ```bash # cURL curl --request POST \ --url https://api.gladia.io/v2/live \ --header 'Content-Type: application/json' \ --header 'x-gladia-key: YOUR_GLADIA_API_KEY' \ --data '{ "model": "solaria-1", "encoding": "wav/pcm", "sample_rate": 16000, "bit_depth": 16, "channels": 1, "language_config": { "languages": ["en"], "code_switching": false }, "realtime_processing": { "translation": true, "translation_config": { "target_languages": ["fr"], "model": "base" }, "sentiment_analysis": true }, "messages_config": { "receive_partial_transcripts": true, "receive_final_transcripts": true, "receive_realtime_processing_events": true } }' ``` ### Response #### Success Response (200) - **id** (string) - The unique identifier for the live session. - **url** (string) - The WebSocket URL to connect to for the live session. #### Response Example ```json { "id": "de70f43f-3041-46e0-892c-8e7f53800a22", "url": "wss://api.gladia.io/v2/live?token=de70f43f-3041-46e0-892c-8e7f53800a22" } ``` ``` -------------------------------- ### Enable Chapterization for Live Audio Source: https://github.com/gladiaio/docs/blob/main/chapters/audio-intelligence/chapterization.mdx For live audio, enable chapterization within 'post_processing' and ensure 'receive_post_processing_events' is true. ```json { "post_processing": { "chapterization": true }, "messages_config": { "receive_post_processing_events": true } } ``` -------------------------------- ### Connect to WebSocket and Handle Messages Source: https://github.com/gladiaio/docs/blob/main/chapters/live-stt/quickstart.mdx Connect to the WebSocket to receive transcription events. Implement handlers for 'message', 'started', 'ended', and 'error' events. ```typescript liveSession.on("message", (message) => { // Handle messages from the API }); liveSession.on("started", (message) => { // Handle start session message }); liveSession.on("ended", (message) => { // Handle end session message }); liveSession.on("error", (message) => { // Handle error message }); ``` ```python from gladiaio_sdk import ( LiveV2WebSocketMessage, LiveV2InitResponse, LiveV2EndedMessage, ) @live_session.on("message") def on_message(message: LiveV2WebSocketMessage) -> None: # Handle messages from the API pass @live_session.on("error") def on_error(error: Exception) -> None: # Handle error message print(f"Live session error: {error}") @live_session.once("started") def on_started(_response: LiveV2InitResponse): # Handle start session print("Session started. Listening…") @live_session.once("ended") def on_ended(_ended: LiveV2EndedMessage): # Handle end session print("Session ended.") ``` -------------------------------- ### Enable Chapterization for Pre-recorded Audio Source: https://github.com/gladiaio/docs/blob/main/chapters/audio-intelligence/chapterization.mdx Set the 'chapterization' flag to true in your request configuration for pre-recorded audio. ```json { "chapterization": true } ``` -------------------------------- ### Get Live Session Result Source: https://context7.com/gladiaio/docs/llms.txt Retrieve the complete transcription and post-processing results for a finished live session using its unique session ID. ```APIDOC ## Get Live Session Result — `GET /v2/live/:id` Retrieve the complete transcription result and post-processing outputs for a finished live session using the session ID from the init response. ### Method GET ### Endpoint `/v2/live/:id` ### Parameters #### Path Parameters - **id** (string) - Required - The unique identifier for the live session. ### Request Example #### Curl ```bash curl --request GET \ --url https://api.gladia.io/v2/live/de70f43f-3041-46e0-892c-8e7f53800a22 \ --header 'x-gladia-key: YOUR_GLADIA_API_KEY' ``` #### Python ```python import requests session_id = "de70f43f-3041-46e0-892c-8e7f53800a22" response = requests.get( f"https://api.gladia.io/v2/live/{session_id}", headers={"x-gladia-key": "YOUR_GLADIA_API_KEY"}, ) response.raise_for_status() print(response.json()) ``` ### Response #### Success Response (200) - **field1** (type) - Description ``` -------------------------------- ### Create a Transcription Job (SDK) Source: https://github.com/gladiaio/docs/blob/main/chapters/pre-recorded-stt/quickstart.mdx Use the Gladia SDK to create a transcription job by passing the audio URL and transcription options. ```APIDOC ## Create a Transcription Job (SDK) This method uses the Gladia SDK to initiate a transcription job for a pre-recorded audio file. ### Method Signature ```javascript gladiaClient.preRecorded().createUntyped(options) ``` ### Parameters - **options** (object) - Required - Configuration for the transcription job. - **audio_url** (string) - Required - The URL of the audio file to transcribe. - **language_config** (object) - Optional - Configuration for language detection and switching. - **languages** (array of strings) - Optional - A list of languages to detect. If empty, Gladia will attempt to detect the language. - **code_switching** (boolean) - Optional - Enables or disables code switching detection. - **custom_vocabulary** (boolean) - Optional - Enables or disables custom vocabulary. - **custom_vocabulary_config** (object) - Optional - Configuration for custom vocabulary. - **vocabulary** (array of strings) - Required if `custom_vocabulary` is true - A list of custom words or phrases. ### Request Example (JavaScript) ```javascript import { GladiaClient } from "@gladiaio/sdk"; const gladiaClient = new GladiaClient({ apiKey: "YOUR_GLADIA_API_KEY" }); const job = await gladiaClient.preRecorded().createUntyped({ audio_url: "YOUR_AUDIO_URL", language_config: { languages: ["en", "fr"], code_switching: true, }, custom_vocabulary: true, custom_vocabulary_config: { vocabulary: ["Gladia", "Solaria", "Salesforce"], }, }); ``` ### Request Example (Python) ```python from gladiaio_sdk import GladiaClient gladia_client = GladiaClient(api_key="YOUR_GLADIA_API_KEY").prerecorded() job = gladia_client.create( { "audio_url": "YOUR_AUDIO_URL", "language_config": { "languages": ["en", "fr"], "code_switching": True, }, "custom_vocabulary": True, "custom_vocabulary_config": { "vocabulary": ["Gladia", "Solaria", "Salesforce"], }, } ) ``` ```