### Install AssemblyAI SDK Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Commands to install the SDK using various package managers. ```bash npm install assemblyai ``` ```bash yarn add assemblyai ``` ```bash pnpm add assemblyai ``` ```bash bun add assemblyai ``` -------------------------------- ### Install Dependencies with npm Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/samples/streaming-stt-from-mic/README.md Run this command to install the necessary Node.js dependencies for the project. ```bash npm install ``` -------------------------------- ### Run the Transcription Sample Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/samples/streaming-stt-from-mic/README.md Execute this command to start the real-time audio transcription sample application. ```bash npm run start ``` -------------------------------- ### Install AssemblyAI Node.js SDK Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/CLAUDE.md Install the SDK using npm. Ensure your Node.js version is 18 or higher. ```bash npm install assemblyai ``` -------------------------------- ### Get Transcript Sentences and Paragraphs Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Retrieve transcript content segmented into sentences or paragraphs, including start and end timing information. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); const transcriptId = "your-transcript-id"; // Get sentences const { sentences } = await client.transcripts.sentences(transcriptId); for (const sentence of sentences) { console.log(`[${sentence.start}ms - ${sentence.end}ms]: ${sentence.text}`); } // Get paragraphs const { paragraphs } = await client.transcripts.paragraphs(transcriptId); for (const paragraph of paragraphs) { console.log(`Paragraph: ${paragraph.text}\n`); } // Output: // [0ms - 3500ms]: Welcome to today's episode. // [3600ms - 8200ms]: We have a special guest joining us. // // Paragraph: Welcome to today's episode. We have a special guest joining us... ``` -------------------------------- ### Get Sentences and Paragraphs Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Retrieve the transcript broken down into sentences or paragraphs with timing information. ```APIDOC ## Get Sentences and Paragraphs ### Description Retrieve the transcript broken down into sentences or paragraphs with timing information. ### Method `GET` ### Endpoint `/v2/transcript/{transcript_id}/sentences` `/v2/transcript/{transcript_id}/paragraphs` ### Parameters #### Path Parameters - **transcript_id** (string) - Required - The ID of the transcript. ### Response #### Success Response (200) **Sentences:** - **sentences** (array) - An array of sentence objects. - **start** (integer) - The start time of the sentence in milliseconds. - **end** (integer) - The end time of the sentence in milliseconds. - **text** (string) - The text of the sentence. **Paragraphs:** - **paragraphs** (array) - An array of paragraph objects. - **text** (string) - The text of the paragraph. #### Response Example (Sentences) ```json { "sentences": [ { "start": 0, "end": 3500, "text": "Welcome to today's episode." }, { "start": 3600, "end": 8200, "text": "We have a special guest joining us." } ] } ``` #### Response Example (Paragraphs) ```json { "paragraphs": [ { "text": "Welcome to today's episode. We have a special guest joining us..." } ] } ``` ``` -------------------------------- ### Get Transcript Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Retrieve an existing transcript by ID to check its current status or access the transcription results. ```APIDOC ## Get Transcript ### Description Retrieve an existing transcript by ID to check its current status or access the transcription results. ### Method `GET` ### Endpoint `/v2/transcript/{transcript_id}` ### Parameters #### Path Parameters - **transcript_id** (string) - Required - The ID of the transcript to retrieve. ### Response #### Success Response (200) - **status** (string) - The current status of the transcript (e.g., "completed", "error"). - **audio_url** (string) - The URL of the audio file that was transcribed. - **audio_duration** (number) - The duration of the audio file in seconds. - **text** (string) - The full transcribed text (if status is "completed"). - **words** (array) - An array of word objects with timing information (if available). - **confidence** (number) - The overall confidence score of the transcription (if available). - **error** (string) - An error message if the status is "error". #### Response Example ```json { "status": "completed", "audio_url": "https://assembly.ai/conference.mp3", "audio_duration": 120.5, "text": "This is the transcribed text.", "words": [ { "text": "This", "start": 100, "end": 200 } ], "confidence": 0.95 } ``` ``` -------------------------------- ### Initialize AssemblyAI Client Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Import the module and instantiate the client with an API key. ```javascript import { AssemblyAI } from "assemblyai"; const baseUrl = "https://api.assemblyai.com"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", baseUrl: baseUrl, }); ``` -------------------------------- ### Initialize AssemblyAI Client Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Create an AssemblyAI client instance using your API key. The `baseUrl` is optional and defaults to the AssemblyAI API endpoint. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", baseUrl: "https://api.assemblyai.com", // optional, defaults to this }); ``` -------------------------------- ### Load SDK via CDN Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md HTML script tags for loading the SDK from UNPKG. ```html ``` -------------------------------- ### Create a Streaming Transcriber with Node.js SDK Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Initialize a streaming transcriber with specific audio configurations. This is the first step for real-time transcription. ```typescript const transcriber = client.streaming.transcriber({ speechModel: "u3-rt-pro", sampleRate: 16_000, formatTurns: true, }); ``` -------------------------------- ### Initialize AssemblyAI Client Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/CLAUDE.md Initialize the AssemblyAI client with your API key. Never expose API keys client-side; use temporary auth tokens for browser streaming. ```typescript const client = new AssemblyAI({ apiKey: process.env.ASSEMBLYAI_API_KEY, }); ``` -------------------------------- ### Configure AssemblyAI API Key Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/samples/streaming-stt-from-mic/README.md Set your AssemblyAI API key as an environment variable or in a .env file. Replace '[YOUR_ASSEMBLYAI_API_KEY]' with your actual API key. ```plaintext ASSEMBLYAI_API_KEY=[YOUR_ASSEMBLYAI_API_KEY] ``` -------------------------------- ### Connect to Streaming Server Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Establish a connection to the streaming audio server using the configured transcriber instance. This must be called after event configuration. ```typescript await transcriber.connect(); ``` -------------------------------- ### Initialize StreamingTranscriber via CDN Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Create a transcriber object using the global assemblyai variable provided by the CDN script. ```javascript const { StreamingTranscriber } = assemblyai; const transcriber = new StreamingTranscriber({ token: "[GENERATE TEMPORARY AUTH TOKEN IN YOUR API]", ... }); ``` -------------------------------- ### Initialize StreamingTranscriber in the browser Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/docs/compat.md Use a temporary token to initialize the StreamingTranscriber instance in a browser environment. ```js import { StreamingTranscriber } from "assemblyai"; // or the following if you're using UMD // const { StreamingTranscriber } = assemblyai; const token = getToken(); // getToken is a function for you to implement const rt = new StreamingTranscriber({ token: token, }); ``` -------------------------------- ### Initialize Streaming Transcriber with Token Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Instantiate the StreamingTranscriber client using a provided authentication token, typically retrieved from a server. ```typescript import { StreamingTranscriber } from "assemblyai"; // TODO: implement getToken to retrieve token from server const token = await getToken(); const transcriber = new StreamingTranscriber({ token, }); ``` -------------------------------- ### Real-time Streaming Transcription Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Set up a real-time streaming transcriber using WebSockets for live audio. Configure event handlers for session status and transcriptions. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); // Create streaming transcriber const transcriber = client.streaming.transcriber({ sampleRate: 16000, speechModel: "u3-rt-pro", formatTurns: true, encoding: "pcm_s16le", // or "pcm_mulaw" }); // Configure event handlers transcriber.on("open", ({ id, expires_at }) => { console.log("Session opened:", id); console.log("Expires at:", new Date(expires_at * 1000)); }); transcriber.on("turn", ({ transcript, end_of_turn, words }) => { if (end_of_turn) { console.log("Final:", transcript); } else { console.log("Partial:", transcript); } }); transcriber.on("error", (error) => { console.error("Streaming error:", error.message); }); transcriber.on("close", (code, reason) => { console.log("Session closed:", code, reason); }); // Connect to AssemblyAI await transcriber.connect(); // Send audio chunks (from microphone, file, etc.) // transcriber.sendAudio(audioChunk); // Or pipe a stream // audioStream.pipeTo(transcriber.stream()); // Update configuration mid-stream transcriber.updateConfiguration({ end_of_turn_confidence_threshold: 0.8, min_turn_silence: 500, }); // Force end of current turn transcriber.forceEndpoint(); // Close when done await transcriber.close(); ``` -------------------------------- ### Upload File API Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Upload an audio file to AssemblyAI and receive a URL for transcription using the upload method. ```APIDOC ## POST /v2/upload ### Description Upload an audio file to AssemblyAI and receive a URL for transcription. ### Method POST ### Endpoint `/v2/upload` ### Parameters #### Request Body - **file** (File, Buffer, Stream) - Required - The audio file to upload. ### Response #### Success Response (200) - **upload_url** (string) - The URL where the file has been uploaded. #### Response Example ```json { "upload_url": "https://aai-prod-files.s3.us-east-2.amazonaws.com/your-file-id.mp3" } ``` ``` -------------------------------- ### Browser Usage with AssemblyAI SDK CDN Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Load the AssemblyAI SDK directly in the browser using a CDN script tag for client-side streaming applications. Choose between the full SDK or a smaller streaming-only version. ```html ``` -------------------------------- ### Real-time Streaming Transcription Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/CLAUDE.md Set up a real-time streaming transcription session. Use the `u3-rt-pro` speech model for streaming. The `turn` event provides transcribed text as it becomes available. ```typescript const transcriber = client.streaming.transcriber({ speechModel: "u3-rt-pro", sampleRate: 16_000, }); transcriber.on("turn", (turn) => { console.log(turn.text); }); await transcriber.connect(); // Send audio chunks: transcriber.sendAudio(chunk) await transcriber.close(); ``` -------------------------------- ### LeMUR Custom Task with Transcript IDs Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Run custom prompts against transcript content for flexible LLM-powered analysis using transcript IDs. Allows for detailed analysis and specific output formats like JSON. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); const response = await client.lemur.task({ transcript_ids: ["transcript-id"], prompt: `Analyze this customer call transcript and provide: 1. A brief summary (2-3 sentences) 2. Customer sentiment score (1-10) 3. Key topics discussed 4. Any follow-up actions needed 5. Potential upsell opportunities Format the response as JSON.`, context: "This is a sales call with a potential enterprise customer.", final_model: "anthropic/claude-3-5-sonnet", temperature: 0.2, max_output_size: 1000, }); console.log("Analysis:\n", response.response); ``` -------------------------------- ### Upload Audio File Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Upload audio files to AssemblyAI for transcription. Supports uploads from file paths, buffers, and streams. ```javascript import { AssemblyAI } from "assemblyai"; import fs from "fs"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); // Upload from file path const uploadUrl = await client.files.upload("./recording.mp3"); console.log("Uploaded to:", uploadUrl); // Upload from buffer const buffer = fs.readFileSync("./audio.wav"); const uploadUrl2 = await client.files.upload(buffer); // Upload from stream const stream = fs.createReadStream("./podcast.mp3"); const uploadUrl3 = await client.files.upload(stream); // Use uploaded URL for transcription const transcript = await client.transcripts.transcribe({ audio_url: uploadUrl, }); ``` -------------------------------- ### List and paginate transcripts Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Retrieve a list of transcripts or iterate through all pages using pagination URLs. ```javascript const page = await client.transcripts.list(); ``` ```typescript let previousPageUrl: string | null = null; do { const page = await client.transcripts.list(previousPageUrl); previousPageUrl = page.page_details.prev_url; } while (previousPageUrl !== null); ``` -------------------------------- ### Configure Streaming Transcriber Events Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Set up event listeners for the streaming transcriber to handle session openings, closures, transcriptions, and errors. ```typescript transcriber.on("open", ({ id, expires_at }) => console.log('Session ID:', id, 'Expires at:', expires_at)); transcriber.on("close", (code: number, reason: string) => console.log('Closed', code, reason)); transcriber.on("turn", ({ transcript }) => console.log('Transcript:', transcript)); transcriber.on("error", (error: Error) => console.error('Error', error)); ``` -------------------------------- ### Create Temporary Token for Streaming with Node.js SDK Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Generate a temporary authentication token on the server for secure client-side streaming. This token has a limited lifespan. ```typescript const token = await client.streaming.createTemporaryToken({ expires_in_seconds = 60, }); ``` -------------------------------- ### Transcribe Local Audio Files Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Transcribe audio from local files by providing a file path, a readable stream, or a buffer. The SDK handles the file upload automatically before transcription. ```javascript import { AssemblyAI } from "assemblyai"; import fs from "fs"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); // From file path const transcript = await client.transcripts.transcribe({ audio: "./interview.mp3", speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }); // From readable stream const stream = fs.createReadStream("./podcast.wav"); const transcriptFromStream = await client.transcripts.transcribe({ audio: stream, speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }); // From buffer const buffer = fs.readFileSync("./recording.mp3"); const transcriptFromBuffer = await client.transcripts.transcribe({ audio: buffer, speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }); console.log("Transcript:", transcript.text); ``` -------------------------------- ### Transcribe Local Audio File Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/CLAUDE.md Transcribe an audio file from your local file system. The `.transcribe()` method waits for completion. ```typescript const transcript = await client.transcripts.transcribe({ audio: "./recording.mp3", }); ``` -------------------------------- ### Transcribe a local audio file Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Upload and transcribe a local file path using the transcribe or submit methods. ```javascript // Upload a file via local path and transcribe let transcript = await client.transcripts.transcribe({ audio: "./news.mp4", speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }); ``` ```javascript let transcript = await client.transcripts.submit({ audio: "./news.mp4", speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }); ``` -------------------------------- ### Create Temporary Token for Browser Streaming API Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Generate a temporary authentication token for client-side streaming without exposing your API key. ```APIDOC ## POST /v2/realtime/token ### Description Generate a temporary authentication token for client-side streaming. ### Method POST ### Endpoint `/v2/realtime/token` ### Parameters #### Request Body - **expires_in_seconds** (integer) - Required - The duration in seconds for which the token will be valid. - **max_session_duration_seconds** (integer) - Optional - The maximum duration in seconds for a streaming session. ### Response #### Success Response (200) - **token** (string) - The temporary authentication token. #### Response Example ```json { "token": "your-temporary-token" } ``` ``` -------------------------------- ### Generate LeMUR Summary Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Use this to generate an AI-powered summary of one or more transcripts. Requires transcript IDs and can be configured with context, answer format, and model parameters. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); // First, transcribe audio const transcript = await client.transcripts.transcribe({ audio: "https://assembly.ai/meeting.mp3", }); // Generate summary const summary = await client.lemur.summary({ transcript_ids: [transcript.id], context: "This is a team meeting about Q4 planning.", answer_format: "bullet points", final_model: "anthropic/claude-3-5-sonnet", temperature: 0, max_output_size: 2000, }); console.log("Summary:", summary.response); console.log("Request ID:", summary.request_id); console.log("Tokens used:", summary.usage.input_tokens, "in,", summary.usage.output_tokens, "out"); // Output: // Summary: // - Team discussed Q4 revenue targets of $2M // - Marketing plans to launch new campaign in October // - Engineering to complete platform upgrade by November // Request ID: 5e1b27c2-691f-4414-8bc5-f14678442f9e // Tokens used: 1500 in, 150 out ``` -------------------------------- ### Transcribe Audio File from URL Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Transcribe an audio file directly from a URL. This method queues the job and automatically polls for completion, returning the full transcript. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); // Transcribe from URL const transcript = await client.transcripts.transcribe({ audio: "https://assembly.ai/sports_injuries.mp3", speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }); console.log("Transcript ID:", transcript.id); console.log("Status:", transcript.status); console.log("Text:", transcript.text); ``` -------------------------------- ### Streaming Transcription API Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Create a real-time streaming transcriber for live audio transcription using WebSocket connections. ```APIDOC ## WebSocket Endpoint for Streaming Transcription ### Description Establish a WebSocket connection for real-time audio transcription. ### Method WebSocket ### Endpoint `wss://api.assemblyai.com/v2/realtime/ws` ### Parameters #### Query Parameters - **sample_rate** (integer) - Required - The audio sample rate (e.g., 16000). - **speech_model** (string) - Optional - The speech model to use (e.g., "u3-rt-pro"). - **format_turns** (boolean) - Optional - Whether to format turns in the output. - **encoding** (string) - Optional - The audio encoding format (e.g., "pcm_s16le", "pcm_mulaw"). - **token** (string) - Required for browser streaming - A temporary token generated by `createTemporaryToken`. ### Request Body (Audio Chunks) Binary audio data chunks are sent over the WebSocket connection. ### Response #### Events - **open**: Emitted when the session is opened. Contains `id` and `expires_at`. - **turn**: Emitted when a turn of speech is detected. Contains `transcript`, `end_of_turn`, and `words`. - **error**: Emitted when an error occurs. Contains an `error` object with a `message`. - **close**: Emitted when the session is closed. Contains `code` and `reason`. ### Configuration Updates Send JSON messages to the WebSocket to update configuration mid-stream: ```json { "end_of_turn_confidence_threshold": 0.8, "min_turn_silence": 500 } ``` ### Actions Send JSON messages to the WebSocket to perform actions: ```json { "type": "ForceEndpoint" } ``` ``` -------------------------------- ### Transcribe Audio from URL Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/CLAUDE.md Transcribe audio from a given URL. This method polls until the transcription is complete. Supports specifying multiple speech models for fallback ordering and enabling speaker labels. ```typescript const transcript = await client.transcripts.transcribe({ audio: "https://example.com/audio.mp3", speech_models: ["universal-3-pro", "universal-2"], speaker_labels: true, }); console.log(transcript.text); for (const utterance of transcript.utterances) { console.log(`Speaker ${utterance.speaker}: ${utterance.text}`); } ``` -------------------------------- ### List and Paginate Transcripts Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Retrieve a paginated list of transcripts with filtering options. Use the page_details to iterate through older records. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); // Get first page const page = await client.transcripts.list({ limit: 10, status: "completed", }); console.log("Transcripts:", page.transcripts.length); for (const t of page.transcripts) { console.log(`- ${t.id}: ${t.status} (${t.created})`); } // Paginate through all transcripts let previousPageUrl = page.page_details.prev_url; while (previousPageUrl !== null) { const olderPage = await client.transcripts.list(previousPageUrl); console.log("Older transcripts:", olderPage.transcripts.length); previousPageUrl = olderPage.page_details.prev_url; } ``` -------------------------------- ### Generate a temporary streaming token Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/docs/compat.md Generate a temporary token on the server to avoid exposing the API key in client-side applications. ```js import { AssemblyAI } from "assemblyai"; // Ideally, to avoid embedding your API key client side, // you generate this token on the server, and pass it to the client via an API. const client = new AssemblyAI({ apiKey: "YOUR_API_KEY" }); const token = await client.streaming.createTemporaryToken({ expires_in_seconds: 60, }); ``` -------------------------------- ### Retrieve Transcript Details Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Fetch an existing transcript by its ID to check status, audio metadata, and transcription results. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); const transcript = await client.transcripts.get("your-transcript-id"); console.log("Status:", transcript.status); console.log("Audio URL:", transcript.audio_url); console.log("Duration:", transcript.audio_duration, "seconds"); if (transcript.status === "completed") { console.log("Text:", transcript.text); console.log("Words:", transcript.words?.length); console.log("Confidence:", transcript.confidence); } if (transcript.status === "error") { console.log("Error:", transcript.error); } ``` -------------------------------- ### Transcribe audio from a URL Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Transcribe a remote audio file using the transcribe method, which polls until completion, or the submit method for non-blocking submission. ```javascript // Transcribe file at remote URL const audioFile = "https://assembly.ai/sports_injuries.mp3"; const params = { audio: audioFile, speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }; const run = async () => { const transcript = await client.transcripts.transcribe(params); console.log(transcript.text); }; run(); ``` ```javascript let transcript = await client.transcripts.submit({ audio: "https://assembly.ai/espn.m4a", speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }); ``` -------------------------------- ### Generate subtitles Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Export transcript data as SRT or VTT subtitle formats with optional character limits. ```javascript const charsPerCaption = 32; let srt = await client.transcripts.subtitles(transcript.id, "srt"); srt = await client.transcripts.subtitles(transcript.id, "srt", charsPerCaption); let vtt = await client.transcripts.subtitles(transcript.id, "vtt"); vtt = await client.transcripts.subtitles(transcript.id, "vtt", charsPerCaption); ``` -------------------------------- ### Retrieve Subtitles Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/CLAUDE.md Fetch transcription results in subtitle formats (SRT or VTT) using the transcript ID. ```typescript const srt = await client.transcripts.subtitles(id, "srt"); const vtt = await client.transcripts.subtitles(id, "vtt"); ``` -------------------------------- ### LeMUR Custom Task with Input Text Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Perform custom analysis on raw text input using the LeMUR task endpoint. Useful for analyzing short text snippets or when a full transcript is not available. ```javascript // You can also use input_text instead of transcript_ids const responseFromText = await client.lemur.task({ input_text: "Speaker A: Hello, I'm calling about your enterprise plan...", prompt: "Summarize this conversation in one paragraph.", }); ``` -------------------------------- ### Define global types in assemblyai.d.ts Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/docs/reference-types-from-js.md Create this file to map the AssemblyAI module types to the global assemblyai variable. ```typescript import AssemblyAIModule from "assemblyai"; declare global { const assemblyai: typeof AssemblyAIModule; } ``` -------------------------------- ### Reference types in JavaScript Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/docs/reference-types-from-js.md Use a triple-slash reference directive at the top of your script file to enable IDE type support. ```js /// const { RealtimeTranscriber } = assemblyai; ... ``` -------------------------------- ### Enable Speaker Diarization Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Enable speaker labels to identify individual speakers in the transcript output. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "", }); const audioFile = "https://assembly.ai/wildfires.mp3"; const params = { audio: audioFile, speech_models: ["universal-3-pro", "universal-2"], language_detection: true, speaker_labels: true, }; const run = async () => { const transcript = await client.transcripts.transcribe(params); for (const utterance of transcript.utterances!) { console.log(`Speaker ${utterance.speaker}: ${utterance.text}`); } }; run(); ``` -------------------------------- ### Enable Speaker Diarization in Transcriptions Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Use speaker_labels to identify individual speakers in audio. You can provide an expected speaker count or use speaker_options for range-based constraints. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); const transcript = await client.transcripts.transcribe({ audio: "https://assembly.ai/wildfires.mp3", speech_models: ["universal-3-pro", "universal-2"], language_detection: true, speaker_labels: true, speakers_expected: 2, // Optional hint for expected number of speakers }); // Access speaker-labeled utterances if (transcript.utterances) { for (const utterance of transcript.utterances) { console.log(`Speaker ${utterance.speaker}: ${utterance.text}`); } } // Output: // Speaker A: Welcome to the show today. // Speaker B: Thank you for having me. // Speaker A: Let's talk about the recent wildfires... // Advanced: Use speaker_options for precise control const transcriptAdvanced = await client.transcripts.transcribe({ audio: "https://assembly.ai/conference.mp3", speaker_labels: true, speaker_options: { min_speakers_expected: 2, max_speakers_expected: 10, }, }); ``` -------------------------------- ### Extract sentences and paragraphs Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Retrieve structured text segments from a completed transcript. ```javascript const sentences = await client.transcripts.sentences(transcript.id); const { paragraphs } = await client.transcripts.paragraphs(transcript.id); for (const paragraph of paragraphs) { console.log(paragraph.text); } for (const sentence of sentences) { console.log(sentence.text); } ``` -------------------------------- ### Generate Subtitles Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Export transcript as SRT or VTT subtitle format for video captioning. You can also specify the maximum number of characters per caption. ```APIDOC ## Generate Subtitles ### Description Export transcript as SRT or VTT subtitle format for video captioning. ### Method `GET` ### Endpoint `/v2/transcript/{transcript_id}/subtitles/{format}` ### Parameters #### Path Parameters - **transcript_id** (string) - Required - The ID of the transcript. - **format** (string) - Required - The subtitle format (e.g., "srt", "vtt"). #### Query Parameters - **chars_per_caption** (integer) - Optional - Maximum number of characters per caption. ### Response #### Success Response (200) - **subtitle content** (string) - The subtitle content in the requested format (SRT or VTT). #### Response Example (SRT) ```srt 1 00:00:00,000 --> 00:00:03,500 Welcome to today's episode. 2 00:00:03,600 --> 00:00:08,200 We have a special guest joining us. ``` ``` -------------------------------- ### Extract LeMUR Action Items Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Extract action items and tasks from meeting transcripts. Can specify the answer format, such as a numbered list with assignees. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); const response = await client.lemur.actionItems({ transcript_ids: ["meeting-transcript-id"], context: "This is a project kickoff meeting with the engineering team.", answer_format: "numbered list with assignees", final_model: "anthropic/claude-3-5-sonnet", }); console.log("Action Items:\n", response.response); console.log("Request ID:", response.request_id); // Output: // Action Items: // 1. Set up development environment - John (Due: Friday) // 2. Create project repository and CI/CD pipeline - Sarah (Due: Monday) // 3. Draft technical specification document - Mike (Due: Next Wednesday) // 4. Schedule follow-up meeting for architecture review - Jane (Due: Tomorrow) ``` -------------------------------- ### Transcribe with Multiple Features Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/CLAUDE.md Enable multiple audio intelligence features during transcription, such as sentiment analysis, entity detection, auto chapters, and language detection. The `speech_models` parameter accepts an array for fallback. ```typescript const transcript = await client.transcripts.transcribe({ audio: audioUrl, speech_models: ["universal-3-pro", "universal-2"], speaker_labels: true, sentiment_analysis: true, entity_detection: true, auto_chapters: true, language_detection: true, }); ``` -------------------------------- ### Send Audio Stream to Transcriber Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Pipe an audio stream directly to the streaming transcriber for efficient processing of larger audio sources. ```typescript audioStream.pipeTo(transcriber.stream()); ``` -------------------------------- ### Search for Words in Transcript Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Use this to find occurrences of specific words or phrases within a transcript. Requires a transcript ID. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); const result = await client.transcripts.wordSearch("your-transcript-id", [ "climate", "wildfire", "temperature", ]); console.log("Total matches:", result.total_count); for (const match of result.matches) { console.log(`"${match.text}" found ${match.count} times`); for (const timestamp of match.timestamps) { console.log(` - at ${timestamp.start}ms to ${timestamp.end}ms`); } } ``` -------------------------------- ### Retrieve PII-Redacted Audio with Node.js SDK Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Retrieve the PII-redacted audio file URL after transcription with PII redaction enabled. This requires the transcript ID and specifies the PII policies and audio quality for redaction. ```javascript import { AssemblyAI } from "assemblyai"; import fs from "fs"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); // Transcribe with PII redaction const transcript = await client.transcripts.transcribe({ audio: "https://example.com/customer-call.mp3", redact_pii: true, redact_pii_policies: ["person_name", "phone_number", "email_address"], redact_pii_audio: true, redact_pii_audio_quality: "mp3", }); // Get redacted audio URL const redactedAudio = await client.transcripts.redactedAudio(transcript.id); console.log("Status:", redactedAudio.status); console.log("Redacted audio URL:", redactedAudio.redacted_audio_url); // Download redacted audio file const audioFile = await client.transcripts.redactedAudioFile(transcript.id); const buffer = await audioFile.arrayBuffer(); fs.writeFileSync("redacted-call.mp3", Buffer.from(buffer)); console.log("Redacted audio saved to redacted-call.mp3"); ``` -------------------------------- ### List Transcripts Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Retrieve a paginated list of your transcripts with optional filtering by status and limiting the number of results per page. ```APIDOC ## List Transcripts ### Description Retrieve a paginated list of your transcripts with optional filtering. ### Method `GET` ### Endpoint `/v2/transcripts` ### Parameters #### Query Parameters - **limit** (integer) - Optional - The maximum number of transcripts to return per page. - **status** (string) - Optional - Filter transcripts by status (e.g., "completed", "processing", "error"). - **before_id** (string) - Optional - Returns transcripts created before the specified transcript ID. - **after_id** (string) - Optional - Returns transcripts created after the specified transcript ID. ### Response #### Success Response (200) - **transcripts** (array) - An array of transcript objects. - **id** (string) - The ID of the transcript. - **status** (string) - The status of the transcript. - **created** (string) - The creation timestamp of the transcript. - **page_details** (object) - Information about the current page of results. - **locale** (string) - The locale of the results. - **next_url** (string) - URL for the next page of results. - **prev_url** (string) - URL for the previous page of results. #### Response Example ```json { "transcripts": [ { "id": "your-transcript-id-1", "status": "completed", "created": "2023-01-01T10:00:00Z" }, { "id": "your-transcript-id-2", "status": "processing", "created": "2023-01-01T10:05:00Z" } ], "page_details": { "locale": "en_US", "next_url": "/v2/transcripts?limit=10&after_id=your-transcript-id-2", "prev_url": null } } ``` ``` -------------------------------- ### LeMUR Question and Answer Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Ask specific questions about transcript content and receive structured answers. Supports multiple questions with different answer formats and options. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); const response = await client.lemur.questionAnswer({ transcript_ids: ["transcript-id-1", "transcript-id-2"], context: "These are customer support calls.", questions: [ { question: "What were the main customer complaints?", answer_format: "bullet points", }, { question: "Were any refunds requested?", answer_options: ["Yes", "No", "Unclear"], }, { question: "What was the customer sentiment?", context: "Consider the tone and language used.", }, ], final_model: "anthropic/claude-3-5-sonnet", }); for (const qa of response.response) { console.log(`Q: ${qa.question}`); console.log(`A: ${qa.answer}\n`); } // Output: // Q: What were the main customer complaints? // A: - Delayed shipping times // - Product quality issues // - Difficulty reaching support // // Q: Were any refunds requested? // A: Yes // // Q: What was the customer sentiment? // A: The customer was frustrated but remained polite throughout the conversation. ``` -------------------------------- ### Submit Transcription Job Without Polling Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Queue a transcription job without waiting for it to complete. You can poll for the status and result later using `waitUntilReady` with custom polling options. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); // Submit without waiting let transcript = await client.transcripts.submit({ audio: "https://assembly.ai/espn.m4a", speech_models: ["universal-3-pro", "universal-2"], language_detection: true, }); console.log("Queued transcript ID:", transcript.id); console.log("Initial status:", transcript.status); // "queued" or "processing" // Poll until ready with custom options transcript = await client.transcripts.waitUntilReady(transcript.id, { pollingInterval: 1000, // Check every 1 second pollingTimeout: 300000, // Timeout after 5 minutes }); console.log("Final status:", transcript.status); // "completed" or "error" console.log("Text:", transcript.text); ``` -------------------------------- ### Send Audio Chunks to Transcriber Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Send audio data to the streaming transcriber in small chunks as they become available. This is suitable for real-time audio processing. ```typescript // Pseudo code for getting audio getAudio((chunk) => { transcriber.sendAudio(chunk); }); ``` -------------------------------- ### Speaker Diarization Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Enable speaker labels to identify who said what in multi-speaker audio recordings. You can provide an optional hint for the expected number of speakers or use `speaker_options` for more precise control. ```APIDOC ## Speaker Diarization ### Description Enable speaker labels to identify who said what in multi-speaker audio recordings. ### Method `POST` (Implicitly via `client.transcripts.transcribe`) ### Endpoint `/v2/realtime/transcribe` or `/v2/transcript` (depending on usage) ### Parameters #### Request Body - **audio** (string) - Required - URL of the audio file. - **speech_models** (array of strings) - Optional - List of speech models to use. - **language_detection** (boolean) - Optional - Enable language detection. - **speaker_labels** (boolean) - Required - Enable speaker diarization. - **speakers_expected** (integer) - Optional - Hint for the expected number of speakers. - **speaker_options** (object) - Optional - For precise control over speaker diarization. - **min_speakers_expected** (integer) - Optional - Minimum number of speakers expected. - **max_speakers_expected** (integer) - Optional - Maximum number of speakers expected. ### Request Example ```json { "audio": "https://assembly.ai/wildfires.mp3", "speech_models": ["universal-3-pro", "universal-2"], "language_detection": true, "speaker_labels": true, "speakers_expected": 2 } ``` ### Response #### Success Response (200) - **utterances** (array) - Contains speaker-labeled utterances. - **speaker** (string) - The label of the speaker (e.g., "A", "B"). - **text** (string) - The transcribed text for the utterance. #### Response Example ```json { "utterances": [ { "speaker": "A", "text": "Welcome to the show today." }, { "speaker": "B", "text": "Thank you for having me." }, { "speaker": "A", "text": "Let's talk about the recent wildfires..." } ] } ``` ``` -------------------------------- ### Close Streaming Connection Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Gracefully close the connection to the streaming server when audio transmission is complete or no longer needed. ```typescript await transcriber.close(); ``` -------------------------------- ### Retrieve LeMUR Response Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Retrieve a previously generated LeMUR response using its request ID. This is useful for accessing results without re-running the analysis. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); // Retrieve a previous LeMUR response const previousResponse = await client.lemur.getResponse("5e1b27c2-691f-4414-8bc5-f14678442f9e"); console.log("Retrieved response:", previousResponse.response); console.log("Usage:", previousResponse.usage); ``` -------------------------------- ### Word Search API Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Search for specific words or phrases within a transcript using the wordSearch method. ```APIDOC ## POST /v2/transcript/{transcript_id}/word-search ### Description Search for specific words or phrases within a transcript. ### Method POST ### Endpoint `/v2/transcript/{transcript_id}/word-search` ### Parameters #### Path Parameters - **transcript_id** (string) - Required - The ID of the transcript to search within. #### Request Body - **words** (array of strings) - Required - An array of words or phrases to search for. ### Request Example ```json { "words": [ "climate", "wildfire", "temperature" ] } ``` ### Response #### Success Response (200) - **total_count** (integer) - The total number of matches found. - **matches** (array of objects) - A list of matches, each containing: - **text** (string) - The matched word or phrase. - **count** (integer) - The number of times the word or phrase appears. - **timestamps** (array of objects) - A list of timestamps where the word or phrase was found, each containing: - **start** (integer) - The start time in milliseconds. - **end** (integer) - The end time in milliseconds. #### Response Example ```json { "total_count": 5, "matches": [ { "text": "wildfire", "count": 3, "timestamps": [ { "start": 12500, "end": 13200 }, { "start": 45600, "end": 46100 }, { "start": 89000, "end": 89800 } ] } ] } ``` ``` -------------------------------- ### Retrieve and poll transcript status Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Fetch a transcript object or wait for completion using polling intervals and timeouts. ```javascript const transcript = await client.transcripts.get(transcript.id); ``` ```javascript const transcript = await client.transcripts.waitUntilReady(transcript.id, { // How frequently the transcript is polled in ms. Defaults to 3000. pollingInterval: 1000, // How long to wait in ms until the "Polling timeout" error is thrown. Defaults to infinite (-1). pollingTimeout: 5000, }); ``` -------------------------------- ### Delete a Transcript with Node.js SDK Source: https://github.com/assemblyai/assemblyai-node-sdk/blob/main/README.md Use this snippet to delete a specific transcript by its ID. Ensure you have the transcript ID and an authenticated client instance. ```javascript const res = await client.transcripts.delete(transcript.id); ``` -------------------------------- ### Delete LeMUR Request Data with Node.js SDK Source: https://context7.com/assemblyai/assemblyai-node-sdk/llms.txt Use this snippet to delete data associated with a LeMUR request for privacy compliance. Ensure you have your API key and the correct request ID. ```javascript import { AssemblyAI } from "assemblyai"; const client = new AssemblyAI({ apiKey: "YOUR_API_KEY", }); const result = await client.lemur.purgeRequestData("5e1b27c2-691f-4414-8bc5-f14678442f9e"); console.log("Request ID deleted:", result.request_id); console.log("Deleted:", result.deleted); // Output: // Request ID deleted: 5e1b27c2-691f-4414-8bc5-f14678442f9e // Deleted: true ```