Try Live
Add Docs
Rankings
Pricing
Docs
Install
Theme
Install
Docs
Pricing
More...
More...
Try Live
Rankings
Enterprise
Create API Key
Add Docs
AssemblyAI Python SDK
https://github.com/assemblyai/assemblyai-python-sdk
Admin
A Python SDK for AssemblyAI's AI models, enabling transcription and understanding of audio and
...
Tokens:
21,897
Snippets:
102
Trust Score:
9.3
Update:
4 months ago
Context
Skills
Chat
Benchmark
82.1
Suggestions
Latest
Show doc for...
Code
Info
Show Results
Context Summary (auto-generated)
Raw
Copy
Link
# AssemblyAI Python SDK The AssemblyAI Python SDK provides a comprehensive interface for transcribing and understanding audio using AI models. This SDK offers access to AssemblyAI's speech-to-text API, audio intelligence features, and LeMUR (Leveraging Large Language Models to Understand Recognized Speech) framework for advanced audio analysis. The SDK supports both synchronous and asynchronous operations, real-time streaming transcription, and a wide range of audio intelligence features including sentiment analysis, entity detection, content safety, and PII redaction. The SDK is built with a focus on developer experience, offering simple interfaces for common tasks while providing granular control for advanced use cases. It handles file uploads, transcription polling, subtitle generation, and integrates seamlessly with large language models for post-processing audio transcripts. All operations are configurable through a comprehensive settings system, and the SDK supports both batch and streaming transcription workflows. ## API Reference ### Transcribe Audio Files Transcribe audio files from URLs, local paths, or binary data with automatic file upload handling. ```python import assemblyai as aai # Set API key aai.settings.api_key = "your-api-key-here" # Create transcriber transcriber = aai.Transcriber() # Transcribe from URL transcript = transcriber.transcribe("https://example.com/audio.mp3") print(transcript.text) print(f"Confidence: {transcript.confidence}") print(f"Duration: {transcript.audio_duration} seconds") # Transcribe local file transcript = transcriber.transcribe("./local-audio.wav") # Transcribe binary data with open("audio.mp3", "rb") as f: audio_data = f.read() transcript = transcriber.transcribe(audio_data) # Handle transcription errors if transcript.status == aai.TranscriptStatus.error: print(f"Transcription failed: {transcript.error}") else: print(transcript.text) ``` ### Configure Transcription Options Customize transcription with language detection, speaker labels, and formatting options. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" # Configure transcription settings config = aai.TranscriptionConfig( language_code="en", speaker_labels=True, speakers_expected=2, punctuate=True, format_text=True, dual_channel=False, webhook_url="https://your-server.com/webhook", word_boost=["custom", "vocabulary"], boost_param="high" ) # Use configuration transcriber = aai.Transcriber(config=config) transcript = transcriber.transcribe("https://example.com/meeting.mp3") # Access speaker-labeled utterances for utterance in transcript.utterances: print(f"Speaker {utterance.speaker}: {utterance.text}") print(f"Confidence: {utterance.confidence}") print(f"Start: {utterance.start}ms, End: {utterance.end}ms") ``` ### Submit Transcription Jobs Submit transcription jobs without waiting for completion for asynchronous workflows. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() # Submit job without polling transcript = transcriber.submit("https://example.com/audio.mp3") print(f"Submitted transcript with ID: {transcript.id}") print(f"Status: {transcript.status}") # Later, wait for completion transcript = transcript.wait_for_completion() print(transcript.text) # Or retrieve existing transcript by ID transcript = aai.Transcript.get_by_id("transcript-id-here") print(transcript.text) # Use async variant future = transcriber.transcribe_async("https://example.com/audio.mp3") # Do other work... transcript = future.result() print(transcript.text) ``` ### Transcribe Multiple Files Process multiple audio files concurrently with automatic batching and error handling. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() # Transcribe multiple files audio_files = [ "https://example.com/audio1.mp3", "https://example.com/audio2.mp3", "./local-audio.wav", ] # Returns TranscriptGroup transcript_group = transcriber.transcribe_group(audio_files) # Iterate over completed transcripts for transcript in transcript_group: print(f"ID: {transcript.id}") print(f"Text: {transcript.text}") print(f"Status: {transcript.status}") # Handle failures separately transcript_group, failures = transcriber.transcribe_group( audio_files, return_failures=True ) for error in failures: print(f"Failed: {error}") # Check overall status print(f"Group status: {transcript_group.status}") ``` ### Export Subtitles Generate SRT and VTT subtitle files from transcripts for video players. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/video-audio.mp3") # Export to SRT format srt_subtitles = transcript.export_subtitles_srt() print(srt_subtitles) # Export to VTT format vtt_subtitles = transcript.export_subtitles_vtt() print(vtt_subtitles) # Control caption length srt_subtitles = transcript.export_subtitles_srt(chars_per_caption=32) # Save to files with open("subtitles.srt", "w") as f: f.write(srt_subtitles) with open("subtitles.vtt", "w") as f: f.write(vtt_subtitles) ``` ### Search and Navigate Transcripts Search for words, retrieve sentences and paragraphs for structured navigation. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/audio.mp3") # Search for specific words or phrases matches = transcript.word_search(["price", "product", "discount"]) for match in matches: print(f"Found '{match.text}' {match.count} times") for timestamp in match.timestamps: print(f" At {timestamp}ms") # Get sentences sentences = transcript.get_sentences() for sentence in sentences: print(f"{sentence.text}") print(f"Start: {sentence.start}ms, End: {sentence.end}ms") print(f"Confidence: {sentence.confidence}") # Get paragraphs paragraphs = transcript.get_paragraphs() for paragraph in paragraphs: print(f"{paragraph.text}") print(f"Start: {paragraph.start}ms, End: {paragraph.end}ms") ``` ### Custom Spelling Dictionary Ensure proper spelling of domain-specific terms with custom dictionaries. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" # Define custom spellings config = aai.TranscriptionConfig() config.set_custom_spelling({ "Kubernetes": ["k8s", "kates"], "PostgreSQL": ["postgres", "post gres"], "API": ["A P I"], "LaTeX": ["lay tech", "latex"] }) transcriber = aai.Transcriber() transcript = transcriber.transcribe( "https://example.com/tech-talk.mp3", config=config ) print(transcript.text) # Output will use correct spellings: "Kubernetes", "PostgreSQL", etc. ``` ### PII Redaction Automatically detect and redact personally identifiable information from transcripts and audio. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" # Configure PII redaction config = aai.TranscriptionConfig() config.set_redact_pii( policies=[ aai.PIIRedactionPolicy.person_name, aai.PIIRedactionPolicy.phone_number, aai.PIIRedactionPolicy.email_address, aai.PIIRedactionPolicy.credit_card_number, aai.PIIRedactionPolicy.ssn, aai.PIIRedactionPolicy.location, ], substitution=aai.PIISubstitutionPolicy.hash ) transcriber = aai.Transcriber() transcript = transcriber.transcribe( "https://example.com/customer-call.mp3", config=config ) print(transcript.text) # Output: "My name is ###### and my phone is ##########" # Redact audio file too config_with_audio = aai.TranscriptionConfig( redact_pii=True, redact_pii_policies=[aai.PIIRedactionPolicy.person_name], redact_pii_audio=True ) transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config_with_audio) # Get redacted audio URL redacted_url = transcript.get_redacted_audio_url() print(f"Redacted audio: {redacted_url}") # Save redacted audio locally transcript.save_redacted_audio("redacted_audio.mp3") ``` ### Content Safety Detection Detect sensitive and inappropriate content in audio with confidence scores. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" # Enable content safety detection config = aai.TranscriptionConfig( content_safety=True, content_safety_confidence=75 # Only include labels with >75% confidence ) transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config) # Get flagged content with timestamps for result in transcript.content_safety.results: print(f"Text: {result.text}") print(f"Timestamp: {result.timestamp.start} - {result.timestamp.end}") for label in result.labels: print(f" Category: {label.label}") print(f" Confidence: {label.confidence}") print(f" Severity: {label.severity}") # Get overall summary for category, confidence in transcript.content_safety.summary.items(): print(f"{confidence * 100}% confident audio contains {category}") # Get severity summary for category, severity in transcript.content_safety.severity_score_summary.items(): print(f"{category}:") print(f" Low severity: {severity.low * 100}%") print(f" Medium severity: {severity.medium * 100}%") print(f" High severity: {severity.high * 100}%") ``` ### Sentiment Analysis Analyze sentiment of sentences in transcripts with confidence scores. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" # Enable sentiment analysis config = aai.TranscriptionConfig( sentiment_analysis=True, speaker_labels=True # Optional: include speaker info ) transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/review.mp3", config=config) # Analyze sentiment per sentence for sentiment in transcript.sentiment_analysis: print(f"Text: {sentiment.text}") print(f"Sentiment: {sentiment.sentiment}") # POSITIVE, NEUTRAL, or NEGATIVE print(f"Confidence: {sentiment.confidence}") print(f"Timestamp: {sentiment.start}ms - {sentiment.end}ms") if hasattr(sentiment, 'speaker'): print(f"Speaker: {sentiment.speaker}") # Calculate overall sentiment positive_count = sum(1 for s in transcript.sentiment_analysis if s.sentiment == aai.SentimentType.positive) negative_count = sum(1 for s in transcript.sentiment_analysis if s.sentiment == aai.SentimentType.negative) neutral_count = sum(1 for s in transcript.sentiment_analysis if s.sentiment == aai.SentimentType.neutral) print(f"Positive: {positive_count}, Negative: {negative_count}, Neutral: {neutral_count}") ``` ### Entity Detection Identify and extract named entities like people, organizations, and locations. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" # Enable entity detection config = aai.TranscriptionConfig(entity_detection=True) transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/news.mp3", config=config) # Extract entities for entity in transcript.entities: print(f"Entity: {entity.text}") print(f"Type: {entity.entity_type}") # person_name, location, organization, etc. print(f"Timestamp: {entity.start}ms - {entity.end}ms") # Filter by entity type people = [e for e in transcript.entities if e.entity_type == aai.EntityType.person_name] locations = [e for e in transcript.entities if e.entity_type == aai.EntityType.location] organizations = [e for e in transcript.entities if e.entity_type == aai.EntityType.organization] print(f"Found {len(people)} people, {len(locations)} locations, {len(organizations)} organizations") ``` ### Auto Chapters Automatically segment audio into chapters with summaries over time. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" # Enable auto chapters config = aai.TranscriptionConfig(auto_chapters=True) transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/podcast.mp3", config=config) # Get chapters with summaries for chapter in transcript.chapters: print(f"Chapter: {chapter.headline}") print(f"Summary: {chapter.summary}") print(f"Gist: {chapter.gist}") print(f"Time: {chapter.start}ms - {chapter.end}ms") print("---") # Create chapter navigation chapter_times = [(ch.start, ch.headline) for ch in transcript.chapters] for start_time, headline in chapter_times: print(f"{start_time / 1000:.2f}s - {headline}") ``` ### Auto Highlights Extract key phrases and important moments from audio content. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" # Enable auto highlights config = aai.TranscriptionConfig(auto_highlights=True) transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/presentation.mp3", config=config) # Get highlighted phrases ranked by importance for highlight in transcript.auto_highlights.results: print(f"Highlight: {highlight.text}") print(f"Rank: {highlight.rank}") # Relevancy score print(f"Count: {highlight.count}") # Number of occurrences # Get all timestamps where this phrase appears for timestamp in highlight.timestamps: print(f" At {timestamp.start}ms - {timestamp.end}ms") # Get top 5 highlights top_highlights = sorted( transcript.auto_highlights.results, key=lambda h: h.rank, reverse=True )[:5] for h in top_highlights: print(f"- {h.text} (rank: {h.rank})") ``` ### Topic Detection (IAB Categories) Automatically classify audio content into topics and categories. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" # Enable IAB category detection config = aai.TranscriptionConfig(iab_categories=True) transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/podcast.mp3", config=config) # Get topic segments for result in transcript.iab_categories.results: print(f"Text: {result.text}") print(f"Timestamp: {result.timestamp.start}ms - {result.timestamp.end}ms") for label in result.labels: print(f" Topic: {label.label}") print(f" Relevance: {label.relevance}") # Get overall topic summary print("\nOverall Topics:") for topic, relevance in transcript.iab_categories.summary.items(): print(f"{topic}: {relevance * 100:.1f}% relevant") # Get top 5 topics sorted_topics = sorted( transcript.iab_categories.summary.items(), key=lambda x: x[1], reverse=True )[:5] for topic, relevance in sorted_topics: print(f"- {topic} ({relevance * 100:.1f}%)") ``` ### Summarization Generate concise summaries of audio content with customizable formats. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" # Enable summarization with default settings (bullets, informative) config = aai.TranscriptionConfig(summarization=True) transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/meeting.mp3", config=config) print("Summary:") print(transcript.summary) # Customize summary type and model config = aai.TranscriptionConfig( summarization=True, summary_model=aai.SummarizationModel.catchy, summary_type=aai.SummarizationType.headline ) transcript = transcriber.transcribe("https://example.com/podcast.mp3", config=config) print(f"Headline: {transcript.summary}") # Try different summary models for model in [aai.SummarizationModel.informative, aai.SummarizationModel.conversational, aai.SummarizationModel.catchy]: config = aai.TranscriptionConfig( summarization=True, summary_model=model, summary_type=aai.SummarizationType.bullets ) transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config) print(f"{model}: {transcript.summary}") ``` ### LeMUR Task (Custom Prompts) Use custom prompts to analyze transcripts with large language models. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/earnings-call.mp3") # Run custom prompt result = transcript.lemur.task( prompt="Extract the key financial metrics mentioned in this earnings call. " "Format as a JSON object with revenue, profit, and growth_rate.", final_model=aai.LemurModel.claude3_5_sonnet, temperature=0.0 ) print(result.response) print(f"Request ID: {result.request_id}") print(f"Usage: {result.usage.input_tokens} input, {result.usage.output_tokens} output") # Provide additional context result = transcript.lemur.task( prompt="What are the main technical challenges discussed?", context="This is a technical architecture review meeting for a cloud platform.", final_model=aai.LemurModel.claude3_5_sonnet ) print(result.response) ``` ### LeMUR Question & Answer Ask structured questions about audio transcripts with automated Q&A. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/customer-call.mp3") # Ask multiple questions questions = [ aai.LemurQuestion(question="What product was the customer interested in?"), aai.LemurQuestion(question="What was the customer's main concern?"), aai.LemurQuestion(question="Was the issue resolved?"), aai.LemurQuestion(question="What is the customer's price range?"), ] result = transcript.lemur.question( questions=questions, context="This is a sales call with a potential customer.", final_model=aai.LemurModel.claude3_5_sonnet ) # Process answers for qa in result.response: print(f"Q: {qa.question}") print(f"A: {qa.answer}") print("---") print(f"Usage: {result.usage.input_tokens} input tokens, {result.usage.output_tokens} output tokens") ``` ### LeMUR Summary Generate structured summaries with context-aware formatting. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/meeting.mp3") # Generate summary result = transcript.lemur.summarize( context="This is a product planning meeting for Q1 2024.", answer_format="TLDR with 3-5 bullet points", final_model=aai.LemurModel.claude3_5_sonnet ) print("Summary:") print(result.response) print(f"Request ID: {result.request_id}") # Try different formats result = transcript.lemur.summarize( context="Executive team meeting", answer_format="One paragraph executive summary, followed by key decisions made", final_model=aai.LemurModel.claude3_5_sonnet, temperature=0.3 ) print(result.response) ``` ### LeMUR Action Items Extract actionable tasks and follow-ups from transcripts. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/project-meeting.mp3") # Extract action items result = transcript.lemur.action_items( context="Sprint planning meeting for the engineering team", answer_format="List each action item with the responsible person and deadline if mentioned", final_model=aai.LemurModel.claude3_5_sonnet ) print("Action Items:") print(result.response) # Process multiple transcripts transcript_group = transcriber.transcribe_group([ "https://example.com/meeting1.mp3", "https://example.com/meeting2.mp3", ]) result = transcript_group.lemur.action_items( context="Weekly standup meetings from the past week", answer_format="Group by person, then list their action items" ) print(result.response) ``` ### LeMUR with Custom Input Text Process formatted transcript text with speaker labels for better context. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" # Transcribe with speaker labels config = aai.TranscriptionConfig(speaker_labels=True) transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/interview.mp3", config=config) # Format transcript with speaker labels formatted_text = "" for utterance in transcript.utterances: formatted_text += f"Speaker {utterance.speaker}:\n{utterance.text}\n\n" # Use formatted text with LeMUR result = aai.Lemur().task( prompt="Analyze the conversation dynamics. Who was asking most questions? " "Who was providing most information? Summarize the interaction style.", input_text=formatted_text, final_model=aai.LemurModel.claude3_5_sonnet ) print(result.response) # Alternative: use custom formatting custom_text = "" for i, utterance in enumerate(transcript.utterances, 1): custom_text += f"[{i}] Speaker {utterance.speaker} ({utterance.start}ms): {utterance.text}\n" result = aai.Lemur().task( prompt="Create a timeline of the discussion with key points", input_text=custom_text ) print(result.response) ``` ### LeMUR Data Management Retrieve previous LeMUR results and purge sensitive data. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/confidential.mp3") # Create LeMUR request result = transcript.lemur.summarize( context="Confidential financial discussion", answer_format="Executive summary" ) print(result.response) request_id = result.request_id # Later, retrieve the same result without re-processing lemur = aai.Lemur() retrieved_result = lemur.get_response_data(request_id) print(f"Retrieved: {retrieved_result.response}") # Purge sensitive data when done purge_result = aai.Lemur.purge_request_data(request_id) print(f"Purged: {purge_result.deleted}") # Verify deletion try: lemur.get_response_data(request_id) except aai.LemurError as e: print(f"Data successfully purged: {e}") ``` ### Real-time Streaming Transcription Transcribe audio in real-time from microphones or live streams. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" # Define callbacks def on_open(session_opened: aai.RealtimeSessionOpened): print(f"Session opened with ID: {session_opened.session_id}") def on_data(transcript: aai.RealtimeTranscript): if isinstance(transcript, aai.RealtimeFinalTranscript): print(f"Final: {transcript.text}") else: print(f"Partial: {transcript.text}", end="\r") def on_error(error: aai.RealtimeError): print(f"Error: {error}") def on_close(): print("Connection closed") # Create real-time transcriber transcriber = aai.RealtimeTranscriber( sample_rate=16_000, on_data=on_data, on_error=on_error, on_open=on_open, on_close=on_close, encoding=aai.AudioEncoding.pcm_s16le, end_utterance_silence_threshold=1000 ) # Connect and stream transcriber.connect() # Stream from microphone microphone_stream = aai.extras.MicrophoneStream(sample_rate=16_000) transcriber.stream(microphone_stream) # Or stream from file file_stream = aai.extras.stream_file("audio.wav", sample_rate=16_000) transcriber.stream(file_stream) # Close connection transcriber.close() ``` ### Streaming with Temporary Tokens Use temporary authentication tokens for client-side streaming applications. ```python import assemblyai as aai # Server-side: Create temporary token aai.settings.api_key = "your-api-key-here" token = aai.RealtimeTranscriber.create_temporary_token( expires_in=3600 # 1 hour ) print(f"Token: {token}") # Send token to client... # Client-side: Use temporary token (no API key needed) def on_data(transcript): print(transcript.text) def on_error(error): print(f"Error: {error}") transcriber = aai.RealtimeTranscriber( token=token, # Use token instead of API key sample_rate=16_000, on_data=on_data, on_error=on_error, ) transcriber.connect() microphone_stream = aai.extras.MicrophoneStream(sample_rate=16_000) transcriber.stream(microphone_stream) transcriber.close() ``` ### List and Filter Transcripts Retrieve and paginate through previously created transcripts. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() # Get first page of transcripts page = transcriber.list_transcripts() print(f"Page details: {page.page_details}") print(f"Found {len(page.transcripts)} transcripts") for item in page.transcripts: print(f"ID: {item.id}") print(f"Status: {item.status}") print(f"Created: {item.created}") # Filter by status params = aai.ListTranscriptParameters( limit=10, status=aai.TranscriptStatus.completed, ) page = transcriber.list_transcripts(params) # Paginate through all transcripts all_transcripts = [] params = aai.ListTranscriptParameters(limit=100) page = transcriber.list_transcripts(params) all_transcripts.extend(page.transcripts) while page.page_details.before_id_of_prev_url is not None: params.before_id = page.page_details.before_id_of_prev_url page = transcriber.list_transcripts(params) all_transcripts.extend(page.transcripts) print(f"Total transcripts: {len(all_transcripts)}") ``` ### Delete Transcripts Remove transcripts and their associated data from AssemblyAI's servers. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() transcript = transcriber.transcribe("https://example.com/audio.mp3") print(f"Created transcript: {transcript.id}") # Delete by ID deleted = aai.Transcript.delete_by_id(transcript.id) print(f"Deleted: {deleted.id}") # Async deletion future = aai.Transcript.delete_by_id_async(transcript.id) deleted = future.result() print(f"Deleted: {deleted.id}") # Bulk deletion params = aai.ListTranscriptParameters( status=aai.TranscriptStatus.completed, limit=100 ) page = transcriber.list_transcripts(params) for item in page.transcripts: if item.created < "2024-01-01": # Delete old transcripts aai.Transcript.delete_by_id(item.id) print(f"Deleted old transcript: {item.id}") ``` ### Upload Files Upload audio files separately from transcription for reuse or archiving. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() # Upload local file upload_url = transcriber.upload_file("./audio-file.mp3") print(f"Uploaded to: {upload_url}") # Use uploaded URL for transcription transcript = transcriber.transcribe(upload_url) print(transcript.text) # Upload binary data with open("audio.wav", "rb") as f: audio_data = f.read() upload_url = transcriber.upload_file(audio_data) # Reuse uploaded URL transcript1 = transcriber.transcribe(upload_url) transcript2 = transcriber.transcribe( upload_url, config=aai.TranscriptionConfig(sentiment_analysis=True) ) # Async upload future = transcriber.upload_file_async("./large-file.mp3") # Do other work... upload_url = future.result() print(f"Async upload complete: {upload_url}") ``` ### Configure Global Settings Set default values for API key, timeouts, and polling intervals. ```python import assemblyai as aai # Configure via settings object aai.settings.api_key = "your-api-key-here" aai.settings.http_timeout = 60.0 # Increase timeout to 60 seconds aai.settings.polling_interval = 5.0 # Poll every 5 seconds aai.settings.base_url = "https://api.assemblyai.com" # Or configure via environment variables # export ASSEMBLYAI_API_KEY=your-api-key-here # export ASSEMBLYAI_HTTP_TIMEOUT=60.0 # export ASSEMBLYAI_POLLING_INTERVAL=5.0 # Use custom client custom_settings = aai.Settings( api_key="custom-api-key", http_timeout=120.0, polling_interval=10.0 ) client = aai.Client(settings=custom_settings) transcriber = aai.Transcriber(client=client) # Access last HTTP response transcript = transcriber.transcribe("https://example.com/audio.mp3") response = client.last_response print(f"Status code: {response.status_code}") print(f"Headers: {response.headers}") ``` ### Error Handling Handle various error scenarios with proper exception catching. ```python import assemblyai as aai aai.settings.api_key = "your-api-key-here" transcriber = aai.Transcriber() try: transcript = transcriber.transcribe("https://example.com/invalid-audio.mp3") if transcript.status == aai.TranscriptStatus.error: print(f"Transcription failed: {transcript.error}") else: print(transcript.text) except aai.TranscriptError as e: print(f"Transcription error: {e}") print(f"Status code: {e.status_code}") except aai.AssemblyAIError as e: print(f"API error: {e}") print(f"Status code: {e.status_code}") # Handle LeMUR errors try: result = transcript.lemur.task("Analyze this") except aai.LemurError as e: print(f"LeMUR error: {e}") # Handle redacted audio errors try: config = aai.TranscriptionConfig(redact_pii=True, redact_pii_audio=True) transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config) redacted_url = transcript.get_redacted_audio_url() except aai.RedactedAudioIncompleteError: print("Redacted audio still processing") except aai.RedactedAudioExpiredError: print("Redacted audio has expired") except aai.RedactedAudioUnavailableError: print("Redacted audio not available") # Access HTTP response for debugging client = aai.Client.get_default() try: transcript = transcriber.transcribe("https://example.com/audio.mp3") except Exception as e: print(f"Error: {e}") print(f"Response status: {client.last_response.status_code}") print(f"Response body: {client.last_response.text}") ``` ## Summary The AssemblyAI Python SDK provides a comprehensive solution for audio transcription and understanding, serving use cases from simple speech-to-text conversion to advanced audio intelligence and LLM-powered analysis. The SDK excels in media production workflows for generating subtitles and captions, customer service analytics for sentiment analysis and call summarization, content moderation for detecting sensitive material, and compliance applications requiring PII redaction. Its real-time streaming capabilities enable live transcription for meetings, broadcasts, and voice interfaces, while batch processing supports large-scale audio analysis pipelines. Integration patterns follow a simple initialization flow: set the API key, create a transcriber instance, configure transcription options, and submit audio for processing. The SDK supports both synchronous operations for interactive applications and asynchronous workflows for background processing. For real-time applications, the streaming client connects via WebSocket with customizable callbacks. Advanced use cases leverage LeMUR to combine transcript data with large language models for summarization, question answering, and custom analysis tasks. The SDK handles file management, error handling, and polling automatically, providing a robust foundation for building audio-powered applications with minimal boilerplate code.