AssemblyAI Python SDK (assemblyai/assemblyai-python-sdk)

AssemblyAI Python SDK

https://github.com/assemblyai/assemblyai-python-sdk
Admin
A Python SDK for AssemblyAI's AI models, enabling transcription and understanding of audio and...

Tokens:21,897
Snippets:102
Trust Score:9.3
Update:5 months ago
Show doc for...
Context Summary (auto-generated)
Raw
# AssemblyAI Python SDK

The AssemblyAI Python SDK provides a comprehensive interface for transcribing and understanding audio using AI models. This SDK offers access to AssemblyAI's speech-to-text API, audio intelligence features, and LeMUR (Leveraging Large Language Models to Understand Recognized Speech) framework for advanced audio analysis. The SDK supports both synchronous and asynchronous operations, real-time streaming transcription, and a wide range of audio intelligence features including sentiment analysis, entity detection, content safety, and PII redaction.

The SDK is built with a focus on developer experience, offering simple interfaces for common tasks while providing granular control for advanced use cases. It handles file uploads, transcription polling, subtitle generation, and integrates seamlessly with large language models for post-processing audio transcripts. All operations are configurable through a comprehensive settings system, and the SDK supports both batch and streaming transcription workflows.

## API Reference

### Transcribe Audio Files

Transcribe audio files from URLs, local paths, or binary data with automatic file upload handling.

```python
import assemblyai as aai

# Set API key
aai.settings.api_key = "your-api-key-here"

# Create transcriber
transcriber = aai.Transcriber()

# Transcribe from URL
transcript = transcriber.transcribe("https://example.com/audio.mp3")
print(transcript.text)
print(f"Confidence: {transcript.confidence}")
print(f"Duration: {transcript.audio_duration} seconds")

# Transcribe local file
transcript = transcriber.transcribe("./local-audio.wav")

# Transcribe binary data
with open("audio.mp3", "rb") as f:
    audio_data = f.read()
    transcript = transcriber.transcribe(audio_data)

# Handle transcription errors
if transcript.status == aai.TranscriptStatus.error:
    print(f"Transcription failed: {transcript.error}")
else:
    print(transcript.text)
```

### Configure Transcription Options

Customize transcription with language detection, speaker labels, and formatting options.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

# Configure transcription settings
config = aai.TranscriptionConfig(
    language_code="en",
    speaker_labels=True,
    speakers_expected=2,
    punctuate=True,
    format_text=True,
    dual_channel=False,
    webhook_url="https://your-server.com/webhook",
    word_boost=["custom", "vocabulary"],
    boost_param="high"
)

# Use configuration
transcriber = aai.Transcriber(config=config)
transcript = transcriber.transcribe("https://example.com/meeting.mp3")

# Access speaker-labeled utterances
for utterance in transcript.utterances:
    print(f"Speaker {utterance.speaker}: {utterance.text}")
    print(f"Confidence: {utterance.confidence}")
    print(f"Start: {utterance.start}ms, End: {utterance.end}ms")
```

### Submit Transcription Jobs

Submit transcription jobs without waiting for completion for asynchronous workflows.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"
transcriber = aai.Transcriber()

# Submit job without polling
transcript = transcriber.submit("https://example.com/audio.mp3")
print(f"Submitted transcript with ID: {transcript.id}")
print(f"Status: {transcript.status}")

# Later, wait for completion
transcript = transcript.wait_for_completion()
print(transcript.text)

# Or retrieve existing transcript by ID
transcript = aai.Transcript.get_by_id("transcript-id-here")
print(transcript.text)

# Use async variant
future = transcriber.transcribe_async("https://example.com/audio.mp3")
# Do other work...
transcript = future.result()
print(transcript.text)
```

### Transcribe Multiple Files

Process multiple audio files concurrently with automatic batching and error handling.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"
transcriber = aai.Transcriber()

# Transcribe multiple files
audio_files = [
    "https://example.com/audio1.mp3",
    "https://example.com/audio2.mp3",
    "./local-audio.wav",
]

# Returns TranscriptGroup
transcript_group = transcriber.transcribe_group(audio_files)

# Iterate over completed transcripts
for transcript in transcript_group:
    print(f"ID: {transcript.id}")
    print(f"Text: {transcript.text}")
    print(f"Status: {transcript.status}")

# Handle failures separately
transcript_group, failures = transcriber.transcribe_group(
    audio_files,
    return_failures=True
)

for error in failures:
    print(f"Failed: {error}")

# Check overall status
print(f"Group status: {transcript_group.status}")
```

### Export Subtitles

Generate SRT and VTT subtitle files from transcripts for video players.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"
transcriber = aai.Transcriber()

transcript = transcriber.transcribe("https://example.com/video-audio.mp3")

# Export to SRT format
srt_subtitles = transcript.export_subtitles_srt()
print(srt_subtitles)

# Export to VTT format
vtt_subtitles = transcript.export_subtitles_vtt()
print(vtt_subtitles)

# Control caption length
srt_subtitles = transcript.export_subtitles_srt(chars_per_caption=32)

# Save to files
with open("subtitles.srt", "w") as f:
    f.write(srt_subtitles)

with open("subtitles.vtt", "w") as f:
    f.write(vtt_subtitles)
```

### Search and Navigate Transcripts

Search for words, retrieve sentences and paragraphs for structured navigation.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"
transcriber = aai.Transcriber()

transcript = transcriber.transcribe("https://example.com/audio.mp3")

# Search for specific words or phrases
matches = transcript.word_search(["price", "product", "discount"])
for match in matches:
    print(f"Found '{match.text}' {match.count} times")
    for timestamp in match.timestamps:
        print(f"  At {timestamp}ms")

# Get sentences
sentences = transcript.get_sentences()
for sentence in sentences:
    print(f"{sentence.text}")
    print(f"Start: {sentence.start}ms, End: {sentence.end}ms")
    print(f"Confidence: {sentence.confidence}")

# Get paragraphs
paragraphs = transcript.get_paragraphs()
for paragraph in paragraphs:
    print(f"{paragraph.text}")
    print(f"Start: {paragraph.start}ms, End: {paragraph.end}ms")
```

### Custom Spelling Dictionary

Ensure proper spelling of domain-specific terms with custom dictionaries.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

# Define custom spellings
config = aai.TranscriptionConfig()
config.set_custom_spelling({
    "Kubernetes": ["k8s", "kates"],
    "PostgreSQL": ["postgres", "post gres"],
    "API": ["A P I"],
    "LaTeX": ["lay tech", "latex"]
})

transcriber = aai.Transcriber()
transcript = transcriber.transcribe(
    "https://example.com/tech-talk.mp3",
    config=config
)

print(transcript.text)
# Output will use correct spellings: "Kubernetes", "PostgreSQL", etc.
```

### PII Redaction

Automatically detect and redact personally identifiable information from transcripts and audio.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

# Configure PII redaction
config = aai.TranscriptionConfig()
config.set_redact_pii(
    policies=[
        aai.PIIRedactionPolicy.person_name,
        aai.PIIRedactionPolicy.phone_number,
        aai.PIIRedactionPolicy.email_address,
        aai.PIIRedactionPolicy.credit_card_number,
        aai.PIIRedactionPolicy.ssn,
        aai.PIIRedactionPolicy.location,
    ],
    substitution=aai.PIISubstitutionPolicy.hash
)

transcriber = aai.Transcriber()
transcript = transcriber.transcribe(
    "https://example.com/customer-call.mp3",
    config=config
)

print(transcript.text)
# Output: "My name is ###### and my phone is ##########"

# Redact audio file too
config_with_audio = aai.TranscriptionConfig(
    redact_pii=True,
    redact_pii_policies=[aai.PIIRedactionPolicy.person_name],
    redact_pii_audio=True
)

transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config_with_audio)

# Get redacted audio URL
redacted_url = transcript.get_redacted_audio_url()
print(f"Redacted audio: {redacted_url}")

# Save redacted audio locally
transcript.save_redacted_audio("redacted_audio.mp3")
```

### Content Safety Detection

Detect sensitive and inappropriate content in audio with confidence scores.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

# Enable content safety detection
config = aai.TranscriptionConfig(
    content_safety=True,
    content_safety_confidence=75  # Only include labels with >75% confidence
)

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)

# Get flagged content with timestamps
for result in transcript.content_safety.results:
    print(f"Text: {result.text}")
    print(f"Timestamp: {result.timestamp.start} - {result.timestamp.end}")

    for label in result.labels:
        print(f"  Category: {label.label}")
        print(f"  Confidence: {label.confidence}")
        print(f"  Severity: {label.severity}")

# Get overall summary
for category, confidence in transcript.content_safety.summary.items():
    print(f"{confidence * 100}% confident audio contains {category}")

# Get severity summary
for category, severity in transcript.content_safety.severity_score_summary.items():
    print(f"{category}:")
    print(f"  Low severity: {severity.low * 100}%")
    print(f"  Medium severity: {severity.medium * 100}%")
    print(f"  High severity: {severity.high * 100}%")
```

### Sentiment Analysis

Analyze sentiment of sentences in transcripts with confidence scores.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

# Enable sentiment analysis
config = aai.TranscriptionConfig(
    sentiment_analysis=True,
    speaker_labels=True  # Optional: include speaker info
)

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/review.mp3", config=config)

# Analyze sentiment per sentence
for sentiment in transcript.sentiment_analysis:
    print(f"Text: {sentiment.text}")
    print(f"Sentiment: {sentiment.sentiment}")  # POSITIVE, NEUTRAL, or NEGATIVE
    print(f"Confidence: {sentiment.confidence}")
    print(f"Timestamp: {sentiment.start}ms - {sentiment.end}ms")

    if hasattr(sentiment, 'speaker'):
        print(f"Speaker: {sentiment.speaker}")

# Calculate overall sentiment
positive_count = sum(1 for s in transcript.sentiment_analysis if s.sentiment == aai.SentimentType.positive)
negative_count = sum(1 for s in transcript.sentiment_analysis if s.sentiment == aai.SentimentType.negative)
neutral_count = sum(1 for s in transcript.sentiment_analysis if s.sentiment == aai.SentimentType.neutral)

print(f"Positive: {positive_count}, Negative: {negative_count}, Neutral: {neutral_count}")
```

### Entity Detection

Identify and extract named entities like people, organizations, and locations.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

# Enable entity detection
config = aai.TranscriptionConfig(entity_detection=True)

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/news.mp3", config=config)

# Extract entities
for entity in transcript.entities:
    print(f"Entity: {entity.text}")
    print(f"Type: {entity.entity_type}")  # person_name, location, organization, etc.
    print(f"Timestamp: {entity.start}ms - {entity.end}ms")

# Filter by entity type
people = [e for e in transcript.entities if e.entity_type == aai.EntityType.person_name]
locations = [e for e in transcript.entities if e.entity_type == aai.EntityType.location]
organizations = [e for e in transcript.entities if e.entity_type == aai.EntityType.organization]

print(f"Found {len(people)} people, {len(locations)} locations, {len(organizations)} organizations")
```

### Auto Chapters

Automatically segment audio into chapters with summaries over time.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

# Enable auto chapters
config = aai.TranscriptionConfig(auto_chapters=True)

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/podcast.mp3", config=config)

# Get chapters with summaries
for chapter in transcript.chapters:
    print(f"Chapter: {chapter.headline}")
    print(f"Summary: {chapter.summary}")
    print(f"Gist: {chapter.gist}")
    print(f"Time: {chapter.start}ms - {chapter.end}ms")
    print("---")

# Create chapter navigation
chapter_times = [(ch.start, ch.headline) for ch in transcript.chapters]
for start_time, headline in chapter_times:
    print(f"{start_time / 1000:.2f}s - {headline}")
```

### Auto Highlights

Extract key phrases and important moments from audio content.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

# Enable auto highlights
config = aai.TranscriptionConfig(auto_highlights=True)

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/presentation.mp3", config=config)

# Get highlighted phrases ranked by importance
for highlight in transcript.auto_highlights.results:
    print(f"Highlight: {highlight.text}")
    print(f"Rank: {highlight.rank}")  # Relevancy score
    print(f"Count: {highlight.count}")  # Number of occurrences

    # Get all timestamps where this phrase appears
    for timestamp in highlight.timestamps:
        print(f"  At {timestamp.start}ms - {timestamp.end}ms")

# Get top 5 highlights
top_highlights = sorted(
    transcript.auto_highlights.results,
    key=lambda h: h.rank,
    reverse=True
)[:5]

for h in top_highlights:
    print(f"- {h.text} (rank: {h.rank})")
```

### Topic Detection (IAB Categories)

Automatically classify audio content into topics and categories.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

# Enable IAB category detection
config = aai.TranscriptionConfig(iab_categories=True)

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/podcast.mp3", config=config)

# Get topic segments
for result in transcript.iab_categories.results:
    print(f"Text: {result.text}")
    print(f"Timestamp: {result.timestamp.start}ms - {result.timestamp.end}ms")

    for label in result.labels:
        print(f"  Topic: {label.label}")
        print(f"  Relevance: {label.relevance}")

# Get overall topic summary
print("\nOverall Topics:")
for topic, relevance in transcript.iab_categories.summary.items():
    print(f"{topic}: {relevance * 100:.1f}% relevant")

# Get top 5 topics
sorted_topics = sorted(
    transcript.iab_categories.summary.items(),
    key=lambda x: x[1],
    reverse=True
)[:5]

for topic, relevance in sorted_topics:
    print(f"- {topic} ({relevance * 100:.1f}%)")
```

### Summarization

Generate concise summaries of audio content with customizable formats.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

# Enable summarization with default settings (bullets, informative)
config = aai.TranscriptionConfig(summarization=True)

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/meeting.mp3", config=config)

print("Summary:")
print(transcript.summary)

# Customize summary type and model
config = aai.TranscriptionConfig(
    summarization=True,
    summary_model=aai.SummarizationModel.catchy,
    summary_type=aai.SummarizationType.headline
)

transcript = transcriber.transcribe("https://example.com/podcast.mp3", config=config)
print(f"Headline: {transcript.summary}")

# Try different summary models
for model in [aai.SummarizationModel.informative, aai.SummarizationModel.conversational, aai.SummarizationModel.catchy]:
    config = aai.TranscriptionConfig(
        summarization=True,
        summary_model=model,
        summary_type=aai.SummarizationType.bullets
    )
    transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)
    print(f"{model}: {transcript.summary}")
```

### LeMUR Task (Custom Prompts)

Use custom prompts to analyze transcripts with large language models.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/earnings-call.mp3")

# Run custom prompt
result = transcript.lemur.task(
    prompt="Extract the key financial metrics mentioned in this earnings call. "
           "Format as a JSON object with revenue, profit, and growth_rate.",
    final_model=aai.LemurModel.claude3_5_sonnet,
    temperature=0.0
)

print(result.response)
print(f"Request ID: {result.request_id}")
print(f"Usage: {result.usage.input_tokens} input, {result.usage.output_tokens} output")

# Provide additional context
result = transcript.lemur.task(
    prompt="What are the main technical challenges discussed?",
    context="This is a technical architecture review meeting for a cloud platform.",
    final_model=aai.LemurModel.claude3_5_sonnet
)

print(result.response)
```

### LeMUR Question & Answer

Ask structured questions about audio transcripts with automated Q&A.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/customer-call.mp3")

# Ask multiple questions
questions = [
    aai.LemurQuestion(question="What product was the customer interested in?"),
    aai.LemurQuestion(question="What was the customer's main concern?"),
    aai.LemurQuestion(question="Was the issue resolved?"),
    aai.LemurQuestion(question="What is the customer's price range?"),
]

result = transcript.lemur.question(
    questions=questions,
    context="This is a sales call with a potential customer.",
    final_model=aai.LemurModel.claude3_5_sonnet
)

# Process answers
for qa in result.response:
    print(f"Q: {qa.question}")
    print(f"A: {qa.answer}")
    print("---")

print(f"Usage: {result.usage.input_tokens} input tokens, {result.usage.output_tokens} output tokens")
```

### LeMUR Summary

Generate structured summaries with context-aware formatting.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/meeting.mp3")

# Generate summary
result = transcript.lemur.summarize(
    context="This is a product planning meeting for Q1 2024.",
    answer_format="TLDR with 3-5 bullet points",
    final_model=aai.LemurModel.claude3_5_sonnet
)

print("Summary:")
print(result.response)
print(f"Request ID: {result.request_id}")

# Try different formats
result = transcript.lemur.summarize(
    context="Executive team meeting",
    answer_format="One paragraph executive summary, followed by key decisions made",
    final_model=aai.LemurModel.claude3_5_sonnet,
    temperature=0.3
)

print(result.response)
```

### LeMUR Action Items

Extract actionable tasks and follow-ups from transcripts.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/project-meeting.mp3")

# Extract action items
result = transcript.lemur.action_items(
    context="Sprint planning meeting for the engineering team",
    answer_format="List each action item with the responsible person and deadline if mentioned",
    final_model=aai.LemurModel.claude3_5_sonnet
)

print("Action Items:")
print(result.response)

# Process multiple transcripts
transcript_group = transcriber.transcribe_group([
    "https://example.com/meeting1.mp3",
    "https://example.com/meeting2.mp3",
])

result = transcript_group.lemur.action_items(
    context="Weekly standup meetings from the past week",
    answer_format="Group by person, then list their action items"
)

print(result.response)
```

### LeMUR with Custom Input Text

Process formatted transcript text with speaker labels for better context.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

# Transcribe with speaker labels
config = aai.TranscriptionConfig(speaker_labels=True)
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/interview.mp3", config=config)

# Format transcript with speaker labels
formatted_text = ""
for utterance in transcript.utterances:
    formatted_text += f"Speaker {utterance.speaker}:\n{utterance.text}\n\n"

# Use formatted text with LeMUR
result = aai.Lemur().task(
    prompt="Analyze the conversation dynamics. Who was asking most questions? "
           "Who was providing most information? Summarize the interaction style.",
    input_text=formatted_text,
    final_model=aai.LemurModel.claude3_5_sonnet
)

print(result.response)

# Alternative: use custom formatting
custom_text = ""
for i, utterance in enumerate(transcript.utterances, 1):
    custom_text += f"[{i}] Speaker {utterance.speaker} ({utterance.start}ms): {utterance.text}\n"

result = aai.Lemur().task(
    prompt="Create a timeline of the discussion with key points",
    input_text=custom_text
)

print(result.response)
```

### LeMUR Data Management

Retrieve previous LeMUR results and purge sensitive data.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/confidential.mp3")

# Create LeMUR request
result = transcript.lemur.summarize(
    context="Confidential financial discussion",
    answer_format="Executive summary"
)

print(result.response)
request_id = result.request_id

# Later, retrieve the same result without re-processing
lemur = aai.Lemur()
retrieved_result = lemur.get_response_data(request_id)
print(f"Retrieved: {retrieved_result.response}")

# Purge sensitive data when done
purge_result = aai.Lemur.purge_request_data(request_id)
print(f"Purged: {purge_result.deleted}")

# Verify deletion
try:
    lemur.get_response_data(request_id)
except aai.LemurError as e:
    print(f"Data successfully purged: {e}")
```

### Real-time Streaming Transcription

Transcribe audio in real-time from microphones or live streams.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

# Define callbacks
def on_open(session_opened: aai.RealtimeSessionOpened):
    print(f"Session opened with ID: {session_opened.session_id}")

def on_data(transcript: aai.RealtimeTranscript):
    if isinstance(transcript, aai.RealtimeFinalTranscript):
        print(f"Final: {transcript.text}")
    else:
        print(f"Partial: {transcript.text}", end="\r")

def on_error(error: aai.RealtimeError):
    print(f"Error: {error}")

def on_close():
    print("Connection closed")

# Create real-time transcriber
transcriber = aai.RealtimeTranscriber(
    sample_rate=16_000,
    on_data=on_data,
    on_error=on_error,
    on_open=on_open,
    on_close=on_close,
    encoding=aai.AudioEncoding.pcm_s16le,
    end_utterance_silence_threshold=1000
)

# Connect and stream
transcriber.connect()

# Stream from microphone
microphone_stream = aai.extras.MicrophoneStream(sample_rate=16_000)
transcriber.stream(microphone_stream)

# Or stream from file
file_stream = aai.extras.stream_file("audio.wav", sample_rate=16_000)
transcriber.stream(file_stream)

# Close connection
transcriber.close()
```

### Streaming with Temporary Tokens

Use temporary authentication tokens for client-side streaming applications.

```python
import assemblyai as aai

# Server-side: Create temporary token
aai.settings.api_key = "your-api-key-here"

token = aai.RealtimeTranscriber.create_temporary_token(
    expires_in=3600  # 1 hour
)

print(f"Token: {token}")
# Send token to client...

# Client-side: Use temporary token (no API key needed)
def on_data(transcript):
    print(transcript.text)

def on_error(error):
    print(f"Error: {error}")

transcriber = aai.RealtimeTranscriber(
    token=token,  # Use token instead of API key
    sample_rate=16_000,
    on_data=on_data,
    on_error=on_error,
)

transcriber.connect()
microphone_stream = aai.extras.MicrophoneStream(sample_rate=16_000)
transcriber.stream(microphone_stream)
transcriber.close()
```

### List and Filter Transcripts

Retrieve and paginate through previously created transcripts.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"
transcriber = aai.Transcriber()

# Get first page of transcripts
page = transcriber.list_transcripts()

print(f"Page details: {page.page_details}")
print(f"Found {len(page.transcripts)} transcripts")

for item in page.transcripts:
    print(f"ID: {item.id}")
    print(f"Status: {item.status}")
    print(f"Created: {item.created}")

# Filter by status
params = aai.ListTranscriptParameters(
    limit=10,
    status=aai.TranscriptStatus.completed,
)
page = transcriber.list_transcripts(params)

# Paginate through all transcripts
all_transcripts = []
params = aai.ListTranscriptParameters(limit=100)
page = transcriber.list_transcripts(params)
all_transcripts.extend(page.transcripts)

while page.page_details.before_id_of_prev_url is not None:
    params.before_id = page.page_details.before_id_of_prev_url
    page = transcriber.list_transcripts(params)
    all_transcripts.extend(page.transcripts)

print(f"Total transcripts: {len(all_transcripts)}")
```

### Delete Transcripts

Remove transcripts and their associated data from AssemblyAI's servers.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/audio.mp3")

print(f"Created transcript: {transcript.id}")

# Delete by ID
deleted = aai.Transcript.delete_by_id(transcript.id)
print(f"Deleted: {deleted.id}")

# Async deletion
future = aai.Transcript.delete_by_id_async(transcript.id)
deleted = future.result()
print(f"Deleted: {deleted.id}")

# Bulk deletion
params = aai.ListTranscriptParameters(
    status=aai.TranscriptStatus.completed,
    limit=100
)
page = transcriber.list_transcripts(params)

for item in page.transcripts:
    if item.created < "2024-01-01":  # Delete old transcripts
        aai.Transcript.delete_by_id(item.id)
        print(f"Deleted old transcript: {item.id}")
```

### Upload Files

Upload audio files separately from transcription for reuse or archiving.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"
transcriber = aai.Transcriber()

# Upload local file
upload_url = transcriber.upload_file("./audio-file.mp3")
print(f"Uploaded to: {upload_url}")

# Use uploaded URL for transcription
transcript = transcriber.transcribe(upload_url)
print(transcript.text)

# Upload binary data
with open("audio.wav", "rb") as f:
    audio_data = f.read()
    upload_url = transcriber.upload_file(audio_data)

# Reuse uploaded URL
transcript1 = transcriber.transcribe(upload_url)
transcript2 = transcriber.transcribe(
    upload_url,
    config=aai.TranscriptionConfig(sentiment_analysis=True)
)

# Async upload
future = transcriber.upload_file_async("./large-file.mp3")
# Do other work...
upload_url = future.result()
print(f"Async upload complete: {upload_url}")
```

### Configure Global Settings

Set default values for API key, timeouts, and polling intervals.

```python
import assemblyai as aai

# Configure via settings object
aai.settings.api_key = "your-api-key-here"
aai.settings.http_timeout = 60.0  # Increase timeout to 60 seconds
aai.settings.polling_interval = 5.0  # Poll every 5 seconds
aai.settings.base_url = "https://api.assemblyai.com"

# Or configure via environment variables
# export ASSEMBLYAI_API_KEY=your-api-key-here
# export ASSEMBLYAI_HTTP_TIMEOUT=60.0
# export ASSEMBLYAI_POLLING_INTERVAL=5.0

# Use custom client
custom_settings = aai.Settings(
    api_key="custom-api-key",
    http_timeout=120.0,
    polling_interval=10.0
)

client = aai.Client(settings=custom_settings)
transcriber = aai.Transcriber(client=client)

# Access last HTTP response
transcript = transcriber.transcribe("https://example.com/audio.mp3")
response = client.last_response
print(f"Status code: {response.status_code}")
print(f"Headers: {response.headers}")
```

### Error Handling

Handle various error scenarios with proper exception catching.

```python
import assemblyai as aai

aai.settings.api_key = "your-api-key-here"
transcriber = aai.Transcriber()

try:
    transcript = transcriber.transcribe("https://example.com/invalid-audio.mp3")

    if transcript.status == aai.TranscriptStatus.error:
        print(f"Transcription failed: {transcript.error}")
    else:
        print(transcript.text)

except aai.TranscriptError as e:
    print(f"Transcription error: {e}")
    print(f"Status code: {e.status_code}")

except aai.AssemblyAIError as e:
    print(f"API error: {e}")
    print(f"Status code: {e.status_code}")

# Handle LeMUR errors
try:
    result = transcript.lemur.task("Analyze this")
except aai.LemurError as e:
    print(f"LeMUR error: {e}")

# Handle redacted audio errors
try:
    config = aai.TranscriptionConfig(redact_pii=True, redact_pii_audio=True)
    transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)
    redacted_url = transcript.get_redacted_audio_url()
except aai.RedactedAudioIncompleteError:
    print("Redacted audio still processing")
except aai.RedactedAudioExpiredError:
    print("Redacted audio has expired")
except aai.RedactedAudioUnavailableError:
    print("Redacted audio not available")

# Access HTTP response for debugging
client = aai.Client.get_default()
try:
    transcript = transcriber.transcribe("https://example.com/audio.mp3")
except Exception as e:
    print(f"Error: {e}")
    print(f"Response status: {client.last_response.status_code}")
    print(f"Response body: {client.last_response.text}")
```

## Summary

The AssemblyAI Python SDK provides a comprehensive solution for audio transcription and understanding, serving use cases from simple speech-to-text conversion to advanced audio intelligence and LLM-powered analysis. The SDK excels in media production workflows for generating subtitles and captions, customer service analytics for sentiment analysis and call summarization, content moderation for detecting sensitive material, and compliance applications requiring PII redaction. Its real-time streaming capabilities enable live transcription for meetings, broadcasts, and voice interfaces, while batch processing supports large-scale audio analysis pipelines.

Integration patterns follow a simple initialization flow: set the API key, create a transcriber instance, configure transcription options, and submit audio for processing. The SDK supports both synchronous operations for interactive applications and asynchronous workflows for background processing. For real-time applications, the streaming client connects via WebSocket with customizable callbacks. Advanced use cases leverage LeMUR to combine transcript data with large language models for summarization, question answering, and custom analysis tasks. The SDK handles file management, error handling, and polling automatically, providing a robust foundation for building audio-powered applications with minimal boilerplate code.