Try Live
Add Docs
Rankings
Pricing
Docs
Install
Theme
Install
Docs
Pricing
More...
More...
Try Live
Rankings
Enterprise
Create API Key
Add Docs
ElevenLabs Python
https://github.com/elevenlabs/elevenlabs-python
Admin
The official Python SDK for ElevenLabs, enabling developers to integrate lifelike AI voices into
...
Tokens:
134,948
Snippets:
629
Trust Score:
9
Update:
3 months ago
Context
Skills
Chat
Benchmark
79.27
Suggestions
Latest
Show doc for...
Code
Info
Show Results
Context Summary (auto-generated)
Raw
Copy
Link
# ElevenLabs Python SDK The ElevenLabs Python SDK provides a comprehensive interface for accessing ElevenLabs' AI voice generation and audio processing services. This library enables developers to convert text to lifelike speech, clone voices, manage audio projects, and build interactive conversational AI agents. The SDK is auto-generated from ElevenLabs' API definition using Fern, ensuring consistency with the REST API while providing a Pythonic interface with full type hints and async support. The SDK supports multiple audio processing capabilities including text-to-speech conversion with customizable voices and settings, speech-to-speech transformation, voice cloning through instant voice cloning (IVC), audio dubbing for multilingual content, music generation, and real-time conversational AI agents. It offers both synchronous and asynchronous client implementations, streaming support for low-latency audio generation, and helper functions for audio playback and file management. ## Text-to-Speech Conversion Convert text to natural-sounding speech using AI voices. ```python from elevenlabs.client import ElevenLabs from elevenlabs import play client = ElevenLabs(api_key="your_api_key") # Basic text-to-speech conversion audio = client.text_to_speech.convert( voice_id="JBFqnCBsd6RMkjVDRZzb", text="The first move is what sets everything in motion.", model_id="eleven_multilingual_v2", output_format="mp3_44100_128" ) # Play the audio play(audio) # Save audio to file from elevenlabs import save save(audio, "output.mp3") ``` ## Streaming Text-to-Speech Stream audio in real-time as it's being generated for lower latency. ```python from elevenlabs.client import ElevenLabs from elevenlabs import stream client = ElevenLabs(api_key="your_api_key") # Stream audio with low latency audio_stream = client.text_to_speech.stream( text="This audio is being streamed in real-time", voice_id="JBFqnCBsd6RMkjVDRZzb", model_id="eleven_multilingual_v2" ) # Option 1: Stream directly to audio player stream(audio_stream) # Option 2: Process chunks manually for chunk in audio_stream: if isinstance(chunk, bytes): # Process each audio chunk print(f"Received {len(chunk)} bytes") ``` ## Voice Management List, search, and manage available voices. ```python from elevenlabs.client import ElevenLabs client = ElevenLabs(api_key="your_api_key") # List all available voices response = client.voices.get_all() for voice in response.voices: print(f"Voice: {voice.name}, ID: {voice.voice_id}") # Search voices with filters voices = client.voices.search() print(voices.voices) # Get voice settings settings = client.voices.get_settings("voice_id_here") print(f"Stability: {settings.stability}, Similarity: {settings.similarity_boost}") ``` ## Voice Cloning Create custom voice clones from audio samples. ```python from elevenlabs.client import ElevenLabs client = ElevenLabs(api_key="your_api_key") # Clone a voice from audio samples voice = client.voices.ivc.create( name="Alex", description="An old American male voice with a slight hoarseness", files=["./sample_0.mp3", "./sample_1.mp3", "./sample_2.mp3"] ) print(f"Created voice: {voice.voice_id}") # Use the cloned voice audio = client.text_to_speech.convert( voice_id=voice.voice_id, text="This is my cloned voice speaking", model_id="eleven_multilingual_v2" ) ``` ## Async Client Usage Use asynchronous operations for concurrent API calls. ```python import asyncio from elevenlabs.client import AsyncElevenLabs async def generate_speech(): client = AsyncElevenLabs(api_key="your_api_key") # Async text-to-speech audio = await client.text_to_speech.convert( voice_id="JBFqnCBsd6RMkjVDRZzb", text="Async speech generation", model_id="eleven_flash_v2_5" ) # List models asynchronously models = await client.models.list() for model in models: print(f"Model: {model.model_id}") return audio # Run async function asyncio.run(generate_speech()) ``` ## Conversational AI Agent Build interactive AI agents with real-time audio capabilities. ```python from elevenlabs.client import ElevenLabs from elevenlabs.conversational_ai.conversation import Conversation, ClientTools from elevenlabs.conversational_ai.default_audio_interface import DefaultAudioInterface client = ElevenLabs(api_key="your_api_key") # Create audio interface for real-time I/O audio_interface = DefaultAudioInterface() # Register custom tools the agent can call client_tools = ClientTools() def get_weather(params): location = params.get("location", "Unknown") return f"Weather in {location}: Sunny, 72°F" def calculate_sum(params): numbers = params.get("numbers", []) return sum(numbers) client_tools.register("get_weather", get_weather, is_async=False) client_tools.register("calculate_sum", calculate_sum, is_async=False) # Create and start conversation conversation = Conversation( client=client, agent_id="your-agent-id", requires_auth=True, audio_interface=audio_interface, client_tools=client_tools ) conversation.start_session() # Conversation runs until you call: # conversation.end_session() ``` ## Custom Event Loop for Conversational AI Use custom event loops for advanced async management. ```python import asyncio from elevenlabs.client import ElevenLabs from elevenlabs.conversational_ai.conversation import Conversation, ClientTools from elevenlabs.conversational_ai.default_audio_interface import DefaultAudioInterface async def main(): client = ElevenLabs(api_key="your_api_key") audio_interface = DefaultAudioInterface() # Get current event loop custom_loop = asyncio.get_running_loop() # Create ClientTools with custom loop client_tools = ClientTools(loop=custom_loop) # Register async tool async def fetch_data(params): url = params.get("url") # Your async HTTP request logic await asyncio.sleep(1) # Simulate async operation return {"data": "fetched from " + url} client_tools.register("fetch_data", fetch_data, is_async=True) # Create conversation with custom tools conversation = Conversation( client=client, agent_id="your-agent-id", requires_auth=True, audio_interface=audio_interface, client_tools=client_tools ) conversation.start_session() asyncio.run(main()) ``` ## Speech-to-Text Conversion Transcribe audio files to text. ```python from elevenlabs.client import ElevenLabs client = ElevenLabs(api_key="your_api_key") # Transcribe audio file with open("audio.mp3", "rb") as audio_file: result = client.speech_to_text.convert( audio=audio_file, model_id="eleven_turbo_v2_5" ) print(result.text) ``` ## Audio Dubbing Dub audio content into different languages. ```python from elevenlabs.client import ElevenLabs client = ElevenLabs(api_key="your_api_key") # Create dubbing project with open("source_video.mp4", "rb") as video_file: dubbing = client.dubbing.create( file=video_file, target_lang="es", # Spanish source_lang="en", mode="automatic" ) print(f"Dubbing ID: {dubbing.dubbing_id}") # Get dubbing status status = client.dubbing.get(dubbing_id=dubbing.dubbing_id) print(f"Status: {status.status}") # Download dubbed audio when ready if status.status == "dubbed": audio = client.dubbing.get_file( dubbing_id=dubbing.dubbing_id, language_code="es" ) ``` ## Music Generation Generate music from text prompts. ```python from elevenlabs.client import ElevenLabs client = ElevenLabs(api_key="your_api_key") # Generate music from prompt music = client.music.compose( prompt="upbeat electronic dance music with synthesizers", duration=30, output_format="mp3_44100_128" ) # Save generated music from elevenlabs import save save(music, "generated_music.mp3") ``` ## Audio Isolation Isolate vocals or remove background noise from audio. ```python from elevenlabs.client import ElevenLabs client = ElevenLabs(api_key="your_api_key") # Isolate vocals from audio with open("mixed_audio.mp3", "rb") as audio_file: isolated = client.audio_isolation.convert( audio=audio_file, file_format="mp3" ) from elevenlabs import save save(isolated, "isolated_vocals.mp3") ``` ## Custom Voice Settings Fine-tune voice parameters for speech generation. ```python from elevenlabs.client import ElevenLabs from elevenlabs import VoiceSettings client = ElevenLabs(api_key="your_api_key") # Create custom voice settings voice_settings = VoiceSettings( stability=0.5, # 0.0 to 1.0 similarity_boost=0.75, # 0.0 to 1.0 style=0.3, # 0.0 to 1.0 use_speaker_boost=True ) # Use custom settings audio = client.text_to_speech.convert( voice_id="JBFqnCBsd6RMkjVDRZzb", text="Custom voice settings for fine-tuned output", model_id="eleven_multilingual_v2", voice_settings=voice_settings ) ``` ## Pronunciation Dictionary Use pronunciation dictionaries for consistent pronunciation of specific terms. ```python from elevenlabs.client import ElevenLabs from elevenlabs import PronunciationDictionaryVersionLocator client = ElevenLabs(api_key="your_api_key") # Create pronunciation dictionary dictionary = client.pronunciation_dictionaries.create_from_file( name="technical_terms", file=open("pronunciation_rules.pls", "rb") ) # Use dictionary in text-to-speech audio = client.text_to_speech.convert( voice_id="JBFqnCBsd6RMkjVDRZzb", text="API and SQL are technical terms", model_id="eleven_multilingual_v2", pronunciation_dictionary_locators=[ PronunciationDictionaryVersionLocator( pronunciation_dictionary_id=dictionary.id, version_id=dictionary.version_id ) ] ) ``` ## History Management Retrieve and manage speech generation history. ```python from elevenlabs.client import ElevenLabs client = ElevenLabs(api_key="your_api_key") # Get generation history history = client.history.list() for item in history.history: print(f"Text: {item.text}, Created: {item.date_unix}") # Download audio from history if history.history: audio = client.history.get_audio(history_item_id=history.history[0].history_item_id) from elevenlabs import save save(audio, "history_audio.mp3") # Delete history item client.history.delete(history_item_id=history.history[0].history_item_id) ``` ## Optimized Streaming for Low Latency Use latency optimization settings for real-time applications. ```python from elevenlabs.client import ElevenLabs from elevenlabs import stream client = ElevenLabs(api_key="your_api_key") # Use Eleven Flash v2.5 with max latency optimization audio_stream = client.text_to_speech.stream( text="Ultra-low latency streaming for real-time applications", voice_id="JBFqnCBsd6RMkjVDRZzb", model_id="eleven_flash_v2_5", optimize_streaming_latency=4 # Max latency optimization ) stream(audio_stream) ``` ## Error Handling Handle API errors gracefully with proper exception handling. ```python from elevenlabs.client import ElevenLabs from elevenlabs import UnauthorizedError, NotFoundError, BadRequestError client = ElevenLabs(api_key="your_api_key") try: audio = client.text_to_speech.convert( voice_id="invalid_voice_id", text="Testing error handling", model_id="eleven_multilingual_v2" ) except UnauthorizedError: print("Invalid API key") except NotFoundError: print("Voice not found") except BadRequestError as e: print(f"Invalid request: {e}") except Exception as e: print(f"Unexpected error: {e}") ``` ## Summary The ElevenLabs Python SDK provides a powerful toolkit for integrating AI-powered voice generation and audio processing into Python applications. Primary use cases include building voice-enabled applications with natural-sounding speech synthesis, creating multilingual content through voice cloning and dubbing, developing interactive conversational AI agents with real-time audio capabilities, and processing audio for content creation workflows including music generation and vocal isolation. The SDK is particularly well-suited for accessibility applications, podcast production, gaming voice-over, customer service automation, and educational content creation. Integration patterns follow a straightforward client initialization model where developers create an `ElevenLabs` or `AsyncElevenLabs` client instance with API credentials, then access specific functionality through namespace properties like `text_to_speech`, `voices`, `conversational_ai`, and `dubbing`. The SDK supports both fire-and-forget operations for batch processing and streaming operations for real-time applications. Helper functions like `play()`, `save()`, and `stream()` simplify audio handling, while comprehensive type hints and async support enable robust application development. The SDK automatically handles authentication, request formatting, and response parsing, allowing developers to focus on application logic rather than API implementation details.