### Install Python Dependencies Source: https://github.com/livekit-examples/multi-agent-python/blob/main/README.md Installs project dependencies into a virtual environment. Ensure you have Python 3 installed. ```console cd multi-agent-python python3 -m venv venv source venv/bin/activate pip install -r requirements.txt ``` -------------------------------- ### Configure Environment Variables Source: https://github.com/livekit-examples/multi-agent-python/blob/main/README.md Sets up necessary API keys and LiveKit connection details. Copy the example file and fill in your credentials. ```bash lk app env ``` -------------------------------- ### Start a Multi-Model Voice Session with AgentSession Source: https://context7.com/livekit-examples/multi-agent-python/llms.txt Orchestrates the STT -> LLM -> TTS pipeline for a LiveKit room. Connects to the room and starts the agent session with specified plugins and shared userdata. ```python from livekit.agents import AgentSession, JobContext, WorkerOptions, cli, metrics from livekit.agents.voice import MetricsCollectedEvent from livekit.plugins import deepgram, openai, silero from dataclasses import dataclass, field from typing import Optional @dataclass class StoryData: characters: list = field(default_factory=list) locations: list = field(default_factory=list) theme: Optional[str] = None async def entrypoint(ctx: JobContext): await ctx.connect() # connect to the LiveKit room session = AgentSession[StoryData]( vad=ctx.proc.userdata["vad"], # Silero VAD for end-of-speech detection llm=openai.LLM(model="gpt-4o-mini"), # LLM for reasoning and tool calls stt=deepgram.STT(model="nova-3"), # STT for transcription tts=openai.TTS(voice="ash"), # TTS for audio output userdata=StoryData(), # shared state across agents ) usage_collector = metrics.UsageCollector() @session.on("metrics_collected") def _on_metrics_collected(ev: MetricsCollectedEvent): metrics.log_metrics(ev.metrics) usage_collector.collect(ev.metrics) async def log_usage(): summary = usage_collector.get_summary() print(f"Total usage: {summary}") ctx.add_shutdown_callback(log_usage) await session.start( agent=LeadEditorAgent(), room=ctx.room, room_output_options=RoomOutputOptions(transcription_enabled=True), ) ``` -------------------------------- ### AgentSession — Start a multi-model voice session Source: https://context7.com/livekit-examples/multi-agent-python/llms.txt `AgentSession` orchestrates the STT → LLM → TTS pipeline for a connected LiveKit room. It accepts plugin instances for each model stage, optional VAD, and typed `userdata` shared across all agents in the session. ```APIDOC ## AgentSession — Start a multi-model voice session ### Description `AgentSession` orchestrates the STT → LLM → TTS pipeline for a connected LiveKit room. It accepts plugin instances for each model stage, optional VAD, and typed `userdata` shared across all agents in the session. ### Method Signature ```python AgentSession[StoryData]( vad: Any, llm: Any, stt: Any, tts: Any, userdata: Any ) ``` ### Parameters - **vad** (Any) - Silero VAD for end-of-speech detection. - **llm** (Any) - LLM for reasoning and tool calls (e.g., `openai.LLM`). - **stt** (Any) - STT for transcription (e.g., `deepgram.STT`). - **tts** (Any) - TTS for audio output (e.g., `openai.TTS`). - **userdata** (Any) - Shared state across agents (e.g., `StoryData()`). ### Example Usage ```python from livekit.agents import AgentSession, JobContext, WorkerOptions, cli, metrics from livekit.agents.voice import MetricsCollectedEvent from livekit.plugins import deepgram, openai, silero from dataclasses import dataclass, field from typing import Optional @dataclass class StoryData: characters: list = field(default_factory=list) locations: list = field(default_factory=list) theme: Optional[str] = None async def entrypoint(ctx: JobContext): await ctx.connect() # connect to the LiveKit room session = AgentSession[StoryData]( vad=ctx.proc.userdata["vad"], # Silero VAD for end-of-speech detection llm=openai.LLM(model="gpt-4o-mini"), # LLM for reasoning and tool calls stt=deepgram.STT(model="nova-3"), # STT for transcription tts=openai.TTS(voice="ash"), # TTS for audio output userdata=StoryData(), # shared state across agents ) usage_collector = metrics.UsageCollector() @session.on("metrics_collected") def _on_metrics_collected(ev: MetricsCollectedEvent): metrics.log_metrics(ev.metrics) usage_collector.collect(ev.metrics) async def log_usage(): summary = usage_collector.get_summary() print(f"Total usage: {summary}") ctx.add_shutdown_callback(log_usage) await session.start( agent=LeadEditorAgent(), room=ctx.room, room_output_options=RoomOutputOptions(transcription_enabled=True), ) ``` ``` -------------------------------- ### Environment Configuration and Worker Bootstrap Source: https://context7.com/livekit-examples/multi-agent-python/llms.txt Configure the worker using environment variables from `.env.local` and launch with `cli.run_app`. Use the `dev` argument for development mode with hot-reloading. ```bash # 1. Copy and fill in credentials cp .env.example .env.local # .env.local contents: # LIVEKIT_URL=wss://your-project.livekit.cloud # LIVEKIT_API_KEY=APIxxxxxxxxxxxxxxx # LIVEKIT_API_SECRET=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx # OPENAI_API_KEY=sk-... # DEEPGRAM_API_KEY=... # 2. Install dependencies python3 -m venv venv source venv/bin/activate # Windows: powershell venv/Scripts/Activate.ps1 pip install -r requirements.txt # 3. Run in development mode (auto-reconnects on code changes) python3 main.py dev # Expected output: # INFO:livekit.agents.worker:Starting worker... # INFO:livekit.agents.worker:Connected to LiveKit server # INFO:multi-agent:added character to the story: Alice ``` -------------------------------- ### Pre-load Models with prewarm() and JobProcess Source: https://context7.com/livekit-examples/multi-agent-python/llms.txt The `prewarm` function runs at worker startup to load heavy models into `proc.userdata`, making them instantly available for sessions. Access pre-loaded models via `ctx.proc.userdata` in the `entrypoint`. ```python from livekit.agents import JobProcess, WorkerOptions, cli from livekit.plugins import silero def prewarm(proc: JobProcess): # Load Silero VAD model once at process startup; reused for every session proc.userdata["vad"] = silero.VAD.load() async def entrypoint(ctx): # Access pre-loaded VAD from process userdata vad = ctx.proc.userdata["vad"] ... if __name__ == "__main__": cli.run_app( WorkerOptions( entrypoint_fnc=entrypoint, prewarm_fnc=prewarm, # registered here ) ) ``` -------------------------------- ### Define an Agent with on_enter for Opening Reply Source: https://context7.com/livekit-examples/multi-agent-python/llms.txt Subclass Agent and implement on_enter to provide an initial greeting or statement when the agent becomes active in the session. ```python from livekit.agents import Agent class LeadEditorAgent(Agent): def __init__(self) -> None: super().__init__( instructions=( "You are the lead editor at a publishing house. " "Ask the user about their story idea, then hand off to the right specialist." ), ) async def on_enter(self): # Automatically greets the user upon activation self.session.generate_reply() ``` -------------------------------- ### Agent subclass with on_enter — Define an agent and trigger its opening reply Source: https://context7.com/livekit-examples/multi-agent-python/llms.txt Subclassing `Agent` and implementing `on_enter` allows the agent to immediately generate a greeting or opening statement when it becomes active in the session. ```APIDOC ## Agent subclass with `on_enter` — Define an agent and trigger its opening reply ### Description Subclassing `Agent` and implementing `on_enter` allows the agent to immediately generate a greeting or opening statement when it becomes active in the session. ### Class Definition ```python from livekit.agents import Agent class LeadEditorAgent(Agent): def __init__(self) -> None: super().__init__( instructions=( "You are the lead editor at a publishing house. " "Ask the user about their story idea, then hand off to the right specialist." ), ) async def on_enter(self): # Automatically greets the user upon activation self.session.generate_reply() ``` ### Method - **on_enter()**: An async method that is called when the agent becomes active in the session. It can be used to initiate the agent's behavior, such as generating an opening reply. ``` -------------------------------- ### Run the Agent Source: https://github.com/livekit-examples/multi-agent-python/blob/main/README.md Executes the main agent script in development mode. This requires a compatible frontend application to interact with. ```bash python3 main.py dev ``` -------------------------------- ### Agent Handoff with Tuple Return Source: https://context7.com/livekit-examples/multi-agent-python/llms.txt Return a tuple of (new_agent_instance, message_string) from a @function_tool to transfer session control to another agent. The new agent receives the chat history. ```python from livekit.agents import Agent, RunContext, ChatContext from livekit.agents.llm import function_tool from typing import Optional class SpecialistEditorAgent(Agent): def __init__(self, specialty: str, chat_ctx: Optional[ChatContext] = None): super().__init__( instructions=f"You specialize in {specialty}. Help the user develop their idea.", tts=openai.TTS(voice="echo"), # each agent can use a different voice/model chat_ctx=chat_ctx, ) async def on_enter(self): self.session.generate_reply() class LeadEditorAgent(Agent): def __init__(self): super().__init__(instructions="Route the user to the right specialist.") @function_tool async def detected_childrens_book(self, context: RunContext[StoryData]): """Called when the story is identified as a children's book.""" specialist = SpecialistEditorAgent( "children's books", chat_ctx=context.session._chat_ctx, # pass full history to specialist ) # Returning (agent, message) triggers the handoff return specialist, "Let's switch to the children's book editor." @function_tool async def detected_novel(self, context: RunContext[StoryData]): """Called when the story is identified as a novel.""" specialist = SpecialistEditorAgent( "novels", chat_ctx=context.session._chat_ctx, ) return specialist, "Let's switch to the fiction editor." ``` -------------------------------- ### Expose Agent Methods as LLM-Callable Tools with @function_tool Source: https://context7.com/livekit-examples/multi-agent-python/llms.txt Decorate async methods with @function_tool to register them as callable tools for the LLM. The method receives a RunContext for accessing shared userdata and the session. ```python from livekit.agents import Agent, RunContext from livekit.agents.llm import function_tool from dataclasses import dataclass @dataclass class StoryData: theme: str = None class LeadEditorAgent(Agent): def __init__(self): super().__init__(instructions="Gather story details from the user.") @function_tool async def theme_introduction( self, context: RunContext[StoryData], theme: str, ): """Called when the user has provided a theme. Args: theme: The name of the theme """ context.userdata.theme = theme print(f"Theme set to: {theme}") # The LLM will continue the conversation after this tool returns ``` -------------------------------- ### `@function_tool` — Expose agent methods as LLM-callable tools Source: https://context7.com/livekit-examples/multi-agent-python/llms.txt Decorating an async method with `@function_tool` registers it as a callable tool the LLM can invoke during conversation. The method receives a typed `RunContext` giving access to shared `userdata` and the active session. ```APIDOC ## `@function_tool` — Expose agent methods as LLM-callable tools ### Description Decorating an async method with `@function_tool` registers it as a callable tool the LLM can invoke during conversation. The method receives a typed `RunContext` giving access to shared `userdata` and the active session. ### Method Signature ```python @function_tool async def method_name( context: RunContext[UserData], param1: type, param2: type, ... ) ``` ### Parameters - **context** (`RunContext[UserData]`) - Provides access to shared `userdata` and the active session. - **param1, param2, ...** (type) - Parameters defined by the tool's signature, which the LLM will populate. ### Example Usage ```python from livekit.agents import Agent, RunContext from livekit.agents.llm import function_tool from dataclasses import dataclass @dataclass class StoryData: theme: str = None class LeadEditorAgent(Agent): def __init__(self): super().__init__(instructions="Gather story details from the user.") @function_tool async def theme_introduction( self, context: RunContext[StoryData], theme: str, ): """Called when the user has provided a theme. Args: theme: The name of the theme """ context.userdata.theme = theme print(f"Theme set to: {theme}") # The LLM will continue the conversation after this tool returns ``` ``` -------------------------------- ### Control Session Flow with interrupt() and generate_reply() Source: https://context7.com/livekit-examples/multi-agent-python/llms.txt Use `session.interrupt()` to stop audio playback mid-stream. `session.generate_reply()` triggers LLM responses, allowing custom instructions and control over user interruptions. ```python from livekit.agents import Agent, RunContext from livekit.agents.llm import function_tool from livekit.api import DeleteRoomRequest from livekit.agents.job import get_job_context class SpecialistEditorAgent(Agent): @function_tool async def story_finished(self, context: RunContext[StoryData]): """Called when the story outline is complete and the session should end.""" # Stop any in-progress TTS playback immediately self.session.interrupt() # Generate a farewell/feedback message; disallow user interruption during playback await self.session.generate_reply( instructions="Give brief but honest feedback on the story idea.", allow_interruptions=False, ) # Delete the LiveKit room to cleanly terminate the session job_ctx = get_job_context() await job_ctx.api.room.delete_room( DeleteRoomRequest(room=job_ctx.room.name) ) ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.