### Install LM Studio Python SDK Source: https://github.com/lmstudio-ai/lmstudio-python/blob/main/README.md Commands to install the LM Studio Python SDK from PyPI using pip. This is the primary installation method for the SDK that provides the lmstudio package ```console pip install lmstudio ``` -------------------------------- ### Basic Text Completion with LM Studio Source: https://github.com/lmstudio-ai/lmstudio-python/blob/main/README.md Basic text completion using the synchronous Client API from the lmstudio package. Requires an already loaded LLM instance and handles websocket connections ```python import lmstudio as lms model = lms.llm() model.complete("Once upon a time,") ``` -------------------------------- ### Clone LM Studio Repository Source: https://github.com/lmstudio-ai/lmstudio-python/blob/main/README.md Source code retrieval using git for the LM Studio Python SDK project with recursive submodule initialization required for development ```console git clone https://github.com/lmstudio-ai/lmstudio-python cd lmstudio-python ``` ```console git submodule update --init --recursive ``` -------------------------------- ### Repository Development Commands Source: https://github.com/lmstudio-ai/lmstudio-python/blob/main/README.md Additional git operations for repository maintenance and synchronization with SDK schema. Used for development environment setup and updates ```console tox -m check ``` ```console tox -e sync-sdk-schema ``` -------------------------------- ### Custom Callbacks for Prediction Progress Tracking Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt This example shows how to implement custom callbacks to monitor and react to different stages of an LLM prediction process. It covers callbacks for receiving the first token, processing prediction fragments, tracking prompt progress, and managing round starts and ends. This allows for real-time feedback and dynamic behavior during generation. ```python import lmstudio as lms import time # Define callbacks def on_first_token(round_num: int): print(f"[Round {round_num}] First token received") def on_fragment(fragment: lms.LlmPredictionFragment, round_num: int): # Print each token as it arrives print(fragment.content, end="", flush=True) def on_prompt_progress(progress: float, round_num: int): # Progress is 0.0 to 1.0 if progress == 1.0: print(f"[Round {round_num}] Prompt processing complete") def on_round_start(round_num: int): print(f"\n{'='*50}") print(f"Starting round {round_num}") print('='*50) def on_round_end(round_num: int): print(f"\nRound {round_num} completed") # Initialize model = lms.llm() # Non-streaming with callbacks start_time = time.time() result = model.complete( "Write a haiku about programming", config=lms.LlmPredictionConfig(temperature=0.8), on_first_token=lambda: print("Generating..."), on_prediction_fragment=lambda f: print(f.content, end="", flush=True) ) elapsed = time.time() - start_time print(f"\n\nCompleted in {elapsed:.2f}s") print(f"Total tokens: {result.stats.get('total_tokens', 'N/A')}") # Agent with full callback suite chat = lms.Chat() def my_tool(x: int) -> int: """Multiply by 2.""" return x * 2 result = model.act( "What is 42 multiplied by 2?", tools=[my_tool], on_message=chat.append, on_first_token=on_first_token, on_prediction_fragment=on_fragment, on_prompt_processing_progress=on_prompt_progress, on_round_start=on_round_start, on_round_end=on_round_end, on_prediction_completed=lambda result: print(f"\nPrediction stats: {result}") ) print(f"\nTotal rounds: {result.rounds}") print(f"Total time: {result.total_time_seconds:.2f}s") ``` -------------------------------- ### LM Studio Plugin Development with ToolsProvider Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt This section details how to develop custom plugins for LM Studio. It includes defining configuration schemas (per-chat and global), implementing tools that can be used by the LLM, and structuring the plugin files (manifest.json and Python script). The example demonstrates creating tools for fetching weather and calculating distances. ```json # plugin_dir/manifest.json { "name": "My Custom Plugin", "version": "1.0.0", "description": "Provides custom tools", "hooks": { "toolsProvider": "./src/plugin.py:create_tools_provider" } } ``` ```python # plugin_dir/src/plugin.py from lmstudio.plugin import ToolsProviderController, BaseConfigSchema, config_field import requests # Define configuration schema class ConfigSchema(BaseConfigSchema): """Per-chat configuration.""" api_key: str = config_field( label="API Key", hint="Your API key for the service", default="" ) class GlobalConfigSchema(BaseConfigSchema): """Global plugin configuration.""" timeout: int = config_field( label="Request Timeout", hint="Timeout in seconds", default=30 ) def create_tools_provider(): """Create the tools provider hook.""" controller = ToolsProviderController[ConfigSchema, GlobalConfigSchema]() @controller.tool() def fetch_weather(city: str) -> str: """Fetch weather information for a city.""" # Access configuration config = controller.chat_config global_config = controller.global_config # Make API call (example) try: response = requests.get( f"https://api.weather.com/v1/{city}", headers={"Authorization": f"Bearer {config.api_key}"}, timeout=global_config.timeout ) return response.json()["description"] except Exception as e: return f"Error fetching weather: {e}" @controller.tool() def calculate_distance(lat1: float, lon1: float, lat2: float, lon2: float) -> float: """Calculate distance between two coordinates in kilometers.""" from math import radians, sin, cos, sqrt, atan2 R = 6371 # Earth radius in km lat1, lon1, lat2, lon2 = map(radians, [lat1, lon1, lat2, lon2]) dlat = lat2 - lat1 dlon = lon2 - lon1 a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2 c = 2 * atan2(sqrt(a), sqrt(1-a)) return R * c return controller # Run plugin if __name__ == "__main__": from lmstudio.plugin import run_plugin run_plugin(plugin_dir=".") ``` -------------------------------- ### LM Studio Client Usage with Error Handling in Python Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt This snippet demonstrates using a context manager for LM Studio client initialization, model loading with fallbacks, and prediction with comprehensive error handling for timeouts and failures. It requires the LM Studio server running locally and the lms library installed. Inputs include prompts and configuration; outputs are chat responses or error messages. Limitations include dependency on local server availability and potential timeouts under high load. ```python # Use context manager for cleanup try: with lms.Client() as client: try: # Try to get specific model model = client.llm.model("my-preferred-model") except LMStudioError: # Fallback to any loaded model loaded = client.list_loaded_models(namespace="llm") if loaded: model = client.llm.model(loaded[0].identifier) else: # Load default model = client.llm.load_new_instance("default-model") # Handle prediction errors chat = lms.Chat() try: result = model.respond( "Explain quantum entanglement", config=lms.LlmPredictionConfig( temperature=0.7, max_tokens=1000 ) ) chat.add_assistant_response(result) print(result.content) except LMStudioTimeoutError: print("Request timed out - model may be overloaded") except LMStudioPredictionError as e: print(f"Prediction failed: {e}") chat.add_assistant_response( lms.AssistantResponse(content="I apologize, I encountered an error.") ) except LMStudioClientError as e: print(f"Failed to connect to LM Studio: {e}") print("Make sure LM Studio is running locally") except LMStudioError as e: print(f"SDK error: {e}") ``` -------------------------------- ### Chat Interface Development Source: https://github.com/lmstudio-ai/lmstudio-python/blob/main/README.md Chat response functionality using Chat helper to manage chat history and include it in response prediction requests. Supports multi-turn conversations with proper context management ```python import lmstudio as lms EXAMPLE_MESSAGES = ( "My hovercraft is full of eels!", "I will not buy this record, it is scratched." ) model = lms.llm() chat = lms.Conversation("You are a helpful shopkeeper assisting a foreign traveller") for message in EXAMPLE_MESSAGES: chat.add_user_message(message) print(f"Customer: {message}") response = model.respond(chat) chat.add_assistant_response(response) print(f"Shopkeeper: {response}") ``` -------------------------------- ### Tool Use Error Handling in LM Studio Python SDK Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt This example shows defining a risky tool function, a custom error handler for tool calls, and using the model.act method for agent-like interactions with tools and retry logic. It depends on the lms library and a loaded model. Inputs are user requests and tool functions; outputs are processed results or error messages. Limitations include max rounds for predictions and handling of invalid tool requests. ```python # Tool use error handling def risky_tool(value: int) -> int: """A tool that might fail.""" if value < 0: raise ValueError("Value must be positive") return value * 2 def handle_tool_error(error: lms.LMStudioPredictionError, request) -> str: """Handle tool call failures.""" if request: return f"Tool '{request.tool_name}' failed: {error}. Please try a different approach." return "A tool call failed. Please rephrase your request." model = lms.llm() result = model.act( "Process the value -5", tools=[risky_tool], handle_invalid_tool_request=handle_tool_error, max_prediction_rounds=3 ) ``` -------------------------------- ### Model Management and Configuration with LM Studio Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt Demonstrates how to load, configure, and manage LLM model instances using the lmstudio-python library. This includes listing downloaded and loaded models, specifying custom configurations like context length and GPU layers, tokenization utilities, and applying prompt templates. ```python import lmstudio as lms # Create explicit client client = lms.Client(api_host="localhost:1234") # List available models downloaded = client.list_downloaded_models() print("Downloaded models:") for model in downloaded: print(f" - {model.path}") loaded = client.list_loaded_models() print(f"\nCurrently loaded: {len(loaded)} models") # Load model with custom configuration model = client.llm.load_new_instance( model_key="qwen2.5-7b-instruct", config=lms.LlmLoadModelConfig( context_length=8192, gpu_split_strategy="layers", max_gpu_layers=32 ), ttl=300000, # 5 minutes in milliseconds on_load_progress=lambda progress: print(f"Loading: {progress*100:.1f}%") ) # Get model information info = model.get_info() print(f"Model: {info['identifier']}") print(f"Context length: {model.get_context_length()}") # Tokenization utilities text = "Hello, world!" tokens = model.tokenize(text) count = model.count_tokens(text) print(f"Text: '{text}'") print(f"Tokens: {tokens}") print(f"Count: {count}") # Batch tokenization texts = ["First text", "Second text", "Third text"] token_lists = model.tokenize(texts) for text, tokens in zip(texts, token_lists): print(f"{text}: {len(tokens)} tokens") # Apply model's prompt template chat = lms.Chat("You are helpful") chat.add_user_message("Hello!") formatted = model.apply_prompt_template(chat) print(f"Formatted prompt:\n{formatted}") # Unload when done model.unload() ``` -------------------------------- ### Pretty Print JSON and Make Multiple Requests Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt Demonstrates how to pretty print JSON output and handle multiple requests to an LLM model, parsing responses into a list of structured objects. Assumes a 'movie' object for JSON printing and a 'model' object with a 'respond' method for book information. ```python import json # Pretty print full JSON print(json.dumps(movie, indent=2)) # Multiple requests books = [] for title in ["The Hobbit", "1984", "Pride and Prejudice"]: result = model.respond(f"Tell me about {title}", response_format=BookInfo) books.append(result.parsed) for book in books: print(f"{book['title']} by {book['author']} ({book['year']})") ``` -------------------------------- ### Simple Text Completion with LM Studio Python SDK Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt Generates text completions from prompts using the synchronous API of the LM Studio Python SDK. It requires the 'lmstudio' library and connects to a local LM Studio instance. The function accepts a prompt string and an optional configuration object for prediction parameters. ```python import lmstudio as lms # Connect to local LM Studio and get a model model = lms.llm() # Generate completion result = model.complete("Once upon a time in a distant land,") print(result.content) # With configuration result = model.complete( "Explain quantum computing", config=lms.LlmPredictionConfig( temperature=0.7, top_p=0.9, max_tokens=500 ) ) print(result.content) ``` -------------------------------- ### Agent Tool Use with LM Studio Python SDK Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt Creates intelligent agents that can call Python functions to solve complex tasks using the LM Studio Python SDK. It requires the 'lmstudio' and 'math' libraries. Tools are defined as Python functions with type hints, and the agent can execute them with automatic tool execution and parallel calls. ```python import lmstudio as lms import math # Define tools as Python functions with type hints def add(a: int, b: int) -> int: """Add two numbers together.""" return a + b def multiply(a: float, b: float) -> float: """Multiply two numbers.""" return a * b def is_prime(n: int) -> bool: """Check if a number is prime.""" if n < 2: return False for i in range(2, int(math.sqrt(n)) + 1): if n % i == 0: return False return True # Initialize chat = lms.Chat() model = lms.llm("qwen2.5-7b-instruct-1m") # Run multi-round agent with automatic tool execution result = model.act( "Is the result of (123 + 456) multiplied by 2 a prime number? Think step by step.", tools=[add, multiply, is_prime], max_prediction_rounds=10, max_parallel_tool_calls=2, # Allow parallel execution on_message=chat.append, # Track all messages on_round_start=lambda round_num: print(f"\n=== Round {round_num} ==="), on_round_end=lambda round_num: print(f"=== Round {round_num} complete ===\n") ) print(f"\nCompleted in {result.rounds} rounds ({result.total_time_seconds:.2f}s)") print("\nFull conversation:") print(chat) ``` -------------------------------- ### Asynchronous Concurrent LLM Operations Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt Utilizes Python's async/await for efficient concurrent LLM operations. This snippet shows how to initialize an async client, execute multiple completion tasks concurrently using asyncio.gather, and handle streaming responses. ```python import asyncio import lmstudio as lms async def main(): # Must use context manager for async client (structured concurrency) async with lms.AsyncClient() as client: # Get model handle model = await client.llm.model("qwen2.5-7b-instruct-1m") # Define tasks questions = [ "What is the capital of France?", "Explain photosynthesis in one sentence.", "What is the largest planet?", "Who wrote Romeo and Juliet?", "What is the speed of light?" ] # Execute concurrently results = await asyncio.gather( *[model.complete(q) for q in questions] ) # Process results for question, result in zip(questions, results): print(f"Q: {question}") print(f"A: {result.content}\n") # Streaming with async chat = lms.Chat("You are concise") async for fragment in model.respond_stream(chat): print(fragment.content, end="", flush=True) # Run async application asyncio.run(main()) ``` -------------------------------- ### LM Studio SDK Error Handling and Timeout Configuration Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt This snippet illustrates best practices for error handling within the LM Studio Python SDK. It shows how to import specific exception classes like LMStudioError, LMStudioClientError, LMStudioTimeoutError, and LMStudioPredictionError. Additionally, it demonstrates how to configure the synchronous API timeout globally using `set_sync_api_timeout`. ```python import lmstudio as lms from lmstudio import ( LMStudioError, LMStudioClientError, LMStudioTimeoutError, LMStudioPredictionError ) # Configure timeout lms.set_sync_api_timeout(120.0) # 2 minutes ``` -------------------------------- ### Structured JSON Responses with LM Studio Python SDK Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt Parses model outputs into type-safe Python objects using schema definitions with the LM Studio Python SDK. It requires the 'lmstudio' and 'json' libraries. Response schemas are defined using classes inheriting from 'lms.BaseModel', enabling structured data access. ```python import lmstudio as lms import json # Define response schema class BookInfo(lms.BaseModel): """Structured information about a book.""" title: str author: str year: int genres: list[str] summary: str class MovieInfo(lms.BaseModel): """Structured information about a movie.""" title: str director: str year: int cast: list[str] rating: float # Get model model = lms.llm() # Request structured response result = model.respond( "Tell me about The Lord of the Rings: The Fellowship of the Ring", response_format=MovieInfo ) # Access parsed data with type safety movie = result.parsed print(f"Title: {movie['title']}") print(f"Director: {movie['director']}") print(f"Year: {movie['year']}") print(f"Cast: {', '.join(movie['cast'])}") print(f"Rating: {movie['rating']}/10") ``` -------------------------------- ### Image Handling in Chat for Multimodal Models Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt Illustrates how to include images in chat messages for multimodal LLM interactions. This involves preparing an image file, adding it to a chat message, and receiving a response from a vision-capable model. The conversation can then be continued with text-only messages. ```python import lmstudio as lms # Initialize client = lms.Client() model = lms.llm() # Use vision-capable model chat = lms.Chat("You are an image analysis assistant") # Prepare image image_handle = client.prepare_image( src="/path/to/image.jpg", name="photo.jpg" ) # Add message with image chat.add_user_message([ "What do you see in this image?", image_handle ]) # Get response result = model.respond(chat) print(result.content) # Continue conversation with text only chat.add_assistant_response(result) chat.add_user_message("Can you describe the colors?") result = model.respond(chat) print(result.content) ``` -------------------------------- ### Streaming Chat with History using LM Studio Python SDK Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt Builds interactive chatbots with streaming responses and conversation history management using the LM Studio Python SDK. It requires the 'lmstudio' and 'json' libraries. The chat history is managed via a 'Chat' object, and responses are streamed with a callback to automatically append messages. ```python import lmstudio as lms # Initialize chat with system prompt chat = lms.Chat("You are a helpful AI assistant specializing in Python programming") model = lms.llm() # Interactive loop while True: user_input = input("You: ") if not user_input: break # Add user message to history chat.add_user_message(user_input) # Stream response with callback to auto-append to history stream = model.respond_stream(chat, on_message=chat.append) print("Bot: ", end="", flush=True) for fragment in stream: print(fragment.content, end="", flush=True) print() # New line after response # Export conversation import json print(json.dumps(chat.to_dict(), indent=2)) ``` -------------------------------- ### Embedding Generation with LM Studio Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt Shows how to generate vector embeddings for text using embedding models provided by lmstudio-python. This includes generating embeddings for single text strings and batches of strings, and calculating cosine similarity between embeddings using NumPy. ```python import lmstudio as lms import numpy as np # Get embedding model embedding_model = lms.embedding_model("nomic-embed-text-v1.5") # Single text embedding text = "The quick brown fox jumps over the lazy dog" embedding = embedding_model.embed(text) print(f"Embedding dimension: {len(embedding)}") print(f"First 5 values: {embedding[:5]}") # Batch embeddings (more efficient) texts = [ "Machine learning is fascinating", "Deep learning uses neural networks", "Natural language processing enables AI to understand text" ] embeddings = embedding_model.embed_batch(texts) print(f"Generated {len(embeddings)} embeddings") # Calculate similarity def cosine_similarity(a, b): return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) # Compare embeddings sim_0_1 = cosine_similarity(embeddings[0], embeddings[1]) sim_0_2 = cosine_similarity(embeddings[0], embeddings[2]) print(f"Similarity [0-1]: {sim_0_1:.4f}") print(f"Similarity [0-2]: {sim_0_2:.4f}") ``` -------------------------------- ### Tokenization for Embeddings Source: https://context7.com/lmstudio-ai/lmstudio-python/llms.txt This snippet demonstrates how to tokenize text and count the number of tokens using an embedding model. It is useful for understanding the token representation of text, which is crucial for many NLP tasks. ```python tokens = embedding_model.tokenize(text) count = embedding_model.count_tokens(text) print(f"Tokens: {count}") ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.