### List Models API Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md Retrieve a list of available models. ```APIDOC ## GET /v1/models ### Description This endpoint retrieves a list of all available models that can be used with the API. ### Method GET ### Endpoint /v1/models ### Parameters None ### Request Example (No request body or specific parameters needed for this GET request) ### Response #### Success Response (200) - **data** (array) - A list of model objects. - **id** (string) - The ID of the model. - **object** (string) - The type of object, usually 'model'. - **created** (integer) - Unix timestamp of when the model was created. - **owned_by** (string) - The entity that owns the model. #### Response Example ```json { "data": [ { "id": "gpt-4o", "object": "model", "created": 1677652288, "owned_by": "openai" }, { "id": "text-embedding-3-small", "object": "model", "created": 1677652288, "owned_by": "openai" } ], "object": "list" } ``` ``` -------------------------------- ### Prompt Optimization Example (Python) Source: https://github.com/stima-tech/docs/blob/main/docs/billing/payg.md Demonstrates how to optimize prompts for cost reduction in Python. The example contrasts a long, verbose prompt with a concise one, highlighting the potential savings by reducing input token usage. ```python # Expensive: Long, verbose prompt prompt = """ I would like you to help me with the following task. Please read the text below carefully and provide a detailed summary of the main points... """ # Cheaper: Concise prompt prompt = "Summarize the key points:" ``` -------------------------------- ### Generating Responses with System Instructions (Python) Source: https://github.com/stima-tech/docs/blob/main/docs/references/responses.md This Python example shows how to provide system-level instructions to the model when generating a response. The `instructions` parameter guides the AI's behavior, ensuring it acts as a helpful coding assistant and includes code examples. ```python response = client.responses.create( model="gpt-4o", instructions="You are a helpful coding assistant. Always provide code examples.", input="How do I read a file in Python?" ) ``` -------------------------------- ### Image Generation API Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md Generate images from text prompts. ```APIDOC ## POST /v1/images/generations ### Description This endpoint generates images based on a provided text description (prompt). ### Method POST ### Endpoint /v1/images/generations ### Parameters #### Request Body - **prompt** (string) - Required - A description of the desired image. - **n** (integer) - Optional - The number of images to generate (default: 1). - **size** (string) - Optional - The size of the generated images (e.g., '1024x1024', '512x512', '256x256'). ### Request Example ```json { "prompt": "A photorealistic cat wearing a tiny hat", "n": 1, "size": "1024x1024" } ``` ### Response #### Success Response (200) - **data** (array) - A list of image objects. - **url** (string) - The URL of the generated image. - **created** (integer) - Unix timestamp of when the response was created. #### Response Example ```json { "data": [ { "url": "https://api.apertis.ai/v1/images/generations/image1.png" } ], "created": 1677652288 } ``` ``` -------------------------------- ### Optimize Prompt Token Usage (Python) Source: https://github.com/stima-tech/docs/blob/main/docs/billing/quota-management.md Illustrates how to reduce token consumption by creating more concise prompts. The example shows an inefficient, verbose prompt compared to an efficient, direct prompt for a summarization task. ```python # Inefficient (high token usage) prompt = """ I would like you to please help me with the following task. I need you to summarize the following text for me. Please make sure the summary is comprehensive and detailed. Here is the text: {long_text} """ # Efficient (lower token usage) prompt = f"Summarize:\n{long_text}" ``` -------------------------------- ### Concise System Message Example (Python) Source: https://github.com/stima-tech/docs/blob/main/docs/billing/quota-management.md Demonstrates the effective use of concise system messages in conversational AI. It contrasts a brief, focused system message with an overly verbose one, highlighting the importance of brevity for managing token usage. ```python # Good - concise system message messages = [ {"role": "system", "content": "You are a helpful coding assistant."}, {"role": "user", "content": "..."} ] # Avoid - overly detailed system message messages = [ {"role": "system", "content": "You are an extremely helpful assistant..."}, # 500+ tokens {"role": "user", "content": "..."} ] ``` -------------------------------- ### Text to Speech API Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md Convert text into spoken audio. ```APIDOC ## POST /v1/audio/speech ### Description This endpoint converts input text into spoken audio in various languages and voices. ### Method POST ### Endpoint /v1/audio/speech ### Parameters #### Request Body - **model** (string) - Required - The audio model to use (e.g., 'tts-1'). - **input** (string) - Required - The text to synthesize into speech. - **voice** (string) - Optional - The voice to use for the synthesis (e.g., 'alloy', 'echo', 'fable'). - **response_format** (string) - Optional - The format of the audio output (e.g., 'mp3', 'opus', 'aac'). ### Request Example ```json { "model": "tts-1", "input": "Hello world, this is a test.", "voice": "alloy" } ``` ### Response #### Success Response (200) - **audio_content** (binary) - The generated audio content in the specified format. #### Response Example (Binary audio data would be returned here, not a JSON object) ``` -------------------------------- ### Make API Call with Node.js Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md This Node.js snippet shows how to make a chat completion API call using the OpenAI SDK. First, install the SDK using 'npm install openai'. Configure the client with your API key and the Apertis API base URL. The code sends a user message and logs the assistant's reply. ```javascript import OpenAI from 'openai'; const client = new OpenAI({ apiKey: 'sk-your-api-key', baseURL: 'https://api.apertis.ai/v1' }); async function main() { const response = await client.chat.completions.create({ model: 'gpt-4o', messages: [ { role: 'user', content: 'Hello! What can you do?' } ] }); console.log(response.choices[0].message.content); } main(); ``` -------------------------------- ### Speech to Text API Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md Transcribe spoken audio into text. ```APIDOC ## POST /v1/audio/transcriptions ### Description This endpoint transcribes spoken audio into written text. ### Method POST ### Endpoint /v1/audio/transcriptions ### Parameters #### Request Body - **file** (file) - Required - The audio file to transcribe. - **model** (string) - Required - The audio model to use (e.g., 'whisper-1'). - **language** (string) - Optional - The language of the input audio (ISO-639-1 format). ### Request Example (This endpoint typically uses multipart/form-data for file uploads) ``` --boundary Content-Disposition: form-data; name="file"; filename="audio.mp3" Content-Type: audio/mpeg --boundary Content-Disposition: form-data; name="model" whisper-1 --boundary-- ``` ### Response #### Success Response (200) - **text** (string) - The transcribed text. #### Response Example ```json { "text": "Hello world, this is a transcription test." } ``` ``` -------------------------------- ### Image Analysis with Python Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md This Python example demonstrates image analysis using the Apertis API. The 'messages' payload includes both text and an image URL, allowing the model to interpret the content of the image and respond to questions about it. ```python response = client.chat.completions.create( model="gpt-4o", messages=[ { "role": "user", "content": [ {"type": "text", "text": "What's in this image?"}, {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}} ] } ] ) ``` -------------------------------- ### Make API Call with Python Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md This Python snippet demonstrates how to make a chat completion API call using the OpenAI SDK. Ensure you have the 'openai' package installed. You need to provide your API key and the base URL for the Apertis API. The code sends a user message and prints the assistant's response. ```python from openai import OpenAI client = OpenAI( api_key="sk-your-api-key", base_url="https://api.apertis.ai/v1" ) response = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "user", "content": "Hello! What can you do?"} ] ) print(response.choices[0].message.content) ``` -------------------------------- ### Chat Completions API Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md Engage in conversational interactions using various models. ```APIDOC ## POST /v1/chat/completions ### Description This endpoint provides chat-based interactions, allowing you to send messages and receive responses from AI models. ### Method POST ### Endpoint /v1/chat/completions ### Parameters #### Request Body - **model** (string) - Required - The ID of the model to use for chat completion. - **messages** (array) - Required - A list of message objects representing the conversation history. - **role** (string) - Required - The role of the author of the message ('system', 'user', or 'assistant'). - **content** (string) - Required - The content of the message. - **temperature** (number) - Optional - Controls randomness. Lower values make output more deterministic. - **max_tokens** (integer) - Optional - The maximum number of tokens to generate in the response. ### Request Example ```json { "model": "gpt-4o", "messages": [ {"role": "user", "content": "Hello!"} ] } ``` ### Response #### Success Response (200) - **choices** (array) - A list of completion choices. - **message** (object) - **role** (string) - The role of the author ('assistant'). - **content** (string) - The generated message content. - **finish_reason** (string) - The reason the model stopped generating tokens. - **created** (integer) - Unix timestamp of when the response was created. - **model** (string) - The model used for the completion. - **usage** (object) - Usage statistics. - **prompt_tokens** (integer) - **completion_tokens** (integer) - **total_tokens** (integer) #### Response Example ```json { "choices": [ { "message": { "role": "assistant", "content": "Hello there! How can I help you today?" }, "finish_reason": "stop" } ], "created": 1677652288, "model": "gpt-4o", "usage": { "prompt_tokens": 10, "completion_tokens": 15, "total_tokens": 25 } } ``` ``` -------------------------------- ### Implementing Exponential Backoff (Python) Source: https://github.com/stima-tech/docs/blob/main/docs/billing/rate-limits.md Python example demonstrating an exponential backoff strategy with jitter for handling rate limit errors. ```APIDOC ### Retry Strategy Implement exponential backoff for rate limit errors: ```python import time import random from openai import OpenAI, RateLimitError client = OpenAI( api_key="sk-your-api-key", base_url="https://api.apertis.ai/v1" ) def make_request_with_retry(messages, max_retries=5): for attempt in range(max_retries): try: return client.chat.completions.create( model="gpt-4o", messages=messages ) except RateLimitError: if attempt == max_retries - 1: raise # Exponential backoff with jitter wait_time = (2 ** attempt) + random.uniform(0, 1) print(f"Rate limited. Waiting {wait_time:.2f}s...") time.sleep(wait_time) ``` ``` -------------------------------- ### Best Practice: Request Queuing Source: https://github.com/stima-tech/docs/blob/main/docs/billing/rate-limits.md Example of implementing a request queuing system to manage API call rates effectively. ```APIDOC ### 2. Implement Request Queuing Use a queue to manage request rate: ```python import asyncio from collections import deque class RateLimitedQueue: def __init__(self, max_requests_per_minute=50): self.max_rpm = max_requests_per_minute self.queue = deque() self.request_times = deque() async def add_request(self, request_func): # Wait if at rate limit while len(self.request_times) >= self.max_rpm: oldest = self.request_times[0] wait_time = 60 - (time.time() - oldest) if wait_time > 0: await asyncio.sleep(wait_time) self.request_times.popleft() # Execute request self.request_times.append(time.time()) return await request_func() ``` ``` -------------------------------- ### Multi-turn Conversations with Python Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md This Python example illustrates how to manage multi-turn conversations using the Apertis API. A list of messages, including system, user, and assistant roles, is passed to the API to maintain context across multiple interactions. ```python messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "The capital of France is Paris."}, {"role": "user", "content": "What's the population?"} ] response = client.chat.completions.create( model="gpt-4o", messages=messages ) ``` -------------------------------- ### Implementing Exponential Backoff (Node.js) Source: https://github.com/stima-tech/docs/blob/main/docs/billing/rate-limits.md Node.js example demonstrating an exponential backoff strategy with jitter for handling rate limit errors. ```APIDOC ### Node.js Example ```javascript import OpenAI from 'openai'; const client = new OpenAI({ apiKey: 'sk-your-api-key', baseURL: 'https://api.apertis.ai/v1' }); async function makeRequestWithRetry(messages, maxRetries = 5) { for (let attempt = 0; attempt < maxRetries; attempt++) { try { return await client.chat.completions.create({ model: 'gpt-4o', messages }); } catch (error) { if (error.status !== 429 || attempt === maxRetries - 1) { throw error; } // Exponential backoff with jitter const waitTime = Math.pow(2, attempt) + Math.random(); console.log(`Rate limited. Waiting ${waitTime.toFixed(2)}s...`); await new Promise(r => setTimeout(r, waitTime * 1000)); } } } ``` ``` -------------------------------- ### Implement Response Caching with Hashing (Python) Source: https://github.com/stima-tech/docs/blob/main/docs/billing/quota-management.md This Python example shows how to implement a caching mechanism for API responses using `functools.lru_cache` and `hashlib`. It generates a hash of the prompt to use as a cache key, reducing redundant API calls for identical inputs. ```python import hashlib from functools import lru_cache @lru_cache(maxsize=1000) def get_cached_response(prompt_hash, model): return client.chat.completions.create( model=model, messages=[{"role": "user", "content": prompt}] ) # Create hash for caching prompt_hash = hashlib.md5(prompt.encode()).hexdigest() response = get_cached_response(prompt_hash, "gpt-4o") ``` -------------------------------- ### Adapt System Prompts from Anthropic to Apertis Format Source: https://github.com/stima-tech/docs/blob/main/docs/help/migration-guides.md This example shows the difference in how system prompts are handled when migrating from Anthropic's API to Apertis, which uses the OpenAI format. Anthropic uses a dedicated `system` parameter, while Apertis incorporates the system prompt as a message with the role `system`. ```python # Anthropic format: message = client.messages.create( model="claude-3-5-sonnet-20241022", system="You are a helpful assistant.", messages=[{"role": "user", "content": "Hello!"}] ) # Apertis format: response = client.chat.completions.create( model="claude-3-5-sonnet-20241022", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ] ) ``` -------------------------------- ### Install LangChain and Apertis Integration Source: https://github.com/stima-tech/docs/blob/main/i18n/en/docusaurus-plugin-content-docs/current/installation/langchain.md Installs the necessary LangChain libraries and the OpenAI integration package, which is often used for API-based LLMs like Apertis. ```bash pip install langchain pip install langchain-openai ``` -------------------------------- ### Migrate LangChain ChatOpenAI to Apertis Source: https://github.com/stima-tech/docs/blob/main/docs/help/migration-guides.md This example shows how to update LangChain configurations to use Apertis instead of a standard OpenAI model. The `ChatOpenAI` class parameters `openai_api_key` and `openai_api_base` are modified to point to Apertis. ```python from langchain_openai import ChatOpenAI # Before llm = ChatOpenAI(model="gpt-4") # After llm = ChatOpenAI( model="gpt-4o", openai_api_key="sk-apertis-key", openai_api_base="https://api.apertis.ai/v1" ) ``` -------------------------------- ### Migrate from Google AI Studio to Apertis Python Client Source: https://github.com/stima-tech/docs/blob/main/docs/help/migration-guides.md This Python example illustrates migrating from Google AI Studio's SDK to the Apertis client. Key changes include using the OpenAI client, providing an API key and base URL, and adjusting how the response text is accessed. ```python import google.generativeai as genai genai.configure(api_key="google-api-key") model = genai.GenerativeModel("gemini-1.5-pro") response = model.generate_content("Hello!") text = response.text ``` ```python from openai import OpenAI client = OpenAI( api_key="sk-apertis-key", base_url="https://api.apertis.ai/v1" ) response = client.chat.completions.create( model="gemini-1.5-pro", messages=[{"role": "user", "content": "Hello!"}] ) text = response.choices[0].message.content ``` -------------------------------- ### Initialize OpenAI Client with Environment Variable Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md This Python code initializes the OpenAI client using an API key stored in an environment variable. It ensures that the API key is not hardcoded, enhancing security. The `base_url` is set to the Apertis AI API endpoint. ```python import os from openai import OpenAI client = OpenAI( api_key=os.environ.get("APERTIS_API_KEY"), base_url="https://api.apertis.ai/v1" ) ``` -------------------------------- ### Authentication Header Example (Bash) Source: https://github.com/stima-tech/docs/blob/main/docs/help/error-codes.md Example of how to include the API key in the Authorization header for API requests. This is essential for authenticating requests. ```bash -H "Authorization: Bearer sk-your-api-key" ``` -------------------------------- ### Install and Launch Crush Terminal Source: https://context7.com/stima-tech/docs/llms.txt Provides installation commands for the Crush terminal AI agent using both Homebrew (macOS) and NPM (Cross-Platform). After installation, the 'crush' command launches the terminal interface. ```bash # Homebrew (macOS) brew install charmbracelet/tap/crush # NPM (Cross-Platform) npm install -g @charmland/crush # Launch crush ``` -------------------------------- ### Test Apertis Integration with Python SDK Source: https://github.com/stima-tech/docs/blob/main/docs/help/migration-guides.md This Python code snippet demonstrates how to initialize and test an integration with the Apertis API using the OpenAI SDK. It retrieves API credentials from environment variables and makes a sample chat completion request. This verifies the setup and basic functionality. ```python from openai import OpenAI import os client = OpenAI( api_key=os.environ.get("APERTIS_API_KEY"), base_url=os.environ.get("APERTIS_BASE_URL") ) # Test with a simple request response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content) ``` -------------------------------- ### Install Crush using NPM (Cross-Platform) Source: https://github.com/stima-tech/docs/blob/main/docs/installation/crush.md Installs the Crush AI coding agent globally across different operating systems using NPM. This method requires Node.js and npm to be installed. ```bash npm install -g @charmland/crush ``` -------------------------------- ### Try Different Models with Python Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md This Python code demonstrates how to use the Apertis API client to make requests to different AI models like GPT-4o, Claude 3.5 Sonnet, and Gemini Pro. You simply change the 'model' parameter in the `client.chat.completions.create` call. ```python # OpenAI GPT-4o response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Explain quantum computing"}] ) # Anthropic Claude 3.5 Sonnet response = client.chat.completions.create( model="claude-3-5-sonnet-20241022", messages=[{"role": "user", "content": "Explain quantum computing"}] ) # Google Gemini Pro response = client.chat.completions.create( model="gemini-1.5-pro", messages=[{"role": "user", "content": "Explain quantum computing"}] ) ``` -------------------------------- ### Install Crush using Yay (Arch Linux) Source: https://github.com/stima-tech/docs/blob/main/docs/installation/crush.md Installs the Crush AI coding agent on Arch Linux using the `yay` AUR helper. This command fetches and installs the package from the Arch User Repository. ```bash yay -S crush-bin ``` -------------------------------- ### Python: Use Subscription Token for API Requests Source: https://github.com/stima-tech/docs/blob/main/docs/billing/subscription-plans.md Demonstrates how to initialize the OpenAI client with a dedicated subscription token and make an API call. This token is linked to your subscription and its quota automatically syncs and resets with your billing cycle. It is managed separately from regular API tokens. ```python from openai import OpenAI client = OpenAI( api_key="sk-sub-your-subscription-key", base_url="https://api.apertis.ai/v1" ) # Quota is tracked against your subscription response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}] ) ``` -------------------------------- ### Configure Continue Dev with LLM Models and Custom Commands (JSON) Source: https://github.com/stima-tech/docs/blob/main/i18n/en/docusaurus-plugin-content-docs/current/installation/continue.md This JSON configuration file sets up Continue Dev with various Large Language Models (LLMs) including Claude, GPT, and Gemini. It specifies API endpoints, model identifiers, and provider details. Additionally, it defines custom commands, such as 'test' for generating unit tests, and configures tab autocomplete and telemetry settings. Users need to replace placeholder API keys with their actual keys. ```json { "models": [ { "model": "claude-3-5-sonnet-20241022", "apiBase": "https://api.apertis.ai/v1", "title": "Claude 3.5", "apiKey": "sk-xxxxxxxxxxxxxxxxxxxxx", "provider": "openai", "description": "Explain in details" }, { "model": "claude-3-5-haiku-20241022", "apiBase": "https://api.apertis.ai/v1", "title": "Claude 3.5", "apiKey": "sk-xxxxxxxxxxxxxxxxxxxxx", "provider": "openai", "description": "Explain in details" }, { "model": "claude-3-5-sonnet-20240620", "apiBase": "https://api.apertis.ai/v1", "title": "Claude 3.5", "apiKey": "sk-xxxxxxxxxxxxxxxxxxxxx", "provider": "openai", "description": "Explain in details" }, { "model": "gpt-4o", "apiBase": "https://api.apertis.ai/v1", "title": "GPT-4o", "apiKey": "sk-xxxxxxxxxxxxxxxxxxxxx", "provider": "openai", "description": "Explain in details" }, { "model": "gpt-4-turbo", "apiBase": "https://api.apertis.ai/v1", "title": "GPT-4-Turbo", "apiKey": "sk-xxxxxxxxxxxxxxxxxxxxx", "provider": "openai", "description": "Explain in details" }, { "model": "gpt-3.5-turbo", "apiBase": "https://api.apertis.ai/v1", "title": "GPT-3.5-Turbo", "apiKey": "sk-xxxxxxxxxxxxxxxxxxxxx", "provider": "openai", "description": "Explain in details" }, { "model": "gemini-1.5-pro-latest", "apiBase": "https://api.apertis.ai/v1", "title": "gemini-1.5-pro-latest", "apiKey": "sk-xxxxxxxxxxxxxxxxxxxxx", "provider": "openai", "description": "Explain in details" }, { "model": "gemini-1.5-flash-latest", "apiBase": "https://api.apertis.ai/v1", "title": "gemini-1.5-flash-latest", "apiKey": "sk-xxxxxxxxxxxxxxxxxxxxx", "provider": "openai", "description": "Explain in details" } ], "customCommands": [ { "name": "test", "prompt": "{{{ input }}}\n\nWrite a comprehensive set of unit tests for the selected code. It should setup, run tests that check for correctness including important edge cases, and teardown. Ensure that the tests are complete and sophisticated. Give the tests just as chat output, don't edit any file.", "description": "Write unit tests for highlighted code" } ], "allowAnonymousTelemetry": true, "embeddingsProvider": { "provider": "free-trial" }, "tabAutocompleteModel": { "model": "gpt-4o", "apiBase": "https://api.apertis.ai/v1", "title": "GPT-4o", "apiKey": "sk-xxxxxxxxxxxxxxxxxxxxx", "provider": "openai" }, "tabAutocompleteOptions": { "useCopyBuffer": false, "maxPromptTokens": 400, "prefixPercentage": 0.5 }, "reranker": { "name": "free-trial" } } ``` -------------------------------- ### Python Example for Apertis API Client Source: https://github.com/stima-tech/docs/blob/main/docs/authentication/api-keys.md Shows how to initialize and use the OpenAI Python client to interact with the Apertis API. This example requires the 'openai' library and uses your API key and base URL for configuration. The response from the chat completion is printed to the console. ```python from openai import OpenAI client = OpenAI( api_key="sk-your-api-key", base_url="https://api.apertis.ai/v1" ) response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content) ``` -------------------------------- ### Install OpenCode CLI with Apertis API Support Source: https://context7.com/stima-tech/docs/llms.txt Set up the OpenCode CLI tool for terminal-based AI coding assistance using Apertis API models. Installation can be done via a curl script or by using npm. ```bash # Installation curl -fsSL https://opencode.ai/install | bash # or npm install -g opencode-ai ``` -------------------------------- ### Text Embeddings API Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md Generate text embeddings using the specified model. ```APIDOC ## POST /v1/embeddings ### Description This endpoint generates vector representations (embeddings) for a given piece of text using a specified model. ### Method POST ### Endpoint /v1/embeddings ### Parameters #### Request Body - **model** (string) - Required - The ID of the model to use for embedding. - **input** (string or array of strings) - Required - The input text(s) to embed. ### Request Example ```json { "model": "text-embedding-3-small", "input": "Hello, world!" } ``` ### Response #### Success Response (200) - **data** (array) - A list of embedding objects. - **embedding** (array of floats) - The generated embedding vector. - **index** (integer) - The index of the input text in the request. - **model** (string) - The model used for embedding. - **object** (string) - The type of object returned, usually 'list'. - **usage** (object) - Usage statistics for the request. - **prompt_tokens** (integer) - The number of tokens in the prompt. - **total_tokens** (integer) - The total number of tokens processed. #### Response Example ```json { "data": [ { "embedding": [ 0.0023123, -0.0045678, // ... other dimensions ], "index": 0 } ], "model": "text-embedding-3-small", "object": "list", "usage": { "prompt_tokens": 1, "total_tokens": 1 } } ``` ``` -------------------------------- ### Node.js Example for Apertis API Client Source: https://github.com/stima-tech/docs/blob/main/docs/authentication/api-keys.md Provides a Node.js example using the 'openai' package to communicate with the Apertis API. It demonstrates setting up the client with your API key and base URL, making a chat completion request, and logging the response. ```javascript import OpenAI from 'openai'; const client = new OpenAI({ apiKey: 'sk-your-api-key', baseURL: 'https://api.apertis.ai/v1' }); const response = await client.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: 'Hello!' }] }); console.log(response.choices[0].message.content); ``` -------------------------------- ### Handling HTTP 429 Rate Limit Errors Source: https://github.com/stima-tech/docs/blob/main/docs/billing/rate-limits.md Explains how the API responds when rate limits are exceeded and provides an example error JSON. ```APIDOC ## Handling Rate Limits ### HTTP 429 Response When you exceed the rate limit, the API returns: ```json { "error": { "message": "Rate limit exceeded. Please wait before making more requests.", "type": "rate_limit_error", "code": "rate_limit_exceeded" } } ``` ``` -------------------------------- ### Import OpenAI SDK in Node.js (ES Modules vs CommonJS) Source: https://github.com/stima-tech/docs/blob/main/docs/help/troubleshooting.md Provides examples of how to import the OpenAI SDK in Node.js, demonstrating both the ES Modules syntax (`import`) and the CommonJS syntax (`require`). This is important for ensuring correct module resolution in different Node.js project configurations. ```javascript // ES Modules import OpenAI from 'openai'; // CommonJS const OpenAI = require('openai'); ``` -------------------------------- ### List Available Models Request (HTTP) Source: https://github.com/stima-tech/docs/blob/main/docs/help/error-codes.md An example HTTP GET request to list available models. This can be used to resolve 'model_not_found' errors. ```http GET /v1/models ``` -------------------------------- ### Set Environment Variable for API Key Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md This snippet shows how to set the Apertis API key as an environment variable using bash. This is the recommended practice for production environments to avoid hardcoding sensitive credentials. ```bash # Set environment variable export APERTIS_API_KEY="sk-your-api-key" ``` -------------------------------- ### Chat Completions API Endpoint Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md This endpoint allows you to interact with various AI models to generate text completions for chat-based applications. You can specify the model, messages, and other parameters to tailor the AI's response. ```APIDOC ## POST /v1/chat/completions ### Description Generates chat completions for a given set of messages using a specified AI model. ### Method POST ### Endpoint `/v1/chat/completions` ### Parameters #### Request Body - **model** (string) - Required - The AI model to use for completion (e.g., `gpt-4o`, `claude-3-5-sonnet-20241022`). - **messages** (array) - Required - A list of message objects representing the conversation history. Each object should have `role` (e.g., `system`, `user`, `assistant`) and `content`. - **stream** (boolean) - Optional - If set to `true`, the response will be streamed in chunks. ### Request Example ```json { "model": "gpt-4o", "messages": [ {"role": "user", "content": "Hello! What can you do?"} ] } ``` ### Response #### Success Response (200) - **id** (string) - Unique identifier for this completion. - **object** (string) - The type of object, typically `chat.completion`. - **created** (integer) - Unix timestamp of when the completion was created. - **model** (string) - The model used for the completion. - **choices** (array) - A list of completion choices. - **index** (integer) - The index of the choice. - **message** (object) - The message from the assistant. - **role** (string) - The role of the message sender, typically `assistant`. - **content** (string) - The AI's response content. - **finish_reason** (string) - The reason the model stopped generating tokens (e.g., `stop`, `length`). - **usage** (object) - Information about token usage. - **prompt_tokens** (integer) - Number of tokens in the prompt. - **completion_tokens** (integer) - Number of tokens in the completion. - **total_tokens** (integer) - Total tokens used. #### Response Example ```json { "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1703894400, "model": "gpt-4o", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! I'm an AI assistant. I can help you with..." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 12, "completion_tokens": 45, "total_tokens": 57 } } ``` ``` -------------------------------- ### Python Example: Interact with Apertis LLM using LangChain Source: https://github.com/stima-tech/docs/blob/main/i18n/en/docusaurus-plugin-content-docs/current/installation/langchain.md Demonstrates how to configure and use LangChain with Apertis's API for LLM interactions. It includes functions for single responses and streaming output, along with examples of different prompt types and model configurations. ```python from langchain_openai import ChatOpenAI from langchain_core.messages import HumanMessage, SystemMessage CONFIG = { "api_key": "APERTIS_API_KEY", "base_url": "https://api.apertis.ai/v1", "model": "gpt-4o-mini", "temperature": 0.7, "request_timeout": 30, } def get_llm(**kwargs): config = CONFIG.copy() config.update(kwargs) return ChatOpenAI(**config) def ask(message, **kwargs): llm = get_llm(**kwargs) try: response = llm.invoke(message) return response.content except: response = "" for chunk in llm.stream(message): response += chunk.content return response def ask_stream(message, **kwargs): llm = get_llm(**kwargs) for chunk in llm.stream(message): print(chunk.content, end="", flush=True) print() if __name__ == "__main__": response = ask("Hi, introduce yourself") print(f"Response: {response}\n") messages = [ SystemMessage(content="You are Python expert"), HumanMessage(content="What is LangChain?") ] response = ask(messages) print(f"Expert Response: {response}\n") creative_response = ask("Write a poem", temperature=0.9) print(f"Response: {creative_response}\n") print("Streaming: ", end="") ask_stream("Explain AI") # Switch Model fast_response = ask("1+1=?", model="grok-4-fast") print(f"\nResponse: {fast_response}") ``` -------------------------------- ### Stream Long Responses with Python Source: https://github.com/stima-tech/docs/blob/main/docs/billing/rate-limits.md Utilize streaming to enhance the perceived performance of long responses from the API. This Python example demonstrates how to set `stream=True` and iterate through response chunks, printing them incrementally without waiting for the entire response. ```python response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Write a long story"}], stream=True ) for chunk in response: print(chunk.choices[0].delta.content, end="") ``` -------------------------------- ### Use LlamaIndex with Apertis API for LLM Completions Source: https://github.com/stima-tech/docs/blob/main/i18n/en/docusaurus-plugin-content-docs/current/installation/llamaindex.md Demonstrates how to initialize and use the OpenAI LLM interface from LlamaIndex to interact with the Apertis API. This requires an API key from Apertis and specifies the model and API base URL. The output is the completion generated by the language model. ```python from llama_index.llms.openai import OpenAI llm = OpenAI( model="gpt-4o", api_key="sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", # Please replace with your API Key api_base="https://api.apertis.ai/v1") ret=llm.complete("Donald Trump is ") print(ret) ``` -------------------------------- ### Generate Text Embeddings with Python Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md This snippet demonstrates how to generate text embeddings using the Apertis AI API client in Python. It requires the `openai` library and an API key. The output includes the embedding vector and its dimension. ```python response = client.embeddings.create( model="text-embedding-3-small", input="Hello, world!" ) embedding = response.data[0].embedding print(f"Embedding dimension: {len(embedding)}") ``` -------------------------------- ### Handle API Errors with Python Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md This Python snippet demonstrates robust error handling when interacting with the Apertis AI API. It specifically catches `RateLimitError` and general `APIError`, providing informative messages to the user. It requires the `openai` library. ```python from openai import OpenAI, APIError, RateLimitError client = OpenAI( api_key="sk-your-api-key", base_url="https://api.apertis.ai/v1" ) try: response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content) except RateLimitError: print("Rate limited! Please wait and retry.") except APIError as e: print(f"API error: {e}") ``` -------------------------------- ### Migrate from LiteLLM to Apertis Python Client Source: https://github.com/stima-tech/docs/blob/main/docs/help/migration-guides.md This Python snippet shows how to migrate from LiteLLM to the Apertis client. The core change is replacing the `litellm.completion` call with the standard `openai.OpenAI` client initialization and `chat.completions.create` method. ```python from litellm import completion response = completion( model="gpt-4", messages=[{"role": "user", "content": "Hello!"}] ) ``` ```python from openai import OpenAI client = OpenAI( api_key="sk-apertis-key", base_url="https://api.apertis.ai/v1" ) response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}] ) ``` -------------------------------- ### Enable Streaming Responses with Python Source: https://github.com/stima-tech/docs/blob/main/docs/getting-started/quick-start.md This Python snippet shows how to enable streaming for API responses. By setting `stream=True` in the `client.chat.completions.create` method, you receive the response in chunks, allowing for real-time output. The code iterates through the chunks and prints the content. ```python response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Write a short poem"}], stream=True # Enable streaming ) for chunk in response: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="") ``` -------------------------------- ### Install Crush using Nix Source: https://github.com/stima-tech/docs/blob/main/docs/installation/crush.md Installs the Crush AI coding agent using the Nix package manager. This command fetches the tool directly from a GitHub repository, ensuring reproducible builds. ```bash nix run github:numtide/nix-ai-tools#crush ```