### Adding and Running Examples Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/CONTRIBUTING.md Steps to add a new Python example file, make it executable, and run it against the API. ```python # add an example to examples/.py #!/usr/bin/env -S rye run python … ``` ```sh $ chmod +x examples/.py # run the example against your api $ ./examples/.py ``` -------------------------------- ### Install from Git Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/CONTRIBUTING.md Command to install the SDK directly from a Git repository using pip. ```sh $ pip install git+ssh://git@github.com/Cerebras/cerebras-cloud-sdk-python.git ``` -------------------------------- ### Build and Install from Source Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/CONTRIBUTING.md Commands to build the Python package distributable files (wheel) and install it locally. ```sh $ rye build # or $ python -m build ``` ```sh $ pip install ./path-to-wheel-file.whl ``` -------------------------------- ### Setup Environment with Rye Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/CONTRIBUTING.md Commands to bootstrap the project using Rye, sync dependencies, and activate the virtual environment for running Python scripts. ```sh $ ./scripts/bootstrap ``` ```sh $ rye sync --all-features ``` ```sh # Activate the virtual environment - https://docs.python.org/3/library/venv.html#how-venvs-work $ source .venv/bin/activate # now you can omit the `rye run` prefix $ python script.py ``` -------------------------------- ### Mock Server Setup for Tests Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/CONTRIBUTING.md Instructions to set up a mock server using Prism against an OpenAPI specification, which is often required for running tests. ```sh # you will need npm installed $ npx prism mock path/to/your/openapi.yml ``` -------------------------------- ### Setup Environment without Rye Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/CONTRIBUTING.md Instructions for setting up the project using standard pip, including installing dependencies from a lock file. ```sh $ pip install -r requirements-dev.lock ``` -------------------------------- ### Install and Initialize Async Client Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Installs the Cerebras Cloud SDK with aiohttp support and initializes an asynchronous client with an API key. This client is used to make API calls to the Cerebras Cloud. ```bash pip install 'cerebras_cloud_sdk[aiohttp] @ git+ssh://git@github.com/Cerebras/cerebras-cloud-sdk-python-private.git' ``` ```python import asyncio from cerebras.cloud.sdk import DefaultAioHttpClient from cerebras.cloud.sdk import AsyncCerebras async def main() -> None: async with AsyncCerebras( api_key="My API Key", http_client=DefaultAioHttpClient(), ) as client: chat_completion = await client.chat.completions.create( messages=[ { "role": "user", "content": "Why is fast inference important?", } ], model="llama3.1-8b", ) asyncio.run(main()) ``` -------------------------------- ### Install Cerebras Python SDK Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Installs the Cerebras Python SDK using pip. This is the first step to using the SDK for interacting with the Cerebras REST API. ```sh pip install cerebras_cloud_sdk ``` -------------------------------- ### Publishing to PyPI Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/CONTRIBUTING.md Information on publishing packages to PyPI, either through a GitHub workflow or manually. ```sh # Publish with a GitHub workflow # You can release to package managers by using [the `Publish PyPI` GitHub action](https://www.github.com/Cerebras/cerebras-cloud-sdk-python/actions/workflows/publish-pypi.yml). # Publish manually # If you need to manually release a package, you can run the `bin/publish-pypi` script with a `PYPI_TOKEN` set on the environment. ``` -------------------------------- ### Get Cerebras Cloud SDK Version Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md This snippet demonstrates how to import the Cerebras Cloud SDK and print its currently installed version at runtime. This is useful for debugging or verifying an upgrade. ```python import cerebras.cloud.sdk print(cerebras.cloud.sdk.__version__) ``` -------------------------------- ### Linting and Formatting Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/CONTRIBUTING.md Commands to run linting and formatting checks using Ruff and Black. ```sh $ ./scripts/lint ``` ```sh $ ./scripts/format ``` -------------------------------- ### Running Tests Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/CONTRIBUTING.md Command to execute the project's test suite. ```sh $ ./scripts/test ``` -------------------------------- ### Enable aiohttp for Async Client Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Instructions to enable aiohttp as the HTTP backend for the asynchronous Cerebras client to improve concurrency performance. This requires installing the aiohttp package. ```sh pip install aiohttp ``` -------------------------------- ### Using TypedDict for Nested Parameters Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Illustrates how to use TypedDict for nested request parameters in the Cerebras SDK. This example shows passing a dictionary for `stream_options` to the `chat.completions.create` method and accessing the response. ```python from cerebras.cloud.sdk import Cerebras client = Cerebras() chat_completion = client.chat.completions.create( messages=[ { "content": "content", "role": "system", } ], model="model", stream_options={}, ) print(chat_completion.stream_options) ``` -------------------------------- ### Distinguishing Null vs. Missing Fields in API Responses Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Provides a Python code example demonstrating how to differentiate between a field explicitly set to `null` and a field that is entirely missing from an API response using the `.model_fields_set` attribute. ```python if response.my_field is None: if 'my_field' not in response.model_fields_set: print('Got json like {}, without a "my_field" key present at all.') else: print('Got json like {"my_field": null}.') ``` -------------------------------- ### Per-Request HTTP Client Customization Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Shows how to customize the HTTP client on a per-request basis using the `with_options()` method, allowing different configurations for specific calls without altering the main client instance. ```python from cerebras.cloud.sdk import Cerebras, DefaultHttpxClient import httpx client = Cerebras() client.with_options(http_client=DefaultHttpxClient(proxy="http://my.test.proxy.example.com")) ``` -------------------------------- ### Configuring the HTTP Client Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Demonstrates how to override the default httpx client to customize aspects like proxies, transports, and other advanced configurations. This allows fine-grained control over network requests. ```python import httpx from cerebras.cloud.sdk import Cerebras, DefaultHttpxClient client = Cerebras( # Or use the `CEREBRAS_BASE_URL` env var base_url="http://my.test.server.example.com:8083", http_client=DefaultHttpxClient( proxy="http://my.test.proxy.example.com", transport=httpx.HTTPTransport(local_address="0.0.0.0"), ), ) ``` -------------------------------- ### Managing HTTP Client Resources with Context Manager Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Explains how to manage the lifecycle of the HTTP client, ensuring resources are properly closed. Using a context manager (`with Cerebras() as client:`) guarantees the client is closed upon exiting the block. ```python from cerebras.cloud.sdk import Cerebras with Cerebras() as client: # make requests here pass # HTTP client is now closed ``` -------------------------------- ### Text Completion (Synchronous) Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Demonstrates how to perform a text completion request using the synchronous Cerebras client. It initializes the client and provides a prompt, max tokens, and model for text generation. ```python import os from cerebras.cloud.sdk import Cerebras client = Cerebras( api_key=os.environ.get("CEREBRAS_API_KEY"), # This is the default and can be omitted ) completion = client.completions.create( prompt="It was a dark and stormy ", max_tokens=100, model="llama3.1-8b", ) print(completion) ``` -------------------------------- ### Enabling Logging in Cerebras SDK Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Shows how to enable verbose logging for the Cerebras Python SDK by setting the `CEREBRAS_LOG` environment variable to 'info' or 'debug'. ```shell $ export CEREBRAS_LOG=info ``` ```shell $ export CEREBRAS_LOG=debug ``` -------------------------------- ### Text Completions API Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/api.md Manages text generation tasks. The `create` method sends a request to the `/v1/completions` endpoint with parameters and returns a `Completion` object. ```APIDOC POST /v1/completions client.completions.create(**params) -> Completion Description: Creates a text completion. Parameters: params: A dictionary of parameters for text completion (e.g., prompt, model). Returns: Completion: An object containing the text completion response. ``` -------------------------------- ### Synchronous Text Completion Streaming Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Demonstrates how to stream text completion responses using the synchronous Cerebras client. It iterates over the stream to print text chunks. The API key is fetched from an environment variable. ```python import os from cerebras.cloud.sdk import Cerebras client = Cerebras( # This is the default and can be omitted api_key=os.environ.get("CEREBRAS_API_KEY"), ) stream = client.completions.create( prompt="It was a dark and stormy ", max_tokens=100, model="llama3.1-8b", stream=True, ) for chunk in stream: print(chunk.choices[0].text or "", end="") ``` -------------------------------- ### Accessing Raw Response Data Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Demonstrates how to access the raw HTTP response, including headers, and parse the content. This is useful for inspecting low-level response details or handling custom headers. ```python from cerebras.cloud.sdk import Cerebras client = Cerebras() response = client.chat.completions.with_raw_response.create( messages=[ { "role": "user", "content": "Why is fast inference important?", } ], model="llama3.1-8b", ) print(response.headers.get('X-My-Header')) completion = response.parse() # get the object that `chat.completions.create()` would have returned print(completion) ``` -------------------------------- ### Chat Completion (Synchronous) Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Demonstrates how to perform a chat completion request using the synchronous Cerebras client. It initializes the client with an API key and sends a user message to a specified model. ```python import os from cerebras.cloud.sdk import Cerebras client = Cerebras( api_key=os.environ.get("CEREBRAS_API_KEY"), # This is the default and can be omitted ) chat_completion = client.chat.completions.create( messages=[ { "role": "user", "content": "Why is fast inference important?", } ], model="llama3.1-8b", ) print(chat_completion) ``` -------------------------------- ### Chat Completion (Asynchronous) Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Demonstrates how to perform a chat completion request using the asynchronous Cerebras client. It initializes the client and uses `async`/`await` for non-blocking API calls. ```python import os import asyncio from cerebras.cloud.sdk import AsyncCerebras client = AsyncCerebras( api_key=os.environ.get("CEREBRAS_API_KEY"), # This is the default and can be omitted ) async def main() -> None: chat_completion = await client.chat.completions.create( messages=[ { "role": "user", "content": "Why is fast inference important?", } ], model="llama3.1-8b", ) print(chat_completion) asyncio.run(main()) ``` -------------------------------- ### Chat Completions API Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/api.md Handles chat-based interactions. The `create` method sends a request to the `/v1/chat/completions` endpoint with specified parameters and returns a `ChatCompletion` object. ```APIDOC POST /v1/chat/completions client.chat.completions.create(**params) -> ChatCompletion Description: Creates a chat completion. Parameters: params: A dictionary of parameters for chat completion (e.g., messages, model). Returns: ChatCompletion: An object containing the chat completion response. ``` -------------------------------- ### Synchronous Chat Completion Streaming Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Demonstrates how to stream chat completion responses using the synchronous Cerebras client. It iterates over the stream to print content chunks as they arrive. The API key is fetched from an environment variable. ```python import os from cerebras.cloud.sdk import Cerebras client = Cerebras( # This is the default and can be omitted api_key=os.environ.get("CEREBRAS_API_KEY"), ) stream = client.chat.completions.create( messages=[ { "role": "user", "content": "Why is fast inference important?", } ], model="llama3.1-8b", stream=True, ) for chunk in stream: print(chunk.choices[0].delta.content or "", end="") ``` -------------------------------- ### Configuring Timeouts for Cerebras SDK Requests Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Explains how to set request timeouts for the Cerebras Python SDK, both globally and per request. It covers using a float for simple timeouts and `httpx.Timeout` for more granular control over connect, read, and write timeouts. ```python from cerebras.cloud.sdk import Cerebras import httpx # Configure the default for all requests: client = Cerebras( # 20 seconds (default is 1 minute) timeout=20.0, ) # More granular control: client = Cerebras( timeout=httpx.Timeout(60.0, read=5.0, write=10.0, connect=2.0), ) # Override per-request: client.with_options(timeout=5.0).chat.completions.create( messages=[ { "role": "user", "content": "Why is fast inference important?", } ], model="llama3.1-8b", ) ``` -------------------------------- ### Streaming Response Data Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Shows how to stream response bodies using `.with_streaming_response` for efficient handling of large responses. It requires a context manager and allows reading the response content incrementally. ```python from cerebras.cloud.sdk import Cerebras client = Cerebras() with client.chat.completions.with_streaming_response.create( messages=[ { "role": "user", "content": "Why is fast inference important?", } ], model="llama3.1-8b", ) as response: print(response.headers.get("X-My-Header")) for line in response.iter_lines(): print(line) ``` -------------------------------- ### Asynchronous Chat Completion Streaming Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Demonstrates how to stream chat completion responses using the asynchronous Cerebras client. It uses an async for loop to process streamed chunks. The API key is fetched from an environment variable. ```python import os import asyncio from cerebras.cloud.sdk import AsyncCerebras client = AsyncCerebras( # This is the default and can be omitted api_key=os.environ.get("CEREBRAS_API_KEY"), ) async def main() -> None: stream = await client.chat.completions.create( messages=[ { "role": "user", "content": "Why is fast inference important?", } ], model="llama3.1-8b", stream=True, ) async for chunk in stream: print(chunk.choices[0].delta.content or "", end="") asyncio.run(main()) ``` -------------------------------- ### Handling API Errors with Cerebras SDK Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Demonstrates how to catch and handle various API errors, including connection issues, rate limiting, and general status errors, using the Cerebras Python SDK. It shows how to access specific error details like status codes and responses. ```python import cerebras.cloud.sdk from cerebras.cloud.sdk import Cerebras client = Cerebras() try: client.chat.completions.create( messages=[ { "role": "user", "content": "This should cause an error!", } ], model="some-model-that-doesnt-exist", ) except cerebras.cloud.sdk.APIConnectionError as e: print("The server could not be reached") print(e.__cause__) # an underlying Exception, likely raised within httpx. except cerebras.cloud.sdk.RateLimitError as e: print("A 429 status code was received; we should back off a bit.") except cerebras.cloud.sdk.APIStatusError as e: print("Another non-200-range status code was received") print(e.status_code) print(e.response) ``` -------------------------------- ### Making Undocumented Endpoint Requests Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Illustrates how to make requests to undocumented API endpoints using `client.post` (or other HTTP verbs). This method respects client options like retries and allows specifying the response casting. ```python import httpx from cerebras.cloud.sdk import Cerebras client = Cerebras() response = client.post( "/foo", cast_to=httpx.Response, body={"my_param": True}, ) print(response.headers.get("x-foo")) ``` -------------------------------- ### Models API Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/api.md Provides functionality to retrieve and list available models. The `retrieve` method fetches details for a specific model by its ID via the `/v1/models/{model_id}` endpoint, while the `list` method retrieves all available models from the `/v1/models` endpoint. ```APIDOC GET /v1/models/{model_id} client.models.retrieve(model_id) -> ModelRetrieveResponse Description: Retrieves a specific model by its ID. Parameters: model_id: The unique identifier of the model. Returns: ModelRetrieveResponse: An object containing the details of the specified model. ``` ```APIDOC GET /v1/models client.models.list() -> ModelListResponse Description: Lists all available models. Parameters: None Returns: ModelListResponse: An object containing a list of available models. ``` -------------------------------- ### Configuring Retries for Cerebras SDK Requests Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Illustrates how to manage automatic retries for requests made with the Cerebras Python SDK. It shows how to disable retries globally or configure them on a per-request basis using the `max_retries` option. ```python from cerebras.cloud.sdk import Cerebras # Configure the default for all requests: client = Cerebras( # default is 2 max_retries=0, ) # Or, configure per-request: client.with_options(max_retries=5).chat.completions.create( messages=[ { "role": "user", "content": "Why is fast inference important?", } ], model="llama3.1-8b", ) ``` -------------------------------- ### Set Cerebras API Key Source: https://github.com/cerebras/cerebras-cloud-sdk-python/blob/main/README.md Sets the Cerebras API key as an environment variable. This key is required for authenticating requests to the Cerebras REST API. ```sh export CEREBRAS_API_KEY="your-api-key-here" ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.