### Example Tool Calls

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

Provides an example list of ToolCall objects, each representing a call to the 'simple_add' tool with different arguments.

```python
tool_calls = [ToolCall(id='123', name='simple_add', arguments={'a': 3, 'b': 5}, server=False, extra={'type': 'function'}), 
              ToolCall(id='456', name='simple_add', arguments={'a': 10, 'b': 20}, server=False, extra={'type': 'function'})]
```

--------------------------------

### Model Output Example

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

Shows an example of the output received when interacting with a specific model, 'models/gemini-3.1-pro-preview'. This indicates the model's response format.

```text
Result:
Markdown(**models/gemini-3.1-pro-preview:**)

```

--------------------------------

### Gemini CLI Content Generation Example

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/05_gemini.ipynb

Demonstrates generating content using the Gemini CLI with a specific model and input. This is a non-streaming example used for testing.

```python
inp = [{"role": "user", "parts": [{"text": "Hi how are you?"}]}]
resp = await gem_cli.models.generate_content(model=mn, contents=inp)
comp = mk_completion(resp, mn, api_name, vnd_nm)
comp
```

--------------------------------

### Chat with Tool Usage Example

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

Demonstrates a chat interaction where the assistant uses a tool to perform a calculation. The example shows the message flow from user to assistant, tool use, tool result, and final assistant response.

```markdown
Result:
Markdown(**Msg**

- role: `user`

<contents>

**Part** (`text`)

What is 5 + 7? Use the tool to calculate it

<details markdown='1'>

- data: `None`

</details>

</contents>

**Msg**

- role: `assistant`

<contents>

**Part** (`tool_use`)


<details markdown='1'>

- data: `{'type': 'function_call', 'status': 'completed', 'call_id': 'call_B51xXWFg10YkHqkJVMODTx4i', 'id': 'fc_0e2920bf05b71dc30069f311b880888191a42435c7c037419e', 'name': 'async_add', 'arguments': {'a': 5, 'b': 7}, 'server': False}`

</details>

</contents>

**Msg**

- role: `tool`

<contents>

**Part** (`tool_result`)

12

<details markdown='1'>

- data: `{'id': 'fc_0e2920bf05b71dc30069f311d949c88192a50f746f6e2da30d', 'name': 'async_add', 'arguments': {'a': 5, 'b': 7}, 'server': False}`

</details>

</contents>

**Msg**

- role: `user`

<contents>

**Part** (`text`)

You have used all your tool calls for this turn. Please summarize your findings. If you did not comp...

<details markdown='1'>

- data: `None`

</details>

</contents>

**Msg**

- role: `assistant`

<contents>

**Part** (`text`)

I used the tool to calculate 5 + 7, and the result is 12. If you have more calculations or questions, feel free to ask!

<details markdown='1'>

- data: `{'type': 'output_text', 'logprobs': [], 'text': 'I used the tool to calculate 5 + 7, and the result is 12. If you have more calculations or questions, feel free to ask!', 'citations': []}`

</details>

</contents>
```

--------------------------------

### Example: Streaming Completion with OpenAI

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/02_oai_responses.ipynb

Demonstrates how to obtain a streaming completion from the OpenAI API using a specified model and input.

```python
mn,inp = 'gpt-4o-mini','Hi!'
resp = await oai_cli.responses.create_response(model=mn,input=inp)
comp = mk_completion(resp, mn, api_name, vnd_nm); comp
```

--------------------------------

### Import LiteLLM

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

Import the LiteLLM library to start using its functionalities.

```python
import litellm
```

--------------------------------

### Commented Out Qwen API Client Initialization

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/03_oai_chat.ipynb

A commented-out example showing how to initialize an OpenAPIClient for the Qwen API with its specific endpoint.

```python
# qwen_cli = OpenAPIClient(oai_spec, headers={"Authorization": f"Bearer {os.environ['QWEN_API_KEY']}"})
# for op in qwen_cli.ops: op.base_url = 'https://dashscope.aliyuncs.com/compatible-mode/v1'

```

--------------------------------

### Example Completion Object Creation

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/00_types.ipynb

Demonstrates how to create a Completion object with multiple parts, including 'thinking' and 'text' types, simulating a complex LLM response.

```python
parts = [
    Part(type=PartType.thinking, text="First, let me consider the question..."),
    Part(type=PartType.text, text="The answer involves two parts. "),
    Part(type=PartType.thinking, text="Now for the second part, I need to..."),
    Part(type=PartType.text, text="And here's the conclusion."),
]
Completion(model='model', message=Msg(role="assistant", content=parts))
```

--------------------------------

### Streaming Cache Test Setup

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

Sets up streaming cache tests for both `acomplete` and `litellm.completion` to verify cache reads and writes.

```python
# Streaming - as a sanity check
it1 = await acomplete([msg1], model='claude-sonnet-4-20250514', max_tokens=64, stream=True)  # writes cache
it2 = await acomplete([msg1, comp1.message, msg3], model='claude-sonnet-4-20250514', max_tokens=64, stream=True)  # reads cache
async for comp1 in it1: pass
async for comp2 in it2: pass
```

--------------------------------

### Example: Usage with Code Execution Tool

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/05_gemini.ipynb

Illustrates a Gemini API call requesting code execution and normalizes the usage metadata. Note that code execution tool use is not explicitly logged in the normalized output in this example.

```python
resp = await gem_cli.models.generate_content(model=mn, contents=[{"role": "user", "parts": [{"text": "Calculate the first 10 fibonacci numbers using code"}]}], tools=[{"codeExecution": {}}])
norm_usage(resp)
```

--------------------------------

### Model Output: GPT-4o-search-preview to GPT-4o-search-preview

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

Example output from gpt-4o-search-preview when interacting with itself, detailing current weather conditions.

```text
Output:
  gpt-4o-search-preview -> gpt-4o-search-preview: As
of
11
:
16
PM
local
time
on
Friday
,
June
12
,
202
6
,
in
Brisbane
,
Australia
,
the
current
weather
cond…
```

--------------------------------

### Model Output: GPT-4o-search-preview to GPT-4o-mini

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

Example output from gpt-4o-search-preview when interacting with gpt-4o-mini, providing a weather update.

```text
Output:
  gpt-4o-search-preview -> gpt-4o-mini : As
of
11
:
16
PM
local
time
on
Friday
,
June
12
,
202
6
,
in
Brisbane
,
Australia
,
the
weather
is
mostly
cl…
```

--------------------------------

### PartAccum Example: Merging Tool Calls

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/01_streaming.ipynb

Demonstrates initializing PartAccum with a ToolCall and then merging parts, excluding tool calls.

```python
pa = PartAccum({0: ToolCall(id='toolu_01GF7HEH9s63gdAYL5dbcSj5', name='python', arguments={}, server=False, extra={'caller': {'type': 'direct'}})}, [])
pa.get_merged(False)
```

--------------------------------

### Example: Basic Text Generation Usage

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/05_gemini.ipynb

Demonstrates a basic text generation call to the Gemini API and normalizes its usage metadata.

```python
resp = await gem_cli.models.generate_content(model=mn, contents=[{"role": "user", "parts": [{"text": "hi!"}]}])
norm_usage(resp)
```

--------------------------------

### Registering a Test API

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/00_types.ipynb

Demonstrates how to register a new API endpoint with the `api_registry`. This example registers a 'test' API with a simple function `f`.

```python
def f(): print('test')
api_registry.register('test',**{'f':f})
```

--------------------------------

### Example of Prepending System Message

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/03_oai_chat.ipynb

Demonstrates prepending a system message to user messages for OpenAI Chat. Shows the resulting message list structure.

```python
sp = 'You are a pirate. Always respond in pirate speak. Keep it to one sentence.'
msg1 = mk_user_msg('What are you?')
msgs = denorm_msgs([msg1])
msgs = denorm_system(sp, msgs); msgs
```

--------------------------------

### Example of Part Instantiation with Long Data

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/00_types.ipynb

Demonstrates creating a `Part` instance with a long text and a long dictionary value to showcase the truncation in its Markdown representation.

```python
Part(PartType.text, 'Hello world!'*1000, data={'long':"10"*1000})
```

--------------------------------

### Create OpenAI Response with System Instructions

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/02_oai_responses.ipynb

Demonstrates creating an OpenAI response using a system prompt and streaming the output. Requires setup for oai_cli, denorm_msgs, mk_user_msg, acollect_stream, and vendor name.

```python
sp = 'You are a pirate. Always respond in pirate speak. Keep it to one sentence.'
msg1 = mk_user_msg('What are you?')
resp = await oai_cli.responses.create_response(model='gpt-4o-mini', input=denorm_msgs([msg1]), instructions=denorm_system(sp), stream=True)
async for comp in acollect_stream(resp, vendor_name=vnd_nm): pass
comp
```

--------------------------------

### Get Model Information

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/00_types.ipynb

Retrieves detailed information about a specific model from a given vendor. This is a commented-out example.

```python
# get_model_info('claude-fable-5', 'anthropic')
```

```python
# get_model_info('MiniMax-M3', 'minimax')
```

--------------------------------

### Cache Test Setup with Litellm

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

Sets up a cache test scenario using `litellm.completion` with long text input and a summarization request, followed by a follow-up question.

```python
big_msgs1 = [{"role":"user","content":[
    {"type":"text","text":big_text,"cache_control":{"type":"ephemeral"}},
    {"type":"text","text":"Summarize"}]}]
litecomp1 = litellm.completion(model="anthropic/claude-sonnet-4-20250514", messages=big_msgs1, max_tokens=64)

big_msgs2 = big_msgs1 + [
    {"role":"assistant","content":litecomp1.choices[0].message.content},
    {"role":"user","content":"Now in French"}]
litecomp2 = litellm.completion(model="anthropic/claude-sonnet-4-20250514", messages=big_msgs2, max_tokens=64)
```

--------------------------------

### Get Model Info for Codex Models

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/00_types.ipynb

Retrieves detailed information for specific Codex models. These are examples of using `get_model_info` with different model identifiers.

```python
get_model_info(codex53spark, 'codex')
```

```python
get_model_info(codex55, 'codex')
```

--------------------------------

### Direct Text Completion Output (Part 1)

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

This example shows the first part of a direct text completion output. It is a fragment of a larger response.

```text
Hello

```

--------------------------------

### Example Conversation with mk_msgs

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

Demonstrates how to use `mk_msgs` to create a simple chat conversation with four messages. The output shows the resulting list of `Msg` objects.

```python
msgs = mk_msgs(['Hey!',"Hi there!","How are you?","I'm doing fine and you?"])
msgs
```

--------------------------------

### ToolCall Example

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/00_types.ipynb

An example of a ToolCall object, which represents a function call made by an LLM.

```python
ToolCall(id='oxwvx1fm', name='simple_add', arguments={'b': 547982745, 'a': 5478954793}, server=False, extra={'thoughtSignature': 'EscDCsQDAQw51scPHdv+D5BX7JWdLzz3Bv8tsKFRuAJe2UkTFZ+NZKzNsLtmQBiia+/r4HJEUptq1zQB0q9HToX0qzCUqyNAbDLY76KxMeW9jpsnUvh6ZjPM5sDD7fAafF7cjdApNMsihPqIZBAZjAlFPcp1c/50MObH5f1q7hO7fgDS4iSJ3Q3FfbAYWnJ4nlA2peVMu/6WFcKZh1wcZCIuN6iFCj6nhH+6RKkaFRaM0b6XCmpti6qldSeZx+qtHmo+lzr1tct4Gz/CITDI7gRJ3qfLYV2u45jOhKzdd1t6gQ39XLJ93j0xd0AwpzcdZLbHWqwWJCQ43nNzhJ7IQTAWOSyPgKDnlAMHq2PTEoXBYkMBApCZ1x+HncBzt77kQrTTe7sWGVmD5boVnYAIFPFGXOULP5tDZ+nog+Fg8NV10vaFKlHVf+VDzFnVWxT259LN12ykGtBilfpTXiKCV12RAZwhuL7vXXHrsBGg5HNVImcXqgMvwf/rtQlJeop+9bEcAiU48hMFMzumOrCmmHD3HgxpYLW7T3vtDmbNdKCDqVtIwO4Rp5HE6GudRWmq8iC2UnyQglUXoXVnxIZW7eYYDsGAYrYgZ1A='})
```

--------------------------------

### Prepare and Create Response with Tools

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/02_oai_responses.ipynb

Sets up the model name, input prompt, and tools for an OpenAI API request, then creates a completion response. This is a common setup for making API calls that involve tool usage.

```python
mn,inp,tools = 'gpt-4o-mini','What is the weather in Istanbul today?',[{"type": "web_search_preview"}]
resp = await oai_cli.responses.create_response(model=mn,input=inp,tools=tools)
comp = mk_completion(resp, mn, api_name, vnd_nm)
```

--------------------------------

### Basic Chat Completion Example

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

Illustrates a simple chat completion response from a Gemini model. This shows the structure of a successful message and usage statistics.

```python
Result:
Completion(model='models/gemini-3-flash-preview', message=Msg(role='assistant', content=[Part(type=<PartType.text: 'text'>, text='Hello! How can I help you today?', data={'citations': []})]), finish_reason=<finish_reason.stop: 'stop'>, usage=Usage(prompt_tokens=3, completion_tokens=9, total_tokens=77, cached_tokens=0, cache_creation_tokens=0, reasoning_tokens=65, raw={'promptTokenCount': 3, 'candidatesTokenCount': 9, 'totalTokenCount': 77, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 3}], 'thoughtsTokenCount': 65}), tool_calls=[], api_name='gemini', vendor_name='gemini', raw={'deltas': [Delta(text='Hello! How can I', thinking='', refusal='', tool_calls=[], citations=[], server_tool_result=None, finish_reason=None, usage=Usage(prompt_tokens=3, completion_tokens=5, total_tokens=73, cached_tokens=0, cache_creation_tokens=0, reasoning_tokens=65, raw={'promptTokenCount': 3, 'candidatesTokenCount': 5, 'totalTokenCount': 73, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 3}], 'thoughtsTokenCount': 65}), raw={'candidates': [{'content': {'parts': [{'text': 'Hello! How can I'}], 'role': 'model'}, 'index': 0}], 'usageMetadata': {'promptTokenCount': 3, 'candidatesTokenCount': 5, 'totalTokenCount': 73, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 3}], 'thoughtsTokenCount': 65}, 'modelVersion': 'gemini-3-flash-preview', 'responseId': 'gRLzacDtIt6A3boP9vCWsAg'}), Delta(text=' help you today?', thinking='', refusal='', tool_calls=[], citations=[], server_tool_result=None, finish_reason=None, usage=Usage(prompt_tokens=3, completion_tokens=9, total_tokens=77, cached_tokens=0, cache_creation_tokens=0, reasoning_tokens=65, raw={'promptTokenCount': 3, 'candidatesTokenCount': 9, 'totalTokenCount': 77, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 3}], 'thoughtsTokenCount': 65}), raw={'candidates': [{'content': {'parts': [{'text': ' help you today?'}], 'role': 'model'}, 'index': 0}], 'usageMetadata': {'promptTokenCount': 3, 'candidatesTokenCount': 9, 'totalTokenCount': 77, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 3}], 'thoughtsTokenCount': 65}, 'modelVersion': 'gemini-3-flash-preview', 'responseId': 'gRLzacDtIt6A3boP9vCWsAg'}), Delta(text='', thinking='', refusal='', tool_calls=[], citations=[], server_tool_result=None, finish_reason=<finish_reason.stop: 'stop'>, usage=Usage(prompt_tokens=3, completion_tokens=9, total_tokens=77, cached_tokens=0, cache_creation_tokens=0, reasoning_tokens=65, raw={'promptTokenCount': 3, 'candidatesTokenCount': 9, 'totalTokenCount': 77, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 3}], 'thoughtsTokenCount': 65}), raw={'candidates': [{'content': {'parts': [{'text': '', 'thoughtSignature': 'EtQCCtECAQw51sfRXDOcN9iujtIsIdL/3Hqy9Ppa7GABoJXxMd00zUZUs4rcHJs815F1BZP0RlKbRrxtSACPJBb5ypxaKrzijIymPV7n9FynodoT6/B7wJquuHXD6rIvPy9/nssqrWAcBA5fJOdjXRtfM3tMLhIcl6Np3L87f6KeOwgS/npqJLikxKJxHFukl1cRw2COc3gqKfksPcAwPydBUcegmji3elck26EZmqzqO8+jCETceWkThUCxmg9jM9oWI3JmmrOFSKZ9/IcIFf4kuz/xeFxzPbdh/PQW1GMHndLy/PErTkRIwu5HtcZQYAZWcwqB3ob6ulYi0NdDWl9Y1SeMCa911GpG1W3iOro46AZcpe/+eEj16TFCqReGU6nD2MSHx9iNcGhTu919tAW5BGw1sKfZV5PFltMBAzQTRvplLakXAwsdxE/jPheVo6PZj9VtXQ=='}], 'role': 'model'}, 'finishReason': 'STOP', 'index': 0}], 'usageMetadata': {'promptTokenCount': 3, 'candidatesTokenCount': 9, 'totalTokenCount': 77, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 3}], 'thoughtsTokenCount': 65}, 'modelVersion': 'gemini-3-flash-preview', 'responseId': 'gRLzacDtIt6A3boP9vCWsAg'})]})
```

--------------------------------

### Initialize AsyncChat with Different Configurations

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

Demonstrates how to initialize the AsyncChat class with various configurations, including auto-inferred models, known vendors, and explicit API settings.

```python
# Auto inferred
c = AsyncChat("claude-opus-4-5")
c = AsyncChat("models/gemini-3-pro-preview")
c = AsyncChat("gpt-4.1")
# Known Vendor
c = AsyncChat("gpt-5.4", vendor_name="codex")
# Explicit
c = AsyncChat("gpt-oss-20b", api_name='openai_chat', api_key='...', base_url='https://openrouter.ai/api/v1')
```

--------------------------------

### Chat Model Completion Output Example

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

An example of a chat model's completion output, including reasoning, descriptive text, and usage statistics for an image analysis task.

```python
Result:
Completion(model='accounts/fireworks/models/kimi-k2p6', message=Msg(role='assistant', content=[Part(type='thinking', text="The user wants me to identify what's in the image. Looking at the image, I can see a small puppy with brown and white fur lying on green grass. Next to the puppy are some purple flowers (likely asters or similar small purple flowers). The puppy appears to be a Cavalier King Charles Spaniel puppy, given its distinctive coloring - white face with brown ears and markings, large dark eyes, and that specific puppy-like appearance. It's lying down with its front paws extended on the grass.\n\nI should describe the image clearly and accurately. The image shows:\n- A puppy (Cavalier King Charles Spaniel)\n- Brown and white coloring\n- Lying on green grass\n- Purple flowers nearby (to the left of the puppy)\n- Cute, looking at camera\n\nLet me provide a friendly, descriptive answer.", data=None), Part(type='text', text="This image shows an adorable **Cavalier King Charles Spaniel puppy** lying on green grass. The puppy has the breed's characteristic **brown and white coat**, with large dark eyes, floppy brown ears, and a white blaze down the center of its face. It's resting with its front paws stretched out, looking directly at the camera. \n\nNext to the puppy (on the left side of the image) is a cluster of **small purple flowers**
—likely asters or daisy-like blooms—growing in the grass. The overall scene is very cute and gives off a sweet, summery, outdoor vibe! 🐶🌸", data={'citations': []})]), finish_reason='stop', usage=Usage(prompt_tokens=107, completion_tokens=300, total_tokens=407, cached_tokens=0, cache_creation_tokens=0, reasoning_tokens=0, raw={'prompt_tokens': 107, 'total_tokens': 407, 'completion_tokens': 300, 'prompt_tokens_details': {'cached_tokens': 0}}), tool_calls=[], api_name='openai_chat', vendor_name='fireworks_ai', raw={'id': 'chatcmpl-1cde9440db084c7aaeb92b30e84a41b8', 'object': 'chat.completion', 'created': 1777533873, 'model': 'accounts/fireworks/models/kimi-k2p6', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': "This image shows an adorable **Cavalier King Charles Spaniel puppy** lying on green grass. The puppy has the breed's characteristic **brown and white coat**, with large dark eyes, floppy brown ears, and a white blaze down the center of its face. It's resting with its front paws stretched out, looking directly at the camera. \n\nNext to the puppy (on the left side of the image) is a cluster of **small purple flowers**
—likely asters or daisy-like blooms—growing in the grass. The overall scene is very cute and gives off a sweet, summery, outdoor vibe! 🐶🌸", 'reasoning_content': "The user wants me to identify what's in the image. Looking at the image, I can see a small puppy with brown and white fur lying on green grass. Next to the puppy are some purple flowers (likely asters or similar small purple flowers). The puppy appears to be a Cavalier King Charles Spaniel puppy, given its distinctive coloring - white face with brown ears and markings, large dark eyes, and that specific puppy-like appearance. It's lying down with its front paws extended on the grass.\n\nI should describe the image clearly and accurately. The image shows:\n- A puppy (Cavalier King Charles Spaniel)\n- Brown and white coloring\n- Lying on green grass\n- Purple flowers nearby (to the left of the puppy)\n- Cute, looking at camera\n\nLet me provide a friendly, descriptive answer."}, 'finish_reason': 'stop', 'token_ids': None}], 'usage': {'prompt_tokens': 107, 'total_tokens': 407, 'completion_tokens': 300, 'prompt_tokens_details': {'cached_tokens': 0}}, 'prompt_token_ids': [163587, 2482, 163601, 45702, 1573, 306, 566, 4082, 30, 163602, 4017, 163603, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163605, 163604, 198, 163586, 163588, 69702, 163601, 163606]})
```

--------------------------------

### FastLLM Setup and Helper Functions

Source: https://github.com/answerdotai/fastllm/blob/main/README.md

Imports necessary types and functions from fastllm for chat completions. Includes helper functions for creating user messages and streaming responses, which print thinking processes and text as they arrive.

```python
from fastllm.types import Msg, Part, PartType, Completion
from fastllm.acomplete import acomplete, mk_tool_res_msg
import asyncio, json

# Helpers
def user(text): return Msg(role='user', content=[Part(type=PartType.text, text=text)])

async def stream(msgs, model, **kw):
    """Stream a response, printing text/thinking as it arrives. Returns the final Completion."""
    cnt, max_think = 0, 10
    async for o in await acomplete(msgs, model, stream=True, **kw):
        if not isinstance(o, Completion):
            if o.get('thinking') and cnt < max_think: print('🧠', end='', flush=True)
            if txt := o.get('text'): print(txt, end='', flush=True)
            cnt += 1
    print()
    return o
```

--------------------------------

### Gemini Completion Example

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

Demonstrates a successful completion from the Gemini API, including model name, message content, finish reason, and usage statistics.

```text
Result:
Completion(model='models/gemini-3-flash-preview', message=Msg(role='assistant', content=[Part(type=<PartType.text: 'text'>, text='Hello! How can I help you today?', data={'citations': []})]), finish_reason=<finish_reason.stop: 'stop'>, usage=Usage(prompt_tokens=3, completion_tokens=9, total_tokens=35, cached_tokens=0, cache_creation_tokens=0, reasoning_tokens=23, raw={'promptTokenCount': 3, 'candidatesTokenCount': 9, 'totalTokenCount': 35, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 3}], 'thoughtsTokenCount': 23, 'serviceTier': 'standard'}), tool_calls=[], api_name='gemini', vendor_name='gemini', raw={'deltas': [Delta(text='Hello! How can I help you today?', thinking='', refusal='', tool_calls=[], citations=[], server_tool_result=None, finish_reason=None, usage=Usage(prompt_tokens=3, completion_tokens=9, total_tokens=35, cached_tokens=0, cache_creation_tokens=0, reasoning_tokens=23, raw={'promptTokenCount': 3, 'candidatesTokenCount': 9, 'totalTokenCount': 35, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 3}], 'thoughtsTokenCount': 23, 'serviceTier': 'standard'}), raw={'candidates': [{'content': {'parts': [{'text': 'Hello! How can I help you today?'}], 'role': 'model'}, 'index': 0}], 'usageMetadata': {'promptTokenCount': 3, 'candidatesTokenCount': 9, 'totalTokenCount': 35, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 3}], 'thoughtsTokenCount': 23, 'serviceTier': 'standard'}, 'modelVersion': 'gemini-3-flash-preview', 'responseId': 'zgMsau-6BuO2_uMP4erKgQc'}), Delta(text='', thinking='', refusal='', tool_calls=[], citations=[], server_tool_result=None, finish_reason=<finish_reason.stop: 'stop'>, usage=Usage(prompt_tokens=3, completion_tokens=9, total_tokens=35, cached_tokens=0, cache_creation_tokens=0, reasoning_tokens=23, raw={'promptTokenCount': 3, 'candidatesTokenCount': 9, 'totalTokenCount': 35, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 3}], 'thoughtsTokenCount': 23, 'serviceTier': 'standard'}), raw={'candidates': [{'content': {'parts': [{'text': '', 'thoughtSignature': 'ErkBCrYBAQw51sf6XKqYlAH0hfjkKYIf2UGH2zQzCbpusz4xPgjpm8sfiwdjf3sXmr4Ii0wXe5/JEaUY/gx6M2+GkSZj+D+bV12cYzLNZe0H2Jv27iPgCCH3/gNLkwz6sNcaxM3SdQ8ldXf/7Mj5gVBTedYL9LB8XkQPoF6jayDn5/lpR5iYmPXcd9NCgWRT73OoRbsbKqg1LLJcHclabsBZkCqQU8/PqDZrPPzP8KAaUgCS1sMzQPIqI54='}], 'role': 'model'}, 'finishReason': 'STOP', 'index': 0}], 'usageMetadata': {'promptTokenCount': 3, 'candidatesTokenCount': 9, 'totalTokenCount': 35, 'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 3}], 'thoughtsTokenCount': 23, 'serviceTier': 'standard'}, 'modelVersion': 'gemini-3-flash-preview', 'responseId': 'zgMsau-6BuO2_uMP4erKgQc'})]})
```

--------------------------------

### Direct Text Completion Output (Part 2)

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

This example shows the second part of a direct text completion output. It continues the response from the previous snippet.

```text
! How can I help you today?

```

--------------------------------

### Content Generation with Google Search Tool

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/05_gemini.ipynb

This example demonstrates how to generate content by leveraging the Google Search tool. Provide an empty dictionary for the 'googleSearch' tool to enable its use.

```python
resp = await gem_cli.models.generate_content(model=mn, contents=[{"role": "user", "parts": [{"text": "What is the weather in Istanbul today?"}]}], tools=[{"googleSearch": {}}])
comp = mk_completion(resp, mn, api_name, vnd_nm)
comp
```

--------------------------------

### Example Usage of run_fence_tool

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

Illustrates the usage of the `run_fence_tool` function with Python and Bash examples. It asserts that the output for a Python print statement is '3' and the output for a Bash 'ls' command is 'bash: ls'.

```python
out = await run_fence_tool('py', 'print(1+2)', _ns)
test_eq(_result_re.search(out).group(1), '3')

out = await run_fence_tool('bash', 'ls', _ns)
test_eq(_result_re.search(out).group(1), 'bash: ls')
```

--------------------------------

### Anthropic API Response Example

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/04_anthropic.ipynb

An example of a structured response from the Anthropic API, including message content, usage statistics, and raw API details. This shows the output after processing a streaming request.

```text
Result:
Completion(model=None, message=Msg(role='assistant', content=[Part(type=<PartType.text: 'text'>, text='I can see a very small red square or rectangle in the image. The image appears to be mostly white/transparent with just this small red geometric shape visible in what looks like the upper left area. The red element is quite small and appears to be a simple solid color shape.', data={'citations': []})]), finish_reason=<finish_reason.stop: 'stop'>, usage=Usage(prompt_tokens=77, completion_tokens=59, total_tokens=136, cached_tokens=0, cache_creation_tokens=0, reasoning_tokens=0, raw={'input_tokens': 77, 'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'output_tokens': 59}), tool_calls=[], api_name='anthropic', vendor_name=None, raw={'deltas': [Delta(text=None, thinking=None, refusal='', tool_calls=[], citations=None, server_tool_result=None, finish_reason=None, usage=None, raw={'type': 'message_start', 'message': {'model': 'claude-sonnet-4-20250514', 'id': 'msg_01Jj7S7fWhFgBBXoFS6ACd6F', 'type': 'message', 'role': 'assistant', 'content': [], 'stop_reason': None, 'stop_sequence': None, 'stop_details': None, 'usage': {'input_tokens': 77, 'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'cache_creation': {'ephemeral_5m_input_tokens': 0, 'ephemeral_1h_input_tokens': 0}, 'output_tokens': 2, 'service_tier': 'standard', 'inference_geo': 'not_available'}}}), Delta(text=None, thinking=None, refusal='', tool_calls=[], citations=None, server_tool_result=None, finish_reason=None, usage=None, raw={'type': 'content_block_start', 'index': 0, 'content_block': {'type': 'text', 'text': ''}}), Delta(text=None, thinking=None, refusal='', tool_calls=[], citations=None, server_tool_result=None, finish_reason=None, usage=None, raw={'type': 'ping'}), Delta(text='I can', thinking=None, refusal='', tool_calls=[], citations=None, server_tool_result=None, finish_reason=None, usage=None, raw={'type': 'content_block_delta', 'index': 0, 'delta': {'type': 'text_delta', 'text': 'I can'}}), Delta(text=' see a very small red square or rectangle in the image. The image', thinking=None, refusal='', tool_calls=[], citations=None, server_tool_result=None, finish_reason=None, usage=None, raw={'type': 'content_block_delta', 'index': 0, 'delta': {'type': 'text_delta', 'text': ' see a very small red square or rectangle in the image. The image'}}), Delta(text=' appears to be mostly white/transparent with just this small', thinking=None, refusal='', tool_calls=[], citations=None, server_tool_result=None, finish_reason=None, usage=None, raw={'type': 'content_block_delta', 'index': 0, 'delta': {'type': 'text_delta', 'text': ' appears to be mostly white/transparent with just this small'}}), Delta(text=' red geometric shape visible in what looks like the upper left area. The red element is', thinking=None, refusal='', tool_calls=[], citations=None, server_tool_result=None, finish_reason=None, usage=None, raw={'type': 'content_block_delta', 'index': 0, 'delta': {'type': 'text_delta', 'text': ' red geometric shape visible in what looks like the upper left area. The red element is'}}), Delta(text=' quite small and appears to be a simple solid color shape.', thinking=None, refusal='', tool_calls=[], citations=None, server_tool_result=None, finish_reason=None, usage=None, raw={'type': 'content_block_delta', 'index': 0, 'delta': {'type': 'text_delta', 'text': ' quite small and appears to be a simple solid color shape.'}}), Delta(text=None, thinking=None, refusal='', tool_calls=[], citations=None, server_tool_result=None, finish_reason=None, usage=None, raw={'type': 'content_block_stop', 'index': 0}), Delta(text='', thinking='', refusal='', tool_calls=[], citations=[], server_tool_result=None, finish_reason=<finish_reason.stop: 'stop'>, usage=Usage(prompt_tokens=77, completion_tokens=59, total_tokens=136, cached_tokens=0, cache_creation_tokens=0, reasoning_tokens=0, raw={'input_tokens': 77, 'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'output_tokens': 59}), raw={'type': 'message_delta', 'delta': {'stop_reason': 'end_turn', 'stop_sequence': None, 'stop_details': None}, 'usage': {'input_tokens': 77, 'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'output_tokens': 59}})]})
```

--------------------------------

### Initiate Chat with Web Search Options

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

This snippet shows how to start a chat interaction using a specific model and tools, while also configuring web search options. It's useful for scenarios requiring real-time information retrieval during a conversation.

```python
await c(smsg, m=gpt54m, tools=[toolsc], web_search_options={})
```

--------------------------------

### Unified Chat Interface Example

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

Demonstrates the unified chat interface by calling `acomplete` with different LLM models and a sample user message. This showcases the ability to switch providers easily.

```python
ms = ["models/gemini-3.1-pro-preview", "models/gemini-3-flash-preview", "claude-sonnet-4-6", "gpt-4.1"]
msgs = [Msg(role='user', content=[Part(type=PartType.text, text='Hi there!', data={"cache_control": {"type": "ephemeral"}})])]
for m in ms:
    display(Markdown(f'**{m}:**'))
    display(await acomplete(msgs, m))
```

--------------------------------

### Cache Test Setup with Long Text and Summarization

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

Sets up a cache test scenario using `acomplete` with long text input and a summarization request, followed by a follow-up question.

```python
cc = {"cache_control": {"type": "ephemeral"}}
big_text = 'The quick brown fox jumps over the lazy dog. ' * 200
msg1 = Msg('user', content=[Part('text', big_text, data=cc), Part('text', 'Summarize')])
comp1 = await acomplete([msg1], model='claude-sonnet-4-20250514', max_tokens=64)  # writes cache
msg3 = Msg('user', content=[Part('text', 'Now in French')])
comp2 = await acomplete([msg1, comp1.message, msg3], model='claude-sonnet-4-20250514', max_tokens=64)  # reads cache
```

--------------------------------

### Audio and Text Input Completion (Pro Model)

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

Compares `acomplete` and LiteLLM `completion` for audio and text input using the 'pro' model. Note the specific handling of `audio_b64` for cost comparison.

```python
msg = Msg('user', content=[Part(PartType.input_audio, audio_b64), Part('text', 'What is this audio saying?')])
comp = await acomplete([msg], model=pro_mn, temperature=0.0)
litecomp = litellm.completion(model=lpro_mn, messages=[{"role":"user","content":[{"type":"input_audio","input_audio":{"data":audio_b64.split(',', 1)[1],"format":"wav"}},{"type":"text","text":"What is this audio saying?"}]}], temperature=0.0)
# test_close(litellm.completion_cost(completion_response=litecomp), comp.cost, 1e-3)
litellm.completion_cost(completion_response=litecomp), comp.cost
```

--------------------------------

### Get Model Name

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

Retrieves the name of the model being used.

```python
ms[2]
```

--------------------------------

### Example: Usage with URL Context Tool

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/05_gemini.ipynb

Demonstrates a Gemini API call using the URL context tool and normalizes the usage metadata, which includes significant token counts for tool use.

```python
resp = await gem_cli.models.generate_content(model=mn, contents=[{"role": "user", "parts": [{"text": "What is solveit? https://solve.it.com/"}]}], tools=[{"urlContext": {}}])
norm_usage(resp)
```

--------------------------------

### Retrieve Model Information

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/00_types.ipynb

Retrieves information for a specific model and vendor. This is a commented-out example.

```python
# get_model_info('kimi-k2.7-code', 'moonshot')
```

--------------------------------

### Clear Chat History

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

Resets the conversation history, allowing for a fresh start in a new conversation.

```python
chat.clear_history()
print("Chat history cleared.")
```

--------------------------------

### Search Tool Call Example

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/05_gemini.ipynb

Demonstrates using the Gemini API with a Google Search tool. It shows how to call the API for a search query and then normalize the tool calls, which in this case returns an empty list.

```python
resp = await gem_cli.models.generate_content(model=mn, contents=[{"role": "user", "parts": [{"text": "What is the weather in Istanbul today?"}]}], tools=[{"googleSearch": {}}])
norm_tool_calls(resp)
```

--------------------------------

### Initiate Chat with a Message

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/07_chat.ipynb

Example of initiating a chat conversation with a single message using the shortcut function.

```python
await c(msg)
```

--------------------------------

### Model Output: Claude-Sonnet-4-6 to Claude-Sonnet-4-6

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

Example output from claude-sonnet-4-6 when interacting with itself, providing a weather report for Brisbane.

```text
Output:
  claude-sonnet-4-6 -> claude-sonnet-4-6: Here
is the current weather in **Brisbane, Queensland, Australia** for today
, **Friday, June 12, 202…
```

--------------------------------

### Streaming Output - How

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

Demonstrates the streaming output of the word 'How'.

```text
Output:
 How
```

--------------------------------

### Model Output: Claude-Sonnet-4-6 to GPT-4o-search-preview

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

Example output from claude-sonnet-4-6 when interacting with gpt-4o-search-preview, detailing the weather in Brisbane.

```text
Output:
  claude-sonnet-4-6 -> gpt-4o-search-preview: Here
is
the
current
weather
in
**
Br
isbane
,
Australia
**
for
today
,
**
Friday
,
June
12
,
202
6
**
:

## Wea…
```

--------------------------------

### Model Output: GPT-4o-search-preview to Gemini-3-flash-preview

Source: https://github.com/answerdotai/fastllm/blob/main/nbs/06_acomplete.ipynb

Example output from gpt-4o-search-preview when interacting with gemini-3-flash-preview, describing the weather in Brisbane.

```text
Output:
  gpt-4o-search-preview -> models/gemini-3-flash-preview: As of 11:0
0 PM local time on Friday, June 12, 2026, in Brisbane, Australia, the weather is clear
and…
```

--------------------------------

### Using System Prompts with Different Providers

Source: https://github.com/answerdotai/fastllm/blob/main/README.md

Demonstrates passing a system prompt to Claude and Gemini models using FastLLM. Ensure the 'mtok' variable is defined and represents the maximum tokens.

```python
sys = "You are a pirate chef. Always respond in pirate speak and mention food."

print("Claude: ", end='')
r = await stream([user("What should I do today?")], model='claude-sonnet-4-20250514', system=sys, max_tokens=mtok)

print("Gemini: ", end='')
r = await stream([user("What should I do today?")], model='models/gemini-3-flash-preview', system=sys, max_tokens=mtok)
```