### Install Google GenAI SDK for Go Source: https://ai.google.dev/gemini-api/docs/libraries Use `go get` to install the official Google GenAI SDK for Go, enabling access to the Gemini API. ```Go go get google.golang.org/genai ``` -------------------------------- ### Describe Audio Content using Gemini Interactions API Source: https://ai.google.dev/gemini-api/docs/generate-content/audio This example shows how to upload an audio file and then create an interaction to get a text description of its content using the Gemini API. Replace 'path/to/sample.mp3' with your actual audio file path and 'YOUR_FILE_URI' for the REST example. ```python from google import genai import base64 client = genai.Client() uploaded_file = client.files.upload(file="path/to/sample.mp3") interaction = client.interactions.create( model="gemini-3.5-flash", input=[ {"type": "text", "text": "Describe this audio clip"}, { "type": "audio", "uri": uploaded_file.uri, "mime_type": uploaded_file.mime_type } ] ) print(interaction.output_text) ``` ```javascript import { GoogleGenAI } from "@google/genai"; const client = new GoogleGenAI({}); const uploadedFile = await client.files.upload({ file: "path/to/sample.mp3", config: { mime_type: "audio/mp3" } }); const interaction = await client.interactions.create({ model: "gemini-3.5-flash", input: [ {type: "text", text: "Describe this audio clip"}, { type: "audio", uri: uploadedFile.uri, mime_type: uploadedFile.mimeType } ] }); console.log(interaction.output_text); ``` ```bash # First upload the file, then use the URI: curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \ -H "x-goog-api-key: $GEMINI_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "model": "gemini-3.5-flash", "input": [ {"type": "text", "text": "Describe this audio clip"}, { "type": "audio", "uri": "YOUR_FILE_URI", "mime_type": "audio/mp3" } ] }' ``` -------------------------------- ### Install gemini-live-api-dev Skill with Context7 Source: https://ai.google.dev/gemini-api/docs/coding-agents Install the gemini-live-api-dev skill using Context7 for building real-time conversational AI applications. ```bash npx ctx7 skills install /google-gemini/gemini-skills gemini-live-api-dev ``` -------------------------------- ### Install gemini-api-dev Skill with Context7 Source: https://ai.google.dev/gemini-api/docs/coding-agents Install the gemini-api-dev skill using Context7 for users already within the Context7 ecosystem. ```bash npx ctx7 skills install /google-gemini/gemini-skills gemini-api-dev ``` -------------------------------- ### Install gemini-interactions-api Skill with skills.sh Source: https://ai.google.dev/gemini-api/docs/coding-agents Install the gemini-interactions-api skill for building agentic applications with the Interactions API using skills.sh. ```bash npx skills add google-gemini/gemini-skills --skill gemini-interactions-api --global ``` -------------------------------- ### Generate Video with Reference Images (Node.js and Go) Source: https://ai.google.dev/gemini-api/docs/veo Use reference images to guide the Veo video generation process. This example shows how to configure image assets, initiate video generation, poll for the operation's completion, and download the resulting video file. ```Node.js import { GoogleGenAI } from "@google/genai"; const ai = new GoogleGenAI({}); const prompt = "The video opens with a medium, eye-level shot of a beautiful woman with dark hair and warm brown eyes. She wears a magnificent, high-fashion flamingo dress with layers of pink and fuchsia feathers, complemented by whimsical pink, heart-shaped sunglasses. She walks with serene confidence through the crystal-clear, shallow turquoise water of a sun-drenched lagoon. The camera slowly pulls back to a medium-wide shot, revealing the breathtaking scene as the dress's long train glides and floats gracefully on the water's surface behind her. The cinematic, dreamlike atmosphere is enhanced by the vibrant colors of the dress against the serene, minimalist landscape, capturing a moment of pure elegance and high-fashion fantasy."; // dressImage, glassesImage, womanImage generated separately with Nano Banana // and available as objects like { imageBytes: "...", mimeType: "image/png" } const dressReference = { image: dressImage, referenceType: "asset", }; const sunglassesReference = { image: glassesImage, referenceType: "asset", }; const womanReference = { image: womanImage, referenceType: "asset", }; let operation = await ai.models.generateVideos({ model: "veo-3.1-generate-preview", prompt: prompt, config: { referenceImages: [ dressReference, sunglassesReference, womanReference, ], }, }); // Poll the operation status until the video is ready. while (!operation.done) { console.log("Waiting for video generation to complete..."); await new Promise((resolve) => setTimeout(resolve, 10000)); operation = await ai.operations.getVideosOperation({ operation: operation, }); } // Download the video. ai.files.download({ file: operation.response.generatedVideos[0].video, downloadPath: "veo3.1_with_reference_images.mp4", }); console.log(`Generated video saved to veo3.1_with_reference_images.mp4`); ``` ```Go package main import ( "context" "log" "os" "time" "google.golang.org/genai" ) func main() { ctx := context.Background() client, err := genai.NewClient(ctx, nil) if err != nil { log.Fatal(err) } prompt := `The video opens with a medium, eye-level shot of a beautiful woman with dark hair and warm brown eyes. She wears a magnificent, high-fashion flamingo dress with layers of pink and fuchsia feathers, complemented by whimsical pink, heart-shaped sunglasses. She walks with serene confidence through the crystal-clear, shallow turquoise water of a sun-drenched lagoon. The camera slowly pulls back to a medium-wide shot, revealing the breathtaking scene as the dress's long train glides and floats gracefully on the water's surface behind her. The cinematic, dreamlike atmosphere is enhanced by the vibrant colors of the dress against the serene, minimalist landscape, capturing a moment of pure elegance and high-fashion fantasy.` // dressImage, glassesImage, womanImage generated separately with Nano Banana // and available as *genai.Image objects. var dressImage, glassesImage, womanImage *genai.Image dressReference := &genai.VideoGenerationReferenceImage{ Image: dressImage, ReferenceType: "asset", } sunglassesReference := &genai.VideoGenerationReferenceImage{ Image: glassesImage, ReferenceType: "asset", } womanReference := &genai.VideoGenerationReferenceImage{ Image: womanImage, ReferenceType: "asset", } operation, _ := client.Models.GenerateVideos( ctx, "veo-3.1-generate-preview", prompt, nil, // image &genai.GenerateVideosConfig{ ReferenceImages: []*genai.VideoGenerationReferenceImage{ dressReference, sunglassesReference, womanReference, }, }, ) // Poll the operation status until the video is ready. for !operation.Done { log.Println("Waiting for video generation to complete...") time.Sleep(10 * time.Second) operation, _ = client.Operations.GetVideosOperation(ctx, operation, nil) } // Download the video. video := operation.Response.GeneratedVideos[0] client.Files.Download(ctx, video.Video, nil) fname := "veo3.1_with_reference_images.mp4" _ = os.WriteFile(fname, video.Video.VideoBytes, 0644) log.Printf("Generated video saved to %s\n", fname) } ``` -------------------------------- ### Install gemini-interactions-api Skill with Context7 Source: https://ai.google.dev/gemini-api/docs/coding-agents Install the gemini-interactions-api skill using Context7 for building agentic applications with the Interactions API. ```bash npx ctx7 skills install /google-gemini/gemini-skills gemini-interactions-api ``` -------------------------------- ### Install gemini-live-api-dev Skill with skills.sh Source: https://ai.google.dev/gemini-api/docs/coding-agents Install the gemini-live-api-dev skill for building real-time conversational AI applications with Gemini Live API using skills.sh. ```bash npx skills add google-gemini/gemini-skills --skill gemini-live-api-dev --global ``` -------------------------------- ### Scene Examples Source: https://ai.google.dev/gemini-api/docs/speech-generation These examples set the stage for the speech generation, describing the physical environment, mood, and environmental details to establish tone and guide the acting performance. ```Prompt ## THE SCENE: The London Studio It is 10:00 PM in a glass-walled studio overlooking the moonlit London skyline, but inside, it is blindingly bright. The red "ON AIR" tally light is blazing. Jaz is standing up, not sitting, bouncing on the balls of their heels to the rhythm of a thumping backing track. Their hands fly across the faders on a massive mixing desk. It is a chaotic, caffeine-fueled cockpit designed to wake up an entire nation. ``` ```Prompt ## THE SCENE: Homegrown Studio A meticulously sound-treated bedroom in a suburban home. The space is deadened by plush velvet curtains and a heavy rug, but there is a distinct "proximity effect." ``` -------------------------------- ### Initialize a new Node.js project Source: https://ai.google.dev/gemini-api/docs/vercel-ai-sdk-example Create a new directory and initialize a Node.js project using npm, pnpm, or yarn. ```npm mkdir market-trend-app cd market-trend-app npm init -y ``` ```pnpm mkdir market-trend-app cd market-trend-app pnpm init ``` ```yarn mkdir market-trend-app cd market-trend-app yarn init -y ``` -------------------------------- ### Partial Input Completion with Example and Response Prefix Source: https://ai.google.dev/gemini-api/docs/prompting-strategies Use an example and a response prefix to guide the model in generating a concise JSON output, omitting unmentioned items. ```text Valid fields are cheeseburger, hamburger, fries, and drink. Order: Give me a cheeseburger and fries Output: ``` { "cheeseburger": 1, "fries": 1 } ``` Order: I want two burgers, a drink, and fries. Output: ``` ```json { "hamburger": 2, "drink": 1, "fries": 1 } ``` -------------------------------- ### Guide Gemini Output with Few-Shot Examples Source: https://ai.google.dev/gemini-api/docs/generate-content/files Use few-shot examples in the prompt to steer the model towards a desired output format and content, such as extracting only specific information (e.g., city, not country). ```text city: Rome, landmark: the Colosseum. ``` ```text city: Beijing, landmark: Forbidden City ``` ```text city: Rio de Janeiro, landmark: Christ the Redeemer statue ``` -------------------------------- ### Few-shot Prompt for Concise Explanation Selection Source: https://ai.google.dev/gemini-api/docs/prompting-strategies This prompt uses two examples to guide the model to prefer concise explanations, demonstrating how few-shot examples can influence the style and length of the model's response. ```text Below are some examples showing a question, explanation, and answer format: Question: Why is the sky blue? Explanation1: The sky appears blue because of Rayleigh scattering, which causes shorter blue wavelengths of light to be scattered more easily than longer red wavelengths, making the sky look blue. Explanation2: Due to Rayleigh scattering effect. Answer: Explanation2 Question: What is the cause of earthquakes? Explanation1: Sudden release of energy in the Earth's crust. Explanation2: Earthquakes happen when tectonic plates suddenly slip or break apart, causing a release of energy that creates seismic waves that can shake the ground and cause damage. Answer: Explanation1 Now, Answer the following question given the example formats above: Question: How is snow formed? Explanation1: Snow is formed when water vapor in the air freezes into ice crystals in the atmosphere, which can combine and grow into snowflakes as they fall through the atmosphere and accumulate on the ground. Explanation2: Water vapor freezes into ice crystals forming snow. Answer: ``` -------------------------------- ### Start Temporal Development Server Source: https://ai.google.dev/gemini-api/docs/temporal-example Initiates a local Temporal development server for local testing and development purposes. ```bash temporal server start-dev ``` -------------------------------- ### Install Old Gemini API Go SDK Source: https://ai.google.dev/gemini-api/docs/migrate Use this command to install the previous version of the Gemini API Go SDK. ```Go go get github.com/google/generative-ai-go ``` -------------------------------- ### List, Get, and Delete File Search Documents Source: https://ai.google.dev/gemini-api/docs/file-search Use these examples to perform common management operations on individual documents within a File Search store, including retrieving a list of documents, getting details for a specific document, and deleting a document. ```Python for document_in_store in client.file_search_stores.documents.list(parent='fileSearchStores/my-file_search-store-123'): print(document_in_store) file_search_document = client.file_search_stores.documents.get(name='fileSearchStores/my-file_search-store-123/documents/my_doc') print(file_search_document) client.file_search_stores.documents.delete(name='fileSearchStores/my-file_search-store-123/documents/my_doc', config={'force': True}) ``` ```JavaScript const documents = await ai.fileSearchStores.documents.list({ parent: 'fileSearchStores/my-file_search-store-123' }); for await (const doc of documents) { console.log(doc); } const fileSearchDocument = await ai.fileSearchStores.documents.get({ name: 'fileSearchStores/my-file_search-store-123/documents/my_doc' }); await ai.fileSearchStores.documents.delete({ name: 'fileSearchStores/my-file_search-store-123/documents/my_doc' }); ``` ```REST curl "https://generativelanguage.googleapis.com/v1beta/fileSearchStores/my-file_search-store-123/documents?key=${GEMINI_API_KEY}" curl "https://generativelanguage.googleapis.com/v1beta/fileSearchStores/my-file_search-store-123/documents/my_doc?key=${GEMINI_API_KEY}" curl -X DELETE "https://generativelanguage.googleapis.com/v1beta/fileSearchStores/my-file_search-store-123/documents/my_doc?key=${GEMINI_API_KEY}&force=true" ``` -------------------------------- ### Establish Live API Connection Source: https://ai.google.dev/gemini-api/docs/live-api/capabilities Demonstrates how to establish a real-time connection with the Live API, configuring the session for audio responses. ```Python import asyncio from google import genai client = genai.Client() model = "gemini-3.1-flash-live-preview" config = {"response_modalities": ["AUDIO"]} async def main(): async with client.aio.live.connect(model=model, config=config) as session: print("Session started") # Send content... if __name__ == "__main__": asyncio.run(main()) ``` ```JavaScript import { GoogleGenAI, Modality } from '@google/genai'; const ai = new GoogleGenAI({}); const model = 'gemini-3.1-flash-live-preview'; const config = { responseModalities: [Modality.AUDIO] }; async function main() { const session = await ai.live.connect({ model: model, callbacks: { onopen: function () { console.debug('Opened'); }, onmessage: function (message) { console.debug(message); }, onerror: function (e) { console.debug('Error:', e.message); }, onclose: function (e) { console.debug('Close:', e.reason); }, }, config: config, }); console.debug("Session started"); // Send content... session.close(); } main(); ``` -------------------------------- ### Get File Metadata in Python Source: https://ai.google.dev/gemini-api/docs/files Retrieve metadata for an uploaded file using its name. This example first uploads a file and then fetches its details. ```python from google import genai client = genai.Client() myfile = client.files.upload(file='path/to/sample.mp3') file_name = myfile.name myfile = client.files.get(name=file_name) print(myfile) ``` -------------------------------- ### Configure Agent Environment with `environment` Parameter Source: https://ai.google.dev/gemini-api/docs/agent-environment Demonstrates the three ways to use the `environment` parameter: provisioning a fresh remote sandbox, reusing an existing sandbox by ID, and creating a new sandbox with specified sources like a Git repository. ```Python from google import genai client = genai.Client() # Fresh sandbox interaction = client.interactions.create( agent="antigravity-preview-05-2026", input="Write a hello world script.", environment="remote", ) # Reuse an existing sandbox interaction_2 = client.interactions.create( agent="antigravity-preview-05-2026", input="Modify the script to accept a name argument.", environment=interaction.environment_id, previous_interaction_id=interaction.id, ) # New sandbox with sources interaction_3 = client.interactions.create( agent="antigravity-preview-05-2026", input="List all files and summarize the project.", environment={ "type": "remote", "sources": [ { "type": "repository", "source": "https://github.com/octocat/Spoon-Knife", "target": "/workspace/spoon-knife", } ], }, ) print(interaction.output_text) ``` ```JavaScript import { GoogleGenAI } from "@google/genai"; const client = new GoogleGenAI({}); // Fresh sandbox const interaction = await client.interactions.create({ agent: "antigravity-preview-05-2026", input: "Write a hello world script.", environment: "remote", }); // Reuse an existing sandbox const interaction2 = await client.interactions.create({ agent: "antigravity-preview-05-2026", input: "Modify the script to accept a name argument.", environment: interaction.environment_id, previous_interaction_id: interaction.id, }); // New sandbox with sources const interaction3 = await client.interactions.create({ agent: "antigravity-preview-05-2026", input: "List all files and summarize the project.", environment: { type: "remote", sources: [ { type: "repository", source: "https://github.com/octocat/Spoon-Knife", target: "/workspace/spoon-knife", }, ], }, }); console.log(interaction.output_text); ``` ```REST # Fresh sandbox curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \ -H "Content-Type: application/json" \ -H "x-goog-api-key: $GEMINI_API_KEY" \ -d '{ "agent": "antigravity-preview-05-2026", "input": [{"type": "text", "text": "Write a hello world script."} ], "environment": "remote" }' # Reuse an existing sandbox (replace $ENV_ID and $INTERACTION_ID with values from the previous response) curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \ -H "Content-Type: application/json" \ -H "x-goog-api-key: $GEMINI_API_KEY" \ -d "{ \"agent\": \"antigravity-preview-05-2026\", \"input\": [{\"type\": \"text\", \"text\": \"Modify the script to accept a name argument.\"}], \"environment\": \"$ENV_ID\", \"previous_interaction_id\": \"$INTERACTION_ID\" }" # New sandbox with sources curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \ -H "Content-Type: application/json" \ -H "x-goog-api-key: $GEMINI_API_KEY" \ -d '{ "agent": "antigravity-preview-05-2026", "input": [{"type": "text", "text": "List all files and summarize the project."} ], "environment": { "type": "remote", "sources": [ { "type": "repository", "source": "https://github.com/octocat/Spoon-Knife", "target": "/workspace/spoon-knife" } ] } }' ``` -------------------------------- ### Few-shot Response for Concise Explanation Selection Source: https://ai.google.dev/gemini-api/docs/prompting-strategies The model's response to the few-shot prompt, showing it successfully chose the shorter explanation as guided by the provided examples. ```text Answer: Explanation2 ``` -------------------------------- ### Grounding Gemini with Google Search Tool Source: https://ai.google.dev/gemini-api/docs/google-search Use the `google_search` tool with the Gemini model to get real-time information and reduce hallucinations. This example queries for a recent event. ```Python from google import genai client = genai.Client() interaction = client.interactions.create( model="gemini-3.5-flash", input="Who won the euro 2024?", tools=[{"type": "google_search"}] ) print(interaction.output_text) ``` ```JavaScript import { GoogleGenAI } from "@google/genai"; const client = new GoogleGenAI({}); const interaction = await client.interactions.create({ model: "gemini-3.5-flash", input: "Who won the euro 2024?", tools: [{ type: "google_search" }] }); console.log(interaction.output_text); ``` ```REST curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \ -H "x-goog-api-key: $GEMINI_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gemini-3.5-flash", "input": "Who won the euro 2024?", "tools": [{"type": "google_search"}] }' ``` -------------------------------- ### Initialize Live Translate Client and Process Audio Stream (JavaScript SDK) Source: https://ai.google.dev/gemini-api/docs/live-api/live-translate This example illustrates connecting to the Live API using the @google/genai SDK in JavaScript. It configures the client for translation and defines callbacks to handle incoming translated audio and transcription messages. ```javascript import { GoogleGenAI, Modality } from '@google/genai'; const ai = new GoogleGenAI({}); const model = 'gemini-3.5-live-translate-preview'; const config = { responseModalities: [Modality.AUDIO], inputAudioTranscription: {}, outputAudioTranscription: {}, translationConfig: { targetLanguageCode: 'pl', echoTargetLanguage: true } }; async function main() { const session = await ai.live.connect({ model: model, config: config, callbacks: { onopen: () => console.debug('Opened'), onmessage: (message) => { const content = message.serverContent; if (content?.inputTranscription) { console.log('Input transcript:', content.inputTranscription.text); } if (content?.outputTranscription) { console.log('Output transcript:', content.outputTranscription.text); } if (content?.modelTurn?.parts) { for (const part of content.modelTurn.parts) { if (part.inlineData) { const audioData = part.inlineData.data; // Play or process the translated audio chunk (base64 encoded) console.debug(`Received audio chunk (${audioData.length} bytes)`); } } } }, onerror: (e) => console.debug('Error:', e.message), onclose: (e) => console.debug('Close:', e.reason), }, }); console.debug("Session started with translation"); } main(); ``` -------------------------------- ### Start Chat with History and Send Message (Old SDK, Go) Source: https://ai.google.dev/gemini-api/docs/migrate This snippet demonstrates initializing a chat with predefined history and sending a message using the older Go SDK. An API key is required for client creation. ```Go ctx := context.Background() client, err := genai.NewClient(ctx, option.WithAPIKey("GEMINI_API_KEY")) if err != nil { log.Fatal(err) } defer client.Close() model := client.GenerativeModel("gemini-3.5-flash") cs := model.StartChat() cs.History = []*genai.Content{ { Parts: []genai.Part{ genai.Text("Hello, I have 2 dogs in my house."), }, Role: "user", }, { Parts: []genai.Part{ genai.Text("Great to meet you. What would you like to know?"), }, Role: "model", }, } res, err := cs.SendMessage(ctx, genai.Text("How many paws are in my house?")) if err != nil { log.Fatal(err) } printResponse(res) // utility for printing the response ``` -------------------------------- ### Configure structured output with Interactions API in JavaScript Source: https://ai.google.dev/gemini-api/docs/migrate-to-interactions This JavaScript example shows how to define a JSON schema using the top-level `response_format` array when creating an interaction to get structured output. ```JavaScript import { GoogleGenAI } from '@google/genai'; const client = new GoogleGenAI({}); const interaction = await client.interactions.create({ model: 'gemini-3.5-flash', input: 'Give me a recipe for chocolate chip cookies.', response_format: [ { type: 'text', mime_type: 'application/json', schema: { type: 'object', properties: { recipe_name: { type: 'string' }, ingredients: { type: 'array', items: { type: 'string' } } }, required: ['recipe_name', 'ingredients'] } } ] }); console.log(interaction.output_text); ``` -------------------------------- ### Live Translate Session Setup Configuration (JSON) Source: https://ai.google.dev/gemini-api/docs/live-api/live-translate Example JSON structure for configuring a Live Translate session, including model, response modalities, input/output audio transcription, and translation target language. ```json "setup": { "model": "models/gemini-3.5-live-translate-preview", "generationConfig": { "responseModalities": [ "AUDIO" ], "inputAudioTranscription": {}, "outputAudioTranscription": {}, "translationConfig": { "targetLanguageCode": "pl", "echoTargetLanguage": true } } } ``` -------------------------------- ### Install gemini-api-dev Skill with skills.sh Source: https://ai.google.dev/gemini-api/docs/coding-agents Use this command to install the foundational gemini-api-dev skill, providing general-purpose Gemini development best practices, using skills.sh. ```bash npx skills add google-gemini/gemini-skills --skill gemini-api-dev --global ``` -------------------------------- ### Create Client with New Go SDK Source: https://ai.google.dev/gemini-api/docs/migrate This snippet demonstrates creating a client for the new Go GenAI SDK, configuring it with the Gemini API backend. ```go client, err := genai.NewClient(ctx, &genai.ClientConfig{ Backend: genai.BackendGeminiAPI, }) ``` -------------------------------- ### Tool Definitions for Career Coach Example (JSON) Source: https://ai.google.dev/gemini-api/docs/live-api/best-practices This JSON array defines a set of tools for a career coach application, including their names, descriptions, parameters, and specific invocation conditions to guide their use in a conversational AI. ```json [ { "name": "create_client_profile", "description": "Creates a new client profile with their personal details. Returns a unique client ID. \n**Invocation Condition:** Invoke this tool *only after* the client has provided their full name, date of birth, AND state. This should only be called once at the beginning of the 'Intake' step.", "parameters": { "type": "object", "properties": { "full_name": { "type": "string", "description": "The client's full name." }, "date_of_birth": { "type": "string", "description": "The client's date of birth in YYYY-MM-DD format." }, "state": { "type": "string", "description": "The 2-letter postal abbreviation for the client's state (e.g., 'NY', 'CA')." } }, "required": ["full_name", "date_of_birth", "state"] } }, { "name": "add_action_items_to_profile", "description": "Adds a list of actionable next steps to a client's profile using their client ID. \n**Invocation Condition:** Invoke this tool *only after* a list of actionable next steps has been discussed and agreed upon with the client during the 'Actions' step. Requires the `client_id` obtained from the start of the session.", "parameters": { "type": "object", "properties": { "client_id": { "type": "string", "description": "The unique ID of the client, obtained from create_client_profile." }, "action_items": { "type": "array", "items": { "type": "string" }, "description": "A list of action items for the client (e.g., ['Update resume', 'Research three companies'])." } }, "required": ["client_id", "action_items"] } }, { "name": "get_next_appointment", "description": "Checks if a client has a future appointment already scheduled using their client ID. Returns the appointment details or null. \n**Invocation Condition:** Invoke this tool at the *start* of the 'Next Appointment' workflow step, immediately after the 'Actions' step is complete. This is used to check if an appointment *already exists*.", "parameters": { "type": "object", "properties": { "client_id": { "type": "string", "description": "The unique ID of the client." } }, "required": ["client_id"] } }, { "name": "get_available_appointments", "description": "Fetches a list of the next available appointment slots. \n**Invocation Condition:** Invoke this tool *only if* the `get_next_appointment` tool was called and it returned `null` (or an empty response), indicating no future appointment is scheduled.", "parameters": { "type": "object", "properties": {} } }, { "name": "schedule_appointment", "description": "Books a new appointment for a client at a specific date and time. \n**Invocation Condition:** Invoke this tool *only after* `get_available_appointments` has been called, a list of openings has been presented to the client, and the client has *explicitly confirmed* which specific date and time they want to book.", "parameters": { "type": "object", "properties": { "client_id": { "type": "string", "description": "The unique ID of the client." }, "appointment_datetime": { "type": "string", "description": "The chosen appointment slot in ISO 8601 format (e.g., '2025-10-30T14:30:00')." } }, "required": ["client_id", "appointment_datetime"] } } ] ``` -------------------------------- ### Generate Real-time Music with Lyria RealTime Source: https://ai.google.dev/gemini-api/docs/realtime-music-generation Use these examples to connect to the Lyria RealTime model, configure music generation parameters, and start streaming audio. Ensure you handle incoming audio chunks in a background task or callback. ```python import asyncio from google import genai from google.genai import types client = genai.Client(http_options={'api_version': 'v1alpha'}) async def main(): async def receive_audio(session): """Example background task to process incoming audio.""" while True: async for message in session.receive(): audio_data = message.server_content.audio_chunks[0].data # Process audio... await asyncio.sleep(10**-12) async with ( client.aio.live.music.connect(model='models/lyria-realtime-exp') as session, asyncio.TaskGroup() as tg, ): # Set up task to receive server messages. tg.create_task(receive_audio(session)) # Send initial prompts and config await session.set_weighted_prompts( prompts=[ types.WeightedPrompt(text='minimal techno', weight=1.0), ] ) await session.set_music_generation_config( config=types.LiveMusicGenerationConfig(bpm=90, temperature=1.0) ) # Start streaming music await session.play() if __name__ == "__main__": asyncio.run(main()) ``` ```javascript import { GoogleGenAI } from "@google/genai"; import Speaker from "speaker"; import { Buffer } from "buffer"; const client = new GoogleGenAI({ apiKey: GEMINI_API_KEY, apiVersion: "v1alpha" , }); async function main() { const speaker = new Speaker({ channels: 2, // stereo bitDepth: 16, // 16-bit PCM sampleRate: 44100, // 44.1 kHz }); const session = await client.live.music.connect({ model: "models/lyria-realtime-exp", callbacks: { onmessage: (message) => { if (message.serverContent?.audioChunks) { for (const chunk of message.serverContent.audioChunks) { const audioBuffer = Buffer.from(chunk.data, "base64"); speaker.write(audioBuffer); } } }, onerror: (error) => console.error("music session error:", error), onclose: () => console.log("Lyria RealTime stream closed."), }, }); await session.setWeightedPrompts({ weightedPrompts: [ { text: "Minimal techno with deep bass, sparse percussion, and atmospheric synths", weight: 1.0 }, ], }); await session.setMusicGenerationConfig({ musicGenerationConfig: { bpm: 90, temperature: 1.0, audioFormat: "pcm16", // important so we know format sampleRateHz: 44100, }, }); await session.play(); } main().catch(console.error); ``` -------------------------------- ### Create Client with Old Go SDK Source: https://ai.google.dev/gemini-api/docs/migrate This snippet demonstrates creating a client for the older Go Generative AI SDK using an API key option. ```go client, err := genai.NewClient(ctx, option.WithAPIKey("GEMINI_API_KEY")) ``` -------------------------------- ### Generating a robot trajectory for object movement (Python) Source: https://ai.google.dev/gemini-api/docs/robotics-overview This example demonstrates how to request a trajectory for moving a specific object, like a red pen, to a target location. The prompt specifies the start and end points, and the model generates intermediate points. ```Python from google import genai from google.genai import types client = genai.Client() # Load your image and set up your prompt with open('path/to/image-with-objects.jpg', 'rb') as f: image_bytes = f.read() points_data = [] prompt = """ Place a point on the red pen, then 15 points for the trajectory of moving the red pen to the top of the organizer on the left. The points should be labeled by order of the trajectory, from '0' (start point at left hand) to (final point) The answer should follow the json format: [{"point": , "label": }, ...]. The points are in [y, x] format normalized to 0-1000. """ image_response = client.models.generate_content( model="gemini-robotics-er-1.6-preview", contents=[ types.Part.from_bytes( data=image_bytes, mime_type='image/jpeg', ), prompt ], config = types.GenerateContentConfig( temperature=1.0, ) ) print(image_response.text) ``` -------------------------------- ### Configure AI Agent with Safety System Instructions Source: https://ai.google.dev/gemini-api/docs/generate-content/computer-use This snippet demonstrates how to set up an AI agent using the `google.genai` client with detailed system instructions for safety. It defines rules for user confirmation on sensitive actions and default actuation for others, then initializes an interaction with a specified model and computer_use tools. ```python from google import genai client = genai.Client() system_instruction = """ ## **RULE 1: Seek User Confirmation (USER_CONFIRMATION)** This is your first and most important check. If the next required action falls into any of the following categories, you MUST stop immediately, and seek the user's explicit permission. **Procedure for Seeking Confirmation:** * **For Consequential Actions:** Perform all preparatory steps (e.g., navigating, filling out forms, typing a message). You will ask for confirmation **AFTER** all necessary information is entered on the screen, but **BEFORE** you perform the final, irreversible action (e.g., before clicking "Send", "Submit", "Confirm Purchase", "Share"). * **For Prohibited Actions:** If the action is strictly forbidden (e.g., accepting legal terms, solving a CAPTCHA), you must first inform the user about the required action and ask for their confirmation to proceed. **USER_CONFIRMATION Categories:** * **Consent and Agreements:** You are FORBIDDEN from accepting, selecting, or agreeing to any of the following on the user's behalf. You must ask the user to confirm before performing these actions. * Terms of Service * Privacy Policies * Cookie consent banners * End User License Agreements (EULAs) * Any other legally significant contracts or agreements. * **Robot Detection:** You MUST NEVER attempt to solve or bypass the following. You must ask the user to confirm before performing these actions. * CAPTCHAs (of any kind) * Any other anti-robot or human-verification mechanisms, even if you are capable. * **Financial Transactions:** * Completing any purchase. * Managing or moving money (e.g., transfers, payments). * Purchasing regulated goods or participating in gambling. * **Sending Communications:** * Sending emails. * Sending messages on any platform (e.g., social media, chat apps). * Posting content on social media or forums. * **Accessing or Modifying Sensitive Information:** * Health, financial, or government records (e.g., medical history, tax forms, passport status). * Revealing or modifying sensitive personal identifiers (e.g., SSN, bank account number, credit card number). * **User Data Management:** * Accessing, downloading, or saving files from the web. * Sharing or sending files/data to any third party. * Transferring user data between systems. * **Browser Data Usage:** * Accessing or managing Chrome browsing history, bookmarks, autofill data, or saved passwords. * **Security and Identity:** * Logging into any user account. * Any action that involves misrepresentation or impersonation (e.g., creating a fan account, posting as someone else). * **Insurmountable Obstacles:** If you are technically unable to interact with a user interface element or are stuck in a loop you cannot resolve, ask the user to take over. --- ## **RULE 2: Default Behavior (ACTUATE)** If an action does **NOT** fall under the conditions for `USER_CONFIRMATION`, your default behavior is to **Actuate**. **Actuation Means:** You MUST proactively perform all necessary steps to move the user's request forward. Continue to actuate until you either complete the non-consequential task or encounter a condition defined in Rule 1. * **Example 1:** If asked to send money, you will navigate to the payment portal, enter the recipient's details, and enter the amount. You will then **STOP** as per Rule 1 and ask for confirmation before clicking the final "Send" button. * **Example 2:** If asked to post a message, you will navigate to the site, open the post composition window, and write the full message. You will then **STOP** as per Rule 1 and ask for confirmation before clicking the final "Post" button. After the user has confirmed, remember to get the user's latest screen before continuing to perform actions. # Final Response Guidelines: Write final response to the user in the following cases: - User confirmation - When the task is complete or you have enough information to respond to the user """ interaction = client.interactions.create( model="gemini-3.5-flash", system_instruction=system_instruction, input="Prepare a draft but do not send.", tools=[ { "type": "computer_use", "environment": "browser" } ] ) ``` -------------------------------- ### Start and poll for Gemini Deep Research Agent results Source: https://ai.google.dev/gemini-api/docs/deep-research This example demonstrates how to initiate a research task with the Gemini Deep Research Agent in the background and then poll for its completion or failure. It requires setting `background=true` and repeatedly checking the interaction status. ```Python import time from google import genai client = genai.Client() interaction = client.interactions.create( input="Research the history of Google TPUs.", agent="deep-research-preview-04-2026", background=True, ) print(f"Research started: {interaction.id}") while True: interaction = client.interactions.get(interaction.id) if interaction.status == "completed": print(interaction.steps[-1].content[0].text) break elif interaction.status == "failed": print(f"Research failed: {interaction.error}") break time.sleep(10) ``` ```JavaScript import { GoogleGenAI } from '@google/genai'; const client = new GoogleGenAI({}); const interaction = await client.interactions.create({ input: 'Research the history of Google TPUs.', agent: 'deep-research-preview-04-2026', background: true }); console.log(`Research started: ${interaction.id}`); while (true) { const result = await client.interactions.get(interaction.id); if (result.status === 'completed') { console.log(result.steps.at(-1).content[0].text); break; } else if (result.status === 'failed') { console.log(`Research failed: ${result.error}`); break; } await new Promise(resolve => setTimeout(resolve, 10000)); } ``` ```REST # 1. Start the research task curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \ -H "Content-Type: application/json" \ -H "x-goog-api-key: $GEMINI_API_KEY" \ -d '{ "input": "Research the history of Google TPUs.", "agent": "deep-research-preview-04-2026", "background": true }' # 2. Poll for results (Replace INTERACTION_ID) # curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID" \ # -H "x-goog-api-key: $GEMINI_API_KEY" ``` -------------------------------- ### Identify Objects in Image with Gemini Robotics-ER 1.6 (Python) Source: https://ai.google.dev/gemini-api/docs/robotics-overview Sends an image and a prompt to the `gemini-robotics-er-1.6-preview` model using `generateContent` to get a list of identified objects with 2D points and labels. Requires `google.genai` client setup and image loading. ```python from google import genai from google.genai import types PROMPT = """ Point to no more than 10 items in the image. The label returned should be an identifying name for the object detected. The answer should follow the json format: [{"point": , "label": }, ...]. The points are in [y, x] format normalized to 0-1000. """ client = genai.Client() # Load your image with open("my-image.png", 'rb') as f: image_bytes = f.read() image_response = client.models.generate_content( model="gemini-robotics-er-1.6-preview", contents=[ types.Part.from_bytes( data=image_bytes, mime_type='image/png', ), PROMPT ], config = types.GenerateContentConfig( temperature=1.0, thinking_config=types.ThinkingConfig(thinking_budget=0) ) ) print(image_response.text) ``` -------------------------------- ### Generate Product Mockup Image with JavaScript Source: https://ai.google.dev/gemini-api/docs/image-generation This JavaScript example demonstrates how to use the @google/genai library to send an image generation prompt and save the resulting image to a file. ```JavaScript import { GoogleGenAI } from "@google/genai"; import * as fs from "node:fs"; async function main() { const ai = new GoogleGenAI({}); const interaction = await ai.interactions.create({ model: "gemini-3.1-flash-image", input: "A high-resolution, studio-lit product photograph of a minimalist ceramic coffee mug in matte black, presented on a polished concrete surface. The lighting is a three-point softbox setup designed to create soft, diffused highlights and eliminate harsh shadows. The camera angle is a slightly elevated 45-degree shot to showcase its clean lines. Ultra-realistic, with sharp focus on the steam rising from the coffee. Square image.", }); for (const step of interaction.steps) { if (step.type === "model_output") { for (const contentBlock of step.content) { if (contentBlock.type === "text") { console.log(contentBlock.text); } else if (contentBlock.type === "image") { const buffer = Buffer.from(contentBlock.data, "base64"); fs.writeFileSync("product_mockup.png", buffer); } } } } } main(); ```