### Example: Describe Local Image Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-image-describer/SKILL.md Example of how to use the describe script with a local image file path. Ensure GEMINI_API_KEY is set. ```sh GEMINI_API_KEY=your_api_key node .agents/skills/gemini-image-describer/scripts/describe.js "./photos/screenshot.png" ``` -------------------------------- ### Agent Browser Recording Commands Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/video-recording.md Provides examples of the core commands for managing video recordings: starting with a specified file, stopping the current recording, and restarting with a new file. ```bash # Start recording to file agent-browser record start ./output.webm # Stop current recording agent-browser record stop # Restart with new file (stops current + starts new) agent-browser record restart ./take2.webm ``` -------------------------------- ### Profiler Commands: Start and Stop Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/profiling.md Demonstrates starting profiling with default or custom categories and stopping to save the trace. ```bash # Start profiling with default categories agent-browser profiler start # Start with custom trace categories agent-browser profiler start --categories "devtools.timeline,v8.execute,blink.user_timing" # Stop profiling and save to file agent-browser profiler stop ./trace.json ``` -------------------------------- ### Example: Describe Remote Image Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-image-describer/SKILL.md Example of how to use the describe script with a remote image URL. Ensure GEMINI_API_KEY is set. ```sh GEMINI_API_KEY=your_api_key node .agents/skills/gemini-image-describer/scripts/describe.js "https://example.com/photo.jpg" ``` -------------------------------- ### Example: Describe Image from Data URI Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-image-describer/SKILL.md Example of how to use the describe script with an image provided as a Base64 encoded data URI. Ensure GEMINI_API_KEY is set. ```sh GEMINI_API_KEY=your_api_key node .agents/skills/gemini-image-describer/scripts/describe.js "data:image/png;base64,iVBOR..." ``` -------------------------------- ### Execute Transcribe Script Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-audio-transcriber/SKILL.md Run the pre-built script to transcribe an audio file. No npm install or extra setup is required. Ensure the script is executed from the repository root. ```shell node .agents/skills/gemini-audio-transcriber/scripts/transcribe.js ``` -------------------------------- ### Basic Video Recording Workflow Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/video-recording.md Demonstrates the fundamental sequence of starting a recording, performing browser actions, and stopping the recording to save the video. ```bash # Start recording agent-browser record start ./demo.webm # Perform actions agent-browser open https://example.com agent-browser snapshot -i agent-browser click @e1 agent-browser fill @e2 "test input" # Stop and save agent-browser record stop ``` -------------------------------- ### Snapshot Output Format Example Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/snapshot-refs.md An example of the structured output from the `snapshot` command, showing element IDs, tags, attributes, and text content. ```text Page: Example Site - Home URL: https://example.com @e1 [header] @e2 [nav] @e3 [a] "Home" @e4 [a] "Products" @e5 [a] "About" @e6 [button] "Sign In" @e7 [main] @e8 [h1] "Welcome" @e9 [form] @e10 [input type="email"] placeholder="Email" @e11 [input type="password"] placeholder="Password" @e12 [button type="submit"] "Log In" @e13 [footer] @e14 [a] "Privacy Policy" ``` -------------------------------- ### Using Descriptive Filenames for Recordings Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/video-recording.md Shows how to use descriptive and context-rich filenames when starting recordings to easily identify their content later. ```bash # Include context in filename agent-browser record start ./recordings/login-flow-2024-01-15.webm agent-browser record start ./recordings/checkout-test-run-42.webm ``` -------------------------------- ### Basic Profiling Workflow Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/profiling.md Start profiling, perform actions, and then stop and save the trace to a JSON file. ```bash # Start profiling agent-browser profiler start # Perform actions agent-browser navigate https://example.com agent-browser click "#button" agent-browser wait 1000 # Stop and save agent-browser profiler stop ./trace.json ``` -------------------------------- ### Install agent-browser CLI Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/SKILL.md Ensures the agent-browser CLI is installed globally. If not found, it installs the package. It also includes steps to ensure a compatible browser is available, downloading one if necessary. ```bash if ! command -v agent-browser >/dev/null 2>&1; then npm install --global agent-browser fi agent-browser open about:blank && agent-browser close || agent-browser install ``` -------------------------------- ### Correct Workflow: Snapshot Before Interaction Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/snapshot-refs.md Shows the correct sequence of opening a page, taking a snapshot to get refs, and then using those refs for interaction. ```bash # CORRECT agent-browser open https://example.com agent-browser snapshot -i # Get refs first agent-browser click @e1 # Use ref # WRONG agent-browser open https://example.com agent-browser click @e1 # Ref doesn't exist yet! ``` -------------------------------- ### Fetch Webpage as Markdown Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-web-fetch/SKILL.md Example of fetching a webpage and converting its content to Markdown format using the script. ```bash node .agents/skills/felo-web-fetch/scripts/run_web_fetch.mjs --url "https://example.com" --output-format markdown ``` -------------------------------- ### Debug: Trace Recording Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use 'trace start' to begin recording a trace and 'trace stop' to save it to a file. ```bash agent-browser trace start # Start recording trace ``` ```bash agent-browser trace stop trace.zip # Stop and save trace ``` -------------------------------- ### Execute Image Describer Script Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-image-describer/SKILL.md Run the pre-built Node.js script to describe an image. No npm install is required. Provide the image path or URL as an argument. ```sh node .agents/skills/gemini-image-describer/scripts/describe.js ``` -------------------------------- ### Start Chrome with Remote Debugging Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/authentication.md Starts a Chrome instance with the remote debugging port enabled. This is the first step to reusing existing browser sessions for authentication. ```bash # macOS "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222 # Linux google-chrome --remote-debugging-port=9222 # Windows "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe" --remote-debugging-port=9222 ``` -------------------------------- ### Set Custom Install Location Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Set the AGENT_BROWSER_HOME environment variable to specify a custom installation location for agent-browser. ```bash AGENT_BROWSER_HOME="/path/to/agent-browser" # Custom install location ``` -------------------------------- ### Core Browser Automation Workflow Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/SKILL.md Demonstrates the fundamental sequence of opening a URL, taking an initial snapshot to get element references, interacting with elements, and then re-snapshotting to capture changes. ```bash agent-browser open https://example.com/form agent-browser snapshot -i # Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit" agent-browser fill @e1 "user@example.com" agent-browser fill @e2 "password123" agent-browser click @e3 agent-browser wait --load networkidle agent-browser snapshot -i # MUST re-snapshot after navigation/DOM changes ``` -------------------------------- ### Execute Felo Web Fetch CLI Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-web-fetch/SKILL.md Use the packaged CLI after installation for fetching webpage content. Supports short forms for common options. ```bash felo web-fetch -u "https://example.com" [options] ``` -------------------------------- ### Transcribe Local File Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-audio-transcriber/SKILL.md Example of transcribing a local audio file. Set the GEMINI_API_KEY environment variable before execution. The script handles local file paths. ```shell GEMINI_API_KEY=your_api_key node .agents/skills/gemini-audio-transcriber/scripts/transcribe.js "./recordings/meeting.m4a" ``` -------------------------------- ### Dry Run Output Example Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-audio-transcriber/SKILL.md Example JSON output when the dry run mode is enabled, showing parsed input details without API interaction. ```json { "source": "remote-url", "mimeType": "audio/mpeg", "localPath": null, "uriPreview": "data:audio/mpeg;base64,வைக்" } ``` -------------------------------- ### Install Deep Researcher Skill Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-deep-researcher/README.md To install the Deep Researcher skill, type '/skills' in a Telegram chat to view the list of available skills and select this one. ```bash /skills ``` -------------------------------- ### Transcribe Remote URL Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-audio-transcriber/SKILL.md Example of transcribing an audio file from a remote URL. Set the GEMINI_API_KEY environment variable before execution. ```shell GEMINI_API_KEY=your_api_key node .agents/skills/gemini-audio-transcriber/scripts/transcribe.js "https://example.com/audio/meeting.mp3" ``` -------------------------------- ### Debug: Profiling Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use 'profiler start' to begin Chrome DevTools profiling and 'profiler stop' to save the profile. ```bash agent-browser profiler start # Start Chrome DevTools profiling ``` ```bash agent-browser profiler stop trace.json # Stop and save profile ``` -------------------------------- ### Rebuild Script from Source Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-image-describer/SKILL.md Steps to rebuild the describe script from its source file using Bun. This involves navigating to the skill's directory, installing dependencies, and running the build command. ```sh cd .agents/skills/gemini-image-describer bun install bun build src/describe.js --outfile scripts/describe.js --target node --minify ``` -------------------------------- ### Chrome Trace Event Format Example Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/profiling.md An example of the JSON output format used for Chrome trace events, including traceEvents and metadata. ```json { "traceEvents": [ { "cat": "devtools.timeline", "name": "RunTask", "ph": "X", "ts": 12345, "dur": 100, ... }, ... ], "metadata": { "clock-domain": "LINUX_CLOCK_MONOTONIC" } } ``` -------------------------------- ### Common Ref Patterns Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/snapshot-refs.md Provides examples of common HTML elements and their corresponding ref notations, covering buttons, inputs, links, selects, and more. ```text @e1 [button] "Submit" @e2 [input type="email"] @e3 [input type="password"] @e4 [a href="/page"] "Link Text" @e5 [select] @e6 [textarea] placeholder="Message" @e7 [div class="modal"] @e8 [img alt="Logo"] @e9 [checkbox] checked @e10 [radio] selected ``` -------------------------------- ### Short-Lived Sessions in CI/CD Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/authentication.md Example of performing actions in CI/CD without persisting state, ensuring short-lived sessions. ```bash # Don't persist state in CI agent-browser open https://app.example.com/login # ... login and perform actions ... agent-browser close # Session ends, nothing persisted ``` -------------------------------- ### User Request for New Canvas Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-superAgent/SKILL.md Example of a user requesting a new canvas for a different project. In subsequent calls, omit `--live-doc-id` as a new one is created automatically. ```text User: "Open a new canvas for a different project" ``` -------------------------------- ### User Request Specifying Style Directly Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-superAgent/SKILL.md Example of a user requesting content generation with a specific style. The `--skill-id` parameter is used to specify the desired skill. ```text User: "Write a tweet about AI trends using the 'darioamodei' style" ``` -------------------------------- ### Generate Trigger Eval Queries Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/skill-creator/SKILL.md Create a JSON array of evaluation queries, mixing 'should_trigger' and 'should_not_trigger' examples. Queries should be realistic, detailed, and include variations in phrasing, typos, and context. ```json [{"query": "the user prompt", "should_trigger": true}, {"query": "another prompt", "should_trigger": false}] ``` -------------------------------- ### Analysis JSON Schema Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/skill-creator/references/schemas.md Example JSON output from the post-hoc analyzer, providing a summary of the comparison, winner's strengths, loser's weaknesses, instruction following scores, and suggestions for improvement. ```json { "comparison_summary": { "winner": "A", "winner_skill": "path/to/winner/skill", "loser_skill": "path/to/loser/skill", "comparator_reasoning": "Brief summary of why comparator chose winner" }, "winner_strengths": [ "Clear step-by-step instructions for handling multi-page documents", "Included validation script that caught formatting errors" ], "loser_weaknesses": [ "Vague instruction 'process the document appropriately' led to inconsistent behavior", "No script for validation, agent had to improvise" ], "instruction_following": { "winner": { "score": 9, "issues": ["Minor: skipped optional logging step"] }, "loser": { "score": 6, "issues": [ "Did not use the skill's formatting template", "Invented own approach instead of following step 3" ] } }, "improvement_suggestions": [ { "priority": "high", "category": "instructions", "suggestion": "Replace 'process the document appropriately' with explicit steps", "expected_impact": "Would eliminate ambiguity that caused inconsistent behavior" } ], "transcript_insights": { "winner_execution_pattern": "Read skill -> Followed 5-step process -> Used validation script", "loser_execution_pattern": "Read skill -> Unclear on approach -> Tried 3 different methods" } } ``` -------------------------------- ### Data Extraction with agent-browser Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/SKILL.md Demonstrates how to extract text content from a specific element on a web page after opening the page and taking a snapshot to get the element's reference. ```bash agent-browser open https://example.com/products agent-browser snapshot -i agent-browser get text @e5 ``` -------------------------------- ### Display Help Information Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the --help or -h option to display general help information. Use --help for detailed command-specific help. ```bash agent-browser --help # Show help (-h) ``` ```bash agent-browser --help # Show detailed help for a command ``` -------------------------------- ### Generating Documentation Videos Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/video-recording.md Captures a step-by-step workflow for creating video documentation. Includes pauses and waits to ensure clarity for viewers. ```bash #!/bin/bash # Record workflow for documentation agent-browser record start ./docs/how-to-login.webm agent-browser open https://app.example.com/login agent-browser wait 1000 # Pause for visibility agent-browser snapshot -i agent-browser fill @e1 "demo@example.com" agent-browser wait 500 agent-browser fill @e2 "password" agent-browser wait 500 agent-browser click @e3 agent-browser wait --load networkidle agent-browser wait 1000 # Show result agent-browser record stop ``` -------------------------------- ### Combining Video Recording with Screenshots Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/video-recording.md Shows how to simultaneously record video and capture screenshots at key moments during automation to provide both a visual flow and detailed snapshots. ```bash # Record video AND capture key frames agent-browser record start ./flow.webm agent-browser open https://example.com agent-browser screenshot ./screenshots/step1-homepage.png agent-browser click @e1 agent-browser screenshot ./screenshots/step2-after-click.png agent-browser record stop ``` -------------------------------- ### Deep Researcher Prompt Examples Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-deep-researcher/README.md These are example prompts you can use to ask the Deep Researcher skill to generate a report on a specific topic. Provide clear and specific descriptions for better results. ```text 幫我研究「台灣半導體產業供應鏈」,產出一份完整的分析報告。 請深度調查 AI 晶片市場的現況與未來趨勢。 我想了解量子運算對現代密碼學的影響,幫我寫一份報告。 研究一下全球電動車市場的競爭格局,附上參考來源。 幫我比較三大雲端平台(AWS、GCP、Azure)的優缺點,寫成報告。 ``` -------------------------------- ### Command Chaining with && Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/SKILL.md Illustrates how to chain multiple agent-browser commands using '&&' for sequential execution without needing intermediate output. This is useful for simple, linear automation tasks. ```bash agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png ``` -------------------------------- ### Interacting with Elements Using Refs Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/snapshot-refs.md Demonstrates how to use refs obtained from a snapshot to perform actions like clicking buttons and filling input fields. ```bash # Click the "Sign In" button agent-browser click @e6 # Fill email input agent-browser fill @e10 "user@example.com" # Fill password agent-browser fill @e11 "password123" # Submit the form agent-browser click @e12 ``` -------------------------------- ### Basic Proxy Configuration via Environment Variable Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/proxy-support.md Configure HTTP and HTTPS proxies using environment variables for agent-browser to use. ```bash export HTTP_PROXY="http://proxy.example.com:8080" agent-browser open https://example.com ``` ```bash export HTTPS_PROXY="https://proxy.example.com:8080" agent-browser open https://example.com ``` ```bash export HTTP_PROXY="http://proxy.example.com:8080" export HTTPS_PROXY="http://proxy.example.com:8080" agent-browser open https://example.com ``` -------------------------------- ### Enable JSON Output Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the --json option to get command output in JSON format, facilitating programmatic parsing. ```bash agent-browser --json ... ``` -------------------------------- ### Handle Two-Factor Authentication (2FA) Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/authentication.md Logs in with credentials and waits for manual 2FA completion in the browser, then saves the state. ```bash # Login with credentials agent-browser open https://app.example.com/login --headed # Show browser agent-browser snapshot -i agent-browser fill @e1 "user@example.com" agent-browser fill @e2 "password123" agent-browser click @e3 # Wait for user to complete 2FA manually echo "Complete 2FA in the browser window..." agent-browser wait --url "**/dashboard" --timeout 120000 # Save state after 2FA agent-browser state save ./2fa-state.json ``` -------------------------------- ### Authentication State Persistence with agent-browser Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/SKILL.md Shows how to log into a website, save the authentication state to a file, and then load that state in a subsequent session to bypass the login process. ```bash # Login and save state agent-browser open https://app.example.com/login agent-browser snapshot -i agent-browser fill @e1 "$USERNAME" agent-browser fill @e2 "$PASSWORD" agent-browser click @e3 agent-browser wait --url "**/dashboard" agent-browser state save auth.json # Reuse in future sessions agent-browser state load auth.json agent-browser open https://app.example.com/dashboard ``` -------------------------------- ### Use Environment Variables for Credentials Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/authentication.md Demonstrates using environment variables for username and password during login. ```bash agent-browser fill @e1 "$APP_USERNAME" agent-browser fill @e2 "$APP_PASSWORD" ``` -------------------------------- ### A/B Testing Sessions Pattern Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/session-management.md This bash script demonstrates how to run A/B tests by opening different variants of an application in separate sessions and comparing their results via screenshots. ```bash # Test different user experiences agent-browser --session variant-a open "https://app.com?variant=a" agent-browser --session variant-b open "https://app.com?variant=b" # Compare agent-browser --session variant-a screenshot /tmp/variant-a.png agent-browser --session variant-b screenshot /tmp/variant-b.png ``` -------------------------------- ### Handling Recording in Error Cases with Cleanup Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/video-recording.md Demonstrates a robust approach to managing video recordings during automation, ensuring that recordings are stopped and the browser is closed even if errors occur. ```bash #!/bin/bash set -e cleanup() { agent-browser record stop 2>/dev/null || true agent-browser close 2>/dev/null || true } trap cleanup EXIT agent-browser record start ./automation.webm # ... automation steps ... ``` -------------------------------- ### Basic Proxy Configuration via CLI Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/proxy-support.md Set the proxy directly using the --proxy flag when running agent-browser commands. ```bash agent-browser --proxy "http://proxy.example.com:8080" open https://example.com ``` -------------------------------- ### Default Session Usage Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/session-management.md When the --session flag is omitted, commands operate within the default browser session. This example shows commands using the same default session. ```bash # These use the same default session agent-browser open https://example.com agent-browser snapshot -i agent-browser close # Closes default session ``` -------------------------------- ### Execute SuperAgent Script - New Conversation with Skill and Brand Style Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-superAgent/SKILL.md This command initiates a new conversation with a specific skill and applies a predefined brand style. The `--ext` parameter includes detailed style requirements. Always use `--json` for structured output. ```bash node .agents/skills/felo-superAgent/scripts/run_superagent.mjs \ --query "Write a tweet about the latest AI trends" \ --live-doc-id "LIVE_DOC_ID" \ --skill-id twitter-writer \ --ext '{"brand_style_requirement":"Style name: darioamodei\nStyle labels: Thoughtful long-form essays\nStyle DNA: # Dario Amodei (@DarioAmodei) Tweet Writing Style DNA\n\n## Style Overview\nDario writes like a serious intellectual...(full content)"}' \ --accept-language en \ --json ``` -------------------------------- ### Default Text Output Format Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-superAgent/SKILL.md Example of the default text output for the TWITTER category, showing style name, labels, and DNA. Fields with null or empty values are omitted. ```text Style name: darioamodei Style labels: Thoughtful long-form essays Style DNA: # Dario Amodei (@DarioAmodei) Tweet Writing Style DNA ...(full styleDna content) ``` -------------------------------- ### Load Authentication State and Open Page Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/authentication.md Loads a previously saved authentication state and opens a specific URL. Useful for skipping login. ```bash # Load saved auth state agent-browser state load ./auth-state.json # Navigate directly to protected page agent-browser open https://app.example.com/dashboard # Verify authenticated agent-browser snapshot -i ``` -------------------------------- ### Run in Headed Mode Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the --headed option to display the browser window instead of running in headless mode. ```bash agent-browser --headed ... ``` -------------------------------- ### Basic and Interactive Snapshot Commands Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/snapshot-refs.md Use the `snapshot` command to capture the page structure. The `-i` flag provides an interactive snapshot, recommended for assigning refs. ```bash # Basic snapshot (shows page structure) agent-browser snapshot # Interactive snapshot (-i flag) - RECOMMENDED agent-browser snapshot -i ``` -------------------------------- ### Tool Support - Presentation Generation Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-superAgent/SKILL.md Details on the `generate_ppt` tool, including its output format. ```text Tool: `generate_ppt` Output: PPT titles and status ``` -------------------------------- ### Rebuild Transcriber Script Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-audio-transcriber/SKILL.md Instructions to rebuild the transcriber script after modifying its source code. This involves navigating to the skill's directory, installing dependencies with bun, and building the JavaScript file. ```shell cd .agents/skills/gemini-audio-transcriber bun install bun build src/transcribe.js --outfile scripts/transcribe.js --target node --minify ``` -------------------------------- ### Specify Cloud Browser Provider Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the -p or --provider option to select a cloud browser provider. ```bash agent-browser -p ... ``` -------------------------------- ### Deep Researcher Output Structure Example Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-deep-researcher/README.md The Deep Researcher skill outputs a Markdown report structured into sections like Summary, Background, Key Findings, Analysis, Conclusion & Recommendations, and References. ```markdown ## 摘要 本報告探討台灣半導體產業在全球供應鏈中的關鍵地位…… ## 背景 台灣半導體產業自 1980 年代起步,經過數十年發展…… ## 主要發現 1. 台積電佔全球晶圓代工市場超過 50% 的份額…… 2. 先進製程(3 奈米以下)集中在台灣生產…… ## 分析 從地緣政治角度來看,台灣半導體產業面臨…… ## 結論與建議 建議持續投資先進製程研發,同時分散生產基地…… ## 參考來源 - [1] Bloomberg, "TSMC's Global Dominance...", 2024 - [2] 工業技術研究院, 《半導體產業年報》, 2024 ``` -------------------------------- ### Handle OAuth/SSO Redirects Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/authentication.md Initiates an OAuth flow and handles redirects, waiting for specific URL patterns and filling credentials. ```bash # Start OAuth flow agent-browser open https://app.example.com/auth/google # Handle redirects automatically agent-browser wait --url "**/accounts.google.com**" agent-browser snapshot -i # Fill Google credentials agent-browser fill @e1 "user@gmail.com" agent-browser click @e2 # Next button agent-browser wait 2000 agent-browser snapshot -i agent-browser fill @e3 "password" agent-browser click @e4 # Sign in # Wait for redirect back agent-browser wait --url "**/app.example.com" agent-browser state save ./oauth-state.json ``` -------------------------------- ### Comparison JSON Schema Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/skill-creator/references/schemas.md Example JSON output from the blind comparator, detailing the comparison results between two outputs (A and B). Includes winner, reasoning, rubric scores, output quality, and expectation results. ```json { "winner": "A", "reasoning": "Output A provides a complete solution with proper formatting and all required fields. Output B is missing the date field and has formatting inconsistencies.", "rubric": { "A": { "content": { "correctness": 5, "completeness": 5, "accuracy": 4 }, "structure": { "organization": 4, "formatting": 5, "usability": 4 }, "content_score": 4.7, "structure_score": 4.3, "overall_score": 9.0 }, "B": { "content": { "correctness": 3, "completeness": 2, "accuracy": 3 }, "structure": { "organization": 3, "formatting": 2, "usability": 3 }, "content_score": 2.7, "structure_score": 2.7, "overall_score": 5.4 } }, "output_quality": { "A": { "score": 9, "strengths": ["Complete solution", "Well-formatted", "All fields present"], "weaknesses": ["Minor style inconsistency in header"] }, "B": { "score": 5, "strengths": ["Readable output", "Correct basic structure"], "weaknesses": ["Missing date field", "Formatting inconsistencies", "Partial data extraction"] } }, "expectation_results": { "A": { "passed": 4, "total": 5, "pass_rate": 0.80, "details": [ {"text": "Output includes name", "passed": true} ] }, "B": { "passed": 3, "total": 5, "pass_rate": 0.60, "details": [ {"text": "Output includes name", "passed": true} ] } } } ``` -------------------------------- ### Load Browser Extensions Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the --extension option to load browser extensions. This option can be repeated for multiple extensions. ```bash agent-browser --extension ... ``` -------------------------------- ### Specify Custom Browser Executable Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the --executable-path option to point to a custom browser executable. ```bash agent-browser --executable-path

``` -------------------------------- ### Configure Proxy Server Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the --proxy option to specify a proxy server URL and --proxy-bypass to list hosts to bypass the proxy. ```bash agent-browser --proxy ... ``` ```bash agent-browser --proxy-bypass ``` -------------------------------- ### Execute Script and Save Output Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/gemini-image-describer/SKILL.md Instructions for the agent to execute the describe script and save its output to a file. This includes creating the artifact directory and redirecting script output. ```sh mkdir -p artifacts/{issue-comment-id} node .agents/skills/gemini-image-describer/scripts/describe.js "" > artifacts/{issue-comment-id}/result.md ``` -------------------------------- ### Display Version Information Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the --version or -V option to display the agent-browser tool's version. ```bash agent-browser --version # Show version (-V) ``` -------------------------------- ### Debug: Show Browser Window Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the --headed option with the 'open' command to display the browser window for debugging. ```bash agent-browser --headed open example.com # Show browser window ``` -------------------------------- ### Essential agent-browser Commands Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/SKILL.md A comprehensive list of agent-browser commands for navigation, element snapshotting, interaction, waiting, capturing screenshots, and retrieving information from web pages. ```bash # Navigation agent-browser open # Open page agent-browser close # Close browser (always close when done!) # Snapshot (get element refs) agent-browser snapshot -i # Interactive elements with @refs # Interaction (use @refs from snapshot) agent-browser click @e1 # Click element agent-browser fill @e2 "text" # Clear field and type agent-browser type @e2 "text" # Type without clearing agent-browser select @e1 "option" # Select dropdown agent-browser check @e1 # Toggle checkbox agent-browser press Enter # Press key # Wait agent-browser wait --load networkidle # Wait for network idle agent-browser wait @e1 # Wait for element agent-browser wait --url "**/dashboard" # Wait for URL pattern agent-browser wait --text "Welcome" # Wait for text to appear # Capture agent-browser screenshot output.png # Screenshot agent-browser screenshot --full # Full page screenshot agent-browser screenshot --annotate # Screenshot with numbered element labels # Get information agent-browser get text @e1 # Get element text agent-browser get url # Get current URL agent-browser get title # Get page title # Viewport agent-browser set viewport 1920 1080 # Set viewport size ``` -------------------------------- ### Snapshotting with Iframes Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/snapshot-refs.md Shows how agent-browser automatically inlines iframe content into snapshots and allows direct interaction with iframe elements using their refs. ```bash agent-browser snapshot -i # @e1 [heading] "Checkout" # @e2 [Iframe] "payment-frame" # @e3 [input] "Card number" # @e4 [input] "Expiry" # @e5 [button] "Pay" # @e6 [button] "Cancel" # Interact with iframe elements directly using their refs agent-browser fill @e3 "4111111111111111" agent-browser fill @e4 "12/28" agent-browser click @e5 ``` -------------------------------- ### Adding Pauses for Clarity in Recordings Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/video-recording.md Illustrates the use of `agent-browser wait` to introduce pauses in the recording, allowing viewers to better observe the results of actions. ```bash # Slow down for human viewing agent-browser click @e1 agent-browser wait 500 # Let viewer see result ``` -------------------------------- ### Execute SuperAgent with Style Requirement Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-superAgent/SKILL.md Command to execute the SuperAgent with a specific query and a detailed brand style requirement. The `--ext` parameter is used to pass the style name, labels, and DNA. ```bash node .agents/skills/felo-superAgent/scripts/run_superagent.mjs \ --query "Write a tweet about AI trends" \ --live-doc-id "QPetunwpGnkKuZHStP7gwt" \ --skill-id twitter-writer \ --ext '{"brand_style_requirement":"Style name: darioamodei\nStyle labels: Thoughtful long-form essays\nStyle DNA: # Dario Amodei (@DarioAmodei) Tweet Writing Style DNA\n\n## Style Overview\nDario writes like a serious intellectual...(full content, do NOT truncate)"}' \ --accept-language en \ --json ``` -------------------------------- ### Debug: Connect to CDP Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the 'connect' command as an alternative way to establish a connection to a CDP port. ```bash agent-browser connect 9222 # Alternative: connect command ``` -------------------------------- ### Fetch Article with Readability and Markdown Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-web-fetch/SKILL.md Extract article content using readability mode and output as Markdown. ```bash node .agents/skills/felo-web-fetch/scripts/run_web_fetch.mjs --url "https://example.com/article" --with-readability true --output-format markdown ``` -------------------------------- ### Execute Felo Web Fetch Script Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-web-fetch/SKILL.md Run the bundled script to fetch webpage content. Specify the URL and any desired options for extraction. ```bash node .agents/skills/felo-web-fetch/scripts/run_web_fetch.mjs --url "https://example.com/article" [options] ``` -------------------------------- ### API Workflow - Create Conversation Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-superAgent/SKILL.md Demonstrates the API calls for creating a new conversation or a follow-up to an existing one. ```http New: POST /v2/conversations (requires `live_doc_short_id` in body) Follow-up: POST /v2/conversations/{threadId}/follow_up ``` -------------------------------- ### Execute Felo Web Fetch Skill via CLI Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-web-fetch/SKILL.md This snippet demonstrates how to execute the Felo Web Fetch skill using the provided Node.js script or the packaged CLI. It covers basic usage, output format selection, readability mode, CSS selector targeting, and passing cookies with a custom user-agent. ```APIDOC ## Execute Felo Web Fetch Skill via CLI ### Description This section details how to use the Felo Web Fetch skill through its command-line interface. It includes examples for basic fetching, converting content to different formats, extracting specific elements using CSS selectors, and advanced options like readability mode, cookies, and custom user-agents. ### Usage **Using the bundled script:** ```bash node .agents/skills/felo-web-fetch/scripts/run_web_fetch.mjs --url "https://example.com/article" [options] ``` **Using the packaged CLI:** ```bash felo web-fetch -u "https://example.com" [options] ``` ### Parameters #### Required Parameters - `--url` (string): The URL of the webpage to fetch. #### Core Optional Parameters - `--output-format` (string): The desired output format. Options: `html`, `markdown`, `text`. - `--crawl-mode` (string): The crawl mode. Options: `fast`, `fine`. - `--target-selector` (string): A CSS selector to extract specific content. - `--wait-for-selector` (string): A CSS selector to wait for before extraction. - `--with-readability` (boolean): Enable readability mode to extract article text. Defaults to `false`. #### Other Key Optional Parameters - `--cookie` (string): A cookie to send with the request. Can be repeated. - `--set-cookies-json` (string): A JSON array of cookies to set. - `--user-agent` (string): A custom user-agent string. - `--timeout` (integer): HTTP request timeout in seconds. - `--request-timeout-ms` (integer): API payload timeout in milliseconds. - `--with-links-summary` (boolean): Include a summary of links. - `--with-images-summary` (boolean): Include a summary of images. - `--with-images-readability` (boolean): Enable readability for images. - `--with-images` (boolean): Include images. - `--with-links` (boolean): Include links. - `--ignore-empty-text-image` (boolean): Ignore empty text or image elements. - `--with-cache` (boolean): Use caching. Defaults to `true`. - `--with-stypes` (boolean): Include structured types. - `--json` (boolean): Print the full JSON response. ### Examples **Basic fetch as Markdown:** ```bash node .agents/skills/felo-web-fetch/scripts/run_web_fetch.mjs --url "https://example.com" --output-format markdown ``` **Article extraction with readability:** ```bash node .agents/skills/felo-web-fetch/scripts/run_web_fetch.mjs --url "https://example.com/article" --with-readability true --output-format markdown ``` **Extracting content using a CSS selector:** ```bash node .agents/skills/felo-web-fetch/scripts/run_web_fetch.mjs --url "https://example.com" --target-selector "article.main" --output-format markdown ``` **Using cookies and custom user-agent:** ```bash node .agents/skills/felo-web-fetch/scripts/run_web_fetch.mjs --url "https://example.com/private" --cookie "session_id=abc123" --with-readability true --json ``` **Full JSON response:** ```bash node .agents/skills/felo-web-fetch/scripts/run_web_fetch.mjs --url "https://example.com" --output-format text --json ``` ``` -------------------------------- ### Geo-Location Testing with Proxies Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/proxy-support.md Script to test website accessibility and take screenshots from different geographical regions using a list of geo-located proxies. ```bash #!/bin/bash # Test site from different regions using geo-located proxies PROXIES=( "http://us-proxy.example.com:8080" "http://eu-proxy.example.com:8080" "http://asia-proxy.example.com:8080" ) for proxy in "${PROXIES[@]}"; do export HTTP_PROXY="$proxy" export HTTPS_PROXY="$proxy" region=$(echo "$proxy" | grep -oP '^\w+-\w+') echo "Testing from: $region" agent-browser --session "$region" open https://example.com agent-browser --session "$region" screenshot "./screenshots/$region.png" agent-browser --session "$region" close done ``` -------------------------------- ### SOCKS Proxy Configuration Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/proxy-support.md Configure SOCKS5 proxies, with or without authentication, using the ALL_PROXY environment variable. ```bash export ALL_PROXY="socks5://proxy.example.com:1080" agent-browser open https://example.com ``` ```bash export ALL_PROXY="socks5://user:pass@proxy.example.com:1080" agent-browser open https://example.com ``` -------------------------------- ### Select Specific Resources for Query Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-superAgent/README.md Use the --selected-resource-ids flag to specify which resources SuperAgent should use for a query. ```bash node felo-superAgent/scripts/run_superagent.mjs \ --query "Summarize these documents" \ --live-doc-id "PvyKouzJirXjFdst4uKRK3" \ --selected-resource-ids "res1,res2,res3" ``` -------------------------------- ### Tool Support - HTML Generation Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-superAgent/SKILL.md Details on the `generate_html` tool, including its output format. ```text Tool: `generate_html` Output: HTML page titles and status ``` -------------------------------- ### Debugging Failed Automation with Video Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/video-recording.md Records automation sessions to help debug failures by providing a visual log of actions leading up to an error. Includes error handling to stop recording on failure. ```bash #!/bin/bash # Record automation for debugging agent-browser record start ./debug-$(date +%Y%m%d-%H%M%S).webm # Run your automation agent-browser open https://app.example.com agent-browser snapshot -i agent-browser click @e1 || { echo "Click failed - check recording" agent-browser record stop exit 1 } agent-browser record stop ``` -------------------------------- ### Set Extension Paths Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Set the AGENT_BROWSER_EXTENSIONS environment variable to a comma-separated list of extension paths. ```bash AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths ``` -------------------------------- ### Form Submission with agent-browser Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/SKILL.md A common pattern for submitting web forms. It involves opening the form URL, taking a snapshot to identify form fields and the submit button, filling the fields, selecting options, and then submitting. ```bash agent-browser open https://example.com/signup agent-browser snapshot -i agent-browser fill @e1 "Jane Doe" agent-browser fill @e2 "jane@example.com" agent-browser select @e3 "California" agent-browser click @e5 agent-browser wait --load networkidle ``` -------------------------------- ### Capture Full Page Screenshots Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the --full option to capture a screenshot of the entire web page. ```bash agent-browser --full ... ``` -------------------------------- ### Design a Logo with Brand Style Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-superAgent/README.md Fetch image styles and then use the SuperAgent script to design a logo, specifying brand style requirements. ```bash # Fetch IMAGE styles node felo-superAgent/scripts/run_style_library.mjs --category IMAGE --accept-language en # New conversation with chosen style node felo-superAgent/scripts/run_superagent.mjs \ --query "Design a logo for my coffee shop called Bean & Brew" \ --live-doc-id "PvyKouzJirXjFdst4uKRK3" \ --skill-id logo-and-branding \ --ext '{"brand_style_requirement":"Style name: Minimalist Modern\nStyle labels: clean, monochrome\nStyle DNA: ...(full content)\nCover file ID: file_333"}' \ --accept-language en ``` -------------------------------- ### Run Skill Optimization Loop Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/skill-creator/SKILL.md Execute the optimization loop using a Python script. This process splits the eval set, evaluates the current description, and uses Claude to propose improvements. Use the model ID from your system prompt. ```bash python -m scripts.run_loop \ --eval-set \ --skill-path \ --model \ --max-iterations 5 \ --verbose ``` -------------------------------- ### HTTP Basic Authentication Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/authentication.md Sets username and password for HTTP Basic Authentication before navigating to a protected resource. ```bash # Set credentials before navigation agent-browser set credentials username password # Navigate to protected resource agent-browser open https://protected.example.com/api ``` -------------------------------- ### Set HTTP Headers Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the --headers option to define HTTP headers that will be scoped to the URL's origin. ```bash agent-browser --headers ... ``` -------------------------------- ### Tool Support - Document Generation Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-superAgent/SKILL.md Details on the `generate_document` tool, including its output format. ```text Tool: `generate_document` Output: Document titles and status ``` -------------------------------- ### Connect via Chrome DevTools Protocol Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Use the --cdp option to specify a port for connecting via the Chrome DevTools Protocol. ```bash agent-browser --cdp ... ``` -------------------------------- ### Set Cloud Browser Provider Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Set the AGENT_BROWSER_PROVIDER environment variable to specify the cloud browser provider. ```bash AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider ``` -------------------------------- ### Manage Browser State Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/commands.md Save and load browser state, including cookies, storage, and authentication information. ```bash agent-browser state save auth.json # Save cookies, storage, auth state ``` ```bash agent-browser state load auth.json # Restore saved state ``` -------------------------------- ### Basic Login Flow Automation Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/agent-browser/references/authentication.md Automates a typical login process by navigating to a login page, waiting for network idle, identifying form elements, filling credentials, submitting the form, and verifying successful login. ```bash # Navigate to login page agent-browser open https://app.example.com/login agent-browser wait --load networkidle # Get form elements agent-browser snapshot -i # Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Sign In" # Fill credentials agent-browser fill @e1 "user@example.com" agent-browser fill @e2 "password123" # Submit agent-browser click @e3 agent-browser wait --load networkidle # Verify login succeeded agent-browser get url # Should be dashboard, not login ``` -------------------------------- ### Tool Support - Image Generation Source: https://github.com/duotify/githubclawtoolkit/blob/main/skills/felo-superAgent/SKILL.md Details on the `generate_images` tool, including its output format. ```text Tool: `generate_images` Output: Image URLs and titles ```