### Install Dependencies (Windows) Source: https://github.com/aigrator/logos/blob/main/README.md Installs project dependencies on Windows using a PowerShell script. This is the recommended method for automatic setup. ```powershell powershell ./setup_windows.ps1 ``` -------------------------------- ### Install Dependencies (Linux) Source: https://github.com/aigrator/logos/blob/main/README.md Installs project dependencies on Linux (Ubuntu/Debian) using a Bash script. This automates the setup process for the operating system. ```bash bash ./setup_ubuntu.sh ``` -------------------------------- ### Install Other Dependencies Source: https://github.com/aigrator/logos/blob/main/README.md Installs remaining project dependencies from a requirements file. This command should be run after installing PyTorch if manual installation is chosen. ```bash uv pip install -r requirements.txt ``` -------------------------------- ### Launch Logos AI Assistant Source: https://github.com/aigrator/logos/blob/main/README.md Launches the main Python script for the Logos AI Assistant. This is the primary command to start the application. ```bash python main.py ``` -------------------------------- ### Run Logos AI Assistant Application (Python) Source: https://context7.com/aigrator/logos/llms.txt Provides commands to launch the Logos AI Assistant from the command line. It includes options for specifying a working directory for Terminal Mode and running the initial setup wizard. ```python # Launch Logos AI Assistant from command line python main.py # Launch with a specific working directory for Terminal Mode python main.py /home/user/my_project # Linux/macOS python main.py C:\Users\User\Documents\my_project # Windows # Run the initial setup wizard only python main.py --run-setup ``` -------------------------------- ### Install PyTorch with CUDA Source: https://github.com/aigrator/logos/blob/main/README.md Installs PyTorch and Torchvision with CUDA support for GPU acceleration. This command is intended for systems with compatible NVIDIA GPUs and requires the cu121 CUDA toolkit. ```bash uv pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121 ``` -------------------------------- ### Launch Logos AI Assistant (Alias) Source: https://github.com/aigrator/logos/blob/main/README.md Launches the Logos AI Assistant using a system alias, typically created during automated setup. This provides a shorter command for convenience. ```bash logos ``` -------------------------------- ### Launch Application (launch_app) - Python Source: https://context7.com/aigrator/logos/llms.txt Launches an installed application by its name using cross-platform methods. The system handles the platform-specific execution details (Windows, macOS, Linux). ```python # Tool call from LLM { "name": "launch_app", "arguments": { "description": "Opening VS Code editor", "app_name": "Visual Studio Code" } } # The system uses platform-specific launch methods: # - Windows: explorer.exe shell:Appsfolder\{appid} # - macOS: open # - Linux: xdg-open ``` -------------------------------- ### Strategic Plan JSON Structure Example Source: https://github.com/aigrator/logos/blob/main/prompts/vision/strategic_planning_system.txt Demonstrates the required nested JSON structure for a strategic plan, including task descriptions, actions with names and arguments (potentially with coordinates), and expected outcomes. This format is crucial for AI-driven task execution. ```json { "type": "strategic_plan", "payload": [ { "task_description": "Click the blue button", "actions": [ { "name": "click", "arguments": { "description": "Click the blue button with text 'Click Me'", "x1": 832, "y1": 930, "x2": 1088, "y2": 1059 } } ], "expected_outcome": "A new dialog or window should appear after clicking the button." }, { "task_description": "Confirm the action", "actions": [ { "name": "click", "arguments": { "description": "Click the 'OK' button in the confirmation dialog." } } ], "expected_outcome": "The confirmation dialog should close." } ] } ``` -------------------------------- ### Initialize and Run AppController (Python) Source: https://context7.com/aigrator/logos/llms.txt Demonstrates the initialization and execution of the AppController, the central component managing application state, mode switching, and coordination between UI, LLM workers, and handlers. It requires QApplication, SettingsManager, and ThreadPool. ```python from PySide6.QtWidgets import QApplication from controller import AppController from settings_manager import SettingsManager from thread_pool import ThreadPool # Initialize the application app = QApplication([]) settings_manager = SettingsManager() thread_pool = ThreadPool() # Create the controller with optional working directory controller = AppController( app=app, settings_manager=settings_manager, thread_pool=thread_pool, local_server=None, working_directory="/path/to/project" ) # Run the application controller.run() app.exec() ``` -------------------------------- ### Application Launch and Maximize Workflow Source: https://github.com/aigrator/logos/blob/main/prompts/vision/strategic_planning_system.txt A two-step task pattern for launching and maximizing applications. This approach ensures the application is active while treating the maximization as a best-effort operation to prevent execution failures. ```json [ { "task_description": "Launch the application", "actions": [{"tool": "launch_app", "params": {"app_name": "TargetApp"}}], "expected_outcome": "The application window should appear on the screen." }, { "task_description": "Maximize the application window", "actions": [{"tool": "hotkey", "params": {"keys": ["win", "up"]}}], "expected_outcome": "The application window should be active." } ] ``` -------------------------------- ### Drag and Drop (drag_and_drop) - Python Source: https://context7.com/aigrator/logos/llms.txt Performs a drag and drop operation by specifying the start and end coordinates for the drag action. This is used for moving elements on the screen, like files into folders. ```python # Tool call from LLM { "name": "drag_and_drop", "arguments": { "description": "Dragging file to the folder", "start_x1": 100, "start_y1": 100, "start_x2": 150, "start_y2": 130, "end_x1": 300, "end_y1": 200, "end_x2": 350, "end_y2": 230 } } ``` -------------------------------- ### SettingsManager Usage in Python Source: https://github.com/aigrator/logos/blob/main/docs/system_architecture.md This Python code illustrates the correct usage of the SettingsManager class within the project. It covers dependency injection, reading settings with defaults, writing single or multiple settings, and subscribing to setting changes via Qt signals for reactive updates. It emphasizes using a single instance and avoiding direct file manipulation. ```python from src.settings_manager import SettingsManager # --- Initialization (e.g., in main.py) --- # settings_manager = SettingsManager() # --- Dependency Injection --- # class MyClass: # def __init__(self, settings_manager: SettingsManager, controller): # self.settings_manager = settings_manager # self.controller = controller # self.settings_manager.settings_changed.connect(self.on_settings_changed) # --- Reading Settings --- # active_provider = self.settings_manager.get('active_provider', SettingsManager.DEFAULTS['active_provider']) # cache_dir = self.settings_manager.get('cache_dir', SettingsManager.DEFAULTS['cache_dir']) # --- Writing Settings --- # Single update # self.settings_manager.set('current_model', 'gpt-4o') # Batch update # self.settings_manager.set_many({'language': 'en', 'current_model': 'gpt-4o'}) # --- Subscribing to Changes --- # def on_settings_changed(self, new_settings: dict): # cache_dir = new_settings.get('cache_dir', SettingsManager.DEFAULTS['cache_dir']) # print(f"Cache directory updated to: {cache_dir}") # --- Unsubscribing (on cleanup) --- # def closeEvent(self, event): # self.settings_manager.settings_changed.disconnect(self.on_settings_changed) # super().closeEvent(event) ``` -------------------------------- ### Implement Generic LLM Provider Class Source: https://github.com/aigrator/logos/blob/main/docs/adding_new_llm_provider_guide.md Defines the GenericProvider class structure, including static configuration methods for UI schema and default settings, and the initialization of the requests session. ```python class GenericProvider(BaseLLMProvider): @staticmethod def get_provider_id() -> str: return "generic" def __init__(self, provider_config: dict): super().__init__(provider_config) self.session = requests.Session() self.api_url = "https://api.generic-provider.com/v1/chat" ``` -------------------------------- ### Manage Application Settings (Python) Source: https://context7.com/aigrator/logos/llms.txt Illustrates how to use the SettingsManager to handle application configurations, including reading, writing, and subscribing to changes. It supports secure API key storage in the system keyring. ```python from settings_manager import SettingsManager # Initialize settings manager settings_manager = SettingsManager('settings.json') # Read settings active_provider = settings_manager.get('active_provider', SettingsManager.DEFAULTS['active_provider']) webp_quality = settings_manager.get('webp_quality', 80) # Write single setting settings_manager.set('theme', 'dark') # Write multiple settings at once (single signal emission) settings_manager.set_many({ 'language': 'en', 'current_model': 'gemini-2.5-flash' }) # Subscribe to settings changes def on_settings_changed(new_settings: dict): cache_dir = new_settings.get('cache_dir', SettingsManager.DEFAULTS['cache_dir']) # Apply updated settings settings_manager.settings_changed.connect(on_settings_changed) # Get all settings all_settings = settings_manager.all() ``` -------------------------------- ### Configure System Settings via settings.json Source: https://context7.com/aigrator/logos/llms.txt The settings.json file manages global application state, including active LLM providers, keyboard shortcuts, and vision processing thresholds. ```json { "active_provider": "google_gemini", "llm_providers": { "google_gemini": { "model": "gemini-2.5-flash", "require_json_output": true }, "openai": { "model": "gpt-4o", "require_json_output": true }, "anthropic": { "model": "claude-3-5-sonnet-20241022", "require_json_output": true } }, "hotkeys": { "toggle_mode": "Ctrl+Shift+A", "pause_resume": "Ctrl+Alt+P", "stop_execution": "Ctrl+Alt+S", "cancel_request": "Ctrl+Shift+X" }, "theme": "dark", "omniparser": { "ocr_languages": ["en"], "box_threshold": 0.35 }, "vision": { "post_action_delay_ms": 1000, "max_correction_attempts": 3 } } ``` -------------------------------- ### BaseLLMProvider Implementation Source: https://context7.com/aigrator/logos/llms.txt The BaseLLMProvider abstract class defines the interface for custom LLM integrations. It requires implementation of provider identification, UI schema generation, and execution methods for standard and streaming responses. ```python from src.llm.base_provider import BaseLLMProvider from typing import Any class CustomProvider(BaseLLMProvider): @staticmethod def get_provider_id() -> str: return "custom_provider" @staticmethod def get_display_name() -> str: return "Custom AI Provider" @staticmethod def get_default_settings() -> dict[str, Any]: return { 'model': 'custom-model-v1', 'api_key': '', 'require_json_output': True } @staticmethod def get_ui_schema() -> dict[str, Any]: return { 'api_key': {'type': 'password', 'label': 'API Key'}, 'model': {'type': 'dropdown', 'label': 'Model', 'options_source': 'models.json'} } def _execute_get_response(self, prompt, attachments=None, system_prompt=None, history=None, schema_name=None) -> dict[str, Any]: response = self._call_api(prompt, system_prompt) return { "reasoning": response.get("reasoning", ""), "tool_calls": response.get("tool_calls") } def stream_response(self, prompt, attachments=None, system_prompt=None, history=None, callbacks=None): for chunk in self._stream_api(prompt, system_prompt): yield chunk ``` -------------------------------- ### Web Utility Tools Source: https://context7.com/aigrator/logos/llms.txt Tools for interacting with the web, including searching via DuckDuckGo, fetching page content, and downloading files to local storage. ```json { "name": "web_search", "arguments": { "query": "Python asyncio best practices 2024" } } ``` ```json { "name": "web_fetchs", "arguments": { "urls": ["https://docs.python.org/3/library/asyncio.html"], "prompt": "Extract the main concepts about asyncio" } } ``` ```json { "name": "download_file", "arguments": { "url": "https://example.com/data.csv", "destination": "downloads/data.csv" } } ``` -------------------------------- ### Execute API Requests with Error Handling Source: https://github.com/aigrator/logos/blob/main/docs/adding_new_llm_provider_guide.md Demonstrates how to perform POST requests to an LLM endpoint while catching and re-raising exceptions as either FatalError or RetryableError based on HTTP status codes. ```python def _execute_get_response(self, prompt: str, **kwargs) -> Dict[str, Any]: try: response = self.session.post(self.api_url, json={"prompt": prompt}, timeout=120) response.raise_for_status() return response.json() except requests.exceptions.HTTPError as e: if 400 <= e.response.status_code < 500: raise FatalError(f"Client error: {e.response.text}") from e raise RetryableError(f"Server error: {e.response.text}") from e except requests.exceptions.RequestException as e: raise RetryableError(f"Network error: {e}") from e ``` -------------------------------- ### Hybrid Search Orchestration in TerminalHandler Source: https://github.com/aigrator/logos/blob/main/docs/hybrid_web_search_guide.md The _handle_web_request method acts as the orchestrator, attempting a native provider search (Plan A) before falling back to the local computer search (Plan B) upon failure or absence of native capabilities. ```python def _handle_web_request(self, query): # Plan A: Attempt native provider search result = self.provider.web_search(query) if result is None: # Plan B: Fallback to local ddgs search result = self.computer.web_search(query) return result ``` -------------------------------- ### Write to File (write_file) - Python Source: https://context7.com/aigrator/logos/llms.txt Writes specified content to a given file path. If the file does not exist, it will be created. This is fundamental for saving data or configuration. ```python # Tool call from LLM { "name": "write_file", "arguments": { "path": "output.txt", "content": "Hello, World!\nThis is a test file." } } # Response: "Successfully wrote to output.txt" ``` -------------------------------- ### Find Files by Pattern (find_files) - Python Source: https://context7.com/aigrator/logos/llms.txt Searches for files matching a given pattern (e.g., '*.py') within a specified directory. This helps in locating specific types of files within a project. ```python # Tool call from LLM { "name": "find_files", "arguments": { "pattern": "*.py", "directory": "src/" } } ``` -------------------------------- ### Launch Logos AI Assistant with Working Directory (Linux/macOS) Source: https://github.com/aigrator/logos/blob/main/README.md Launches the Logos AI Assistant and sets a specific directory as the working directory for terminal mode. File operations and commands will be relative to this path. ```bash python main.py /home/user/my_project ``` -------------------------------- ### POST /hotkey Source: https://context7.com/aigrator/logos/llms.txt Simulates keyboard shortcut combinations to trigger system or application actions. ```APIDOC ## POST /hotkey ### Description Presses a combination of keys simultaneously to perform actions like saving files or copying text. ### Method POST ### Endpoint /hotkey ### Request Body - **description** (string) - Required - A brief explanation of the action. - **keys** (array) - Required - A list of keys to press (e.g., ["ctrl", "s"]). ### Request Example { "description": "Pressing Ctrl+S to save the document", "keys": ["ctrl", "s"] } ### Response #### Success Response (200) - **status** (string) - Confirmation of the key press execution. ``` -------------------------------- ### POST /write_file Source: https://context7.com/aigrator/logos/llms.txt Writes text content to a specified file path, creating the file if it does not exist. ```APIDOC ## POST /write_file ### Description Updates or creates a file with the provided content. ### Method POST ### Endpoint /write_file ### Request Body - **path** (string) - Required - The destination file path. - **content** (string) - Required - The text content to write. ### Request Example { "path": "output.txt", "content": "Hello, World!" } ### Response #### Success Response (200) - **message** (string) - Confirmation message indicating success. ``` -------------------------------- ### Launch Logos AI Assistant with Working Directory (Windows) Source: https://github.com/aigrator/logos/blob/main/README.md Launches the Logos AI Assistant on Windows and sets a specific directory as the working directory for terminal mode. File operations and commands will be relative to this path. ```powershell python main.py C:\Users\User\Documents\my_project ``` -------------------------------- ### POST /execute_shell Source: https://context7.com/aigrator/logos/llms.txt Executes system-level shell commands and returns the standard output, error, and exit code. ```APIDOC ## POST /execute_shell ### Description Runs a command in the system's default shell. ### Method POST ### Endpoint /execute_shell ### Request Body - **command** (string) - Required - The shell command to execute. ### Request Example { "command": "git status" } ### Response #### Success Response (200) - **stdout** (string) - The standard output of the command. - **stderr** (string) - The standard error output. - **exit_code** (integer) - The command exit status code. ``` -------------------------------- ### Search File Content with Regex (search_file_content) - Python Source: https://context7.com/aigrator/logos/llms.txt Searches for content within files using regular expressions. This allows for powerful and flexible pattern matching across multiple files. -------------------------------- ### Type Text into Input Field (Vision Mode - Python) Source: https://context7.com/aigrator/logos/llms.txt Illustrates the 'write_text' tool for Vision Mode, enabling the AI assistant to input text into UI fields. It supports both clipboard and direct key simulation methods, with 'direct' being the default on Linux for better compatibility. ```python # Tool call from LLM { "name": "write_text", "arguments": { "description": "Typing the filename 'mydocument.txt'", "text": "mydocument.txt", "method": "clipboard" # Optional: "clipboard" (fast) or "direct" (compatible) } } # The method defaults to: # - "direct" on Linux (better compatibility) ``` -------------------------------- ### Native Google Search Implementation Source: https://github.com/aigrator/logos/blob/main/docs/hybrid_web_search_guide.md The GoogleProvider utilizes the Gemini API's built-in GoogleSearchRetrieval tool to perform efficient, integrated web searches. ```python import google.generativeai as genai class GoogleProvider: def web_search(self, query): tool = genai.Tool(google_search_retrieval=genai.GoogleSearchRetrieval()) # Execution logic for native search return response ``` -------------------------------- ### Local Fallback Search via Computer Class Source: https://github.com/aigrator/logos/blob/main/docs/hybrid_web_search_guide.md The Computer class handles the fallback search by retrieving the required ddgs API key from the settings manager and performing the search request. ```python class Computer: def web_search(self, query): api_key = self.settings_manager.get("ddgs_api_key") # Perform HTTP request to ddgs using api_key return search_results ``` -------------------------------- ### List Directory Contents (list_directory) - Python Source: https://context7.com/aigrator/logos/llms.txt Lists the files and subdirectories within a specified directory path. This is useful for exploring the file system and understanding the project structure. ```python # Tool call from LLM { "name": "list_directory", "arguments": { "path": "src/" } } ``` -------------------------------- ### Perform Single Click Action (Vision Mode - Python) Source: https://context7.com/aigrator/logos/llms.txt Defines and implements the 'click' tool for Vision Mode, enabling the AI assistant to perform a single mouse click at specified coordinates. It includes both the LLM tool definition format and the internal Python implementation using the Computer class. ```python # Tool definition used by the LLM { "name": "click", "arguments": { "description": "Click the Save button", "x1": 100, # Top-left x-coordinate "y1": 100, # Top-left y-coordinate "x2": 200, # Bottom-right x-coordinate "y2": 150, # Bottom-right y-coordinate "button": "left" # Optional: "left", "middle", or "right" } } # Internal implementation from src.computer.computer import Computer computer = Computer(settings_manager, mode="vision") computer.click( description="Click the Save button", x1=100, y1=100, x2=200, y2=150, button="left" ) ``` -------------------------------- ### Wait for UI Element (wait_element) - Python Source: https://context7.com/aigrator/logos/llms.txt Pauses execution until a specific UI element or condition becomes present on the screen. This is crucial for synchronizing actions with dynamic web pages or application states. ```python # Tool call from LLM { "name": "wait_element", "arguments": { "description": "Wait for the loading spinner to disappear", "expected_outcome": "The page should finish loading" } } ``` -------------------------------- ### Execute System Actions with Computer Class Source: https://context7.com/aigrator/logos/llms.txt The Computer class acts as the execution engine, dynamically loading tools based on the operational mode (vision or terminal). It supports GUI automation like clicking and typing, as well as shell command execution and file system manipulation. ```python from src.computer.computer import Computer from settings_manager import SettingsManager settings_manager = SettingsManager() # Create Computer instance for Vision Mode vision_computer = Computer(settings_manager, mode="vision") # Execute vision tools vision_computer.click(description="Click Save", x1=100, y1=100, x2=200, y2=150) vision_computer.write_text(description="Enter filename", text="document.txt") vision_computer.hotkey(description="Save file", keys=["ctrl", "s"]) # Create Computer instance for Terminal Mode terminal_computer = Computer( settings_manager, mode="terminal", working_directory="/home/user/project" ) # Execute terminal tools result = terminal_computer.execute_shell(command="ls -la") print(f"Exit code: {result['exit_code']}") print(f"Output: {result['stdout']}") content = terminal_computer.read_files(paths=["README.md", "setup.py"]) print(content) terminal_computer.write_file(path="output.log", content="Task completed") # Search files matches = terminal_computer.search_file_content( pattern="import.*torch", directory="src/" ) ``` -------------------------------- ### File System Operations Source: https://context7.com/aigrator/logos/llms.txt A collection of JSON-based tool calls for common file system tasks including searching, replacing text, managing directories, and performing file operations like copy, move, and delete. ```json { "name": "search_file_content", "arguments": { "pattern": "def main\\(", "directory": "src/" } } ``` ```json { "name": "replace", "arguments": { "file_path": "config.py", "old_string": "DEBUG = True", "new_string": "DEBUG = False" } } ``` ```json { "name": "replace_many", "arguments": { "file_path": "settings.py", "replacements": [ {"old": "localhost", "new": "production.server.com"}, {"old": "DEBUG = True", "new": "DEBUG = False"} ] } } ``` ```json { "name": "create_directory", "arguments": { "path": "new_folder/subfolder" } } ``` ```json { "name": "delete_files", "arguments": { "paths": ["temp.txt", "cache/"] } } ``` ```json { "name": "copy_files", "arguments": { "source": "original.txt", "destination": "backup/original_copy.txt" } } ``` ```json { "name": "move_files", "arguments": { "source": "old_location/file.txt", "destination": "new_location/file.txt" } } ``` ```json { "name": "path_exists", "arguments": { "path": "config.json" } } ```