# mgrep mgrep is a semantic grep tool built by Mixedbread that enables natural-language search across codebases, documents, PDFs, and images. Unlike traditional grep which requires exact pattern matching, mgrep understands the meaning behind your queries and finds relevant content even when exact keywords don't match. It indexes your files into a cloud-backed Mixedbread Store and uses state-of-the-art semantic retrieval with reranking for precise results. The tool is designed for both humans and AI coding agents, reducing token usage by approximately 2x while maintaining search quality. It features background indexing via file watchers, respects `.gitignore` patterns, and integrates with popular coding agents like Claude Code, OpenCode, Codex, and Factory Droid. mgrep supports multimodal search including code, text, PDFs, and images, with audio and video support coming soon. ## Installation Install mgrep globally via npm, pnpm, or bun. ```bash npm install -g @mixedbread/mgrep ``` ## CLI Commands ### mgrep login Authenticates with the Mixedbread platform using a device-based OAuth flow. Opens a browser for authorization and stores the token locally for subsequent commands. ```bash # Interactive browser-based login mgrep login # Alternative: API key authentication for CI/CD environments export MXBAI_API_KEY=your_api_key_here mgrep search "query" # Uses API key automatically ``` ### mgrep logout Logs out from the Mixedbread platform and clears stored authentication tokens. ```bash mgrep logout # Output: ✅ Successfully logged out ``` ### mgrep watch Indexes files in the current directory and continuously monitors for changes. Uploads files to a Mixedbread Store and keeps it synchronized via file watchers. Respects `.gitignore` and optional `.mgrepignore` files. ```bash # Index current repository and watch for changes cd /path/to/project mgrep watch # Dry run to preview what would be indexed mgrep watch --dry-run # With custom file limits mgrep watch --max-file-size 5242880 --max-file-count 5000 # Example output: # ✓ Initial sync complete (150/150) • uploaded 23 # Watching for file changes in /path/to/project # change: /path/to/project/src/index.ts ``` ### mgrep search Performs semantic search across indexed files. This is the default command and can be invoked simply as `mgrep "query"`. ```bash # Basic semantic search in current directory mgrep "where do we set up authentication?" # Search in a specific directory mgrep "How are chunks defined?" src/models # Limit results to top 5 matches mgrep -m 5 "database connection handling" # Show file content in results mgrep -c "error handling patterns" # Generate a summarized answer based on search results mgrep -a "How is rate limiting implemented?" # Sync files before searching (useful for fresh projects) mgrep -s "user validation logic" # Disable reranking for faster but less precise results mgrep --no-rerank "config parser" # Search PDFs and documents mgrep "What is the conclusion of the paper?" research-paper.pdf # Example output: # ./src/lib/auth.ts:45-67 (95.23% match) # ./src/middleware/session.ts:12-28 (89.15% match) # ./src/routes/login.ts:33-51 (85.67% match) ``` ### mgrep search --web Searches the web alongside local files using Mixedbread's web search capability. ```bash # Web search with local results mgrep --web "best practices for error handling in TypeScript" # Web search with summarized answer mgrep --web --answer "How do I integrate a JavaScript runtime into Deno?" # Example output with --web --answer: # Web search queries the `mixedbread/web` store and merges results # based on relevance with your local indexed files. ``` ### mgrep install-claude-code Installs the mgrep plugin for Claude Code agent. Requires authentication first and Claude Code version 2.0.36 or higher. ```bash # Install mgrep plugin for Claude Code mgrep install-claude-code # Output: # Successfully added the mixedbread-ai/mgrep plugin to the marketplace # Successfully installed the mgrep plugin # Then start Claude Code in your project cd /path/to/project claude ``` ### mgrep uninstall-claude-code Removes the mgrep plugin from Claude Code. ```bash mgrep uninstall-claude-code ``` ### mgrep install-opencode Installs mgrep integration for OpenCode agent. ```bash mgrep install-opencode ``` ### mgrep install-codex Installs mgrep integration for Codex agent. ```bash mgrep install-codex ``` ### mgrep install-droid Installs mgrep hooks and skills for Factory Droid agent. ```bash mgrep install-droid ``` ## Configuration ### Configuration File (.mgreprc.yaml) Create a `.mgreprc.yaml` or `.mgreprc.yml` in your project root for local configuration, or `~/.config/mgrep/config.yaml` for global settings. ```yaml # .mgreprc.yaml # Maximum file size in bytes to upload (default: 1MB = 1048576) maxFileSize: 5242880 # Maximum number of files to sync per operation (default: 1000) maxFileCount: 5000 ``` ### Environment Variables Configure mgrep behavior via environment variables, especially useful for CI/CD pipelines. ```bash # Authentication export MXBAI_API_KEY=your_api_key_here # Bypass browser login export MXBAI_STORE=my-custom-store # Override default store name # Search options export MGREP_MAX_COUNT=25 # Default max results export MGREP_CONTENT=1 # Always show content export MGREP_ANSWER=1 # Always generate answers export MGREP_WEB=1 # Include web results export MGREP_SYNC=1 # Sync before searching export MGREP_DRY_RUN=1 # Enable dry run mode export MGREP_RERANK=0 # Disable reranking # Sync options export MGREP_MAX_FILE_SIZE=5242880 # Max file size (5MB) export MGREP_MAX_FILE_COUNT=5000 # Max files per sync # Example usage with environment variables export MGREP_MAX_COUNT=20 export MGREP_CONTENT=1 export MGREP_ANSWER=1 mgrep "search query" ``` ### .mgrepignore File Create a `.mgrepignore` file in your project root to exclude additional files from indexing. Uses the same syntax as `.gitignore`. ```gitignore # .mgrepignore node_modules/ dist/ *.log .env coverage/ ``` ## Store Interface The Store interface provides programmatic access to Mixedbread's vector store operations. ```typescript import { MixedbreadStore } from "./lib/store.js"; interface Store { // List files in a store with optional path filtering listFiles(storeId: string, options?: { pathPrefix?: string }): AsyncGenerator; // Upload a file to the store uploadFile(storeId: string, file: File | ReadableStream, options: { external_id: string; overwrite?: boolean; metadata?: { path: string; hash: string; mtime?: number }; }): Promise; // Delete a file by external ID deleteFile(storeId: string, externalId: string): Promise; // Semantic search across stores search( storeIds: string[], query: string, top_k?: number, search_options?: { rerank?: boolean }, filters?: SearchFilter ): Promise<{ data: ChunkType[] }>; // Question answering with citations ask( storeIds: string[], question: string, top_k?: number, search_options?: { rerank?: boolean }, filters?: SearchFilter ): Promise<{ answer: string; sources: ChunkType[] }>; // Retrieve store metadata retrieve(storeId: string): Promise; // Create a new store create(options: { name: string; description?: string }): Promise; // Get store info including pending/in-progress counts getInfo(storeId: string): Promise; } ``` ## loadConfig Function Loads mgrep configuration with precedence: CLI flags > environment variables > local config > global config > defaults. ```typescript import { loadConfig, type MgrepConfig } from "./lib/config.js"; // Load configuration for a specific directory const config: MgrepConfig = loadConfig("/path/to/project", { maxFileSize: 2097152, // CLI override: 2MB maxFileCount: 2000 // CLI override }); // Configuration interface interface MgrepConfig { maxFileSize: number; // Default: 1048576 (1MB) maxFileCount: number; // Default: 1000 } // Example: Check if file exceeds max size import { exceedsMaxFileSize, formatFileSize } from "./lib/config.js"; if (exceedsMaxFileSize("/path/to/large-file.pdf", config.maxFileSize)) { console.log(`File exceeds limit of ${formatFileSize(config.maxFileSize)}`); } ``` ## Summary mgrep is ideal for developers and AI coding agents who need to navigate large codebases efficiently. Its primary use cases include semantic code exploration, feature discovery, onboarding to new projects, searching documentation and PDFs, and enabling AI agents to find relevant code without exhaustive pattern guessing. The tool complements traditional grep by providing intent-based search while grep handles exact pattern matching and symbol tracing. Integration with coding agents is seamless through dedicated install commands. For Claude Code users, running `mgrep install-claude-code` followed by starting Claude in a project directory enables automatic background indexing and semantic search capabilities. The tool can also be used standalone via the CLI or programmatically through the Store interface for custom integrations. Configuration is flexible through YAML files, environment variables, or CLI flags, making it suitable for both local development and CI/CD pipelines.