# groq-rag groq-rag is an extended TypeScript SDK for Groq that adds RAG (Retrieval-Augmented Generation), web browsing, and autonomous agent capabilities on top of the official groq-sdk. It provides 100% API compatibility with the official Groq SDK while adding powerful features for building intelligent AI applications. The library enables developers to create chatbots and agents that can search the web, fetch URLs, query knowledge bases, and reason through complex tasks using Groq's fast LLM inference. The core functionality centers around the GroqRAG client class, which wraps the official Groq SDK and provides access to three main modules: RAG (document ingestion and semantic retrieval), Web (URL fetching and web search), and Agents (autonomous tool-using AI agents). The library supports multiple vector store backends, embedding providers, and search providers, making it flexible for both development and production use cases. ## GroqRAG Client Initialization The main entry point for all functionality. Creates a client instance that provides access to RAG, web, and agent features while maintaining full compatibility with the official Groq SDK. ```typescript import GroqRAG from 'groq-rag'; const client = new GroqRAG({ apiKey: process.env.GROQ_API_KEY, // Optional: defaults to GROQ_API_KEY env var baseURL: 'https://api.groq.com', // Optional: custom API endpoint timeout: 30000, // Optional: request timeout in ms maxRetries: 2, // Optional: retry attempts }); // Access underlying Groq SDK for direct API usage const groqSdk = client.client; // Standard chat completion (Groq SDK passthrough) const response = await client.complete({ model: 'llama-3.3-70b-versatile', messages: [ { role: 'system', content: 'You are a helpful assistant.' }, { role: 'user', content: 'What is the capital of France?' }, ], }); console.log(response.choices[0].message.content); // Output: "The capital of France is Paris." ``` ## RAG Initialization Initialize the RAG system with configurable vector stores, embedding providers, and chunking strategies. Required before using any RAG features. ```typescript import GroqRAG from 'groq-rag'; const client = new GroqRAG(); // Initialize with in-memory vector store (default) await client.initRAG({ embedding: { provider: 'groq', // 'groq' | 'openai' model: 'text-embedding-3-small', // For OpenAI dimensions: 1536, // Embedding dimensions }, vectorStore: { provider: 'memory', // 'memory' | 'chroma' connectionString: 'http://localhost:8000', // For ChromaDB indexName: 'my-collection', }, chunking: { strategy: 'recursive', // 'recursive' | 'fixed' | 'sentence' | 'paragraph' chunkSize: 1000, chunkOverlap: 200, }, }); // Get retriever instance const retriever = await client.getRetriever(); console.log('RAG initialized with', await retriever.count(), 'documents'); ``` ## Adding Documents to RAG Add documents to the knowledge base for semantic search and retrieval. Supports single documents, bulk uploads, and URL content ingestion. ```typescript import GroqRAG from 'groq-rag'; const client = new GroqRAG(); await client.initRAG(); // Add single document with metadata const docId = await client.rag.addDocument( `Our company's refund policy allows returns within 30 days of purchase. Items must be in original condition with tags attached. Refunds are processed within 5-7 business days.`, { source: 'refund-policy', category: 'policies', version: '2.0' } ); // Add multiple documents at once const docIds = await client.rag.addDocuments([ { content: 'Shipping takes 5-7 business days.', metadata: { source: 'shipping' } }, { content: 'Express shipping available for $12.99.', metadata: { source: 'shipping' } }, ]); // Add URL content directly to knowledge base const urlDocId = await client.rag.addUrl('https://example.com/docs', { source: 'external-docs', fetchedBy: 'system', }); // Check document count const count = await client.rag.count(); console.log(`Knowledge base contains ${count} document chunks`); // Clear all documents await client.rag.clear(); ``` ## RAG-Augmented Chat Chat with automatic context retrieval from the knowledge base. The system queries relevant documents and includes them in the LLM context. ```typescript import GroqRAG from 'groq-rag'; const client = new GroqRAG(); await client.initRAG(); // Add knowledge to the base await client.rag.addDocument( 'Our premium plan costs $29/month and includes unlimited API calls.', { source: 'pricing' } ); await client.rag.addDocument( 'The basic plan is $9/month with 1000 API calls included.', { source: 'pricing' } ); // Chat with RAG augmentation const response = await client.chat.withRAG({ messages: [ { role: 'user', content: 'What pricing plans do you offer?' }, ], model: 'llama-3.3-70b-versatile', topK: 5, // Number of documents to retrieve minScore: 0.5, // Minimum similarity score (0-1) includeMetadata: true, // Include metadata in context temperature: 0.7, maxTokens: 1024, systemPrompt: 'You are a helpful sales assistant.', // Optional custom prompt }); console.log('Answer:', response.content); console.log('Token usage:', response.usage); // Access retrieved sources for (const source of response.sources) { console.log(`- Score: ${source.score.toFixed(3)}`); console.log(` Content: ${source.document.content}`); console.log(` Metadata:`, source.document.metadata); } ``` ## Querying the Knowledge Base Directly query the RAG system for semantic search without LLM completion. Useful for building custom retrieval pipelines. ```typescript import GroqRAG from 'groq-rag'; const client = new GroqRAG(); await client.initRAG(); await client.rag.addDocument('Python is great for data science.'); await client.rag.addDocument('JavaScript dominates web development.'); await client.rag.addDocument('Rust provides memory safety without garbage collection.'); // Semantic search query const results = await client.rag.query('What programming language is best for web?', { topK: 3, minScore: 0.3, }); for (const result of results) { console.log(`Score: ${result.score}`); console.log(`Content: ${result.document.content}`); console.log(`ID: ${result.document.id}`); } // Get pre-formatted context string for custom LLM prompts const context = await client.rag.getContext('web development languages', { topK: 2, includeMetadata: true, }); console.log('Formatted context:\n', context); // Output: // [Source 1 | source: docs] // JavaScript dominates web development. // --- // [Source 2 | source: docs] // Python is great for data science. ``` ## Web Search Chat Augment chat responses with live web search results. Uses DuckDuckGo by default (free, no API key required). ```typescript import GroqRAG from 'groq-rag'; const client = new GroqRAG(); // Chat with web search augmentation const response = await client.chat.withWebSearch({ messages: [ { role: 'user', content: 'What are the latest developments in AI?' }, ], model: 'llama-3.3-70b-versatile', searchQuery: 'AI artificial intelligence news 2024', // Optional: custom search query maxResults: 5, maxSnippetLength: 200, // Truncate snippets (optional) maxTotalContentLength: 2000, // Limit total context size (optional) }); console.log('Response:', response.content); console.log('\nSearch Sources:'); for (const source of response.sources) { console.log(`[${source.position}] ${source.title}`); console.log(` URL: ${source.url}`); console.log(` Snippet: ${source.snippet.substring(0, 100)}...`); } ``` ## URL Content Chat Chat about the content of a specific web page. Fetches, parses, and includes the page content in the LLM context. ```typescript import GroqRAG from 'groq-rag'; const client = new GroqRAG(); // Chat about a specific URL const response = await client.chat.withUrl({ messages: [ { role: 'user', content: 'Summarize this article in 3 bullet points.' }, ], url: 'https://example.com/blog/article', model: 'llama-3.3-70b-versatile', maxContentLength: 5000, // Optional: limit fetched content maxTokens: 1000, // Optional: token-based limit (~4 chars/token) }); console.log('Summary:', response.content); console.log('Page title:', response.source.title); console.log('Fetched at:', response.source.fetchedAt); console.log('Metadata:', response.source.metadata); ``` ## Web Fetching Module Fetch and parse web pages to clean markdown. Supports content extraction, link extraction, and image extraction. ```typescript import GroqRAG from 'groq-rag'; const client = new GroqRAG(); // Fetch a single URL const result = await client.web.fetch('https://example.com', { timeout: 30000, includeLinks: true, includeImages: true, maxContentLength: 10000, // Optional content limit }); console.log('Title:', result.title); console.log('Content:', result.content); console.log('Markdown:', result.markdown); console.log('Metadata:', result.metadata); console.log('Links:', result.links?.slice(0, 5)); console.log('Images:', result.images?.slice(0, 3)); console.log('Fetched at:', result.fetchedAt); // Fetch multiple URLs in parallel const urls = [ 'https://example.com/page1', 'https://example.com/page2', 'https://example.com/page3', ]; const results = await client.web.fetchMany(urls); for (const [index, res] of results.entries()) { if (res instanceof Error) { console.log(`Failed to fetch ${urls[index]}: ${res.message}`); } else { console.log(`${res.title}: ${res.content.substring(0, 100)}...`); } } // Get just the markdown content const markdown = await client.web.fetchMarkdown('https://example.com/docs'); console.log(markdown); ``` ## Web Search Module Search the web using multiple providers. DuckDuckGo is free and requires no API key. ```typescript import GroqRAG, { createSearchProvider } from 'groq-rag'; const client = new GroqRAG(); // Search using default DuckDuckGo const results = await client.web.search('TypeScript best practices', { maxResults: 10, safeSearch: true, maxSnippetLength: 150, // Optional: truncate snippets maxTotalContentLength: 2000, // Optional: limit total content }); for (const result of results) { console.log(`[${result.position}] ${result.title}`); console.log(`URL: ${result.url}`); console.log(`Snippet: ${result.snippet}\n`); } // Create custom search providers const braveSearch = createSearchProvider({ provider: 'brave', apiKey: process.env.BRAVE_API_KEY, }); const serperSearch = createSearchProvider({ provider: 'serper', apiKey: process.env.SERPER_API_KEY, }); const braveResults = await braveSearch.search('AI news', { maxResults: 5 }); ``` ## Creating Agents with Built-in Tools Create autonomous agents that can use tools to accomplish tasks. Built-in tools include web search, URL fetching, calculator, datetime, and RAG queries. ```typescript import GroqRAG from 'groq-rag'; const client = new GroqRAG(); await client.initRAG(); // Add knowledge for RAG tool await client.rag.addDocument( 'Our company headquarters is at 123 Main Street, San Francisco.', { source: 'company-info' } ); // Create agent with all built-in tools const agent = await client.createAgentWithBuiltins({ name: 'Research Assistant', model: 'llama-3.3-70b-versatile', systemPrompt: `You are a helpful research assistant. Use your tools to: - Search the web for current information - Fetch and read specific web pages - Query the knowledge base for internal data - Perform calculations Always cite your sources.`, maxIterations: 10, // Max tool-use cycles verbose: true, // Log agent reasoning }); // Run the agent const result = await agent.run( 'What is the current stock price of Apple and where is our office located?' ); console.log('Answer:', result.output); console.log('Total tokens used:', result.totalTokens); // Review tool calls made for (const tool of result.toolCalls) { console.log(`Tool: ${tool.name}`); console.log(`Execution time: ${tool.executionTime}ms`); console.log(`Result:`, tool.result); } // Review reasoning steps for (const step of result.steps) { if (step.action) { console.log(`Action: ${step.action}(${JSON.stringify(step.actionInput)})`); console.log(`Observation: ${step.observation}`); } else if (step.isFinal) { console.log(`Final thought: ${step.thought}`); } } ``` ## Streaming Agent Execution Run agents with streaming output for real-time feedback. Useful for building interactive chat interfaces. ```typescript import GroqRAG from 'groq-rag'; const client = new GroqRAG(); const agent = await client.createAgentWithBuiltins({ model: 'llama-3.3-70b-versatile', maxIterations: 5, }); console.log('Agent starting...\n'); // Stream agent execution for await (const event of agent.runStream('Search for SpaceX news and summarize')) { switch (event.type) { case 'thought': console.log('[Thinking]', event.data); break; case 'tool_call': const call = event.data as { name: string; arguments: string }; console.log(`\n[Tool Call] ${call.name}(${call.arguments})`); break; case 'tool_result': const result = event.data as { name: string; result: unknown }; console.log(`[Tool Result] ${result.name}: Done`); break; case 'content': process.stdout.write(event.data as string); // Stream content tokens break; case 'done': const final = event.data as { output: string; toolCalls: unknown[] }; console.log('\n\n[Complete] Tools used:', final.toolCalls.length); break; } } // Manage conversation history agent.clearHistory(); // Reset for new conversation const history = agent.getHistory(); // Get message history ``` ## Creating Custom Tools Define custom tools for agents to use. Tools have a name, description, parameter schema, and execute function. ```typescript import GroqRAG, { ToolDefinition, createToolExecutor } from 'groq-rag'; // Define a custom tool const weatherTool: ToolDefinition = { name: 'get_weather', description: 'Get current weather for a location', parameters: { type: 'object', properties: { location: { type: 'string', description: 'City name or coordinates', }, units: { type: 'string', description: 'Temperature units: celsius or fahrenheit', }, }, required: ['location'], }, execute: async (params) => { const { location, units = 'celsius' } = params as { location: string; units?: string; }; // Your weather API call here return { location, temperature: 22, units, conditions: 'sunny', humidity: 45, }; }, }; // Use with agent const client = new GroqRAG(); const agent = client.createAgent({ model: 'llama-3.3-70b-versatile', tools: [weatherTool], }); // Or add tools dynamically agent.addTool(weatherTool); // Use tool executor directly const executor = createToolExecutor(); executor.register(weatherTool); const result = await executor.execute('get_weather', { location: 'San Francisco' }); console.log(result); // { name: 'get_weather', result: { location: 'San Francisco', temperature: 22, ... } } ``` ## Built-in Tool Functions Access individual built-in tools for custom agent configurations. ```typescript import { createWebSearchTool, createFetchUrlTool, createCalculatorTool, createDateTimeTool, createRAGQueryTool, getBuiltinTools, } from 'groq-rag'; // Get all built-in tools (except RAG which needs retriever) const tools = getBuiltinTools(); // Returns: [web_search, fetch_url, calculator, get_datetime] // Create individual tools with custom config const searchTool = createWebSearchTool(); // Uses DuckDuckGo const fetchTool = createFetchUrlTool(); const calcTool = createCalculatorTool(); const dateTool = createDateTimeTool(); // Calculator tool example const calcResult = await calcTool.execute({ expression: 'sqrt(16) + pow(2, 3)' }); console.log(calcResult); // { expression: '...', result: 12 } // DateTime tool example const dateResult = await dateTool.execute({ timezone: 'America/New_York' }); console.log(dateResult); // { datetime: '1/15/2024, 3:30:00 PM', timezone: 'America/New_York', unix: 1705347000 } // For RAG tool, pass a retriever instance const client = new GroqRAG(); await client.initRAG(); const retriever = await client.getRetriever(); const ragTool = createRAGQueryTool(retriever); ``` ## Text Chunking Utilities Chunk text for RAG ingestion using various strategies. Useful for processing large documents. ```typescript import { TextChunker, chunkText } from 'groq-rag'; const longDocument = ` Chapter 1: Introduction This is a long document that needs to be split into chunks... Chapter 2: Main Content More content here that continues across multiple paragraphs... `; // Quick function for one-off chunking const chunks = chunkText(longDocument, 'doc-123', { strategy: 'recursive', // 'recursive' | 'fixed' | 'sentence' | 'paragraph' chunkSize: 500, chunkOverlap: 100, }); for (const chunk of chunks) { console.log(`Chunk ${chunk.id}:`); console.log(` Length: ${chunk.content.length} chars`); console.log(` Content: ${chunk.content.substring(0, 50)}...`); } // Or use the TextChunker class for multiple documents const chunker = new TextChunker({ strategy: 'paragraph', chunkSize: 1000, chunkOverlap: 200, separators: ['\n\n', '\n', '. ', ' ', ''], }); const doc1Chunks = chunker.chunk(document1, 'doc-1'); const doc2Chunks = chunker.chunk(document2, 'doc-2'); ``` ## Utility Functions Helper functions for common operations like token estimation, text cleaning, and retry logic. ```typescript import { estimateTokens, truncateToTokens, cleanText, extractUrls, cosineSimilarity, formatContext, generateId, sleep, retry, batch, safeJsonParse, } from 'groq-rag'; // Token estimation (~4 chars per token) const tokens = estimateTokens('Hello, world!'); console.log(`Estimated tokens: ${tokens}`); // ~4 // Truncate to token limit const truncated = truncateToTokens(longText, 1000); // ~4000 chars // Clean and normalize text const cleaned = cleanText(' Multiple spaces\n\n\n\nand lines '); console.log(cleaned); // "Multiple spaces\n\nand lines" // Extract URLs from text const urls = extractUrls('Visit https://example.com and http://test.org'); console.log(urls); // ['https://example.com', 'http://test.org'] // Vector similarity const similarity = cosineSimilarity([1, 0, 0], [0.5, 0.5, 0]); console.log(`Similarity: ${similarity}`); // Format search results as context const context = formatContext( [ { content: 'Document 1 content', metadata: { source: 'file1' } }, { content: 'Document 2 content', metadata: { source: 'file2' } }, ], { includeMetadata: true, separator: '\n---\n' } ); // Generate unique IDs const id = generateId(); // '1705347000000-abc123' // Retry with exponential backoff const result = await retry( () => fetchData(), { maxRetries: 3, baseDelay: 1000, maxDelay: 10000 } ); // Batch array processing const items = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; const batches = batch(items, 3); // [[1,2,3], [4,5,6], [7,8,9], [10]] // Safe JSON parsing with default const data = safeJsonParse('{"key": "value"}', { default: true }); const fallback = safeJsonParse('invalid json', { default: true }); ``` ## Search Provider Configuration Configure different search providers based on your needs and available API keys. ```typescript import { createSearchProvider, DuckDuckGoSearch, BraveSearch, SerperSearch } from 'groq-rag'; // DuckDuckGo (free, no API key) const duckduckgo = createSearchProvider({ provider: 'duckduckgo' }); // or: const duckduckgo = new DuckDuckGoSearch(); // Brave Search (requires API key from brave.com/search/api) const brave = createSearchProvider({ provider: 'brave', apiKey: process.env.BRAVE_API_KEY, }); // or: const brave = new BraveSearch(process.env.BRAVE_API_KEY!); // Serper.dev Google Search (requires API key from serper.dev) const serper = createSearchProvider({ provider: 'serper', apiKey: process.env.SERPER_API_KEY, }); // or: const serper = new SerperSearch(process.env.SERPER_API_KEY!); // Use any provider const results = await brave.search('TypeScript tutorials', { maxResults: 5, safeSearch: true, }); ``` ## Vector Store Configuration Configure different vector store backends for RAG persistence. ```typescript import GroqRAG, { createVectorStore, MemoryVectorStore, ChromaVectorStore } from 'groq-rag'; const client = new GroqRAG(); // In-memory store (default, no persistence) await client.initRAG({ vectorStore: { provider: 'memory' }, }); // ChromaDB for production (persistent) await client.initRAG({ vectorStore: { provider: 'chroma', connectionString: 'http://localhost:8000', indexName: 'my-knowledge-base', }, }); // Create stores directly const memoryStore = createVectorStore({ provider: 'memory' }); const chromaStore = createVectorStore({ provider: 'chroma', connectionString: 'http://localhost:8000', }); // Vector store operations await memoryStore.add(documentsWithEmbeddings); const results = await memoryStore.search(queryEmbedding, { topK: 5 }); await memoryStore.delete(['doc-id-1', 'doc-id-2']); const count = await memoryStore.count(); await memoryStore.clear(); ``` ## Embedding Provider Configuration Configure embedding providers for vector generation. Groq embeddings are deterministic pseudo-embeddings for testing. ```typescript import GroqRAG, { createEmbeddingProvider, GroqEmbeddings, OpenAIEmbeddings } from 'groq-rag'; const client = new GroqRAG(); // Groq embeddings (default, deterministic, free) await client.initRAG({ embedding: { provider: 'groq' }, }); // OpenAI embeddings (production quality) await client.initRAG({ embedding: { provider: 'openai', apiKey: process.env.OPENAI_API_KEY, model: 'text-embedding-3-small', dimensions: 1536, }, }); // Create embeddings directly const groqEmbed = createEmbeddingProvider({ provider: 'groq' }, groqClient); const openaiEmbed = createEmbeddingProvider({ provider: 'openai', apiKey: process.env.OPENAI_API_KEY, }); // Generate embeddings const { embedding, tokenCount } = await openaiEmbed.embed('Hello world'); const batchResults = await openaiEmbed.embedBatch(['Text 1', 'Text 2', 'Text 3']); console.log(`Dimensions: ${openaiEmbed.dimensions}`); ``` ## Summary groq-rag is designed for building AI applications that need more than basic chat completion. Common use cases include: customer support chatbots with RAG-powered knowledge bases, research assistants that can search the web and synthesize information, document Q&A systems for enterprise knowledge management, and autonomous agents that can perform multi-step tasks using tools. The library's streaming support makes it suitable for building responsive user interfaces that show real-time agent reasoning. Integration is straightforward since groq-rag is a drop-in replacement for the official groq-sdk. Existing Groq SDK code works unchanged, and developers can incrementally adopt RAG, web, or agent features as needed. The modular architecture allows using just the components you need - whether that's the web fetcher for content extraction, the vector store for semantic search, or the full agent system for autonomous task completion. All components are fully typed with TypeScript and work seamlessly together through the unified GroqRAG client interface.