# groq-rag

groq-rag is an extended TypeScript SDK for Groq that adds RAG (Retrieval-Augmented Generation), web browsing, and autonomous agent capabilities on top of the official groq-sdk. It provides 100% API compatibility with the official Groq SDK while adding powerful features for building intelligent AI applications. The library enables developers to create chatbots and agents that can search the web, fetch URLs, query knowledge bases, and reason through complex tasks using Groq's fast LLM inference.

The core functionality centers around the GroqRAG client class, which wraps the official Groq SDK and provides access to three main modules: RAG (document ingestion and semantic retrieval), Web (URL fetching and web search), and Agents (autonomous tool-using AI agents). The library supports multiple vector store backends, embedding providers, and search providers, making it flexible for both development and production use cases.

## GroqRAG Client Initialization

The main entry point for all functionality. Creates a client instance that provides access to RAG, web, and agent features while maintaining full compatibility with the official Groq SDK.

```typescript
import GroqRAG from 'groq-rag';

const client = new GroqRAG({
  apiKey: process.env.GROQ_API_KEY,  // Optional: defaults to GROQ_API_KEY env var
  baseURL: 'https://api.groq.com',   // Optional: custom API endpoint
  timeout: 30000,                     // Optional: request timeout in ms
  maxRetries: 2,                      // Optional: retry attempts
});

// Access underlying Groq SDK for direct API usage
const groqSdk = client.client;

// Standard chat completion (Groq SDK passthrough)
const response = await client.complete({
  model: 'llama-3.3-70b-versatile',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is the capital of France?' },
  ],
});

console.log(response.choices[0].message.content);
// Output: "The capital of France is Paris."
```

## RAG Initialization

Initialize the RAG system with configurable vector stores, embedding providers, and chunking strategies. Required before using any RAG features.

```typescript
import GroqRAG from 'groq-rag';

const client = new GroqRAG();

// Initialize with in-memory vector store (default)
await client.initRAG({
  embedding: {
    provider: 'groq',           // 'groq' | 'openai'
    model: 'text-embedding-3-small',  // For OpenAI
    dimensions: 1536,           // Embedding dimensions
  },
  vectorStore: {
    provider: 'memory',         // 'memory' | 'chroma'
    connectionString: 'http://localhost:8000',  // For ChromaDB
    indexName: 'my-collection',
  },
  chunking: {
    strategy: 'recursive',      // 'recursive' | 'fixed' | 'sentence' | 'paragraph'
    chunkSize: 1000,
    chunkOverlap: 200,
  },
});

// Get retriever instance
const retriever = await client.getRetriever();
console.log('RAG initialized with', await retriever.count(), 'documents');
```

## Adding Documents to RAG

Add documents to the knowledge base for semantic search and retrieval. Supports single documents, bulk uploads, and URL content ingestion.

```typescript
import GroqRAG from 'groq-rag';

const client = new GroqRAG();
await client.initRAG();

// Add single document with metadata
const docId = await client.rag.addDocument(
  `Our company's refund policy allows returns within 30 days of purchase.
  Items must be in original condition with tags attached.
  Refunds are processed within 5-7 business days.`,
  { source: 'refund-policy', category: 'policies', version: '2.0' }
);

// Add multiple documents at once
const docIds = await client.rag.addDocuments([
  { content: 'Shipping takes 5-7 business days.', metadata: { source: 'shipping' } },
  { content: 'Express shipping available for $12.99.', metadata: { source: 'shipping' } },
]);

// Add URL content directly to knowledge base
const urlDocId = await client.rag.addUrl('https://example.com/docs', {
  source: 'external-docs',
  fetchedBy: 'system',
});

// Check document count
const count = await client.rag.count();
console.log(`Knowledge base contains ${count} document chunks`);

// Clear all documents
await client.rag.clear();
```

## RAG-Augmented Chat

Chat with automatic context retrieval from the knowledge base. The system queries relevant documents and includes them in the LLM context.

```typescript
import GroqRAG from 'groq-rag';

const client = new GroqRAG();
await client.initRAG();

// Add knowledge to the base
await client.rag.addDocument(
  'Our premium plan costs $29/month and includes unlimited API calls.',
  { source: 'pricing' }
);
await client.rag.addDocument(
  'The basic plan is $9/month with 1000 API calls included.',
  { source: 'pricing' }
);

// Chat with RAG augmentation
const response = await client.chat.withRAG({
  messages: [
    { role: 'user', content: 'What pricing plans do you offer?' },
  ],
  model: 'llama-3.3-70b-versatile',
  topK: 5,                    // Number of documents to retrieve
  minScore: 0.5,              // Minimum similarity score (0-1)
  includeMetadata: true,      // Include metadata in context
  temperature: 0.7,
  maxTokens: 1024,
  systemPrompt: 'You are a helpful sales assistant.',  // Optional custom prompt
});

console.log('Answer:', response.content);
console.log('Token usage:', response.usage);

// Access retrieved sources
for (const source of response.sources) {
  console.log(`- Score: ${source.score.toFixed(3)}`);
  console.log(`  Content: ${source.document.content}`);
  console.log(`  Metadata:`, source.document.metadata);
}
```

## Querying the Knowledge Base

Directly query the RAG system for semantic search without LLM completion. Useful for building custom retrieval pipelines.

```typescript
import GroqRAG from 'groq-rag';

const client = new GroqRAG();
await client.initRAG();

await client.rag.addDocument('Python is great for data science.');
await client.rag.addDocument('JavaScript dominates web development.');
await client.rag.addDocument('Rust provides memory safety without garbage collection.');

// Semantic search query
const results = await client.rag.query('What programming language is best for web?', {
  topK: 3,
  minScore: 0.3,
});

for (const result of results) {
  console.log(`Score: ${result.score}`);
  console.log(`Content: ${result.document.content}`);
  console.log(`ID: ${result.document.id}`);
}

// Get pre-formatted context string for custom LLM prompts
const context = await client.rag.getContext('web development languages', {
  topK: 2,
  includeMetadata: true,
});

console.log('Formatted context:\n', context);
// Output:
// [Source 1 | source: docs]
// JavaScript dominates web development.
// ---
// [Source 2 | source: docs]
// Python is great for data science.
```

## Web Search Chat

Augment chat responses with live web search results. Uses DuckDuckGo by default (free, no API key required).

```typescript
import GroqRAG from 'groq-rag';

const client = new GroqRAG();

// Chat with web search augmentation
const response = await client.chat.withWebSearch({
  messages: [
    { role: 'user', content: 'What are the latest developments in AI?' },
  ],
  model: 'llama-3.3-70b-versatile',
  searchQuery: 'AI artificial intelligence news 2024',  // Optional: custom search query
  maxResults: 5,
  maxSnippetLength: 200,       // Truncate snippets (optional)
  maxTotalContentLength: 2000, // Limit total context size (optional)
});

console.log('Response:', response.content);
console.log('\nSearch Sources:');
for (const source of response.sources) {
  console.log(`[${source.position}] ${source.title}`);
  console.log(`    URL: ${source.url}`);
  console.log(`    Snippet: ${source.snippet.substring(0, 100)}...`);
}
```

## URL Content Chat

Chat about the content of a specific web page. Fetches, parses, and includes the page content in the LLM context.

```typescript
import GroqRAG from 'groq-rag';

const client = new GroqRAG();

// Chat about a specific URL
const response = await client.chat.withUrl({
  messages: [
    { role: 'user', content: 'Summarize this article in 3 bullet points.' },
  ],
  url: 'https://example.com/blog/article',
  model: 'llama-3.3-70b-versatile',
  maxContentLength: 5000,  // Optional: limit fetched content
  maxTokens: 1000,         // Optional: token-based limit (~4 chars/token)
});

console.log('Summary:', response.content);
console.log('Page title:', response.source.title);
console.log('Fetched at:', response.source.fetchedAt);
console.log('Metadata:', response.source.metadata);
```

## Web Fetching Module

Fetch and parse web pages to clean markdown. Supports content extraction, link extraction, and image extraction.

```typescript
import GroqRAG from 'groq-rag';

const client = new GroqRAG();

// Fetch a single URL
const result = await client.web.fetch('https://example.com', {
  timeout: 30000,
  includeLinks: true,
  includeImages: true,
  maxContentLength: 10000,  // Optional content limit
});

console.log('Title:', result.title);
console.log('Content:', result.content);
console.log('Markdown:', result.markdown);
console.log('Metadata:', result.metadata);
console.log('Links:', result.links?.slice(0, 5));
console.log('Images:', result.images?.slice(0, 3));
console.log('Fetched at:', result.fetchedAt);

// Fetch multiple URLs in parallel
const urls = [
  'https://example.com/page1',
  'https://example.com/page2',
  'https://example.com/page3',
];
const results = await client.web.fetchMany(urls);

for (const [index, res] of results.entries()) {
  if (res instanceof Error) {
    console.log(`Failed to fetch ${urls[index]}: ${res.message}`);
  } else {
    console.log(`${res.title}: ${res.content.substring(0, 100)}...`);
  }
}

// Get just the markdown content
const markdown = await client.web.fetchMarkdown('https://example.com/docs');
console.log(markdown);
```

## Web Search Module

Search the web using multiple providers. DuckDuckGo is free and requires no API key.

```typescript
import GroqRAG, { createSearchProvider } from 'groq-rag';

const client = new GroqRAG();

// Search using default DuckDuckGo
const results = await client.web.search('TypeScript best practices', {
  maxResults: 10,
  safeSearch: true,
  maxSnippetLength: 150,       // Optional: truncate snippets
  maxTotalContentLength: 2000, // Optional: limit total content
});

for (const result of results) {
  console.log(`[${result.position}] ${result.title}`);
  console.log(`URL: ${result.url}`);
  console.log(`Snippet: ${result.snippet}\n`);
}

// Create custom search providers
const braveSearch = createSearchProvider({
  provider: 'brave',
  apiKey: process.env.BRAVE_API_KEY,
});

const serperSearch = createSearchProvider({
  provider: 'serper',
  apiKey: process.env.SERPER_API_KEY,
});

const braveResults = await braveSearch.search('AI news', { maxResults: 5 });
```

## Creating Agents with Built-in Tools

Create autonomous agents that can use tools to accomplish tasks. Built-in tools include web search, URL fetching, calculator, datetime, and RAG queries.

```typescript
import GroqRAG from 'groq-rag';

const client = new GroqRAG();
await client.initRAG();

// Add knowledge for RAG tool
await client.rag.addDocument(
  'Our company headquarters is at 123 Main Street, San Francisco.',
  { source: 'company-info' }
);

// Create agent with all built-in tools
const agent = await client.createAgentWithBuiltins({
  name: 'Research Assistant',
  model: 'llama-3.3-70b-versatile',
  systemPrompt: `You are a helpful research assistant. Use your tools to:
    - Search the web for current information
    - Fetch and read specific web pages
    - Query the knowledge base for internal data
    - Perform calculations
    Always cite your sources.`,
  maxIterations: 10,  // Max tool-use cycles
  verbose: true,      // Log agent reasoning
});

// Run the agent
const result = await agent.run(
  'What is the current stock price of Apple and where is our office located?'
);

console.log('Answer:', result.output);
console.log('Total tokens used:', result.totalTokens);

// Review tool calls made
for (const tool of result.toolCalls) {
  console.log(`Tool: ${tool.name}`);
  console.log(`Execution time: ${tool.executionTime}ms`);
  console.log(`Result:`, tool.result);
}

// Review reasoning steps
for (const step of result.steps) {
  if (step.action) {
    console.log(`Action: ${step.action}(${JSON.stringify(step.actionInput)})`);
    console.log(`Observation: ${step.observation}`);
  } else if (step.isFinal) {
    console.log(`Final thought: ${step.thought}`);
  }
}
```

## Streaming Agent Execution

Run agents with streaming output for real-time feedback. Useful for building interactive chat interfaces.

```typescript
import GroqRAG from 'groq-rag';

const client = new GroqRAG();

const agent = await client.createAgentWithBuiltins({
  model: 'llama-3.3-70b-versatile',
  maxIterations: 5,
});

console.log('Agent starting...\n');

// Stream agent execution
for await (const event of agent.runStream('Search for SpaceX news and summarize')) {
  switch (event.type) {
    case 'thought':
      console.log('[Thinking]', event.data);
      break;
    case 'tool_call':
      const call = event.data as { name: string; arguments: string };
      console.log(`\n[Tool Call] ${call.name}(${call.arguments})`);
      break;
    case 'tool_result':
      const result = event.data as { name: string; result: unknown };
      console.log(`[Tool Result] ${result.name}: Done`);
      break;
    case 'content':
      process.stdout.write(event.data as string);  // Stream content tokens
      break;
    case 'done':
      const final = event.data as { output: string; toolCalls: unknown[] };
      console.log('\n\n[Complete] Tools used:', final.toolCalls.length);
      break;
  }
}

// Manage conversation history
agent.clearHistory();  // Reset for new conversation
const history = agent.getHistory();  // Get message history
```

## Creating Custom Tools

Define custom tools for agents to use. Tools have a name, description, parameter schema, and execute function.

```typescript
import GroqRAG, { ToolDefinition, createToolExecutor } from 'groq-rag';

// Define a custom tool
const weatherTool: ToolDefinition = {
  name: 'get_weather',
  description: 'Get current weather for a location',
  parameters: {
    type: 'object',
    properties: {
      location: {
        type: 'string',
        description: 'City name or coordinates',
      },
      units: {
        type: 'string',
        description: 'Temperature units: celsius or fahrenheit',
      },
    },
    required: ['location'],
  },
  execute: async (params) => {
    const { location, units = 'celsius' } = params as {
      location: string;
      units?: string;
    };
    // Your weather API call here
    return {
      location,
      temperature: 22,
      units,
      conditions: 'sunny',
      humidity: 45,
    };
  },
};

// Use with agent
const client = new GroqRAG();
const agent = client.createAgent({
  model: 'llama-3.3-70b-versatile',
  tools: [weatherTool],
});

// Or add tools dynamically
agent.addTool(weatherTool);

// Use tool executor directly
const executor = createToolExecutor();
executor.register(weatherTool);

const result = await executor.execute('get_weather', { location: 'San Francisco' });
console.log(result);
// { name: 'get_weather', result: { location: 'San Francisco', temperature: 22, ... } }
```

## Built-in Tool Functions

Access individual built-in tools for custom agent configurations.

```typescript
import {
  createWebSearchTool,
  createFetchUrlTool,
  createCalculatorTool,
  createDateTimeTool,
  createRAGQueryTool,
  getBuiltinTools,
} from 'groq-rag';

// Get all built-in tools (except RAG which needs retriever)
const tools = getBuiltinTools();
// Returns: [web_search, fetch_url, calculator, get_datetime]

// Create individual tools with custom config
const searchTool = createWebSearchTool();  // Uses DuckDuckGo
const fetchTool = createFetchUrlTool();
const calcTool = createCalculatorTool();
const dateTool = createDateTimeTool();

// Calculator tool example
const calcResult = await calcTool.execute({ expression: 'sqrt(16) + pow(2, 3)' });
console.log(calcResult);  // { expression: '...', result: 12 }

// DateTime tool example
const dateResult = await dateTool.execute({ timezone: 'America/New_York' });
console.log(dateResult);
// { datetime: '1/15/2024, 3:30:00 PM', timezone: 'America/New_York', unix: 1705347000 }

// For RAG tool, pass a retriever instance
const client = new GroqRAG();
await client.initRAG();
const retriever = await client.getRetriever();
const ragTool = createRAGQueryTool(retriever);
```

## Text Chunking Utilities

Chunk text for RAG ingestion using various strategies. Useful for processing large documents.

```typescript
import { TextChunker, chunkText } from 'groq-rag';

const longDocument = `
  Chapter 1: Introduction
  This is a long document that needs to be split into chunks...

  Chapter 2: Main Content
  More content here that continues across multiple paragraphs...
`;

// Quick function for one-off chunking
const chunks = chunkText(longDocument, 'doc-123', {
  strategy: 'recursive',  // 'recursive' | 'fixed' | 'sentence' | 'paragraph'
  chunkSize: 500,
  chunkOverlap: 100,
});

for (const chunk of chunks) {
  console.log(`Chunk ${chunk.id}:`);
  console.log(`  Length: ${chunk.content.length} chars`);
  console.log(`  Content: ${chunk.content.substring(0, 50)}...`);
}

// Or use the TextChunker class for multiple documents
const chunker = new TextChunker({
  strategy: 'paragraph',
  chunkSize: 1000,
  chunkOverlap: 200,
  separators: ['\n\n', '\n', '. ', ' ', ''],
});

const doc1Chunks = chunker.chunk(document1, 'doc-1');
const doc2Chunks = chunker.chunk(document2, 'doc-2');
```

## Utility Functions

Helper functions for common operations like token estimation, text cleaning, and retry logic.

```typescript
import {
  estimateTokens,
  truncateToTokens,
  cleanText,
  extractUrls,
  cosineSimilarity,
  formatContext,
  generateId,
  sleep,
  retry,
  batch,
  safeJsonParse,
} from 'groq-rag';

// Token estimation (~4 chars per token)
const tokens = estimateTokens('Hello, world!');
console.log(`Estimated tokens: ${tokens}`);  // ~4

// Truncate to token limit
const truncated = truncateToTokens(longText, 1000);  // ~4000 chars

// Clean and normalize text
const cleaned = cleanText('  Multiple   spaces\n\n\n\nand lines  ');
console.log(cleaned);  // "Multiple spaces\n\nand lines"

// Extract URLs from text
const urls = extractUrls('Visit https://example.com and http://test.org');
console.log(urls);  // ['https://example.com', 'http://test.org']

// Vector similarity
const similarity = cosineSimilarity([1, 0, 0], [0.5, 0.5, 0]);
console.log(`Similarity: ${similarity}`);

// Format search results as context
const context = formatContext(
  [
    { content: 'Document 1 content', metadata: { source: 'file1' } },
    { content: 'Document 2 content', metadata: { source: 'file2' } },
  ],
  { includeMetadata: true, separator: '\n---\n' }
);

// Generate unique IDs
const id = generateId();  // '1705347000000-abc123'

// Retry with exponential backoff
const result = await retry(
  () => fetchData(),
  { maxRetries: 3, baseDelay: 1000, maxDelay: 10000 }
);

// Batch array processing
const items = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
const batches = batch(items, 3);  // [[1,2,3], [4,5,6], [7,8,9], [10]]

// Safe JSON parsing with default
const data = safeJsonParse('{"key": "value"}', { default: true });
const fallback = safeJsonParse('invalid json', { default: true });
```

## Search Provider Configuration

Configure different search providers based on your needs and available API keys.

```typescript
import { createSearchProvider, DuckDuckGoSearch, BraveSearch, SerperSearch } from 'groq-rag';

// DuckDuckGo (free, no API key)
const duckduckgo = createSearchProvider({ provider: 'duckduckgo' });
// or: const duckduckgo = new DuckDuckGoSearch();

// Brave Search (requires API key from brave.com/search/api)
const brave = createSearchProvider({
  provider: 'brave',
  apiKey: process.env.BRAVE_API_KEY,
});
// or: const brave = new BraveSearch(process.env.BRAVE_API_KEY!);

// Serper.dev Google Search (requires API key from serper.dev)
const serper = createSearchProvider({
  provider: 'serper',
  apiKey: process.env.SERPER_API_KEY,
});
// or: const serper = new SerperSearch(process.env.SERPER_API_KEY!);

// Use any provider
const results = await brave.search('TypeScript tutorials', {
  maxResults: 5,
  safeSearch: true,
});
```

## Vector Store Configuration

Configure different vector store backends for RAG persistence.

```typescript
import GroqRAG, { createVectorStore, MemoryVectorStore, ChromaVectorStore } from 'groq-rag';

const client = new GroqRAG();

// In-memory store (default, no persistence)
await client.initRAG({
  vectorStore: { provider: 'memory' },
});

// ChromaDB for production (persistent)
await client.initRAG({
  vectorStore: {
    provider: 'chroma',
    connectionString: 'http://localhost:8000',
    indexName: 'my-knowledge-base',
  },
});

// Create stores directly
const memoryStore = createVectorStore({ provider: 'memory' });
const chromaStore = createVectorStore({
  provider: 'chroma',
  connectionString: 'http://localhost:8000',
});

// Vector store operations
await memoryStore.add(documentsWithEmbeddings);
const results = await memoryStore.search(queryEmbedding, { topK: 5 });
await memoryStore.delete(['doc-id-1', 'doc-id-2']);
const count = await memoryStore.count();
await memoryStore.clear();
```

## Embedding Provider Configuration

Configure embedding providers for vector generation. Groq embeddings are deterministic pseudo-embeddings for testing.

```typescript
import GroqRAG, { createEmbeddingProvider, GroqEmbeddings, OpenAIEmbeddings } from 'groq-rag';

const client = new GroqRAG();

// Groq embeddings (default, deterministic, free)
await client.initRAG({
  embedding: { provider: 'groq' },
});

// OpenAI embeddings (production quality)
await client.initRAG({
  embedding: {
    provider: 'openai',
    apiKey: process.env.OPENAI_API_KEY,
    model: 'text-embedding-3-small',
    dimensions: 1536,
  },
});

// Create embeddings directly
const groqEmbed = createEmbeddingProvider({ provider: 'groq' }, groqClient);
const openaiEmbed = createEmbeddingProvider({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY,
});

// Generate embeddings
const { embedding, tokenCount } = await openaiEmbed.embed('Hello world');
const batchResults = await openaiEmbed.embedBatch(['Text 1', 'Text 2', 'Text 3']);
console.log(`Dimensions: ${openaiEmbed.dimensions}`);
```

## Summary

groq-rag is designed for building AI applications that need more than basic chat completion. Common use cases include: customer support chatbots with RAG-powered knowledge bases, research assistants that can search the web and synthesize information, document Q&A systems for enterprise knowledge management, and autonomous agents that can perform multi-step tasks using tools. The library's streaming support makes it suitable for building responsive user interfaces that show real-time agent reasoning.

Integration is straightforward since groq-rag is a drop-in replacement for the official groq-sdk. Existing Groq SDK code works unchanged, and developers can incrementally adopt RAG, web, or agent features as needed. The modular architecture allows using just the components you need - whether that's the web fetcher for content extraction, the vector store for semantic search, or the full agent system for autonomous task completion. All components are fully typed with TypeScript and work seamlessly together through the unified GroqRAG client interface.