### Install and Start Express Server

Source: https://github.com/vakra-dev/reader/blob/main/examples/production/README.md

Commands to install dependencies and start the Express Server example for Reader.

```bash
cd express-server && npm install && npm start
```

--------------------------------

### Install and Start Browser Pool Scaling

Source: https://github.com/vakra-dev/reader/blob/main/examples/production/README.md

Commands to install dependencies and start the Browser Pool Scaling example for Reader.

```bash
cd browser-pool-scaling && npm install && npm start
```

--------------------------------

### Install and Start Job Queue (BullMQ)

Source: https://github.com/vakra-dev/reader/blob/main/examples/production/README.md

Commands to install dependencies and start the API server and worker process for the Job Queue example using BullMQ.

```bash
cd job-queue-bullmq && npm install
npm run start   # API server
npm run worker  # Worker process
```

--------------------------------

### Run Production Server Example

Source: https://github.com/vakra-dev/reader/blob/main/CONTRIBUTING.md

This command starts a production-ready Express.js server example from the 'examples/' folder.

```bash
npx tsx production/express-server/src/index.ts
```

--------------------------------

### Install Reader and Hero Core Dependencies

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/production-server.md

Installs the necessary packages for Vakra Reader, Express, and the shared Hero Core components. This is the initial setup step for the production server.

```bash
npm install @vakra-dev/reader express
npm install @ulixee/hero-core @ulixee/net  # For shared Core
```

--------------------------------

### Provider Examples

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Code examples for configuring proxies with popular providers like IPRoyal, Bright Data, Oxylabs, and SmartProxy.

```APIDOC
## Provider Examples

### Description
Illustrative proxy configurations for various popular proxy providers.

### Method
N/A (Configuration Snippets)

### Endpoint
N/A (Configuration Snippets)

### Parameters
#### IPRoyal Example
- **type**: "residential"
- **host**: "geo.iproyal.com"
- **port**: 12321
- **username**: "customer-username"
- **password**: "password"
- **country**: "us"

#### Bright Data (Luminati) Example
- **type**: "residential"
- **host**: "brd.superproxy.io"
- **port**: 22225
- **username**: "customer-zone-residential"
- **password**: "password"
- **country**: "us"

#### Oxylabs Example
- **type**: "residential"
- **host**: "pr.oxylabs.io"
- **port**: 7777
- **username**: "customer-username"
- **password**: "password"
- **country**: "us"

#### SmartProxy Example
- **type**: "residential"
- **host**: "gate.smartproxy.com"
- **port**: 7000
- **username**: "user"
- **password**: "pass"
- **country**: "us"

### Request Example (IPRoyal)
```json
{
  "type": "residential",
  "host": "geo.iproyal.com",
  "port": 12321,
  "username": "customer-username",
  "password": "password",
  "country": "us"
}
```
```

--------------------------------

### Install BullMQ and Dependencies

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/job-queues.md

Installs the necessary packages for BullMQ, ioredis, and the @vakra-dev/reader library using npm.

```bash
npm install bullmq ioredis @vakra-dev/reader
```

--------------------------------

### Install Dependencies with npm

Source: https://github.com/vakra-dev/reader/blob/main/examples/production/job-queue-bullmq/README.md

Installs the necessary Node.js packages for the BullMQ job queue example. Ensure you are in the correct directory before running.

```bash
cd examples/production/job-queue-bullmq
npm install
```

--------------------------------

### BrowserPool Configuration Example (TypeScript)

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/browser-pool.md

Provides a comprehensive example of configuring a BrowserPool with various options such as size, retirement policies, and queue limits.

```typescript
const pool = new BrowserPool({
  size: 5,                    // Number of browser instances
  retireAfterPages: 100,      // Recycle after N pages
  retireAfterMinutes: 30,     // Recycle after N minutes
  maxQueueSize: 100,          // Max pending requests
  healthCheckIntervalMs: 300000, // Health check interval (5 min)
});
```

--------------------------------

### Bright Data Provider Configuration

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Example configuration for using Bright Data (formerly Luminati) as a residential proxy provider. This setup details the necessary parameters for connecting to their service.

```typescript
proxy: {
  type: "residential",
  host: "brd.superproxy.io",
  port: 22225,
  username: "customer-zone-residential",
  password: "password",
  country: "us",
}
```

--------------------------------

### Start API Server with npm

Source: https://github.com/vakra-dev/reader/blob/main/examples/production/job-queue-bullmq/README.md

Starts the API server that handles job submissions and status checks. This command assumes Node.js is installed and dependencies are met.

```bash
npm run start
```

--------------------------------

### AWS Lambda Container Dockerfile Setup

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/serverless.md

Sets up a Dockerfile for AWS Lambda to include Chrome and its dependencies. It installs necessary packages, copies application code, and defines the entry point for the Lambda function. This allows running Chrome within a containerized Lambda environment.

```dockerfile
FROM public.ecr.aws/lambda/nodejs:20

# Install Chrome dependencies
RUN yum install -y \
    chromium \
    nss \
    freetype \
    freetype-devel \
    fontconfig \
    pango \
    --skip-broken

ENV CHROME_PATH=/usr/bin/chromium-browser
ENV FONTCONFIG_PATH=/etc/fonts

COPY package*.json ./
RUN npm ci --only=production

COPY . .

CMD ["dist/handler.handler"]
```

--------------------------------

### Basic Dockerfile for Reader Application

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/docker.md

A foundational Dockerfile to build and run the Reader application. It installs Node.js, necessary Chrome dependencies, copies application files, installs production dependencies, and exposes the application port.

```dockerfile
# Dockerfile
FROM node:22-slim

# Install Chrome dependencies
RUN apt-get update && apt-get install -y \
    chromium \
    fonts-liberation \
    libasound2 \
    libatk-bridge2.0-0 \
    libatk1.0-0 \
    libcups2 \
    libdbus-1-3 \
    libdrm2 \
    libgbm1 \
    libgtk-3-0 \
    libnspr4 \
    libnss3 \
    libxcomposite1 \
    libxdamage1 \
    libxrandr2 \
    xdg-utils \
    --no-install-recommends \
    && rm -rf /var/lib/apt/lists/*

# Set Chrome path for Hero
ENV CHROME_PATH=/usr/bin/chromium

WORKDIR /app

# Copy package files
COPY package*.json ./

# Install dependencies
RUN npm ci --only=production

# Copy application
COPY . .

# Build if TypeScript
RUN npm run build 2>/dev/null || true

EXPOSE 3000

CMD ["node", "dist/server.js"]

```

--------------------------------

### Start Application with PM2

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/production-server.md

This command initiates the application using PM2, applying the configuration defined in the `ecosystem.config.js` file. PM2 will manage the application lifecycle, including clustering and restarts.

```bash
pm2 start ecosystem.config.js
```

--------------------------------

### Datacenter Proxy Configuration

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Example configuration for using datacenter proxies.

```APIDOC
## Datacenter Proxies

### Description
Configuration object for datacenter proxies.

### Method
N/A (Configuration Snippet)

### Endpoint
N/A (Configuration Snippet)

### Parameters
#### Request Body (within `proxy` object)
- **type** (string) - Must be "datacenter".
- **host** (string) - Proxy server hostname.
- **port** (number) - Proxy server port.
- **username** (string) - Authentication username.
- **password** (string) - Authentication password.

### Request Example
```json
{
  "type": "datacenter",
  "host": "proxy.example.com",
  "port": 8080,
  "username": "username",
  "password": "password"
}
```
```

--------------------------------

### Docker Compose Quick Start (Bash)

Source: https://github.com/vakra-dev/reader/blob/main/examples/deployment/docker/README.md

This command initiates the Reader Docker container using Docker Compose. It's the quickest way to get the Reader REST API server running locally.

```bash
cd examples/deployment/docker
docker-compose up -d
```

--------------------------------

### Run AI Integration Examples

Source: https://github.com/vakra-dev/reader/blob/main/CONTRIBUTING.md

These commands demonstrate AI integration examples, specifically using OpenAI for summarization. It requires setting the OPENAI_API_KEY environment variable.

```bash
# AI integration examples (requires API keys)
export OPENAI_API_KEY="sk-..."
npx tsx ai-tools/openai-summary.ts https://example.com
```

--------------------------------

### Run Basic Examples

Source: https://github.com/vakra-dev/reader/blob/main/CONTRIBUTING.md

These commands execute basic examples from the 'examples/' folder, covering simple scraping, batch scraping, and website crawling.

```bash
cd examples
npm install

# Basic examples
npx tsx basic/basic-scrape.ts
npx tsx basic/batch-scrape.ts
npx tsx basic/crawl-website.ts
```

--------------------------------

### Basic Docker Compose Setup for Reader

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/docker.md

A simple Docker Compose configuration to define and run the Reader service. It specifies the build context, port mapping, environment variables, and restart policy.

```yaml
# docker-compose.yml
version: "3.8"

services:
  reader:
    build: .
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - LOG_LEVEL=info
    restart: unless-stopped
    deploy:
      resources:
        limits:
          memory: 2G

```

--------------------------------

### SmartProxy Provider Configuration

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Example configuration for using SmartProxy residential proxies. This demonstrates how to set the host, port, username, password, and country for the proxy connection.

```typescript
proxy: {
  type: "residential",
  host: "gate.smartproxy.com",
  port: 7000,
  username: "user",
  password: "pass",
  country: "us",
}
```

--------------------------------

### Manually Installing Chromium on Ubuntu/Debian

Source: https://github.com/vakra-dev/reader/blob/main/docs/troubleshooting.md

Install the Chromium browser manually on Ubuntu or Debian systems using the apt package manager.

```bash
sudo apt-get update
sudo apt-get install -y chromium-browser
```

--------------------------------

### Verify Development Setup (Bash)

Source: https://github.com/vakra-dev/reader/blob/main/CONTRIBUTING.md

Commands to verify the development environment setup by running type checking and building the project. These commands ensure Node.js and npm are correctly configured.

```bash
npm run typecheck
npm run build
```

--------------------------------

### Docker Compose Setup with Redis for Reader

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/docker.md

A Docker Compose configuration for a multi-service setup including an API service (Reader), a worker service, and a Redis instance for job queues. It defines dependencies, environment variables, and resource limits.

```yaml
# docker-compose.yml
version: "3.8"

services:
  api:
    build:
      context: .
      dockerfile: Dockerfile.api
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - REDIS_HOST=redis
      - REDIS_PORT=6379
    depends_on:
      - redis
    restart: unless-stopped

  worker:
    build:
      context: .
      dockerfile: Dockerfile.worker
    environment:
      - NODE_ENV=production
      - REDIS_HOST=redis
      - REDIS_PORT=6379
    depends_on:
      - redis
    deploy:
      replicas: 3
      resources:
        limits:
          memory: 2G
    restart: unless-stopped

  redis:
    image: redis:7-alpine
    volumes:
      - redis-data:/data
    restart: unless-stopped

volumes:
  redis-data:

```

--------------------------------

### Datacenter Proxy Configuration

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Example configuration for using a datacenter proxy with the Reader client. Datacenter proxies are fast and cheap but can be easily detected.

```typescript
proxy: {
  type: "datacenter",
  host: "proxy.example.com",
  port: 8080,
  username: "username",
  password: "password"
}
```

--------------------------------

### Reduce Cold Starts with Connection Warm-up

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/serverless.md

This TypeScript code snippet shows a pattern for reducing cold starts by keeping a connection warm. It initializes the connection only once and reuses the promise for subsequent calls.

```typescript
// Keep connection warm
let connectionPromise: Promise<any>;

function getConnection() {
  if (!connectionPromise) {
    connectionPromise = initializeConnection();
  }
  return connectionPromise;
}
```

--------------------------------

### Docker Compose Management Commands

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/docker.md

Essential commands for managing Docker Compose services, including starting, scaling, viewing logs, and stopping all services defined in the docker-compose.yml file.

```bash
# Start all services
docker-compose up -d

# Scale workers
docker-compose up -d --scale worker=5

# View logs
docker-compose logs -f worker

# Stop services
docker-compose down

```

--------------------------------

### Initialize BrowserPool (TypeScript)

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/browser-pool.md

Shows the basic steps to initialize a BrowserPool instance with a specified size. Initialization includes creating Hero instances and starting background health checking.

```typescript
import { BrowserPool } from "@vakra-dev/reader";

const pool = new BrowserPool({ size: 5 });
await pool.initialize();
```

--------------------------------

### Install Reader from npm

Source: https://github.com/vakra-dev/reader/blob/main/docs/getting-started.md

Installs the Reader package using npm. This is the recommended method for most users. Ensure Node.js and npm are installed and up to date.

```bash
npm install @vakra-dev/reader
```

--------------------------------

### Troubleshoot Chrome Startup (Bash)

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/docker.md

These bash commands are used to troubleshoot Chrome startup issues within a Docker container. The first checks the Chrome installation version, and the second performs a manual headless test.

```bash
# Check Chrome installation
docker exec -it container_name chromium --version

# Test Chrome manually
docker exec -it container_name chromium --headless --no-sandbox --dump-dom https://example.com
```

--------------------------------

### Start API and Worker with npm

Source: https://github.com/vakra-dev/reader/blob/main/examples/production/job-queue-bullmq/README.md

Starts both the API server and the worker process simultaneously for development purposes. This is a convenient command for local testing.

```bash
npm run dev
```

--------------------------------

### Install Reader from Source

Source: https://github.com/vakra-dev/reader/blob/main/docs/getting-started.md

Installs the Reader package by cloning the source repository. This method is useful for developers who want to contribute to the project or use the latest unreleased features. It requires Git, Node.js, and npm.

```bash
git clone https://github.com/vakra-dev/reader.git
cd reader
npm install
npm run build
```

--------------------------------

### Request Output Formats via CLI

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/output-formats.md

Shows how to use the Reader CLI to scrape a URL and specify output formats. Examples include requesting a single format and multiple comma-separated formats.

```bash
# Single format
npx reader scrape https://example.com -f markdown

# Multiple formats
npx reader scrape https://example.com -f markdown,text,json
```

--------------------------------

### Start Redis Server with Docker

Source: https://github.com/vakra-dev/reader/blob/main/examples/production/job-queue-bullmq/README.md

Starts a Redis server instance using a Docker container. This is a prerequisite for the BullMQ job queue.

```bash
docker run -d -p 6379:6379 redis:alpine
```

--------------------------------

### Vercel Configuration for Functions

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/serverless.md

A `vercel.json` configuration file specifying settings for Vercel serverless functions. This example sets the memory to 1024MB and the maximum duration to 60 seconds for the `api/scrape.ts` function.

```json
{
  "functions": {
    "api/scrape.ts": {
      "memory": 1024,
      "maxDuration": 60
    }
  }
}
```

--------------------------------

### Run Vakra Reader Server

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/production-server.md

Executes the TypeScript server file using tsx, which allows running TypeScript directly without prior compilation. This command starts the production server.

```bash
npx tsx server.ts
```

--------------------------------

### Monitoring Network Resources with Hero

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/cloudflare-bypass.md

Illustrates how to monitor network resources within a browser instance managed by Hero. This example specifically logs Cloudflare-related resources encountered during navigation. It utilizes the `on('resource')` event handler provided by Hero.

```typescript
await pool.withBrowser(async (hero) => {
  hero.on("resource", (resource) => {
    if (resource.url.includes("cdn-cgi")) {
      console.log("Cloudflare resource:", resource.url);
    }
  });

  await hero.goto("https://protected-site.com");
});
```

--------------------------------

### Test Vakra Reader API Endpoints with cURL

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/production-server.md

Provides example cURL commands to test the `/scrape` and `/crawl` endpoints of the Vakra Reader API server. These demonstrate how to send JSON payloads for scraping specific URLs and initiating crawls.

```bash
# Scrape
curl -X POST http://localhost:3000/scrape \
  -H "Content-Type: application/json" \
  -d '{"urls": ["https://example.com"], "formats": ["markdown"]}'

# Crawl
curl -X POST http://localhost:3000/crawl \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "depth": 2, "scrape": true}'
```

--------------------------------

### Use Case: Search Indexing with Text Format and TypeScript

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/output-formats.md

Illustrates a practical use case for the plain text output format: indexing content for search engines. This TypeScript example shows scraping content as text and then adding it to a hypothetical search index.

```typescript
const reader = new ReaderClient();
const result = await reader.scrape({
  urls: ["https://example.com"],
  formats: ["text"],
});

// Index plain text
await searchIndex.add({
  url: result.data[0].metadata.baseUrl,
  content: result.data[0].text,
});

await reader.close();
```

--------------------------------

### Manual Acquire and Release of Browser Instance (TypeScript)

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/browser-pool.md

Provides an example of manually acquiring and releasing a browser instance from the pool. This advanced method requires careful handling of exceptions to ensure the browser is always released.

```typescript
const hero = await pool.acquire();
try {
  await hero.goto("https://example.com");
  // ... do work
} finally {
  await pool.release(hero);
}
```

--------------------------------

### Vercel Environment Variable Setup

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/serverless.md

A bash command to add the `BROWSERLESS_URL` environment variable using the Vercel CLI. This is used to configure the connection string for the remote browser service within Vercel Functions.

```bash
vercel env add BROWSERLESS_URL
```

--------------------------------

### Clone Repository and Install Dependencies (Bash)

Source: https://github.com/vakra-dev/reader/blob/main/CONTRIBUTING.md

Steps to clone the Reader repository from GitHub and install project dependencies using npm. This is a prerequisite for local development.

```bash
git clone https://github.com/YOUR_USERNAME/reader.git
cd reader
npm install
```

--------------------------------

### Install supermarkdown Package (Bash)

Source: https://github.com/vakra-dev/reader/blob/main/README.md

Provides commands for installing the supermarkdown package, used for HTML to Markdown conversion, via npm for Node.js projects.

```bash
# npm
npm install @vakra-dev/supermarkdown
```

--------------------------------

### Start Worker with npm

Source: https://github.com/vakra-dev/reader/blob/main/examples/production/job-queue-bullmq/README.md

Starts a worker process that consumes and processes jobs from the queue. This should typically be run in a separate terminal from the API server.

```bash
npm run worker
```

--------------------------------

### Verify CLI Installation

Source: https://github.com/vakra-dev/reader/blob/main/docs/getting-started.md

Tests the command-line interface (CLI) of the Reader package by scraping a sample URL. This command should output the content of example.com in markdown format, confirming the CLI is working correctly.

```bash
npx reader scrape https://example.com
```

--------------------------------

### Manually Installing Chromium on macOS

Source: https://github.com/vakra-dev/reader/blob/main/docs/troubleshooting.md

Install the Chromium browser on macOS using the Homebrew package manager.

```bash
brew install --cask chromium
```

--------------------------------

### Resolving Chrome/Chromium Not Found Error

Source: https://github.com/vakra-dev/reader/blob/main/docs/troubleshooting.md

Troubleshoot the 'Could not find Chrome installation' error. Solutions involve letting Reader download Chrome, manual installation on Ubuntu/Debian or macOS, or pointing to an existing Chrome installation via an environment variable.

```bash
# Clear cache and retry download
rm -rf ~/.cache/ulixee
npx reader scrape https://example.com

# Manual install (Ubuntu/Debian)
sudo apt-get update
sudo apt-get install -y chromium-browser

# Manual install (macOS)
brew install --cask chromium

# Point to existing Chrome
export CHROME_PATH=/usr/bin/chromium-browser
# or on macOS
export CHROME_PATH="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
```

--------------------------------

### Build and Run Reader Docker Image

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/docker.md

Commands to build a Docker image for the Reader application and then run it as a container, mapping the application's port.

```bash
# Build image
docker build -t reader .

# Run container
docker run -p 3000:3000 reader

```

--------------------------------

### Async Scraping with BullMQ and Express.js (TypeScript)

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/job-queues.md

This TypeScript code sets up an Express.js server with API endpoints for initiating and monitoring scraping jobs using BullMQ. It defines a worker that processes scraping tasks, leveraging Hero Core for browser automation. The server handles POST requests to start scraping and GET requests to check job status and retrieve results.

```typescript
// complete-example.ts
import { Queue, Worker, Job } from "bullmq";
import express from "express";
import HeroCore from "@ulixee/hero-core";
import { scrape, ScrapeResult } from "@vakra-dev/reader";

const app = express();
app.use(express.json());

// Redis connection
const connection = { host: "localhost", port: 6379 };

// Queue
const scrapeQueue = new Queue("scrape", { connection });

// Shared Hero Core
let heroCore: HeroCore;

// Worker
const worker = new Worker<any, ScrapeResult>(
  "scrape",
  async (job: Job) => {
    const result = await scrape({
      ...job.data,
      connectionToCore: await createConnection(),
    });
    return result;
  },
  { connection, concurrency: 3 }
);

// API endpoints
app.post("/scrape/async", async (req, res) => {
  const job = await scrapeQueue.add("scrape", req.body);
  res.json({ jobId: job.id });
});

app.get("/scrape/:jobId", async (req, res) => {
  const job = await scrapeQueue.getJob(req.params.jobId);
  if (!job) return res.status(404).json({ error: "Not found" });

  const state = await job.getState();
  res.json({
    state,
    progress: job.progress,
    result: state === "completed" ? job.returnvalue : null,
  });
});

// Start
async function start() {
  heroCore = new HeroCore();
  await heroCore.start();

  app.listen(3000, () => console.log("Server running"));
}

start();

```

--------------------------------

### Residential Proxy Configuration

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Example configuration for using residential proxies, including country targeting.

```APIDOC
## Residential Proxies

### Description
Configuration object for residential proxies, allowing for country-specific targeting.

### Method
N/A (Configuration Snippet)

### Endpoint
N/A (Configuration Snippet)

### Parameters
#### Request Body (within `proxy` object)
- **type** (string) - Must be "residential".
- **host** (string) - Proxy server hostname.
- **port** (number) - Proxy server port.
- **username** (string) - Authentication username.
- **password** (string) - Authentication password.
- **country** (string) - Optional - Country code for geo-targeting (e.g., "us", "uk").

### Request Example
```json
{
  "type": "residential",
  "host": "proxy.example.com",
  "port": 8080,
  "username": "username",
  "password": "password",
  "country": "us"
}
```
```

--------------------------------

### Rotate Proxies Manually

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Demonstrates how to cycle through a predefined list of proxies for sequential requests. This helps distribute load and avoid IP-based blocking. It requires a list of proxy configurations and a counter to track the current proxy.

```typescript
const proxies = [
  { host: "proxy1.example.com", port: 8080 },
  { host: "proxy2.example.com", port: 8080 },
  { host: "proxy3.example.com", port: 8080 },
];

let proxyIndex = 0;
const reader = new ReaderClient();

async function scrapeWithRotation(url: string) {
  const proxy = proxies[proxyIndex % proxies.length];
  proxyIndex++;

  return await reader.scrape({
    urls: [url],
    proxy: {
      ...proxy,
      username: "username",
      password: "password",
    },
  });
}

// Don't forget to close when done
// await reader.close();
```

--------------------------------

### Scaling Strategies

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/production-server.md

Documentation on how to scale the Vakra Reader service horizontally and manage memory limits using PM2.

```APIDOC
## Scaling Strategies

### Horizontal Scaling
Run multiple server instances behind a load balancer.

```bash
# Start multiple instances
PORT=3001 npx tsx server.ts &
PORT=3002 npx tsx server.ts &
PORT=3003 npx tsx server.ts &
```

### PM2 Cluster Mode
Use PM2 to manage and scale Node.js applications.

```javascript
// ecosystem.config.js
module.exports = {
  apps: [{
    name: "reader",
    script: "server.ts",
    interpreter: "npx",
    interpreter_args: "tsx",
    instances: "max",
    exec_mode: "cluster",
    env: {
      NODE_ENV: "production",
      PORT: 3000,
    },
  }],
};
```

```bash
pm2 start ecosystem.config.js
```

### Memory Limits
Configure memory limits for Node.js processes using PM2.

```javascript
// ecosystem.config.js
module.exports = {
  apps: [{
    name: "reader",
    script: "server.ts",
    max_memory_restart: "2G",
    node_args: "--max-old-space-size=2048",
  }],
};
```
```

--------------------------------

### GET /job/:id

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/job-queues.md

Retrieves the current status and progress of a specific scraping job using its ID.

```APIDOC
## GET /job/:id

### Description
Retrieves the status and progress of a scraping job identified by its ID.

### Method
GET

### Endpoint
/job/:id

### Parameters
#### Path Parameters
- **id** (string) - Required - The ID of the job to retrieve.

### Response
#### Success Response (200)
- **id** (string) - The job ID.
- **state** (string) - The current state of the job (e.g., "queued", "active", "completed", "failed").
- **progress** (number) - The completion progress of the job (0-100).
- **data** (object) - The original data submitted for the job.
- **result** (any) - The return value of the job if completed.
- **failedReason** (string) - The reason for failure if the job failed.

#### Response Example
```json
{
  "id": "some-unique-job-id",
  "state": "active",
  "progress": 50,
  "data": {
    "urls": ["https://example.com"],
    "formats": ["markdown"]
  },
  "result": null,
  "failedReason": null
}
```
```

--------------------------------

### Setting Chrome Path Environment Variable

Source: https://github.com/vakra-dev/reader/blob/main/docs/troubleshooting.md

Configure the CHROME_PATH environment variable to point Reader to a specific Chrome or Chromium installation.

```bash
# For Linux
export CHROME_PATH=/usr/bin/chromium-browser

# For macOS
export CHROME_PATH="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
```

--------------------------------

### CLI Scrape with Proxy

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Demonstrates how to perform a scrape operation using the Reader CLI, specifying a proxy server directly in the command line arguments.

```bash
npx reader scrape https://example.com --proxy http://user:pass@host:port
```

--------------------------------

### GET /job/:id/result

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/job-queues.md

Retrieves the final result of a completed scraping job. Returns a 202 Accepted status if the job is not yet completed.

```APIDOC
## GET /job/:id/result

### Description
Retrieves the final result of a completed scraping job. If the job is still in progress, it returns the current status and progress.

### Method
GET

### Endpoint
/job/:id/result

### Parameters
#### Path Parameters
- **id** (string) - Required - The ID of the job whose result is requested.

### Response
#### Success Response (200)
- The response body will contain the scraped data in the format(s) requested when the job was enqueued.

#### Accepted Response (202)
- Returned if the job is not yet completed. Contains the current status and progress.
  - **status** (string) - The current state of the job.
  - **progress** (number) - The completion progress of the job (0-100).

#### Response Example (Completed Job)
```json
{
  "content": "# Scraped Content\nThis is the scraped markdown content."
}
```

#### Response Example (Job in Progress)
```json
{
  "status": "active",
  "progress": 75
}
```
```

--------------------------------

### CLI Usage with Proxy

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Demonstrates how to use the Reader CLI to scrape a URL with a specified proxy.

```APIDOC
## CLI Usage

### Description
Command-line interface command to scrape a URL using a proxy.

### Method
N/A (CLI Command)

### Endpoint
N/A (CLI Command)

### Parameters
- **scrape**: Command to initiate scraping.
- **[URL]**: The URL to scrape (e.g., `https://example.com`).
- **--proxy**: Optional flag to specify the proxy URL (e.g., `http://user:pass@host:port`).

### Request Example
```bash
npx reader scrape https://example.com --proxy http://user:pass@host:port
```

### Response
Output will be the scraped content or an error message.
```

--------------------------------

### Reader Client Initialization with Proxy URL

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Initialize the Reader client and configure a proxy using a direct URL string.

```APIDOC
## POST /vakra-dev/reader

### Description
Initializes the Reader client with proxy configuration via a URL.

### Method
POST

### Endpoint
/vakra-dev/reader

### Parameters
#### Request Body
- **urls** (array<string>) - Required - List of URLs to scrape.
- **proxy** (object) - Optional - Proxy configuration.
  - **url** (string) - Required - Full proxy URL (e.g., "http://username:password@proxy.example.com:8080").

### Request Example
```json
{
  "urls": ["https://example.com"],
  "proxy": {
    "url": "http://username:password@proxy.example.com:8080"
  }
}
```

### Response
#### Success Response (200)
- **data** (array<object>) - Array of scraped data.
  - **metadata** (object) - Metadata about the scrape.
    - **baseUrl** (string) - The base URL that was scraped.
    - **proxy** (object) - Information about the proxy used.
      - **host** (string) - Proxy host.
      - **port** (number) - Proxy port.
      - **country** (string) - Optional - Country code if geo-targeting was used.

#### Response Example
```json
{
  "data": [
    {
      "metadata": {
        "baseUrl": "https://example.com",
        "proxy": {
          "host": "proxy.example.com",
          "port": 8080
        }
      }
    }
  ]
}
```
```

--------------------------------

### GET /health

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/production-server.md

Provides health status and performance metrics for the reader service, including active and total requests, failed requests, and queue status.

```APIDOC
## GET /health

### Description
Provides health status and performance metrics for the reader service, including active and total requests, failed requests, and queue status.

### Method
GET

### Endpoint
/health

### Response
#### Success Response (200)
- **status** (string) - The overall status of the service ('ok').
- **heroCore** (string) - Status of the heroCore service ('running' or 'stopped').
- **stats** (object) - Performance statistics:
  - **activeRequests** (number) - Number of currently active requests.
  - **totalRequests** (number) - Total number of requests processed.
  - **failedRequests** (number) - Total number of failed requests (status code >= 500).
  - **queueSize** (number) - Current size of the request queue.
  - **queuePending** (number) - Number of requests pending in the queue.

#### Response Example
```json
{
  "status": "ok",
  "heroCore": "running",
  "stats": {
    "activeRequests": 5,
    "totalRequests": 1000,
    "failedRequests": 10,
    "queueSize": 2,
    "queuePending": 1
  }
}
```
```

--------------------------------

### Oxylabs Provider Configuration

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Example configuration for integrating Oxylabs residential proxies with the Reader client. This snippet shows the specific host, port, and authentication details required.

```typescript
proxy: {
  type: "residential",
  host: "pr.oxylabs.io",
  port: 7777,
  username: "customer-username",
  password: "password",
  country: "us",
}
```

--------------------------------

### Manual Docker Image Build (Bash)

Source: https://github.com/vakra-dev/reader/blob/main/examples/deployment/docker/README.md

This command builds the Docker image for the Reader project. It uses the Dockerfile located in the examples/deployment/docker directory and tags the image as 'reader'.

```bash
docker build -t reader -f examples/deployment/docker/Dockerfile .
```

--------------------------------

### IPRoyal Provider Configuration

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Example of configuring the Reader client to use IPRoyal as a residential proxy provider. This includes specifying the host, port, username, password, and target country.

```typescript
proxy: {
  type: "residential",
  host: "geo.iproyal.com",
  port: 12321,
  username: "customer-username",
  password: "password",
  country: "us",
}
```

--------------------------------

### Crawl Request Flow Example

Source: https://github.com/vakra-dev/reader/blob/main/docs/architecture.md

Illustrates the step-by-step process of a crawl request, from initialization to result return. It details the BFS loop, page fetching, link extraction and filtering, rate limiting, and optional scraping.

```text
crawl({ url: "https://example.com", depth: 2, scrape: true })
  │
  ├─► Crawler.crawl()
  │     │
  │     ├─► Initialize queue with seed URL at depth 0
  │     │
  │     ├─► BFS loop (while queue not empty && pages < maxPages):
  │     │     │
  │     │     ├─► Dequeue next URL
  │     │     │
  │     │     ├─► Fetch page with Hero
  │     │     │
  │     │     ├─► Extract links via regex
  │     │     │
  │     │     ├─► Filter links:
  │     │     │     ├─► Same domain only
  │     │     │     ├─► Match includePatterns
  │     │     │     └─► Exclude excludePatterns
  │     │     │
  │     │     ├─► Add new links to queue with depth + 1
  │     │     │
  │     │     ├─► Rate limit (delay between requests)
  │     │     │
  │     │     └─► Add to discovered URLs
  │     │
  │     ├─► If scrape=true:
  │     │     └─► scrape({ urls: discoveredUrls })
  │     │
  │     └─► Return CrawlResult { urls[], scraped?, metadata }
  │
  └─► Result returned to caller
```

--------------------------------

### Reader Environment Variables Configuration

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/docker.md

Example of environment variables that can be configured for the Reader service in Docker Compose. These variables control application behavior, port, logging level, and Chrome path.

```yaml
services:
  reader:
    environment:
      - NODE_ENV=production
      - PORT=3000
      - LOG_LEVEL=info
      - CHROME_PATH=/usr/bin/chromium
      - MAX_CONCURRENT_REQUESTS=10
      - REQUEST_TIMEOUT_MS=60000

```

--------------------------------

### Residential Proxy Configuration

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Example configuration for using a residential proxy with the Reader client. Residential proxies use real IP addresses, making them harder to detect and suitable for sensitive scraping.

```typescript
proxy: {
  type: "residential",
  host: "proxy.example.com",
  port: 8080,
  username: "username",
  password: "password",
  country: "us",
}
```

--------------------------------

### Set Up Bull Board Queue Dashboard (TypeScript)

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/job-queues.md

Integrates Bull Board to provide a web-based dashboard for monitoring and managing BullMQ queues. It allows visualization of job statuses, queue metrics, and manual job operations.

```typescript
import { createBullBoard } from "@bull-board/api";
import { BullMQAdapter } from "@bull-board/api/bullMQAdapter";
import { ExpressAdapter } from "@bull-board/express";

const serverAdapter = new ExpressAdapter();
serverAdapter.setBasePath("/admin/queues");

createBullBoard({
  queues: [new BullMQAdapter(scrapeQueue)],
  serverAdapter,
});

app.use("/admin/queues", serverAdapter.getRouter());
```

--------------------------------

### Get BrowserPool Statistics (TypeScript)

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/browser-pool.md

Shows how to retrieve statistics about the current state of the browser pool, including the total number of instances, available instances, instances in use, and queue size.

```typescript
const stats = pool.getStats();
console.log(stats);
// {
//   total: 5,
//   available: 3,
//   inUse: 2,
//   queueSize: 0,
//   totalAcquired: 150,
//   totalRecycled: 3
// }
```

--------------------------------

### Reader Client Initialization with Structured Proxy Config

Source: https://github.com/vakra-dev/reader/blob/main/docs/guides/proxy-configuration.md

Initialize the Reader client and configure a proxy using a structured object with type, host, port, and credentials.

```APIDOC
## POST /vakra-dev/reader

### Description
Initializes the Reader client with proxy configuration using a structured object.

### Method
POST

### Endpoint
/vakra-dev/reader

### Parameters
#### Request Body
- **urls** (array<string>) - Required - List of URLs to scrape.
- **proxy** (object) - Optional - Proxy configuration.
  - **type** (string) - Required - Proxy type (e.g., "residential", "datacenter").
  - **host** (string) - Required - Proxy server hostname.
  - **port** (number) - Required - Proxy server port.
  - **username** (string) - Optional - Authentication username.
  - **password** (string) - Optional - Authentication password.
  - **country** (string) - Optional - Country code for geo-targeting (e.g., "us").

### Request Example
```json
{
  "urls": ["https://example.com"],
  "proxy": {
    "type": "residential",
    "host": "proxy.example.com",
    "port": 8080,
    "username": "username",
    "password": "password",
    "country": "us"
  }
}
```

### Response
#### Success Response (200)
- **data** (array<object>) - Array of scraped data.
  - **metadata** (object) - Metadata about the scrape.
    - **baseUrl** (string) - The base URL that was scraped.
    - **proxy** (object) - Information about the proxy used.
      - **host** (string) - Proxy host.
      - **port** (number) - Proxy port.
      - **country** (string) - Optional - Country code if geo-targeting was used.

#### Response Example
```json
{
  "data": [
    {
      "metadata": {
        "baseUrl": "https://example.com",
        "proxy": {
          "host": "proxy.example.com",
          "port": 8080,
          "country": "us"
        }
      }
    }
  ]
}
```
```

--------------------------------

### Horizontal Scaling with Multiple Instances

Source: https://github.com/vakra-dev/reader/blob/main/docs/deployment/production-server.md

This bash script demonstrates how to horizontally scale the application by running multiple instances on different ports. This approach requires a load balancer to distribute traffic across the instances.

```bash
# Start multiple instances
PORT=3001 npx tsx server.ts &
PORT=3002 npx tsx server.ts &
PORT=3003 npx tsx server.ts &
```

--------------------------------

### Initialize and Use Shared Hero Core (TypeScript)

Source: https://github.com/vakra-dev/reader/blob/main/README.md

Demonstrates how to initialize a shared Hero Core instance for production environments to reuse browser instances across requests. It includes setting up connections and using the scrape function.

```typescript
import HeroCore from "@ulixee/hero-core";
import { TransportBridge } from "@ulixee/net";
import { ConnectionToHeroCore } from "@ulixee/hero";
import { scrape } from "@vakra-dev/reader";

// Initialize once at startup
const heroCore = new HeroCore();
await heroCore.start();

// Create connection for each request
function createConnection() {
  const bridge = new TransportBridge();
  heroCore.addConnection(bridge.transportToClient);
  return new ConnectionToHeroCore(bridge.transportToCore);
}

// Use in requests
const result = await scrape({
  urls: ["https://example.com"],
  connectionToCore: createConnection(),
});

// Shutdown on exit
await heroCore.close();
```