### Install Dependencies and Start Services Source: https://docs.firecrawl.dev/contributing/guide Commands to install project dependencies and launch the API server. ```bash cd apps/api pnpm install ``` ```bash redis-server ``` ```bash pnpm start ``` -------------------------------- ### Run the Brand Style Guide Generator Source: https://docs.firecrawl.dev/developer-guides/cookbooks/brand-style-guide-generator-cookbook Execute this command in your terminal to start the brand style guide generation process. Ensure you have the necessary dependencies installed. ```bash npm start ``` -------------------------------- ### Install Firecrawl Go SDK Source: https://docs.firecrawl.dev/sdks/go Command to install the Firecrawl Go SDK via go get. ```bash go get github.com/firecrawl/firecrawl/apps/go-sdk ``` -------------------------------- ### Initialize and Use Firecrawl Client Source: https://docs.firecrawl.dev/sdks/go Basic setup for the Firecrawl client and examples of scraping a single page and crawling a website. ```go package main import ( "context" "fmt" "log" firecrawl "github.com/firecrawl/firecrawl/apps/go-sdk" "github.com/firecrawl/firecrawl/apps/go-sdk/option" ) func main() { // Create a client (reads FIRECRAWL_API_KEY from environment) client, err := firecrawl.NewClient() if err != nil { log.Fatal(err) } // Or provide the API key directly client, err = firecrawl.NewClient( option.WithAPIKey("fc-your-api-key"), ) if err != nil { log.Fatal(err) } ctx := context.Background() // Scrape a single page doc, err := client.Scrape(ctx, "https://firecrawl.dev", &firecrawl.ScrapeOptions{ Formats: []string{"markdown"}, }) if err != nil { log.Fatal(err) } fmt.Println(doc.Markdown) // Crawl a website job, err := client.Crawl(ctx, "https://firecrawl.dev", &firecrawl.CrawlOptions{ Limit: firecrawl.Int(5), }) if err != nil { log.Fatal(err) } fmt.Printf("Crawled pages: %d\n", len(job.Data)) } ``` -------------------------------- ### Initialize and Use Firecrawl Client Source: https://docs.firecrawl.dev/sdks/ruby Basic setup and usage example for scraping and crawling. ```ruby require "firecrawl" client = Firecrawl::Client.from_env doc = client.scrape( "https://firecrawl.dev", Firecrawl::Models::ScrapeOptions.new(formats: ["markdown"]) ) job = client.crawl( "https://firecrawl.dev", Firecrawl::Models::CrawlOptions.new(limit: 5) ) puts doc.markdown puts "Crawled pages: #{job.data&.size || 0}" ``` -------------------------------- ### Setup Cloudflare Worker Project Source: https://docs.firecrawl.dev/quickstarts/cloudflare-workers Initialize a new Cloudflare Worker project and install the Firecrawl SDK. ```bash npm create cloudflare@latest my-scraper cd my-scraper npm install firecrawl ``` -------------------------------- ### Setup Firecrawl CLI and Build Skills Source: https://docs.firecrawl.dev/llms-full.txt Installs only the CLI and build skills for Firecrawl. Use this if you do not need workflow skills or want a more targeted installation. ```bash firecrawl setup skills ``` -------------------------------- ### Manual Firecrawl CLI Setup Source: https://docs.firecrawl.dev/llms-full.txt Installs the Firecrawl CLI globally, initializes skills, and logs in. It also shows how to set the API key directly via an environment variable if skipping the browser login. ```bash npm install -g firecrawl-cli firecrawl init skills firecrawl login --browser # Or, skip the browser and provide your API key directly: export FIRECRAWL_API_KEY="fc-YOUR-API-KEY" ``` -------------------------------- ### Quick Start: Scrape, Interact, and Stop Source: https://docs.firecrawl.dev/features/interact This example demonstrates the complete workflow of scraping a page, interacting with it using AI prompts, and then stopping the interaction session. It covers Python, Node.js, cURL, and CLI usage. ```Python from firecrawl import Firecrawl app = Firecrawl( # No API key needed to get started — add one for higher rate limits: # api_key="fc-YOUR-API-KEY", ) # 1. Scrape Amazon's homepage result = app.scrape("https://www.amazon.com", formats=["markdown"]) scrape_id = result.metadata.scrape_id # 2. Interact — search for a product and get its price app.interact(scrape_id, prompt="Search for iPhone 16 Pro Max") response = app.interact(scrape_id, prompt="Click on the first result and tell me the price") print(response.output) # 3. Stop the session app.stop_interaction(scrape_id) ``` ```Node.js import { Firecrawl } from 'firecrawl'; const app = new Firecrawl({ // No API key needed to get started — add one for higher rate limits: // apiKey: 'fc-YOUR-API-KEY', }); // 1. Scrape Amazon's homepage const result = await app.scrape('https://www.amazon.com', { formats: ['markdown'] }); const scrapeId = result.metadata?.scrapeId; // 2. Interact — search for a product and get its price await app.interact(scrapeId, { prompt: 'Search for iPhone 16 Pro Max' }); const response = await app.interact(scrapeId, { prompt: 'Click on the first result and tell me the price' }); console.log(response.output); // 3. Stop the session await app.stopInteraction(scrapeId); ``` ```cURL # 1. Scrape Amazon's homepage # No API key needed to get started — add -H "Authorization: Bearer $FIRECRAWL_API_KEY" for higher rate limits: RESPONSE=$(curl -s -X POST "https://api.firecrawl.dev/v2/scrape" \ -H "Content-Type: application/json" \ -d '{"url": "https://www.amazon.com", "formats": ["markdown"]}') SCRAPE_ID=$(echo $RESPONSE | jq -r '.data.metadata.scrapeId') # 2. Interact — search for a product and get its price curl -s -X POST "https://api.firecrawl.dev/v2/scrape/$SCRAPE_ID/interact" \ -H "Content-Type: application/json" \ -d '{"prompt": "Search for iPhone 16 Pro Max"}' curl -s -X POST "https://api.firecrawl.dev/v2/scrape/$SCRAPE_ID/interact" \ -H "Content-Type: application/json" \ -d '{"prompt": "Click on the first result and tell me the price"}' # 3. Stop the session curl -s -X DELETE "https://api.firecrawl.dev/v2/scrape/$SCRAPE_ID/interact" ``` ```CLI # 1. Scrape Amazon's homepage (scrape ID is saved automatically) firecrawl scrape https://www.amazon.com # 2. Interact — search for a product and get its price firecrawl interact "Search for iPhone 16 Pro Max" firecrawl interact "Click on the first result and tell me the price" # 3. Stop the session firecrawl interact stop ``` -------------------------------- ### AWS Lambda Setup and Deployment Source: https://docs.firecrawl.dev/quickstarts/aws-lambda Commands to initialize a Node.js project, install Firecrawl, package the function with its dependencies, and deploy it to AWS Lambda using the AWS CLI. Includes setting environment variables and function configuration. ```bash mkdir firecrawl-lambda && cd firecrawl-lambda npm init -y npm install firecrawl ``` ```bash zip -r function.zip index.mjs node_modules/ aws lambda create-function \ --function-name firecrawl-scraper \ --runtime nodejs20.x \ --handler index.handler \ --zip-file fileb://function.zip \ --role arn:aws:iam::YOUR_ACCOUNT:role/lambda-role \ --environment Variables="{FIRECRAWL_API_KEY=fc-YOUR-API-KEY}" \ --timeout 60 ``` -------------------------------- ### Initialize Firecrawl CLI with All Skills and Browser Authentication Source: https://docs.firecrawl.dev/ai-onboarding Installs all Firecrawl skill segments (CLI, build, workflows) and opens the browser for authentication. Use this command for a full setup. ```bash npx -y firecrawl-cli@latest init --all --browser ``` -------------------------------- ### Scrape Documentation Example Source: https://docs.firecrawl.dev/quickstarts/cursor Use this command in Cursor Chat to scrape documentation for a specific topic and get an explanation. ```text Scrape the React hooks documentation and explain useEffect ``` -------------------------------- ### Run the Flask Development Server Source: https://docs.firecrawl.dev/llms-full.txt Start the Flask development server to make your API endpoints accessible. This command assumes you have Flask installed and configured. ```bash flask run ``` -------------------------------- ### Setup Cloudflare Worker Project Source: https://docs.firecrawl.dev/llms-full.txt Initialize a new Cloudflare Worker project and install the Firecrawl SDK. Ensure your API key is added as a secret. ```bash npm create cloudflare@latest my-scraper cd my-scraper npm install firecrawl ``` ```bash wrangler secret put FIRECRAWL_API_KEY ``` -------------------------------- ### Batch Process Multiple Websites Source: https://docs.firecrawl.dev/developer-guides/cookbooks/brand-style-guide-generator-cookbook Implement batch processing to generate brand style guides for multiple websites sequentially. This example iterates through a list of URLs, scrapes each one for branding information, and includes a placeholder for PDF generation. ```typescript const websites = [ "https://stripe.com", "https://linear.app", "https://vercel.com" ]; for (const site of websites) { const { branding } = await fc.scrape(site, { formats: ["branding"] }) as any; // Generate PDF for each site... } ``` -------------------------------- ### Install Dependencies Source: https://docs.firecrawl.dev/quickstarts/fastapi Install the required packages for the FastAPI application. ```bash pip install fastapi uvicorn firecrawl-py ``` -------------------------------- ### Quick Start with Firecrawl Tools Source: https://docs.firecrawl.dev/developer-guides/llm-sdks-and-frameworks/vercel-ai-sdk This example demonstrates how to use the bundled FirecrawlTools() with the Vercel AI SDK to perform a sequence of actions: interact with a website, search for related information, scrape relevant pages, and summarize the findings. It uses the `generateText` function and specifies a stop condition. ```typescript import { generateText, stepCountIs } from 'ai'; import { FirecrawlTools } from 'firecrawl-aisdk'; const { text } = await generateText({ model: 'anthropic/claude-sonnet-4-5', tools: FirecrawlTools(), stopWhen: stepCountIs(30), prompt: ` 1. Use interact on Hacker News to identify the top story 2. Search for other perspectives on the same topic 3. Scrape the most relevant pages you found 4. Summarize everything you found `, }); ``` -------------------------------- ### Batch Scrape with Webhook Configuration Source: https://docs.firecrawl.dev/features/batch-scrape This example demonstrates how to initiate a batch scrape operation and configure a webhook to receive notifications. The webhook can be set up to receive events such as when the batch starts, when a single page is scraped, or when the entire batch is completed. ```APIDOC ## POST /v2/batch/scrape ### Description Initiates a batch scraping job and configures a webhook to receive real-time notifications about the job's progress and results. ### Method POST ### Endpoint https://api.firecrawl.dev/v2/batch/scrape ### Request Body - **urls** (array[string]) - Required - A list of URLs to scrape. - **webhook** (object) - Optional - Configuration for the webhook. - **url** (string) - Required - The URL where webhook notifications will be sent. - **metadata** (object) - Optional - Custom key-value pairs to be included in the webhook payload. - **events** (array[string]) - Optional - A list of events to trigger webhook notifications. Possible values: `started`, `page`, `completed`, `failed`. ### Request Example ```json { "urls": [ "https://example.com/page1", "https://example.com/page2", "https://example.com/page3" ], "webhook": { "url": "https://your-domain.com/webhook", "metadata": { "any_key": "any_value" }, "events": ["started", "page", "completed"] } } ``` ### Response #### Success Response (200) - **id** (string) - The ID of the batch scrape job. - **status** (string) - The status of the batch scrape job. #### Response Example ```json { "id": "batch-job-id", "status": "processing" } ``` ### Event Types - `batch_scrape.started`: The batch scrape job has begun. - `batch_scrape.page`: A single URL was successfully scraped. - `batch_scrape.completed`: All URLs have been processed. - `batch_scrape.failed`: The batch scrape encountered an error. ### Webhook Payload Structure ```json { "success": true, "type": "batch_scrape.page", "id": "batch-job-id", "data": [...], "metadata": {}, "error": null } ``` ### Verifying Webhook Signatures To ensure webhook authenticity, verify the `X-Firecrawl-Signature` header using HMAC-SHA256 with your webhook secret. Compare the computed signature with the provided header using a timing-safe function. Never process a webhook without verifying its signature first. ``` -------------------------------- ### Install Firecrawl PHP SDK Source: https://docs.firecrawl.dev/sdks/php Install the official PHP SDK using Composer. ```bash composer require firecrawl/firecrawl-sdk ``` -------------------------------- ### Install Firecrawl Ruby SDK Source: https://docs.firecrawl.dev/sdks/ruby Installation methods using Bundler or direct gem installation. ```ruby gem "firecrawl-sdk", "~> 1.0" ``` ```bash bundle install ``` ```bash gem install firecrawl-sdk ``` -------------------------------- ### Installation Source: https://docs.firecrawl.dev/sdks/node Install the Firecrawl Node SDK using npm and import the Firecrawl class. ```APIDOC ## Installation Install the SDK with npm: ```js Node theme={null} // npm install firecrawl import { Firecrawl } from 'firecrawl'; const firecrawl = new Firecrawl({ // No API key needed to get started — add one for higher rate limits: // apiKey: "fc-YOUR-API-KEY", }); ``` ``` -------------------------------- ### Get Agent Job Status Source: https://docs.firecrawl.dev/features/agent This example shows how to retrieve the status of an agent job using its Job ID. This is useful for polling when jobs are started asynchronously. ```APIDOC ## GET /v2/agent/{jobId} ### Description Retrieves the current status and results (if available) of an agent job using its unique Job ID. ### Method GET ### Endpoint /v2/agent/ ### Parameters #### Path Parameters - **jobId** (string) - Required - The unique identifier of the agent job. ### Response #### Success Response (200) - **success** (boolean) - Indicates if the operation was successful. - **status** (string) - The current state of the job (e.g., `processing`, `completed`, `failed`, `cancelled`). - **data** (object) - The extracted data, present only if the job is `completed`. - **expiresAt** (string) - The timestamp when the job results will expire. - **creditsUsed** (integer) - The number of credits consumed by the job. #### Response Example (Pending) ```json { "success": true, "status": "processing", "expiresAt": "2024-12-15T00:00:00.000Z" } ``` #### Response Example (Completed) ```json { "success": true, "status": "completed", "data": { "founders": [ { "name": "Eric Ciarla", "role": "Co-founder" }, { "name": "Nicolas Camara", "role": "Co-founder" }, { "name": "Caleb Peffer", "role": "Co-founder" } ] }, "expiresAt": "2024-12-15T00:00:00.000Z", "creditsUsed": 15 } ``` ``` -------------------------------- ### Start Development Server Source: https://docs.firecrawl.dev/developer-guides/cookbooks/ai-research-assistant-cookbook Run this command to start the development server and test the basic chat functionality in your browser. ```bash npm run dev ``` -------------------------------- ### Run FastAPI Server Source: https://docs.firecrawl.dev/quickstarts/fastapi Start the development server using uvicorn. ```bash uvicorn main:app --reload ``` -------------------------------- ### Initialize Node.js Project Source: https://docs.firecrawl.dev/developer-guides/cookbooks/brand-style-guide-generator-cookbook Create a new project directory and initialize it with npm. This sets up the basic structure for your Node.js application. ```bash mkdir brand-style-guide-generator && cd brand-style-guide-generator npm init -y ``` -------------------------------- ### Start Agent Job Asynchronously Source: https://docs.firecrawl.dev/features/agent This example demonstrates how to start an agent job asynchronously using `start_agent` (Python) or `startAgent` (Node.js). This returns a Job ID that can be used to poll for status. ```APIDOC ## POST /v2/agent (Start Job) ### Description Starts an agent job asynchronously, returning a Job ID immediately. This allows for background processing, with status checks performed separately. ### Method POST ### Endpoint /v2/agent ### Parameters #### Request Body - **prompt** (string) - Required - The instruction for the agent to perform. ### Request Example ```json { "prompt": "Find the founders of Firecrawl" } ``` ### Response #### Success Response (200) - **id** (string) - The unique identifier for the agent job. #### Response Example ```json { "id": "job-12345" } ``` ``` -------------------------------- ### Check Python Extraction Job Status Source: https://docs.firecrawl.dev/features/extract Use this method after starting an extraction job to get its current status and results. The SDK waits for completion by default if not using start methods. ```python from firecrawl import Firecrawl firecrawl = Firecrawl( api_key="fc-YOUR_API_KEY" ) # Start an extraction job first extract_job = firecrawl.start_extract([ 'https://docs.firecrawl.dev/*', 'https://firecrawl.dev/' ], prompt="Extract the company mission and features from these pages.") # Get the status of the extraction job job_status = firecrawl.get_extract_status(extract_job.id) print(job_status) # Example output: # id=None # status='completed' # expires_at=datetime.datetime(...) # success=True # data=[{ ... }] # error=None # warning=None # sources=None ``` -------------------------------- ### Install Firecrawl and Google Generative AI SDK Source: https://docs.firecrawl.dev/developer-guides/llm-sdks-and-frameworks/gemini Install the necessary npm packages for Firecrawl and Google's Generative AI. This is the initial setup step for using Gemini with Firecrawl. ```bash npm install firecrawl @google/genai ``` -------------------------------- ### Troubleshoot Supabase Configuration Errors Source: https://docs.firecrawl.dev/contributing/self-host Example log output when the Supabase client is not configured. ```bash [YYYY-MM-DDTHH:MM:SS.SSSz]ERROR - Attempted to access Supabase client when it's not configured. [YYYY-MM-DDTHH:MM:SS.SSSz]ERROR - Error inserting scrape event: Error: Supabase client is not configured. ``` -------------------------------- ### Configure Environment Variables Source: https://docs.firecrawl.dev/contributing/guide Commands to initialize the .env file and the minimal configuration required for local development. ```bash cp apps/api/.env.example apps/api/.env ``` ```bash # ===== Required ===== NUM_WORKERS_PER_QUEUE=8 PORT=3002 HOST=0.0.0.0 REDIS_URL=redis://localhost:6379 REDIS_RATE_LIMIT_URL=redis://localhost:6379 ## To turn on DB authentication, you need to set up supabase. USE_DB_AUTHENTICATION=false ## PostgreSQL connection for queuing — change if credentials, host, or DB differ NUQ_DATABASE_URL=postgres://postgres:postgres@localhost:5433/postgres # ===== Optional ===== # SUPABASE_ANON_TOKEN= # SUPABASE_URL= # SUPABASE_SERVICE_TOKEN= # TEST_API_KEY= # Set if you've configured authentication and want to test with a real API key # OPENAI_API_KEY= # Required for LLM-dependent features (image alt generation, etc.) # BULL_AUTH_KEY=@ # PLAYWRIGHT_MICROSERVICE_URL= # Set to run a Playwright fallback # LLAMAPARSE_API_KEY= # Set to parse PDFs with LlamaParse # SLACK_WEBHOOK_URL= # Set to send Slack server health status messages # POSTHOG_API_KEY= # Set to send PostHog events like job logs # POSTHOG_HOST= # Set to send PostHog events like job logs ``` -------------------------------- ### Scrape with Markdown, Branding, and Screenshot Formats Source: https://docs.firecrawl.dev/features/scrape Combine markdown, branding, and screenshot formats to get comprehensive page data. No API key is needed to get started, but one is recommended for higher rate limits. ```python from firecrawl import Firecrawl firecrawl = Firecrawl( # No API key needed to get started — add one for higher rate limits: # api_key='fc-YOUR_API_KEY', ) result = firecrawl.scrape( url='https://firecrawl.dev', formats=['markdown', 'branding', 'screenshot'] ) print(result['markdown']) print(result['branding']) print(result['screenshot']) ``` ```javascript import { Firecrawl } from 'firecrawl'; const firecrawl = new Firecrawl({ // No API key needed to get started — add one for higher rate limits: // apiKey: "fc-YOUR-API-KEY", }); const result = await firecrawl.scrape('https://firecrawl.dev', { formats: ['markdown', 'branding', 'screenshot'] }); console.log(result.markdown); console.log(result.branding); console.log(result.screenshot); ``` ```bash # No API key needed to get started — add -H "Authorization: Bearer $FIRECRAWL_API_KEY" for higher rate limits: curl -s -X POST "https://api.firecrawl.dev/v2/scrape" \ -H "Content-Type: application/json" \ -d '{ "url": "https://firecrawl.dev", "formats": ["markdown", "branding", "screenshot"] }' ``` -------------------------------- ### Installation Source: https://docs.firecrawl.dev/sdks/python Install the Firecrawl Python SDK using pip and initialize the Firecrawl client. An API key is optional for basic usage but recommended for higher rate limits. ```APIDOC ## Installation To install the Firecrawl Python SDK, you can use pip: ```python # pip install firecrawl-py from firecrawl import Firecrawl firecrawl = Firecrawl( # No API key needed to get started — add one for higher rate limits: # api_key="fc-YOUR-API-KEY", ) ``` ``` -------------------------------- ### Install and Initialize Firecrawl Python SDK Source: https://docs.firecrawl.dev/sdks/python Install the SDK using pip and initialize the Firecrawl client. An API key can be provided for higher rate limits, but is not required for basic usage. ```python # pip install firecrawl-py from firecrawl import Firecrawl firecrawl = Firecrawl( # No API key needed to get started — add one for higher rate limits: # api_key="fc-YOUR-API-KEY", ) ``` -------------------------------- ### Install and Initialize Firecrawl Node SDK Source: https://docs.firecrawl.dev/sdks/node Install the SDK using npm and initialize the Firecrawl client. An API key can be provided during initialization or set as an environment variable for higher rate limits. ```javascript // npm install firecrawl import { Firecrawl } from 'firecrawl'; const firecrawl = new Firecrawl({ // No API key needed to get started — add one for higher rate limits: // apiKey: "fc-YOUR-API-KEY", }); ``` -------------------------------- ### Setup Firecrawl Workflow Skills Source: https://docs.firecrawl.dev/llms-full.txt Installs only the workflow skills for Firecrawl. Use this if you specifically need workflow capabilities and have already set up other skills. ```bash firecrawl setup workflows ``` -------------------------------- ### Scrape a Single Page with Node.js Source: https://docs.firecrawl.dev/advanced-scraping-guide Use the /scrape endpoint to get clean markdown content from a single URL. Ensure you have installed the firecrawl-js library. ```JavaScript // npm install firecrawl import { Firecrawl } from 'firecrawl-js'; const firecrawl = new Firecrawl({ apiKey: 'fc-YOUR-API-KEY' }); const doc = await firecrawl.scrape('https://firecrawl.dev'); console.log(doc.markdown); ``` -------------------------------- ### Installation and Configuration Source: https://docs.firecrawl.dev/sdks/elixir Instructions on how to add the Firecrawl Elixir SDK to your project and configure your API key. ```APIDOC ## Installation Add `firecrawl` to your list of dependencies in `mix.exs` and configure your API key: ```elixir # Add to mix.exs {:firecrawl, "~> 1.0"} # Then configure your API key in config.exs config :firecrawl, api_key: "fc-YOUR-API-KEY" ``` Or pass the API key per-request: ```elixir Firecrawl.scrape_and_extract_from_url([url: "https://example.com"], api_key: "fc-YOUR-API-KEY") ``` ``` -------------------------------- ### Scrape a Single Page with Python Source: https://docs.firecrawl.dev/advanced-scraping-guide Use the /scrape endpoint to get clean markdown content from a single URL. Ensure you have installed the firecrawl-py library. ```Python # pip install firecrawl-py from firecrawl import Firecrawl firecrawl = Firecrawl(api_key="fc-YOUR-API-KEY") doc = firecrawl.scrape("https://firecrawl.dev") print(doc.markdown) ``` -------------------------------- ### Basic SDK Usage Source: https://docs.firecrawl.dev/sdks/php Initialize the client and perform basic scrape and crawl operations. ```php use Firecrawl\Client\FirecrawlClient; use Firecrawl\Models\CrawlOptions; use Firecrawl\Models\ScrapeOptions; $client = FirecrawlClient::fromEnv(); $doc = $client->scrape( 'https://firecrawl.dev', ScrapeOptions::with(formats: ['markdown']) ); $crawl = $client->crawl( 'https://firecrawl.dev', CrawlOptions::with(limit: 5) ); echo $doc->getMarkdown(); echo 'Crawled pages: ' . count($crawl->getData()); ``` -------------------------------- ### Initialize Node.js Project Source: https://docs.firecrawl.dev/llms-full.txt Create a new project directory and initialize it with npm. Update package.json to enable ES modules for TypeScript. ```bash mkdir brand-style-guide-generator && cd brand-style-guide-generator npm init -y ``` ```json { "name": "brand-style-guide-generator", "version": "1.0.0", "type": "module", "scripts": { "start": "npx tsx index.ts" } } ``` -------------------------------- ### Check Node.js Extraction Job Status Source: https://docs.firecrawl.dev/features/extract Initiate an extraction and then poll for its status using the Node.js SDK. This example demonstrates starting a job and checking its completion. ```javascript import { Firecrawl } from 'firecrawl'; const firecrawl = new Firecrawl({ apiKey: "fc-YOUR-API-KEY" }); const started = await firecrawl.startExtract({ urls: ['https://docs.firecrawl.dev'], prompt: 'Extract title', schema: { type: 'object', properties: { title: { type: 'string' } }, required: ['title'] }, }); if (started.id) { const done = await firecrawl.getExtractStatus(started.id); console.log(done.status, done.data); } ``` -------------------------------- ### Scrape URL with Firecrawl MCP Source: https://docs.firecrawl.dev/quickstarts/antigravity Example prompt to scrape a given URL and extract all linked guides using Firecrawl's scrape tool via Antigravity. ```text Scrape https://docs.firecrawl.dev/ai-onboarding and list every linked guide. ``` -------------------------------- ### Initiate Crawl with Webhook Source: https://docs.firecrawl.dev/features/crawl This example demonstrates how to initiate a crawl and configure a webhook to receive notifications for different crawl events. The webhook will send data to your specified URL. ```APIDOC ## POST https://api.firecrawl.dev/v2/crawl ### Description Initiates a web crawl and optionally configures a webhook to receive real-time notifications about the crawl's progress, including started, page, and completed events. ### Method POST ### Endpoint https://api.firecrawl.dev/v2/crawl ### Parameters #### Request Body - **url** (string) - Required - The URL to start crawling from. - **limit** (integer) - Optional - The maximum number of pages to crawl. - **webhook** (object) - Optional - Configuration for receiving webhook notifications. - **url** (string) - Required - The URL where webhook notifications will be sent. - **metadata** (object) - Optional - Custom metadata to be included in webhook payloads. - **events** (array of strings) - Optional - A list of events to receive notifications for (e.g., `started`, `page`, `completed`, `failed`). ### Request Example ```json { "url": "https://docs.firecrawl.dev", "limit": 100, "webhook": { "url": "https://your-domain.com/webhook", "metadata": { "any_key": "any_value" }, "events": ["started", "page", "completed"] } } ``` ### Response #### Success Response (200) - **success** (boolean) - Indicates if the request was successful. - **type** (string) - The type of event payload (e.g., `crawl.page`). - **id** (string) - The unique identifier for the crawl job. - **data** (array) - Page data for 'page' events. - **metadata** (object) - Your custom metadata. - **error** (null) - Error object, null if no error. #### Response Example ```json { "success": true, "type": "crawl.page", "id": "crawl-job-id", "data": [], "metadata": {}, "error": null } ``` ### Verifying Webhook Signatures Always verify the `X-Firecrawl-Signature` header using HMAC-SHA256 and your webhook secret to ensure authenticity. ``` -------------------------------- ### Extracting Data with a Prompt Source: https://docs.firecrawl.dev/features/extract This example demonstrates how to use the `/extract` endpoint to get structured data by providing a prompt and a schema. It's useful for research or when specific URLs are not available. ```APIDOC ## Extracting without URLs The `/extract` endpoint now supports extracting structured data using a prompt without needing specific URLs. This is useful for research or when exact URLs are unknown. Currently in Alpha. ```python Python theme={null} from pydantic import BaseModel class ExtractSchema(BaseModel): company_mission: str # Define the prompt for extraction prompt = 'Extract the company mission from Firecrawl\'s website.' # Perform the extraction scrape_result = firecrawl.extract(prompt=prompt, schema=ExtractSchema) print(scrapeResult) ``` ```js Node theme={null} import { z } from "zod"; // Define schema to extract contents into const schema = z.object({ company_mission: z.string(), }); const scrapeResult = await firecrawl.extract([], { prompt: "Extract the company mission from Firecrawl\'s website.", schema: schema }); console.log(scrapeResult); ``` ```bash cURL theme={null} curl -X POST https://api.firecrawl.dev/v2/extract \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer YOUR_API_KEY' \ -d '{ "urls": [], "prompt": "Extract the company mission from the Firecrawl\'s website.", "schema": { "type": "object", "properties": { "company_mission": { "type": "string" } }, "required": ["company_mission"] } }' ``` ``` -------------------------------- ### Setup Remote MCP Server with Firecrawl Source: https://docs.firecrawl.dev/developer-guides/llm-sdks-and-frameworks/google-adk Configure an ADK agent to use a remote Firecrawl MCP server. Ensure you have a Firecrawl API key and have installed the Google ADK. ```python from google.adk.agents.llm_agent import Agent from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPServerParams from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset FIRECRAWL_API_KEY = "YOUR-API-KEY" root_agent = Agent( model="gemini-2.5-pro", name="firecrawl_agent", description='A helpful assistant for scraping websites with Firecrawl', instruction='Help the user search for website content', tools=[ MCPToolset( connection_params=StreamableHTTPServerParams( url=f"https://mcp.firecrawl.dev/{FIRECRAWL_API_KEY}/v2/mcp", ), ) ], ) ``` -------------------------------- ### Basic Usage Example Source: https://docs.firecrawl.dev/sdks/dotnet A quick example demonstrating how to initialize the Firecrawl client and perform basic scrape and crawl operations. ```APIDOC ## Basic Usage Example This example shows how to initialize the `FirecrawlClient` and perform a scrape and crawl operation. ```csharp using Firecrawl; using Firecrawl.Models; var client = new FirecrawlClient("fc-your-api-key"); // Scrape a single page var doc = await client.ScrapeAsync("https://firecrawl.dev", new ScrapeOptions { Formats = new List { "markdown" } }); // Crawl a website var job = await client.CrawlAsync("https://firecrawl.dev", new CrawlOptions { Limit = 5 }); Console.WriteLine(doc.Markdown); Console.WriteLine($"Crawled pages: {job.Data?.Count ?? 0}"); ``` ``` -------------------------------- ### CLI Example Source: https://docs.firecrawl.dev/features/interact Illustrates the equivalent operations using the Firecrawl CLI, showing how to scrape with a profile, interact, stop sessions, and manage read-only profiles. ```APIDOC ## `firecrawl interact` command ### Description Allows direct interaction with the browser session using natural language prompts. This command is used to guide the scraping process by instructing the AI to perform specific actions within the web page. ### Usage `firecrawl interact ""` ## `firecrawl interact stop` command ### Description Terminates the current interaction session. It's crucial to run this command to ensure that any changes made during the session (like form submissions or state modifications) are saved to the persistent profile, if configured to do so. ``` -------------------------------- ### Run and Test Application Source: https://docs.firecrawl.dev/quickstarts/aspnet-core Commands to execute the application and perform tests. ```bash dotnet run ``` -------------------------------- ### Crawl Multiple Pages from GitHub Source: https://docs.firecrawl.dev/developer-guides/common-sites/github Crawl multiple pages starting from a given URL within a repository or documentation site. This example crawls a React wiki and scrapes results in markdown format. ```typescript import { Firecrawl } from 'firecrawl'; const firecrawl = new Firecrawl({ apiKey: process.env.FIRECRAWL_API_KEY }); const crawlResult = await firecrawl.crawl('https://github.com/facebook/react/wiki', { limit: 10, scrapeOptions: { formats: ['markdown'] } }); console.log(crawlResult.data); ``` -------------------------------- ### Get Documentation with Claude Code Source: https://docs.firecrawl.dev/quickstarts/claude-code This is an example prompt to find and scrape specific documentation using Claude Code. Claude will leverage Firecrawl to locate and extract information from the Stripe API docs. ```text Find and scrape the Stripe API docs for payment intents ``` -------------------------------- ### Python and Node.js Batch Scraping (Async/Sync) Source: https://docs.firecrawl.dev/features/batch-scrape Demonstrates how to initiate a batch scrape job asynchronously to get a job ID, or synchronously to wait for all results. Includes examples for both Python and Node.js SDKs. ```python from firecrawl import Firecrawl firecrawl = Firecrawl(api_key="fc-YOUR-API-KEY") # Asynchronous: starts the batch and returns a job ID immediately start = firecrawl.start_batch_scrape([ "https://firecrawl.dev", "https://docs.firecrawl.dev", ], formats=["markdown"]) status = firecrawl.get_batch_scrape_status(start.id) # Or synchronous: starts the batch and waits for completion job = firecrawl.batch_scrape([ "https://firecrawl.dev", "https://docs.firecrawl.dev", ], formats=["markdown"], poll_interval=2, wait_timeout=120) print(job.status, job.completed, job.total) ``` ```javascript import { Firecrawl } from 'firecrawl'; const firecrawl = new Firecrawl({ apiKey: "fc-YOUR-API-KEY" }); // Asynchronous: starts the batch and returns a job ID immediately const { id } = await firecrawl.startBatchScrape([ 'https://firecrawl.dev', 'https://docs.firecrawl.dev' ], { options: { formats: ['markdown'] }, }); const status = await firecrawl.getBatchScrapeStatus(id); // Or synchronous: starts the batch and waits for completion const job = await firecrawl.batchScrape([ 'https://firecrawl.dev', 'https://docs.firecrawl.dev' ], { options: { formats: ['markdown'] }, pollInterval: 2, timeout: 120 }); console.log(job.status, job.completed, job.total); ``` -------------------------------- ### Use Default FirecrawlTools with generateText Source: https://docs.firecrawl.dev/developer-guides/llm-sdks-and-frameworks/vercel-ai-sdk Initialize FirecrawlTools to get default tools and a system prompt for `generateText`. This setup allows the AI to answer questions with citations by leveraging web search and scraping. ```typescript import { generateText, stepCountIs } from 'ai'; import { FirecrawlTools } from 'firecrawl-aisdk'; const tools = FirecrawlTools(); const { text } = await generateText({ model: 'anthropic/claude-sonnet-4-5', system: `${tools.systemPrompt}\n\nAnswer with citations when possible.`, tools, stopWhen: stepCountIs(20), prompt: 'Find the current Firecrawl pricing page and explain the available plans.', }); ``` -------------------------------- ### Test Firecrawl Laravel Integration with cURL Source: https://docs.firecrawl.dev/quickstarts/laravel Use cURL commands to test the search, scrape, and interact API endpoints after starting the Laravel development server. These examples demonstrate how to send JSON payloads. ```bash php artisan serve # Search the web curl -X POST http://localhost:8000/api/search \ -H "Content-Type: application/json" \ -d '{"query": "firecrawl web scraping"}' # Scrape a page curl -X POST http://localhost:8000/api/scrape \ -H "Content-Type: application/json" \ -d '{"url": "https://example.com"}' # Interact with a page curl -X POST http://localhost:8000/api/interact \ -H "Content-Type: application/json" \ -d '{"url": "https://www.amazon.com", "prompt": "Search for iPhone 16 Pro Max", "followUp": "Click on the first result and tell me the price"}' ``` -------------------------------- ### Launch and Use Browser Session with Profile (CLI) Source: https://docs.firecrawl.dev/features/browser Launch a browser session with a profile using the CLI. Changes are saved by default. Use `--no-save-changes` for read-only mode or combine with commands like `open`. ```bash # Launch with a profile (saves changes by default) firecrawl browser launch-session --profile my-profile # Launch with a profile in read-only mode firecrawl browser launch-session --profile my-profile --no-save-changes # Shorthand: launch with profile + execute in one step firecrawl browser --profile my-profile "open https://example.com" ``` -------------------------------- ### JSON Schema for Structured Output Source: https://docs.firecrawl.dev/features/llm-extract Define the desired structured output using JSON Schema. This example specifies properties for product name, installation type with an enum, and flow rate, including null handling. ```json { "type": "object", "properties": { "product_name": { "type": ["string", "null"], "description": "Full descriptive product name as shown on the page. Return null if not found." }, "installation_type": { "type": ["string", "null"], "description": "Installation type from the Specifications section. Return null if not found.", "enum": ["Deck-mount", "Wall-mount", "Countertop", "Drop-in", "Undermount"] }, "flow_rate_gpm": { "type": ["string", "null"], "description": "Flow rate in GPM from the Specifications section. Return null if not found." } } } ``` -------------------------------- ### Scrape and Extract JSON Data with Node.js Source: https://docs.firecrawl.dev/features/llm-extract This Node.js example shows how to scrape a URL and extract data into a Zod schema. Install the 'firecrawl' and 'zod' packages. An API key can be added for increased rate limits. ```javascript import { Firecrawl } from "firecrawl"; import { z } from "zod"; const app = new Firecrawl({ // No API key needed to get started — add one for higher rate limits: // apiKey: "fc-YOUR_API_KEY", }); // Define schema to extract contents into const schema = z.object({ company_mission: z.string(), supports_sso: z.boolean(), is_open_source: z.boolean(), is_in_yc: z.boolean() }); const result = await app.scrape("https://firecrawl.dev", { formats: [{ type: "json", schema: schema }], }); console.log(result); ```