### Start a Crawlee project with CLI

Source: https://crawlee.dev/js/docs/quick-start

Navigate to your project directory and start the crawler using the npm start script.

```bash
cd my-crawler && npm start
```

--------------------------------

### Start the crawler project

Source: https://crawlee.dev/js/docs/3.16/quick-start

Navigate to the created project directory and start the crawler using npm.

```bash
cd my-crawler && npm start
```

--------------------------------

### Install @apify/tsconfig

Source: https://crawlee.dev/js/docs/3.16/guides/typescript-project

Install the recommended TypeScript configuration preset from Apify.

```bash
npm install --save-dev @apify/tsconfig
```

--------------------------------

### Install Impit HTTP Client Package

Source: https://crawlee.dev/js/docs/3.16/guides/http-clients

Install the necessary package for using the `ImpitHttpClient` with Crawlee.

```bash
npm i @crawlee/impit-client
```

--------------------------------

### Example Dockerfile for Node.js/JavaScript Actor

Source: https://crawlee.dev/js/docs/3.16/guides/docker-images

A standard Dockerfile for Node.js actors. It optimizes build times by copying package files first and installs only necessary dependencies.

```dockerfile
# Specify the base Docker image. You can read more about  
# the available images at https://crawlee.dev/js/docs/guides/docker-images  
# You can also use any other image from Docker Hub.  
FROM apify/actor-node:20  

  
# Copy just package.json and package-lock.json  
# to speed up the build using Docker layer cache.  
COPY package*.json ./  

  
# Install NPM packages, skip optional and development dependencies to  
# keep the image small. Avoid logging too much and print the dependency  
# tree for debugging  
RUN npm --quiet set progress=false 
    && npm install --omit=dev --omit=optional 
    && echo "Installed NPM packages:" 
    && (npm list --omit=dev --all || true) 
    && echo "Node.js version:" 
    && node --version 
    && echo "NPM version:" 
    && npm --version  

  
# Next, copy the remaining files and directories with the source code.  
# Since we do this after NPM install, quick build will be really fast  
# for most source file changes.  
COPY . ./  

  
# Run the image.  
CMD npm start --silent  

```

--------------------------------

### Basic HTTP Crawler Setup and Run

Source: https://crawlee.dev/js/docs/examples/http-crawler

Sets up and runs an HttpCrawler with basic configurations for concurrency, retries, timeouts, and request limits. It defines handlers for processing successful requests and handling failed ones, then initiates the crawl with a list of starting URLs.

```javascript
import { HttpCrawler, log, LogLevel } from 'crawlee';

log.setLevel(LogLevel.DEBUG);

const crawler = new HttpCrawler({
    minConcurrency: 10,
    maxConcurrency: 50,
    maxRequestRetries: 1,
    requestHandlerTimeoutSecs: 30,
    maxRequestsPerCrawl: 10,

    async requestHandler({ pushData, request, body }) {
        log.debug(`Processing ${request.url}...`);

        await pushData({
            url: request.url,
            body,
        });
    },

    failedRequestHandler({ request }) {
        log.debug(`Request ${request.url} failed twice.`);
    },
});

await crawler.run(['https://crawlee.dev']);

log.debug('Crawler finished.');

```

--------------------------------

### Quick Start: Using Proxy URLs

Source: https://crawlee.dev/js/docs/guides/proxy-management

Initialize ProxyConfiguration with a list of proxy URLs to start using them immediately. Crawlee will rotate through these proxies.

```javascript
import { ProxyConfiguration } from 'crawlee';

const proxyConfiguration = new ProxyConfiguration({
    proxyUrls: [
        'http://proxy-1.com',
        'http://proxy-2.com',
    ]
});
const proxyUrl = await proxyConfiguration.newUrl();
```

--------------------------------

### Install Apify CLI and Log In

Source: https://crawlee.dev/js/docs/deployment/apify-platform

Install the Apify CLI globally and log in to your Apify account using your API token. This is a prerequisite for using the CLI to manage your Apify resources.

```bash
npm install -g apify-cli  
apify login -t YOUR_API_TOKEN  
```

--------------------------------

### Quick Start with Proxy URLs

Source: https://crawlee.dev/js/docs/3.16/guides/proxy-management

Initialize ProxyConfiguration with a list of proxy URLs to enable automatic rotation. This is a quick way to start using your own proxy servers.

```javascript
import { ProxyConfiguration } from 'crawlee';

const proxyConfiguration = new ProxyConfiguration({
    proxyUrls: [
        'http://proxy-1.com',
        'http://proxy-2.com',
    ]
});

const proxyUrl = await proxyConfiguration.newUrl();

```

--------------------------------

### Basic Dockerfile for Crawlee with Playwright

Source: https://crawlee.dev/js/docs/guides/docker-images

This Dockerfile installs NPM dependencies, copies project files, and sets the start command. It's suitable for projects using Playwright.

```docker
# Specify the base Docker image. You can read more about
# the available images at https://crawlee.dev/js/docs/guides/docker-images
# You can also use any other image from Docker Hub.
FROM apify/actor-node-playwright-chrome:20

# Copy just package.json and package-lock.json
# to speed up the build using Docker layer cache.
COPY --chown=myuser package*.json ./

# Install NPM packages, skip optional and development dependencies to
# keep the image small. Avoid logging too much and print the dependency
# tree for debugging
RUN npm --quiet set progress=false \
    && npm install --omit=dev --omit=optional \
    && echo "Installed NPM packages:"
    && (npm list --omit=dev --all || true) \
    && echo "Node.js version:"
    && node --version \
    && echo "NPM version:"
    && npm --version

# Next, copy the remaining files and directories with the source code.
# Since we do this after NPM install, quick build will be really fast
# for most source file changes.
COPY --chown=myuser . .


# Run the image.
CMD npm run start:prod --silent
```

--------------------------------

### Install SDK v1 with Puppeteer

Source: https://crawlee.dev/js/docs/upgrading/upgrading-to-v1

Install Apify SDK v1 along with the Puppeteer package for browser automation.

```bash
npm install apify puppeteer
```

--------------------------------

### Basic CheerioCrawler Setup

Source: https://crawlee.dev/js/docs/3.16/introduction/first-crawler

Demonstrates the fundamental setup of a CheerioCrawler, including importing necessary classes, opening a RequestQueue, adding an initial request, and defining a request handler to extract page titles.

```javascript
import { RequestQueue, CheerioCrawler } from 'crawlee';

const requestQueue = await RequestQueue.open();
await requestQueue.addRequest({ url: 'https://crawlee.dev' });

const crawler = new CheerioCrawler({
    requestQueue,
    async requestHandler({ $, request }) {
        const title = $('title').text();
        console.log(`The title of "${request.url}" is: ${title}.`);
    }
})

await crawler.run();
```

--------------------------------

### Install @sparticuz/chromium and Zip Dependencies

Source: https://crawlee.dev/js/docs/deployment/aws-browsers

Install the @sparticuz/chromium package and zip the node_modules folder for use as a Lambda Layer.

```bash
# Install the package  

npm i -S @sparticuz/chromium  

  
# Zip the dependencies  

zip -r dependencies.zip ./node_modules  

```

--------------------------------

### Install Apify CLI

Source: https://crawlee.dev/js/docs/3.16/deployment/apify-platform

Install the Apify CLI globally to manage your Apify account and projects from your local machine.

```bash
npm install -g apify-cli
```

--------------------------------

### Install SDK v1 with Playwright

Source: https://crawlee.dev/js/docs/3.16/upgrading/upgrading-to-v1

Install the SDK v1 and the Playwright package to leverage Playwright's browser automation capabilities.

```bash
npm install apify playwright  

```

--------------------------------

### Install Crawlee Meta-Package

Source: https://crawlee.dev/js/docs/3.16/upgrading/upgrading-to-v3

Install the main `crawlee` package which re-exports most of the `@crawlee/*` packages, including all crawler classes.

```bash
npm install crawlee  
```

--------------------------------

### Install SDK v1 with Puppeteer

Source: https://crawlee.dev/js/docs/3.16/upgrading/upgrading-to-v1

Install the SDK v1 along with the Puppeteer package to maintain compatibility with previous versions.

```bash
npm install apify puppeteer  

```

--------------------------------

### Install SDK v1 with Playwright

Source: https://crawlee.dev/js/docs/upgrading/upgrading-to-v1

Install Apify SDK v1 along with the Playwright package for browser automation.

```bash
npm install apify playwright
```

--------------------------------

### Install Crawlee with Playwright

Source: https://crawlee.dev/js/docs/quick-start

Install Crawlee along with Playwright for headless browser automation. Playwright is installed separately to reduce the core library size.

```bash
npm install crawlee playwright
```

--------------------------------

### Install TypeScript Compiler

Source: https://crawlee.dev/js/docs/3.16/guides/typescript-project

Install the TypeScript compiler as a development dependency using npm.

```bash
npm install --save-dev typescript
```

--------------------------------

### Install Specific Crawlee Package (Cheerio)

Source: https://crawlee.dev/js/docs/3.16/upgrading/upgrading-to-v3

Install only the `@crawlee/cheerio` package if you only need Cheerio support, reducing the number of dependencies.

```bash
npm install @crawlee/cheerio  
```

--------------------------------

### Install Crawlee and Playwright

Source: https://crawlee.dev/js/docs/3.16/quick-start

Install Crawlee along with Playwright for browser automation. Playwright is not bundled with Crawlee.

```bash
npm install crawlee playwright
```

--------------------------------

### Install Apify CLI Globally

Source: https://crawlee.dev/js/docs/3.16/introduction/deployment

Install the Apify CLI globally to manage authentication and deployment for all your Crawlee/Apify projects.

```bash
npm install -g apify-cli  
```

--------------------------------

### Install Crawlee and Puppeteer

Source: https://crawlee.dev/js/docs/3.16/quick-start

Install Crawlee along with Puppeteer for browser automation. Puppeteer is not bundled with Crawlee.

```bash
npm install crawlee puppeteer
```

--------------------------------

### Install Node.js Type Declarations

Source: https://crawlee.dev/js/docs/3.16/guides/typescript-project

Install type declarations for Node.js to enable type-checking for Node.js features.

```bash
npm install --save-dev @types/node
```

--------------------------------

### Handle Start Page with Crawlee.js

Source: https://crawlee.dev/js/docs/introduction/scraping

Enqueues category links from the initial start page. This is the entry point for navigating the website structure.

```javascript
} else {

            // This means we're on the start page, with no label.

            // On this page, we just want to enqueue all the category pages.


            await page.waitForSelector('.collection-block-item');

            await enqueueLinks({

                selector: '.collection-block-item',

                label: 'CATEGORY',

            });

        }
```

--------------------------------

### Install Apify SDK

Source: https://crawlee.dev/js/docs/3.16/introduction/deployment

Install the Apify SDK as a dependency for your Node.js project to interact with Apify Platform's cloud products like RequestQueue and Dataset.

```bash
npm install apify  
```

--------------------------------

### Development Start Script with ts-node-esm

Source: https://crawlee.dev/js/docs/3.16/guides/typescript-project

Configure an NPM script to start the development server using ts-node-esm, with the --transpileOnly flag for faster compilation.

```json
{
    "scripts": {
        "start:dev": "ts-node-esm -T src/main.ts"
    }
}
```

--------------------------------

### Basic PlaywrightCrawler Setup

Source: https://crawlee.dev/js/docs/3.16/deployment/gcp-browsers

Initial setup for a PlaywrightCrawler with a router and configuration. This is a foundational step before integrating with an HTTP server for Cloud Run deployment.

```javascript
import { Configuration, PlaywrightCrawler } from 'crawlee';  

import { router } from './routes.js';  

  
const startUrls = ['https://crawlee.dev'];  

  
const crawler = new PlaywrightCrawler({
    requestHandler: router,
}, new Configuration({
    persistStorage: false,
}));  

await crawler.run(startUrls);  

```

--------------------------------

### Basic StagehandCrawler Example

Source: https://crawlee.dev/js/docs/3.16/guides/stagehand-crawler-guide

This example demonstrates how to initialize and run a StagehandCrawler to extract data from a website. It shows how to configure the crawler with AI model options and implement a request handler for extracting page titles, interacting with navigation, and gathering structured data.

```typescript
import { StagehandCrawler } from '@crawlee/stagehand';

import { z } from 'zod';

const crawler = new StagehandCrawler({
    stagehandOptions: {
        env: 'LOCAL',
        model: 'openai/gpt-4.1-mini',
        verbose: 1,
    },
    async requestHandler({ page, request, log, pushData }) {
        log.info(`Processing ${request.url}`);

        // Use AI to extract the page title
        const title = await page.extract('Get the main heading of the page', z.string());

        // Use AI to click on a navigation element
        await page.act('Click on the Documentation link');

        // Extract structured data after navigation
        const navItems = await page.extract('Get all sidebar navigation items', z.array(z.string()));

        log.info(`Found ${navItems.length} navigation items`);

        await pushData({
            url: request.url,
            title,
            navItems,
        });
    },
});

await crawler.run(['https://crawlee.dev']);

```

--------------------------------

### Puppeteer Recursive Crawl Example

Source: https://crawlee.dev/js/docs/3.16/examples/puppeteer-recursive-crawl

Use this snippet to perform a recursive crawl of a website with PuppeteerCrawler. It starts by adding initial requests and then recursively crawls links matching a glob pattern. Ensure you have Puppeteer installed.

```javascript
import { PuppeteerCrawler } from 'crawlee';

const crawler = new PuppeteerCrawler({
    async requestHandler({ request, page, enqueueLinks, log }) {
        const title = await page.title();
        log.info(`Title of ${request.url}: ${title}`);

        await enqueueLinks({
            globs: ['http?(s)://www.iana.org/**'],
        });
    },
    maxRequestsPerCrawl: 10,
});

await crawler.addRequests(['https://www.iana.org/']);

await crawler.run();

```

--------------------------------

### Complete Crawlee Example with Data Saving

Source: https://crawlee.dev/js/docs/3.16/introduction/saving-data

A full PlaywrightCrawler example that scrapes product details from a website and saves them using `Dataset.pushData()`. It handles pagination and different page types (start, category, detail).

```javascript
import { PlaywrightCrawler, Dataset } from 'crawlee';

const crawler = new PlaywrightCrawler({
    requestHandler: async ({ page, request, enqueueLinks }) => {
        console.log(`Processing: ${request.url}`);

        if (request.label === 'DETAIL') {
            const urlPart = request.url.split('/').slice(-1); // ['sennheiser-mke-440-professional-stereo-shotgun-microphone-mke-440']
            const manufacturer = urlPart[0].split('-')[0]; // 'sennheiser'

            const title = await page.locator('.product-meta h1').textContent();
            const sku = await page.locator('span.product-meta__sku-number').textContent();

            const priceElement =
                page
                    .locator('span.price')
                    .filter({
                        hasText: '$',
                    })
                    .first();

            const currentPriceString = await priceElement.textContent();
            const rawPrice = currentPriceString?.split('$')[1];
            const price = Number(rawPrice?.replaceAll(',', ''));

            const inStockElement =
                page
                    .locator('span.product-form__inventory')
                    .filter({
                        hasText: 'In stock',
                    })
                    .first();

            const inStock = (await inStockElement.count()) > 0;

            const results = {
                url: request.url,
                manufacturer,
                title,
                sku,
                currentPrice: price,
                availableInStock: inStock,
            };

            await Dataset.pushData(results);
        } else if (request.label === 'CATEGORY') {
            // We are now on a category page. We can use this to paginate through and enqueue all products,
            // as well as any subsequent pages we find

            await page.waitForSelector('.product-item > a');
            await enqueueLinks({
                selector: '.product-item > a',
                label: 'DETAIL', // <= note the different label
            });

            // Now we need to find the "Next" button and enqueue the next page of results (if it exists)
            const nextButton = await page.$('a.pagination__next');
            if (nextButton) {
                await enqueueLinks({
                    selector: 'a.pagination__next',
                    label: 'CATEGORY', // <= note the same label
                });
            }
        } else {
            // This means we're on the start page, with no label.
            // On this page, we just want to enqueue all the category pages.

            await page.waitForSelector('.collection-block-item');
            await enqueueLinks({
                selector: '.collection-block-item',
                label: 'CATEGORY',
            });
        }
    },

    // Let's limit our crawls to make our tests shorter and safer.
    maxRequestsPerCrawl: 50,
});

await crawler.run(['https://warehouse-theme-metal.myshopify.com/collections']);

```

--------------------------------

### Faster Request Addition with CheerioCrawler

Source: https://crawlee.dev/js/docs/3.16/introduction/first-crawler

Shows a more concise way to start a CheerioCrawler by passing URLs directly to the crawler.run() method, simplifying the setup by implicitly managing the RequestQueue.

```javascript
import { CheerioCrawler } from 'crawlee';

const crawler = new CheerioCrawler({
    async requestHandler({ $, request }) {
        const title = $('title').text();
        console.log(`The title of "${request.url}" is: ${title}.`);
    }
})

await crawler.run(['https://crawlee.dev']);
```

--------------------------------

### Crawl Single URL with got-scraping

Source: https://crawlee.dev/js/docs/3.16/examples/crawl-single-url

Use this snippet to fetch the HTML of a web page using the got-scraping package. Ensure the 'got-scraping' package is installed. The URL is hard-coded in this example.

```javascript
import { gotScraping } from 'got-scraping';

// Get the HTML of a web page
const { body } = await gotScraping({ url: 'https://www.example.com' });
console.log(body);
```

--------------------------------

### Initialize Project with Apify CLI

Source: https://crawlee.dev/js/docs/introduction/deployment

Use this command to initialize your project for Apify. It creates a .actor folder and an actor.json file for platform configuration.

```bash
apify init
```

--------------------------------

### Crawler Setup with Router

Source: https://crawlee.dev/js/docs/3.16/introduction/refactoring

Sets up a PlaywrightCrawler using a router instance for request handling. This replaces traditional if-clause logic for better organization. Ensure Crawlee is installed and imported.

```javascript
import { PlaywrightCrawler, log } from 'crawlee';
import { router } from './routes.mjs';

log.setLevel(log.LEVELS.DEBUG);

log.debug('Setting up crawler.');
const crawler = new PlaywrightCrawler({
    requestHandler: router,
});

await crawler.run(['https://warehouse-theme-metal.myshopify.com/collections']);
```

--------------------------------

### Basic HTTP Server Setup

Source: https://crawlee.dev/js/docs/guides/running-in-web-server

Sets up a basic Node.js HTTP server that listens on port 3000 and logs incoming requests. This serves as the foundation for handling web requests.

```typescript
import { createServer } from 'http';
import { log } from 'crawlee';

const server = createServer(async (req, res) => {
    log.info(`Request received: ${req.method} ${req.url}`);

    res.writeHead(200, { 'Content-Type': 'text/plain' });
    // We will return the page title here later instead
    res.end('Hello World\n');
});

server.listen(3000, () => {
    log.info('Server is listening for user requests');
});

```

--------------------------------

### PuppeteerCrawler Recursive Crawl Example

Source: https://crawlee.dev/js/docs/examples/puppeteer-recursive-crawl

Sets up and runs a PuppeteerCrawler to recursively crawl a website starting from a given URL. It enqueues links matching a specific glob pattern and limits the number of requests.

```javascript
import { PuppeteerCrawler } from 'crawlee';

const crawler = new PuppeteerCrawler({
    async requestHandler({ request, page, enqueueLinks, log }) {
        const title = await page.title();
        log.info(`Title of ${request.url}: ${title}`);

        await enqueueLinks({
            globs: ['http?(s)://www.iana.org/**'],
        });
    },
    maxRequestsPerCrawl: 10,
});

await crawler.addRequests(['https://www.iana.org/']);

await crawler.run();

```

--------------------------------

### Puppeteer Crawler Setup and Execution

Source: https://crawlee.dev/js/docs/3.16/examples/puppeteer-crawler

Sets up and runs a PuppeteerCrawler to scrape Hacker News. It configures crawler options, defines request and failure handlers, and starts the crawl from a given URL. Results are stored in the default dataset.

```javascript
import { PuppeteerCrawler } from 'crawlee';

const crawler = new PuppeteerCrawler({
    launchContext: {
        launchOptions: {
            headless: true,
        },
    },
    maxRequestsPerCrawl: 50,
    async requestHandler({ pushData, request, page, enqueueLinks, log }) {
        log.info(`Processing ${request.url}...`);

        const data = await page.$$eval('.athing', ($posts) => {
            const scrapedData: { title?: string; rank?: string; href?: string }[] = [];

            $posts.forEach(($post) => {
                scrapedData.push({
                    title: $post.querySelector<HTMLElement>('.title a')?.innerText,
                    rank: $post.querySelector<HTMLElement>('.rank')?.innerText,
                    href: $post.querySelector<HTMLAnchorElement>('.title a')?.href,
                });
            });

            return scrapedData;
        });

        await pushData(data);

        const infos = await enqueueLinks({
            selector: '.morelink',
        });

        if (infos.processedRequests.length === 0) log.info(`${request.url} is the last page!`);
    },
    failedRequestHandler({ request, log }) {
        log.error(`Request ${request.url} failed too many times.`);
    },
});

await crawler.addRequests(['https://news.ycombinator.com/']);

await crawler.run();

console.log('Crawler finished.');

```

--------------------------------

### Migrating to Actor.init() and Actor.exit()

Source: https://crawlee.dev/js/docs/3.16/upgrading/upgrading-to-v3

Shows the equivalent of using `Actor.main()` by directly calling `Actor.init()` and `Actor.exit()`. This pattern is useful when you need more control over the initialization and exit process.

```javascript
import { Actor } from 'apify';
  
await Actor.init();
// your code
await Actor.exit('Crawling finished!');

```

--------------------------------

### Publish Project to Apify Platform

Source: https://crawlee.dev/js/docs/introduction/deployment

Run this command to package your project, upload it to Apify, and start a Docker build. You will receive a link to your new Actor upon completion.

```bash
apify push
```

--------------------------------

### Basic HTTP Server Setup

Source: https://crawlee.dev/js/docs/3.16/guides/running-in-web-server

Sets up a basic Node.js HTTP server using the built-in 'http' module. This server listens for incoming requests and logs them. It's the foundation for handling web requests that will be passed to the crawler.

```javascript
import { createServer } from 'http';
import { log } from 'crawlee';

const server = createServer(async (req, res) => {
    log.info(`Request received: ${req.method} ${req.url}`);

    res.writeHead(200, { 'Content-Type': 'text/plain' });
    // We will return the page title here later instead
    res.end('Hello World
');
});

server.listen(3000, () => {
    log.info('Server is listening for user requests');
});

```

--------------------------------

### Using Actor.main() for Initialization and Exit

Source: https://crawlee.dev/js/docs/3.16/upgrading/upgrading-to-v3

Illustrates the simplified approach to managing crawler lifecycle using `Actor.main()`, which handles initialization and exit automatically. This is the recommended way for most use cases.

```javascript
import { Actor } from 'apify';
  
await Actor.main(async () => {
  
    // your code
  
}, { statusMessage: 'Crawling finished!' });

```

--------------------------------

### Creating a Custom HTTP Client

Source: https://crawlee.dev/js/docs/guides/custom-http-client

Demonstrates how to instantiate a custom HTTP client using the `BasicCrawler` and providing a custom `request` function.

```javascript
const { BasicCrawler } = require('crawlee');

const crawler = new BasicCrawler({
    request: async ({ url, method, headers, body, response, options }) => {
        // Custom logic here
        console.log(`Requesting ${url}`);
        // You can use a library like 'axios' or 'node-fetch' here
        // For example:
        // const axios = require('axios');
        // const response = await axios.request({ url, method, headers, data: body, ...options });
        // return response.data; // Or the full response object

        // For simplicity, we'll just return a placeholder response
        return {
            body: `<html><body>Hello from custom client for ${url}</body></html>`,
            statusCode: 200,
            headers: { 'content-type': 'text/html' },
        };
    },
});

await crawler.run(['http://example.com']);
```

--------------------------------

### Initialize Project for Apify

Source: https://crawlee.dev/js/docs/3.16/introduction/deployment

Initialize your Crawlee project for deployment on the Apify Platform using the Apify CLI. This creates an `.actor` folder with `actor.json` for platform-specific configuration.

```bash
apify init  
```

--------------------------------

### HTML Link Example

Source: https://crawlee.dev/js/docs/3.16/introduction/adding-urls

Example of an anchor `<a>` element with an `href` attribute, which is the default for finding links.

```html
<a href="https://crawlee.dev/js/docs/introduction">This is a link to Crawlee introduction</a>  
```

--------------------------------

### Crawl Output Example

Source: https://crawlee.dev/js/docs/3.16/introduction/setting-up

Example log messages observed in the terminal during a crawl of the Crawlee website.

```log
INFO  PlaywrightCrawler: Starting the crawl  
INFO  PlaywrightCrawler: Title of https://crawlee.dev/ is 'Crawlee · Build reliable crawlers. Fast. | Crawlee'  
INFO  PlaywrightCrawler: Title of https://crawlee.dev/js/docs/examples is 'Examples | Crawlee'  
INFO  PlaywrightCrawler: Title of https://crawlee.dev/js/api/core is '@crawlee/core | API | Crawlee'  
INFO  PlaywrightCrawler: Title of https://crawlee.dev/js/api/core/changelog is 'Changelog | API | Crawlee'  
INFO  PlaywrightCrawler: Title of https://crawlee.dev/js/docs/quick-start is 'Quick Start | Crawlee'  
```

--------------------------------

### Use Pre-release Node.js Version

Source: https://crawlee.dev/js/docs/3.16/guides/docker-images

Demonstrates how to use a pre-release version of a Node.js image, typically denoted by a 'beta' suffix. This is useful for testing upcoming changes.

```dockerfile
# Without library version.  

FROM apify/actor-node:24-beta  
```

--------------------------------

### Configure Crawlee with crawlee.json

Source: https://crawlee.dev/js/docs/3.16/guides/configuration

Specify `ConfigurationOptions` in a `crawlee.json` file at the project root to set global configuration. This example sets the state persistence interval and log level.

```json
{
  "persistStateIntervalMillis": 10000,
  "logLevel": "DEBUG"
}
```

--------------------------------

### Push Project to Apify Platform

Source: https://crawlee.dev/js/docs/3.16/introduction/deployment

Use this command to archive your project, upload it to the Apify Platform, and start a Docker build. After completion, you will receive a link to your new Actor.

```bash
apify push  
```

--------------------------------

### Run Crawler

Source: https://crawlee.dev/js/docs/3.16/guides/request-storage

This is the basic command to start the crawler. Ensure that your crawler is properly configured before running.

```javascript
await crawler.run();
```

--------------------------------

### Install Playwright Extra and Stealth Plugin

Source: https://crawlee.dev/js/docs/3.16/examples/crawler-plugins

Install the necessary packages for using Playwright Extra and its stealth plugin.

```bash
npm install playwright-extra puppeteer-extra-plugin-stealth  
```

--------------------------------

### Install Puppeteer Extra and Stealth Plugin

Source: https://crawlee.dev/js/docs/3.16/examples/crawler-plugins

Install the necessary packages for using Puppeteer Extra and its stealth plugin.

```bash
npm install puppeteer-extra puppeteer-extra-plugin-stealth  
```

--------------------------------

### Configure Proxy with Impit HTTP Client

Source: https://crawlee.dev/js/docs/3.16/guides/impit-http-client

Demonstrates how to set up proxy configurations for requests made using the Impit HTTP Client within a CheerioCrawler. Ensure the `ProxyConfiguration` is passed to the crawler.

```javascript
import { CheerioCrawler, ProxyConfiguration } from 'crawlee';
import { ImpitHttpClient, Browser } from '@crawlee/impit-client';

const proxyConfiguration = new ProxyConfiguration({
    proxyUrls: ['http://proxy1.example.com:8080', 'http://proxy2.example.com:8080'],
});

const crawler = new CheerioCrawler({
    httpClient: new ImpitHttpClient({ browser: Browser.Chrome }),
    proxyConfiguration,
    async requestHandler({ $, request }) {
        console.log(`Scraped ${request.url}`);
    },
});

```