### Install Dependencies for Custom Metric Container Source: https://api-docs.lumar.io/docs/custom-metrics After bootstrapping your container project, install the necessary dependencies using your preferred package manager. This command should be run from the root directory of your bootstrapped container project. ```bash npm install ``` -------------------------------- ### Project Setup and Dependency Installation Source: https://api-docs.lumar.io/docs/graphql-clients/typescript-esm Steps to set up a new TypeScript project, install necessary dependencies, and download the GraphQL API schema. ```APIDOC ## Project Setup and Dependency Installation ### Description This section outlines the commands required to initialize a new TypeScript project, add essential dependencies for GraphQL operations, and download the Lumar API schema. ### Setup TypeScript project ```bash # Initialize new project npm init -y npm pkg set type="module" # Add dependencies npm add graphql @graphql-codegen/cli @graphql-codegen/typescript-operations @graphql-codegen/typescript-graphql-request graphql-request typescript ts-node # Download the latest GraphQL API schema curl https://api.lumar.io/schema.graphql > schema.graphql # Generate TypeScript config npm exec -- tsc --init --module node16 --moduleResolution node16 ``` ``` -------------------------------- ### Install Dependencies and Fetch GraphQL Schema Source: https://api-docs.lumar.io/docs/graphql-clients/python Installs the necessary gql Python library and fetches the latest GraphQL schema from the Lumar API. Ensure you have Python and pip installed, and a stable internet connection. ```shell $ pip install gql[all] $ curl https://api.lumar.io/schema.graphql > schema.graphql ``` -------------------------------- ### Install Dependencies and Fetch Schema Source: https://api-docs.lumar.io/docs/graphql-clients/python Install the necessary Python dependencies and download the latest GraphQL schema from the Lumar API. ```APIDOC ## Install Dependencies and Fetch Schema ### Description Install the `gql` Python client and fetch the GraphQL schema definition. ### Command ```bash $ pip install gql[all] $ curl https://api.lumar.io/schema.graphql > schema.graphql ``` ``` -------------------------------- ### Handler Options Source: https://api-docs.lumar.io/docs/custom-metrics Details the available options for configuring individual handlers (request, preCrawl, postCrawl). ```APIDOC ## Handler Options ### Description Each handler entry is validated by `CrawlCustomMetricContainerHandlerSchema`. The following keys are available for configuration: ### Available Options * **`handler`** (string) - Required. The name of the exported function in the entrypoint module. * **`entrypoint`** (string) - Path to the TypeScript/JavaScript file that exports the handler. If omitted, the CLI uses the container-level `entrypoint`. * **`timeoutMs`** (number) - Per-handler timeout in milliseconds. Allows for longer Puppeteer operations or enforces faster failures. * **`metricsTypeName`** (string) - The name of the TypeScript interface describing this handler’s return shape. Must be exported from the file referenced by `metricsTypePath` (or the entrypoint if no path override is provided). * **`metricsTypeNames`** (object) - A record used by multi-output handlers (`outputType: "multi-output"`) to map each logical metric key to a TypeScript type (e.g., `{ product: "IProductMetrics" }`). * **`metricsTypePath`** (string) - The file that exports the type(s) referenced by `metricsTypeName` or `metricsTypeNames`. Defaults to the handler’s entrypoint. * **`metricsMetadata`** (object) - Optional metadata overrides scoped to this handler. Follows the structure documented in Providing extra metadata for custom metrics for UI. * **`externalPackages`** (array) - An array of native dependencies (e.g., `["sharp"]`) that must be installed alongside the bundle for runtime extraction to load them. * **`tableType`** (string) - The target storage table when the handler emits a single record type. Choose from `CustomMetricContainerTableType` (e.g., `"dc:crawler:project_metrics:item"` for crawl-level data). * **`tableTypes`** (object) - A record that allows multi-output handlers to map individual metric keys to specific table types (e.g., `{ summary: "dc:crawler:project_metrics:item" }`). * **`outputType`** (string) - Options are: * `"single-output"` (default): The handler returns one object per URL. * `"multi-output"`: Signals that the handler returns multiple named metric objects and requires `metricsTypeNames` / `metricsSchemas` hints. * **`metricsSchema`** (string) - Path to a JSON Schema file describing the handler’s output. Useful for publishing type-safe data from plain JavaScript handlers. * **`metricsSchemas`** (object) - A record of JSON Schema paths for multi-output handlers (one schema per metric key). * **`skipForSpr`** (boolean) - Only valid for `preCrawl` and `postCrawl` handlers. When `true`, the handler is not executed during Single Page Requester (SPR) runs, allowing for lightweight requests. > **Tip:** Handler-level options take precedence over legacy top-level fields. You can migrate gradually by moving one handler at a time into the `handlers` object. ``` -------------------------------- ### Build and Upload CustomMetricContainer Source: https://api-docs.lumar.io/docs/custom-metrics Builds and uploads a new version of your CustomMetricContainer. This process prepares your container for release and linking with Lumar projects. ```bash npm run build npm run upload ``` -------------------------------- ### Example: Getting Paginated Projects Source: https://api-docs.lumar.io/docs/graphql/pagination Demonstrates how to fetch the first page of projects using a GraphQL query and shows the expected response and cURL command. ```APIDOC ## Example - getting paginated projects We run the following query: ### Query ```graphql query firstPage { me { accounts(first: 1) { nodes { projects(first: 3) { pageInfo { hasNextPage endCursor } nodes { name } } } } } } ``` ### Response ```json { "data": { "me": { "accounts": { "nodes": [ { "projects": { "pageInfo": { "hasNextPage": true, "endCursor": "Mw" }, "nodes": [ { "name": "Cool Project" }, { "name": "Deep Crawling" }, { "name": "Pro Ject" } ] } } ] } } } } ``` ### cURL ```bash curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query firstPage { me { accounts(first: 1) { nodes { projects(first: 3) { pageInfo { hasNextPage endCursor } nodes { name } } } } } "}' https://api.lumar.io/graphql ``` ``` -------------------------------- ### Setup TypeScript Project and Install Dependencies Source: https://api-docs.lumar.io/docs/graphql-clients/typescript Initializes a new Node.js project, adds necessary dependencies for GraphQL code generation and requests, downloads the API schema, and generates a TypeScript configuration file. ```bash # Initialize new project $ npm init -y # Add dependencies $ npm add graphql @graphql-codegen/cli @graphql-codegen/typescript-operations @graphql-codegen/typescript-graphql-request graphql-request typescript ts-node # Download the latest GraphQL API schema $ curl https://api.lumar.io/schema.graphql > schema.graphql # Generate TypeScript config $ npm exec -- tsc --init ``` -------------------------------- ### Start Lumar Protect Build on Linux Source: https://api-docs.lumar.io/docs/protect/ci/ci-cli-tools Example command to start a Lumar Protect build on a Linux system using the Deepcrawl Automate CLI. Requires the test suite ID and user key credentials. ```bash ./deepcrawl-test-linux --testSuiteId=TEST_SUTE_ID --userKeyId=USER_KEY_ID --userKeySecret=USER_KEY_SECRET ``` -------------------------------- ### Start Lumar Protect Build on Windows Source: https://api-docs.lumar.io/docs/protect/ci/ci-cli-tools Example command to start a Lumar Protect build on a Windows system using the Deepcrawl Automate CLI. Requires the test suite ID and user key credentials. ```batch deepcrawl-test-win.exe --testSuiteId=TEST_SUTE_ID --userKeyId=USER_KEY_ID --userKeySecret=USER_KEY_SECRET ``` -------------------------------- ### Example: Getting the Second Page of Projects Source: https://api-docs.lumar.io/docs/graphql/pagination Shows how to retrieve the subsequent page of projects by using the `endCursor` from the previous response as the `after` parameter. Includes query, response, and cURL. ```APIDOC ## Example - getting the second page of projects We can then get the second page with the query: ### Query ```graphql query secondPage { me { accounts(first: 1) { nodes { projects(first: 3, after: "Mw") { pageInfo { hasNextPage endCursor } nodes { name } } } } } } ``` ### Response ```json { "data": { "me": { "accounts": { "nodes": [ { "projects": { "pageInfo": { "hasNextPage": true, "endCursor": "Ng" }, "nodes": [ { "name": "Dev.io: Main" }, { "name": "Dev.io: Blog" }, { "name": "Some Project" } ] } } ] } } } } ``` ### cURL ```bash curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query secondPage { me { accounts(first: 1) { nodes { projects(first: 3, after: \"Mw\") { pageInfo { hasNextPage endCursor } nodes { name } } } } } "}' https://api.lumar.io/graphql ``` ``` -------------------------------- ### Test Custom Metrics with Single Page Requester Source: https://api-docs.lumar.io/docs/custom-metrics Tests custom metrics by making a request to a specific URL for a given project. This allows you to preview the output of your custom metrics before a full crawl. ```bash npm run oreo project request-custom-metrics -- --projectId 123456 --url http://example.com/ ``` -------------------------------- ### Login to CLI Programmatically Source: https://api-docs.lumar.io/docs/custom-metrics Demonstrates how to log in to the Lumar CLI programmatically using environment variables for account ID, API key ID, and API key secret. This is essential for automated workflows in CI/CD. ```bash npm run oreo login -- --id ${{ secrets.API_KEY_ID }} --secret ${{ secrets.API_KEY_SECRET }} --accountId ${{ secrets.ACCOUNT_ID }} ``` -------------------------------- ### Create And Run Build Source: https://api-docs.lumar.io/docs/category/mutations Creates a build and immediately runs a crawl for it. ```APIDOC ## POST /api/build/create_and_run ### Description Creates a build and immediately runs a crawl for it. ### Method POST ### Endpoint /api/build/create_and_run ### Parameters #### Request Body - **project_id** (string) - Required - The ID of the project for which to create the build. ### Request Example ```json { "project_id": "proj_abc123" } ``` ### Response #### Success Response (200) - **build_id** (string) - The ID of the newly created build. - **message** (string) - Success message indicating the build was created and run. #### Response Example ```json { "build_id": "build_def456", "message": "Build created and crawl initiated successfully." } ``` ``` -------------------------------- ### TypeScript/JavaScript Metric Extraction Handler Source: https://api-docs.lumar.io/docs/custom-metrics Defines the interface for the metric's output and the handler function that extracts data. This example uses Puppeteer to get the current page URL. The handler takes an input object and context, returning a promise that resolves to the specified metrics interface. ```typescript export interface IMetrics extends MetricScriptBasicOutput { url: string; } export const handler: MetricScriptHandler = async (input, _context) => { return { url: input.page.url(), }; }; ``` -------------------------------- ### TypeScript Configuration for Native Dependencies Source: https://api-docs.lumar.io/docs/custom-metrics This TypeScript configuration includes 'sharp' in the 'externalPackages' array, indicating it as a native dependency for the custom metrics. Ensure 'sharp' is also listed in your package.json dependencies for correct installation. ```typescript import type { IContainerConfigData } from "@deepcrawl/oreo"; const config: IContainerConfigData = { externalPackages: ["sharp"], }; export default config; ``` -------------------------------- ### Full .oreorc.ts Configuration Example Source: https://api-docs.lumar.io/docs/custom-metrics Demonstrates a comprehensive .oreorc.ts configuration file, including top-level fields like id, secretsTypeName, paramsTypeName, allowedRenderingResources, navigationTimeoutMs, and detailed handler configurations for request and postCrawl phases. ```typescript import type { IContainerConfigData } from "@deepcrawl/oreo"; const config: IContainerConfigData = { id: "ccc_123", secretsTypeName: "ContainerSecrets", secretsTypePath: "src/secrets.ts", paramsTypeName: "RunParams", paramsTypePath: "src/params.ts", allowedRenderingResources: ["Image"], navigationTimeoutMs: 120000, handlers: { request: { entrypoint: "src/index.ts", handler: "handler", metricsTypeName: "IMetrics", }, postCrawl: { entrypoint: "src/post-crawl.ts", handler: "postCrawlHandler", skipForSpr: true, }, }, }; export default config; ``` -------------------------------- ### Project Setup and Dependencies Source: https://api-docs.lumar.io/docs/graphql-clients/typescript Commands to initialize a new TypeScript project, add necessary dependencies, download the GraphQL schema, and generate a TypeScript configuration file. ```APIDOC ## Project Setup and Dependencies ### Description Commands to initialize a new TypeScript project, add necessary dependencies, download the GraphQL schema, and generate a TypeScript configuration file. ### Method Shell Commands ### Endpoint N/A ### Parameters N/A ### Request Example ```bash # Initialize new project $ npm init -y # Add dependencies $ npm add graphql @graphql-codegen/cli @graphql-codegen/typescript-operations @graphql-codegen/typescript-graphql-request graphql-request typescript ts-node # Download the latest GraphQL API schema $ curl https://api.lumar.io/schema.graphql > schema.graphql # Generate TypeScript config $ npm exec -- tsc --init ``` ### Response N/A ### Response Example N/A ``` -------------------------------- ### Setup TypeScript Project with Dependencies Source: https://api-docs.lumar.io/docs/graphql-clients/typescript-esm Initializes a new Node.js project with TypeScript support and installs necessary dependencies for GraphQL code generation and API requests. This includes graphql-codegen, graphql-request, and TypeScript itself. It also downloads the API schema and initializes the TypeScript configuration. ```bash # Initialize new project $ npm init -y $ npm pkg set type="module" # Add dependencies $ npm add graphql @graphql-codegen/cli @graphql-codegen/typescript-operations @graphql-codegen/typescript-graphql-request graphql-request typescript ts-node # Download the latest GraphQL API schema $ curl https://api.lumar.io/schema.graphql > schema.graphql # Generate TypeScript config $ npm exec -- tsc --init --module node16 --moduleResolution node16 ``` -------------------------------- ### Start Lumar Protect Build on MacOS Source: https://api-docs.lumar.io/docs/protect/ci/ci-cli-tools Example command to start a Lumar Protect build on a MacOS system using the Deepcrawl Automate CLI. Requires the test suite ID and user key credentials. ```bash ./deepcrawl-test-macos --testSuiteId=TEST_SUTE_ID --userKeyId=USER_KEY_ID --userKeySecret=USER_KEY_SECRET ``` -------------------------------- ### Create SEO Project cURL Example Source: https://api-docs.lumar.io/docs/graphql/create-project This cURL command demonstrates how to send a request to create an SEO project. It includes the endpoint, HTTP method, headers, and the JSON payload containing the mutation and variables. ```bash curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"mutation CreateSEOProject($input: CreateProjectInput!) { createSEOProject(input: $input) { project { ...ProjectDetails } } } fragment ProjectDetails on Project { id name primaryDomain # ...other fields you want to retrieve }","variables":{"input":{"accountId":"TjAwN0FjY291bnQ3MTU","name":"www.lumar.io SEO Project","primaryDomain":"https://www.lumar.io/"}}}' https://api.lumar.io/graphql ``` -------------------------------- ### Set Up Environment Variables for Authentication Source: https://api-docs.lumar.io/docs/graphql-clients/python Configure environment variables with your Lumar API User Key ID and Secret for authentication. ```APIDOC ## Setup ENV variables for authenticating with Lumar API ### Description Create User keys to use for API authentication by setting the following environment variables. ### Environment Variables ```bash export DEEPCRAWL_SECRET_ID="YourSecretId" export DEEPCRAWL_SECRET="YourSecret" ``` ``` -------------------------------- ### Create a Crawl in Lumar Source: https://api-docs.lumar.io/docs/custom-metrics Initiates a crawl for a specified project. Once a CustomMetricContainer is linked, custom metrics will be extracted during this crawl. ```bash npm run oreo crawl create -- --projectId 123456 ``` -------------------------------- ### Create Basic Project cURL Example Source: https://api-docs.lumar.io/docs/graphql/create-project This is a cURL command to execute the createBasicProject mutation. It includes the necessary headers and the JSON payload for the mutation and its variables. ```bash curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"mutation CreateBasicProject($input: CreateProjectInput!) { createBasicProject(input: $input) { project { ...ProjectDetails } } } fragment ProjectDetails on Project { id name primaryDomain # ...other fields you want to retrieve }","variables":{"input":{"accountId":"TjAwN0FjY291bnQ3MTU","name":"www.lumar.io Basic Project","primaryDomain":"https://www.lumar.io/"}}}' https://api.lumar.io/graphql ``` -------------------------------- ### Create Logzio Project Mutation Source: https://api-docs.lumar.io/docs/schema/inputs/create-logzio-project-query-input This section details the `createLogzioProjectQuery` mutation, which is used to create a new project in Logzio. It outlines the required input parameters and their types. ```APIDOC ## POST /graphql ### Description This mutation creates a new project in Logzio. It requires a detailed input object specifying various configuration parameters for the project. ### Method POST ### Endpoint /graphql ### Parameters #### Request Body - **query** (String!) - The GraphQL query string for creating the project. - **variables** (JSONObject) - A JSON object containing the variables for the mutation, including the `CreateLogzioProjectQueryInput`. ### Request Example ```json { "query": "mutation CreateLogzioProject($input: CreateLogzioProjectQueryInput!) { createLogzioProject(input: $input) { id name } }", "variables": { "input": { "aiUaRegexp": "string", "dateRange": 0, "desktopUaRegexp": "string", "enabled": true, "logzioConnectionId": "objectId", "maxRows": 0, "mobileUaRegexp": "string", "pathFieldName": "string", "projectId": "objectId", "useLastCrawlDate": true, "userAgentFieldName": "string" } } } ``` ### Response #### Success Response (200) - **data** (JSONObject) - The response data containing the created project's ID and name. - **createLogzioProject** (JSONObject) - **id** (ID!) - The unique identifier of the newly created project. - **name** (String!) - The name of the newly created project. #### Response Example ```json { "data": { "createLogzioProject": { "id": "60d5ecf1a1b2c3d4e5f6a7b8", "name": "New Logzio Project" } } } ``` ``` -------------------------------- ### Set Project-Level Secret (GraphQL) Source: https://api-docs.lumar.io/docs/custom-metrics Configure secrets at the project level using the `setCustomMetricContainerProjectSecret` mutation. This allows for project-specific overrides of container-level secrets. ```APIDOC ## POST /graphql ### Description Sets a secret at the project level for a custom metric container. This secret overrides any container-level secret with the same name for the specified project. ### Method POST ### Endpoint https://api.lumar.io/graphql ### Parameters #### Query Parameters None #### Request Body - **query** (string) - Required - The GraphQL mutation string. - **variables** (object) - Required - The variables for the mutation. - **input** (object) - Required - Input object for the mutation. - **projectId** (ID) - Required - The ID of the project. - **customMetricContainerId** (ID) - Required - The ID of the custom metric container. - **name** (string) - Required - The name of the secret. - **value** (string) - Required - The value of the secret. ### Request Example ```json { "query": "mutation setCustomMetricContainerProjectSecret( $input: SetCustomMetricContainerProjectSecretInput! ) { setCustomMetricContainerProjectSecret(input: $input) { customMetricContainerProjectSecret { name } } }", "variables": { "input": { "projectId": 1, "customMetricContainerId": 1, "name": "OPENAI_APIKEY", "value": "MY API SECRET KEY" } } } ``` ### Response #### Success Response (200) - **data** (object) - **setCustomMetricContainerProjectSecret** (object) - **customMetricContainerProjectSecret** (object) - **name** (string) - The name of the secret that was set. #### Response Example ```json { "data": { "setCustomMetricContainerProjectSecret": { "customMetricContainerProjectSecret": { "name": "OPENAI_APIKEY" } } } } ``` ``` -------------------------------- ### Set Container-Level Secret (GraphQL) Source: https://api-docs.lumar.io/docs/custom-metrics Use the `setCustomMetricContainerSecret` mutation to configure secrets at the container level. These secrets are inherited by all linked projects by default. ```APIDOC ## POST /graphql ### Description Sets a secret at the container level for a custom metric container. This secret will be available as an environment variable within the container and will be inherited by all linked projects. ### Method POST ### Endpoint https://api.lumar.io/graphql ### Parameters #### Query Parameters None #### Request Body - **query** (string) - Required - The GraphQL mutation string. - **variables** (object) - Required - The variables for the mutation. - **input** (object) - Required - Input object for the mutation. - **customMetricContainerId** (ID) - Required - The ID of the custom metric container. - **name** (string) - Required - The name of the secret. - **value** (string) - Required - The value of the secret. ### Request Example ```json { "query": "mutation setCustomMetricContainerSecret( $input: SetCustomMetricContainerSecretInput! ) { setCustomMetricContainerSecret(input: $input) { customMetricContainerSecret { name } } }", "variables": { "input": { "customMetricContainerId": 1, "name": "OPENAI_APIKEY", "value": "MY API SECRET KEY" } } } ``` ### Response #### Success Response (200) - **data** (object) - **setCustomMetricContainerSecret** (object) - **customMetricContainerSecret** (object) - **name** (string) - The name of the secret that was set. #### Response Example ```json { "data": { "setCustomMetricContainerSecret": { "customMetricContainerSecret": { "name": "OPENAI_APIKEY" } } } } ``` ``` -------------------------------- ### Create Test Suite - cURL Example Source: https://api-docs.lumar.io/docs/protect/test-suites/create-test-suites This cURL command demonstrates how to execute the CreateTestSuite GraphQL mutation. It includes necessary headers for authentication and content type, along with the mutation payload. Ensure you replace 'YOUR_API_SESSION_TOKEN' with your actual authentication token. ```curl curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"mutation CreateTestSuite($accountId: ObjectID!, $name: String!, $sitePrimary: String!) { createTestSuite( input: { accountId: $accountId, name: $name, sitePrimary: $sitePrimary } ) { testSuite { id name sitePrimary } } }"}' https://api.lumar.io/graphql ``` -------------------------------- ### Native Dependencies Configuration Source: https://api-docs.lumar.io/docs/custom-metrics This section explains how to include native dependencies in your custom metrics by listing them in the `externalPackages` array within your configuration files. ```APIDOC ## Native Dependencies Configuration ### Description You can use native dependencies in your custom metrics by including them in the `externalPackages` array in the `.oreorc.json` or `.oreorc.ts` file. You also need to have them in your `package.json` dependencies so Deepcrawl can install the correct version. ### Configuration Files - `.oreorc.json` - `.oreorc.ts ### JSON Configuration Example ```json { "externalPackages": ["sharp"] } ``` ### TypeScript Configuration Example ```typescript import type { IContainerConfigData } from "@deepcrawl/oreo"; const config: IContainerConfigData = { externalPackages: ["sharp"], }; export default config; ``` ``` -------------------------------- ### Project Creation Input Parameters Source: https://api-docs.lumar.io/docs/schema/inputs/create-project-input This section outlines the various input parameters available when creating a new website project. These parameters allow for detailed configuration of crawling behavior, data extraction, and other project-specific settings. ```APIDOC ## Project Creation Input Parameters ### Description This section details the input parameters for the `CreateProject` mutation, covering crawling configurations, data handling, and general project settings. ### Method POST ### Endpoint `/graphql` ### Parameters #### Request Body - **block3rdPartyCookies** (Boolean!) - Non-null. Whether to block third-party cookies during crawling. - **compareToCrawl** (CompareToCrawlType!) - Non-null. Specifies the crawl comparison type. - **crawlDisallowedUrls1stLevel** (Boolean!) - Non-null. Whether to crawl disallowed URLs on the first level. - **crawlHyperlinksExternal** (Boolean!) - Non-null. Whether to crawl external hyperlinks. - **crawlHyperlinksInternal** (Boolean!) - Non-null. Whether to crawl internal hyperlinks. - **crawlImagesExternal** (Boolean!) - Non-null. Whether to crawl external images. - **crawlImagesInternal** (Boolean!) - Non-null. Whether to crawl internal images. - **crawlNofollowHyperlinks** (Boolean!) - Non-null. Whether to crawl hyperlinks with `nofollow` attribute. - **crawlNonHtml** (Boolean!) - Non-null. Whether to crawl non-HTML files. - **crawlNotIncluded1stLevel** (Boolean!) - Non-null. Whether to crawl URLs not explicitly included on the first level. - **crawlRedirectsExternal** (Boolean!) - Non-null. Whether to follow external redirects. - **crawlRedirectsInternal** (Boolean!) - Non-null. Whether to follow internal redirects. - **crawlRelAmphtmlExternal** (Boolean!) - Non-null. Whether to crawl external `amphtml` links. - **crawlRelAmphtmlInternal** (Boolean!) - Non-null. Whether to crawl internal `amphtml` links. - **crawlRelCanonicalsExternal** (Boolean!) - Non-null. Whether to crawl external canonical links. - **crawlRelCanonicalsInternal** (Boolean!) - Non-null. Whether to crawl internal canonical links. - **crawlRelHreflangsExternal** (Boolean!) - Non-null. Whether to crawl external `hreflang` links. - **crawlRelHreflangsInternal** (Boolean!) - Non-null. Whether to crawl internal `hreflang` links. - **crawlRelMobileExternal** (Boolean!) - Non-null. Whether to crawl external mobile links. - **crawlRelMobileInternal** (Boolean!) - Non-null. Whether to crawl internal mobile links. - **crawlRelNextPrevExternal** (Boolean!) - Non-null. Whether to crawl external next/prev links. - **crawlRelNextPrevInternal** (Boolean!) - Non-null. Whether to crawl internal next/prev links. - **crawlRobotsTxtNoindex** (Boolean!) - Non-null. Whether to respect `noindex` directives in `robots.txt`. - **crawlScriptsExternal** (Boolean!) - Non-null. Whether to crawl external scripts. - **crawlScriptsInternal** (Boolean!) - Non-null. Whether to crawl internal scripts. - **crawlStylesheetsExternal** (Boolean!) - Non-null. Whether to crawl external stylesheets. - **crawlStylesheetsInternal** (Boolean!) - Non-null. Whether to crawl internal stylesheets. - **crawlTestSite** (Boolean!) - Non-null. Whether to perform a test crawl of the site. - **crawlTypes** ([CrawlType!]!) - Non-null. List of crawl types to perform. - **customDns** ([CustomDnsSettingInput!]!) - Non-null. Custom DNS settings. - **customExtractions** ([CustomExtractionSettingInput!]!) - Non-null. Custom extraction rules. - **customRequestHeaders** ([CustomRequestHeaderInput!]!) - Non-null. Custom request headers. - **dataLayerName** (String) - The name of the data layer. - **dataOnlyCrawlTypes** ([CrawlType!]) - List of crawl types for data-only crawls. - **discoverSitemapsInRobotsTxt** (Boolean!) - Non-null. Whether to discover sitemaps in `robots.txt`. - **duplicatePrecision** (Float!) - Non-null. Precision level for detecting duplicate content. - **emptyPageThreshold** (Int!) - Non-null. Threshold for considering a page empty. - **enableKeyValueStore** (Boolean!) - Non-null. Whether to enable the key-value store. - **excludeUrlPatterns** ([String!]!) - Non-null. List of URL patterns to exclude from crawling. - **excludedDatasources** ([DatasourceCode!]) - List of data sources to exclude. - **failureRateLimitEnabled** (Boolean!) - Non-null. Whether failure rate limiting is enabled. - **failureRateLookbackWindow** (Int!) - Non-null. Lookback window for failure rate calculation. - **failureRateThreshold** (Float!) - Non-null. Threshold for failure rate. - **flattenIframes** (Boolean!) - Non-null. Whether to flatten iframe content. - **flattenShadowDom** (Boolean!) - Non-null. Whether to flatten shadow DOM content. - **gaDateRange** (Int!) - Non-null. Date range for Google Analytics data. - **ignoreInvalidSSLCertificate** (Boolean!) - Non-null. Whether to ignore invalid SSL certificates. - **ignoreRobotsForNavigationRequests** (Boolean!) - Non-null. Whether to ignore `robots.txt` for navigation requests. - **ignoreXRobots** (Boolean!) - Non-null. Whether to ignore `X-Robots-Tag` headers. - **includeBestPractices** (Boolean!) - Non-null. Whether to include best practices checks. - **includeHttpAndHttps** (Boolean!) - Non-null. Whether to include both HTTP and HTTPS versions of URLs. - **includeSubdomains** (Boolean!) - Non-null. Whether to include subdomains in crawling. - **includeUrlPatterns** ([String!]!) - Non-null. List of URL patterns to include in crawling. - **industryCode** (String) - The industry code for the project. - **limitLevelsMax** (Int) - Maximum crawl depth. - **limitPagesMax** (Int) - Maximum number of pages to crawl. - **locationCode** (LocationCode!) - Non-null. The location code for the project. - **logSummaryRequestsHigh** (Int!) - Non-null. High threshold for logging summary requests. - **logSummaryRequestsLow** (Int!) - Non-null. Low threshold for logging summary requests. - **maxBodyContentLength** (Int!) - Non-null. Maximum body content length to process. ### Request Example ```json { "query": "mutation CreateProject($input: CreateProjectInput!) { createProject(input: $input) { id name } }", "variables": { "input": { "block3rdPartyCookies": true, "compareToCrawl": "ALL_PAGES", "crawlDisallowedUrls1stLevel": false, "crawlHyperlinksExternal": true, "crawlHyperlinksInternal": true, "crawlImagesExternal": false, "crawlImagesInternal": true, "crawlNofollowHyperlinks": false, "crawlNonHtml": false, "crawlNotIncluded1stLevel": false, "crawlRedirectsExternal": true, "crawlRedirectsInternal": true, "crawlRelAmphtmlExternal": false, "crawlRelAmphtmlInternal": false, "crawlRelCanonicalsExternal": true, "crawlRelCanonicalsInternal": true, "crawlRelHreflangsExternal": false, "crawlRelHreflangsInternal": false, "crawlRelMobileExternal": false, "crawlRelMobileInternal": false, "crawlRelNextPrevExternal": false, "crawlRelNextPrevInternal": false, "crawlRobotsTxtNoindex": true, "crawlScriptsExternal": false, "crawlScriptsInternal": false, "crawlStylesheetsExternal": false, "crawlStylesheetsInternal": false, "crawlTestSite": false, "crawlTypes": ["ALL_PAGES"], "customDns": [], "customExtractions": [], "customRequestHeaders": [], "dataLayerName": "dataLayer", "dataOnlyCrawlTypes": [], "discoverSitemapsInRobotsTxt": true, "duplicatePrecision": 0.9, "emptyPageThreshold": 100, "enableKeyValueStore": true, "excludeUrlPatterns": [], "excludedDatasources": [], "failureRateLimitEnabled": false, "failureRateLookbackWindow": 60, "failureRateThreshold": 0.1, "flattenIframes": false, "flattenShadowDom": false, "gaDateRange": 30, "ignoreInvalidSSLCertificate": false, "ignoreRobotsForNavigationRequests": false, "ignoreXRobots": false, "includeBestPractices": true, "includeHttpAndHttps": false, "includeSubdomains": false, "includeUrlPatterns": ["https://example.com/*"], "industryCode": "tech", "limitLevelsMax": 10, "limitPagesMax": 1000, "locationCode": "US", "logSummaryRequestsHigh": 500, "logSummaryRequestsLow": 100, "maxBodyContentLength": 1048576 } } } ``` ### Response #### Success Response (200) - **id** (String) - The unique identifier of the created project. - **name** (String) - The name of the created project. #### Response Example ```json { "data": { "createProject": { "id": "proj_12345", "name": "Example Website Project" } } } ``` ``` -------------------------------- ### Handler Configuration with Secrets and Params Source: https://api-docs.lumar.io/docs/custom-metrics This section explains how to pair secrets and parameters types with handler configurations to enable type-safe access to runtime data. ```APIDOC ## Handler Configuration with Secrets and Params ### Description Pairing `secretsTypeName` / `secretsTypePath` with `paramsTypeName` / `paramsTypePath` allows the CLI to identify the exact TypeScript interfaces for secrets and runtime parameters your handlers expect. Secrets become available at runtime via `process.env`, and parameters are exposed on each handler invocation through `context.params`. Wiring these interfaces into `MetricScriptHandler` generics provides full IntelliSense for both container input (`IRequestContainerInput`) and the parameter structure. ### Example Configuration ```typescript import type { IContainerConfigData } from "@deepcrawl/oreo"; const config: IContainerConfigData = { secretsTypeName: "MySecrets", secretsTypePath: "src/metrics-types.ts", paramsTypeName: "MyParams", paramsTypePath: "src/metrics-types.ts", handlers: { request: { entrypoint: "src/index.ts", handler: "myHandler", metricsTypeName: "MyMetrics", }, }, }; ``` ### TypeScript Interfaces Example ```typescript import type { IRequestContainerInput, MetricScriptHandler, MetricScriptParamsType, MetricScriptSecretsType, } from "@deepcrawl/custom-metric-types"; import { MetricScriptBasicOutput } from "@deepcrawl/custom-metric-types"; export interface MySecrets extends MetricScriptSecretsType { OPENAI_API_KEY: string | null | undefined; } export interface MyParams extends MetricScriptParamsType { extractionRegex: string | null | undefined; } export interface MyMetrics extends MetricScriptBasicOutput { /** * @title Page Title * @description The title of the page. */ pageTitle: string; } export const myHandler: MetricScriptHandler = (input, context) => { const openAiKey = process.env["OPENAI_API_KEY"]; if (input.phase === "request" && input.resourceType === "document") { return { pageTitle: document.title, }; } return undefined; }; ``` ``` -------------------------------- ### Create Google Search Console Configuration Source: https://api-docs.lumar.io/docs/schema/inputs/create-google-search-console-configuration-input Creates a new Google Search Console configuration for a given project. This endpoint allows detailed specification of search query filters, date ranges, and property settings. ```APIDOC ## POST /websites/api-docs_lumar_io/createGoogleSearchConsoleConfiguration ### Description Creates a new Google Search Console configuration with specified parameters. ### Method POST ### Endpoint /websites/api-docs_lumar_io/createGoogleSearchConsoleConfiguration ### Parameters #### Request Body - **country** (String) - Optional - The country to filter search results. - **excludeQueries** (Array[String]) - Optional - A list of search queries to exclude. - **includeQueries** (Array[String]) - Optional - A list of search queries to include. - **lastNDays** (Int!) - Required - The number of last days to consider for data. - **minClicks** (Int!) - Required - The minimum number of clicks for a search query to be included. - **projectId** (ObjectID!) - Required - The ID of the project to associate this configuration with. - **searchQueriesMinClicks** (Int!) - Required - Minimum clicks for search queries. - **searchType** (GoogleSearchConsoleSearchType!) - Required - The type of search to perform (e.g., `web`, `images`). - **useSearchConsolePropertyDomainsAsStartUrls** (Boolean!) - Required - Whether to use search console property domains as start URLs. ### Request Example ```json { "country": "USA", "excludeQueries": ["example query 1", "another query"], "includeQueries": ["specific query"], "lastNDays": 30, "minClicks": 100, "projectId": "60d5ec49f2c3a4001f8f1b3d", "searchQueriesMinClicks": 50, "searchType": "web", "useSearchConsolePropertyDomainsAsStartUrls": true } ``` ### Response #### Success Response (200) - **success** (Boolean) - Indicates if the operation was successful. - **message** (String) - A message describing the result of the operation. #### Response Example ```json { "success": true, "message": "Google Search Console configuration created successfully." } ``` ``` -------------------------------- ### Set Project-Level Secret (CLI) Source: https://api-docs.lumar.io/docs/custom-metrics Override container-level secrets for a specific project using the CLI command. This is useful when a project requires a unique secret value. ```APIDOC ## CLI Command ### Description Sets a secret at the project level using the `oreo` CLI. This will override any container-level secret with the same name for the specified project. ### Command ```bash npm run oreo metric secret set -- --name --projectId --value ``` ### Parameters - **--name** (string) - Required - The name of the secret. - **--projectId** (ID) - Required - The ID of the project. - **--value** (string) - Required - The value of the secret. ### Example ```bash npm run oreo metric secret set -- --name OPENAI_APIKEY --projectId 123456 --value "mySecretKey" ``` ``` -------------------------------- ### Create Basic Project Source: https://api-docs.lumar.io/docs/graphql/create-project Creates a Basic project which generates crawl metrics and allows for custom metrics and reports. ```APIDOC ## POST /api/graphql ### Description Creates a Basic project, suitable for generating crawl metrics and adding extensions for custom metrics and reports. ### Method POST ### Endpoint /api/graphql ### Parameters #### Request Body - **input** (CreateProjectInput!) - Required - Input object for creating a project. - **accountId** (String!) - Required - The account ID in which to create the project. - **name** (String!) - Required - The name of the project. - **primaryDomain** (String!) - Required - The primary domain of the project. ### Request Example ```json { "query": "mutation CreateBasicProject($input: CreateProjectInput!) { createBasicProject(input: $input) { project { ...ProjectDetails } } } fragment ProjectDetails on Project { id name primaryDomain }", "variables": { "input": { "accountId": "TjAwN0FjY291bnQ3MTU", "name": "www.lumar.io Basic Project", "primaryDomain": "https://www.lumar.io/" } } } ``` ### Response #### Success Response (200) - **data.createBasicProject.project.id** (String) - The unique identifier for the created project. - **data.createBasicProject.project.name** (String) - The name of the created project. - **data.createBasicProject.project.primaryDomain** (String) - The primary domain of the created project. #### Response Example ```json { "data": { "createBasicProject": { "project": { "id": "TjAwN1Byb2plY3Q2MTMz", "name": "www.lumar.io Basic Project", "primaryDomain": "https://www.lumar.io/" } } } } ``` ``` -------------------------------- ### Create Basic Project Mutation and Variables Source: https://api-docs.lumar.io/docs/graphql/create-project This snippet shows the GraphQL mutation for creating a basic project and the corresponding variables object. It requires accountId, name, and primaryDomain. Optional fields can be explored in CreateProjectInput. ```graphql mutation CreateBasicProject($input: CreateProjectInput!) { createBasicProject(input: $input) { project { ...ProjectDetails } } } fragment ProjectDetails on Project { id name primaryDomain # ...other fields you want to retrieve } ``` ```json { "input": { "accountId": "TjAwN0FjY291bnQ3MTU", "name": "www.lumar.io Basic Project", "primaryDomain": "https://www.lumar.io/" } } ```