### Getting Started Options

Source: https://github.com/capevace/data-wizard-docs/blob/main/introduction.mdx

Choose between Data Wizard Cloud for an easy, hosted experience or self-hosting with a Docker Container for full control and integration capabilities.

```html
<CardGroup cols={2}>
    <Card title="Try it now with Data Wizard Cloud" icon="cloud" href="./cloud">
        The easiest way to use Data Wizard! A hosted, ready-to-use version that requires no installation.
    </Card>

    <Card title="Self-Host with Docker Container" icon="docker" href="./quick-start">
        For full control and integration capabilities. Install and run Data Wizard locally or on your own infrastructure using Docker. Ideal for developers.
    </Card>
</CardGroup>
```

--------------------------------

### Deploy Data Wizard with Docker Compose

Source: https://github.com/capevace/data-wizard-docs/blob/main/quick-start.mdx

Defines and deploys the Data Wizard service using Docker Compose, configuring ports, volumes, and environment variables for a complete setup.

```yaml
version: '3.8'\
\
services:\
  data-wizard:\
    name: data-wizard\
    image: mateffy/data-wizard:latest\
    ports:\
      - "9090:80"\
      - "4430:443"\
      - "4430:443/udp"\
    volumes:\
      - data_wizard_storage:/app/storage\
      - data_wizard_sqlite_data:/app/database\
      - data_wizard_caddy_data:/data\
      - data_wizard_caddy_config:/config\
    environment:\
      - APP_KEY=base64:[REPLACE_WITH_KEY]\
\
    volumes:\
      data_wizard_storage:\
      data_wizard_sqlite_data:\
      data_wizard_caddy_data:\
      data_wizard_caddy_config:
```

--------------------------------

### Install Mintlify CLI

Source: https://github.com/capevace/data-wizard-docs/blob/main/README.md

Installs the Mintlify Command Line Interface globally using npm. This tool is essential for previewing documentation changes locally.

```bash
npm i -g mintlify
```

--------------------------------

### Run Data Wizard with Docker

Source: https://github.com/capevace/data-wizard-docs/blob/main/quick-start.mdx

Launches the Data Wizard Docker container, mapping necessary ports and volumes for persistent storage and setting the essential APP_KEY environment variable.

```bash
docker run \
  --name data-wizard \
  -p 9090:80 \
  -p 4430:443 \
  -p 4430:443/udp \
  -v data_wizard_storage:/app/storage \
  -v data_wizard_sqlite_data:/app/database \
  -v data_wizard_caddy_data:/data \
  -v data_wizard_caddy_config:/config \
  -e APP_KEY=base64:[REPLACE_WITH_KEY] \
  mateffy/data-wizard:latest
```

--------------------------------

### Start Local Development Server

Source: https://github.com/capevace/data-wizard-docs/blob/main/README.md

Starts the Mintlify local development server. This command should be run from the root directory of your documentation project, where the 'mint.json' file is located.

```bash
mintlify dev
```

--------------------------------

### Generate APP_KEY

Source: https://github.com/capevace/data-wizard-docs/blob/main/quick-start.mdx

Generates a random base64 encoded APP_KEY required for Data Wizard's security. Ensure the `-base64` flag is used.

```bash
openssl rand -base64 32
```

--------------------------------

### More Snippet Example

Source: https://github.com/capevace/data-wizard-docs/blob/main/examples/real-estate-properties-from-exposes.mdx

This snippet demonstrates the inclusion of external markdown content, likely for displaying additional information or examples.

```markdown
import More from '/snippets/more.mdx'

<More/>
```

--------------------------------

### Product Data Extraction from Brochures Example

Source: https://github.com/capevace/data-wizard-docs/blob/main/introduction.mdx

Shows how to extract product names and prices from online brochures. This example is valuable for market research, competitor analysis, and catalog management.

```mdx
import More from '/snippets/more.mdx';

<More />
```

--------------------------------

### Example Product Brochure Data Output

Source: https://github.com/capevace/data-wizard-docs/blob/main/examples/products-from-brochures.mdx

An example of the JSON output generated by the product brochure extractor, showing extracted product names and prices.

```json
[
    {
        "name": "Bottle of red wine",
        "original_price": 14.99,
        "discounted_price": 9.99
    },
    {
        "name": "Bottle of white wine",
        "original_price": 12.99,
        "discounted_price": 6.99
    }
]
```

--------------------------------

### Real Estate Expose Data Extraction Example

Source: https://github.com/capevace/data-wizard-docs/blob/main/introduction.mdx

Provides an example of extracting structured data from real estate exposes. This includes property details, pricing information, and location data, useful for real estate data management.

```mdx
import More from '/snippets/more.mdx';

<More />
```

--------------------------------

### More LLM Information

Source: https://github.com/capevace/data-wizard-docs/blob/main/configure-llm.mdx

This snippet likely includes additional details or examples related to choosing an LLM, possibly from an external markdown file.

```react
import More from '/snippets/more.mdx';

<More />
```

--------------------------------

### Implement Strategy Logic

Source: https://github.com/capevace/data-wizard-docs/blob/main/custom-strategies.mdx

Provides examples of implementing custom extraction logic within the `run()` method of a custom strategy. The first example shows a basic implementation, while the second demonstrates validating invoice data by checking the total against the sum of line items.

```php
use Mateffy\Magic\Extraction\Strategies\Extractor;

class MyCustomStrategy extends Extractor
{
    public function run(array $artifacts): array
    {
        // Implement your strategy here
    }
}
```

```php
use Mateffy\Magic\Extraction\Strategies\Extractor;
use Mateffy\Magic\Exceptions\JsonSchemaValidationError;

class ValidatedInvoiceStrategy extends ParallelStrategy
{
    public function run(array $artifacts): array
    {
        $data = parent::run($artifacts);

        // Validate an invoice total:

        $total = 0;

        foreach ($data['line_items'] as $item) {
            $total += $item['amount'];
        }

        if ($total !== $data['total']) {
            throw new \JsonSchemaValidationError(
                'Invoice total does not match the sum of line items'
            );
        }

        // The returned data is now
        // guaranteed to be valid invoice data.
        return $data;
    }
}
```

--------------------------------

### Learn how to extract some data

Source: https://github.com/capevace/data-wizard-docs/blob/main/snippets/more.mdx

Step by step guide to extract data from documents using Data Wizard.

```markdown
Step by step guide to extract data from documents using Data Wizard.
```

--------------------------------

### Example contents.json Structure

Source: https://github.com/capevace/data-wizard-docs/blob/main/preprocessing.mdx

Illustrates the structure of the `contents.json` file, detailing how document content is organized into Slices, including text, images, and page information with their respective properties.

```json
[
  {
    "page": 1,
    "type": "text",
    "text": "This is the text on the first page of the document. Lorem ipsum dolor sit amet..."
  },
  {
    "page": 1,
    "type": "image",
    "mimetype": "image/jpeg",
    "path": "images/image1.jpg",
    "x": 455.0,
    "y": 28.55999755859375,
    "width": 88.32000732421875,
    "height": 688.3200073242188
  },
  {
    "page": 1,
    "type": "page-image",
    "mimetype": "image/jpeg",
    "path": "pages/page1.jpg"
  },
  {
    "page": 1,
    "type": "page-image-marked",
    "mimetype": "image/jpeg",
    "path": "pages_marked/page1.jpg"
  }
]
```

--------------------------------

### Start Extraction

Source: https://github.com/capevace/data-wizard-docs/blob/main/extracting-data.mdx

Initiate data extraction from a bucket or an extractor. View progress and final data in raw JSON or GUI format.

```markdown
## Run inside Data Wizard

<Steps>
    <Step title="Start an Extraction through a Bucket or an Extractor">
        All of the data that Data Wizard generates is viewable on the run page. This includes in-progress data as it's being generated, as well as the final extracted data.
        You can view the data both as raw JSON or in the GUI derived from the JSON schema.

        <p align="center">
            <Frame caption="Launch button in bucket and extractor views">
                <img alt="Launch from Extractor" src="/images/screenshots/extractors/start-menu.png" />
            </Frame>

        </p>

        <Tabs>
            <Tab title="Launch from Extractor">
                ![Launch from Bucket](./images/screenshots/extractors/start.png)
            </Tab>
            <Tab title="Launch from Bucket">
                ![Launch from Bucket](./images/screenshots/buckets/start.png)
            </Tab>
        </Tabs>
    </Step>
    <Step title="View the extracted data in the GUI or as JSON">
        All of the data that Data Wizard generates is viewable on the run page. This includes in-progress data as it's being generated, as well as the final extracted data.
        You can view the data both as raw JSON or in the GUI derived from the JSON schema.

        <Tabs>
            <Tab title="View Data in GUI">
                You can use the built-in UI to create and configure your extractor.

                ![View Data in GUI](./images/screenshots/run/run-gui.png)
            </Tab>
            <Tab title="View Data as JSON">
                You can use the built-in UI to create and configure your extractor.
                ![View Data as JSON](./images/screenshots/run/run-json.png)
            </Tab>
        </Tabs>
    </Step>

    <Step title="Inspect each extraction step">
        You can inspect each step of the extraction process to see how the AI is interpreting your instructions and what data it is returning.

        <Tabs>
            <Tab title="LLM Prompt and Responses">
                You can use the built-in UI to create and configure your extractor.

                ![View Data in GUI](./images/screenshots/run/run-chat-1.png)
            </Tab>
            <Tab title="Supports Images and Tool Calls">
                You can use the built-in UI to create and configure your extractor.
                ![View Data as JSON](./images/screenshots/run/run-chat-2.png)
            </Tab>
        </Tabs>
    </Step>
    <Step title="Download the Data as JSON, XML or CSV">
       asd
    </Step>
    <Step title="Restart the extraction with different parameters">
       asd
    </Step>

    <Step title="Modify the data using AI">
       asd
    </Step>

    <Step title="Analyze extraction costs">
       asd
    </Step>

</Steps>
```

--------------------------------

### Customer Feedback JSON Output Example

Source: https://github.com/capevace/data-wizard-docs/blob/main/examples/customer-feedback-to-data.mdx

An example of the structured JSON data produced by the customer feedback extractor. This format captures key details like submission date, customer information, feedback text, rating, and suggestions.

```json
{
    "formType": "Customer Feedback",
    "submissionDate": "2024-09-08",
    "customerName": "Jane Doe",
    "email": "jane.doe@example.com",
    "feedbackText": "The service was excellent, and the staff were very friendly. I especially appreciated the quick response time to my inquiry.",
    "rating": 5,
    "suggestions": "Perhaps offer more variety in your product catalog."
}
```

--------------------------------

### Customer Feedback to JSON Conversion Example

Source: https://github.com/capevace/data-wizard-docs/blob/main/introduction.mdx

Illustrates transforming customer feedback, whether handwritten or printed, into structured JSON format. This facilitates easier analysis and aids in service improvement initiatives.

```mdx
import More from '/snippets/more.mdx';

<More />
```

--------------------------------

### More Snippet

Source: https://github.com/capevace/data-wizard-docs/blob/main/extractors.mdx

This snippet likely contains additional details or examples related to the documentation.

```markdown
import More from '/snippets/more.mdx';

<More />
```

--------------------------------

### Tax Form Data Example

Source: https://github.com/capevace/data-wizard-docs/blob/main/examples/data-from-paper-tax-forms.mdx

Example of structured JSON output for processed tax form data, including taxpayer information, financial figures, and due dates.

```json
{
    "formType": "Tax Form 1040",
    "taxYear": 2023,
    "taxpayerID": "12-3456789",
    "filingStatus": "Single",
    "income": 75000.00,
    "deductions": 12000.00,
    "taxLiability": 15000.00,
    "paymentDueDate": "2024-04-15"
}
```

--------------------------------

### Run Data Wizard with Docker

Source: https://github.com/capevace/data-wizard-docs/blob/main/deployment.mdx

Starts the Data Wizard Docker container, mapping necessary ports and volumes for persistent storage and configuration. It includes options for HTTP/HTTPS access, data persistence, and setting the essential APP_KEY environment variable.

```bash
docker run \
  --name data-wizard \
  -p 9090:80 \
  -p 4430:443 \
  -p 4430:443/udp \
  -v data_wizard_storage:/app/storage \
  -v data_wizard_sqlite_data:/app/database \
  -v data_wizard_caddy_data:/data \
  -v data_wizard_caddy_config:/config \
  -e APP_KEY=[REPLACE_WITH_APP_KEY] \
  mateffy/data-wizard:latest
```

--------------------------------

### Example Real Estate Data Output

Source: https://github.com/capevace/data-wizard-docs/blob/main/examples/real-estate-properties-from-exposes.mdx

This JSON structure represents the expected output after extracting data from a real estate exposé. It includes property details, unit information, and artifact IDs for images and floorplans.

```json
{
    "name": "Modern Apartment in City Center",
    "address": "12 Example Street",
    "description_text": "Spacious apartment with modern amenities in a vibrant city center location.",
    "units": [
        {
            "usages": [
                "living"
            ],
            "label": "Apartment 1",
            "floor": "2nd Floor",
            "rent_per_m2": 15.50,
            "images": [
                "artifact:images/image1.png",
                "artifact:images/image2.png"
            ],
            "floorplans": [
                 "artifact:images/image7.png"
            ]
        },
        {
            "usages": [
                "living"
            ],
            "label": "Apartment 2",
            "floor": "3rd Floor",
            "rent_per_m2": 16.00,
            "images": [],
            "floorplans": []
        }
    ],
    "images": [
        "artifact:images/image9.png"
    ],
    "floorplans": []
}
```

--------------------------------

### Invoice Data Extraction Example

Source: https://github.com/capevace/data-wizard-docs/blob/main/introduction.mdx

Demonstrates extracting structured data from scanned invoices. This includes key information such as invoice numbers, dates, line items, and total amounts. It's useful for automating invoice processing.

```mdx
import More from '/snippets/more.mdx';

<More />
```

--------------------------------

### Tax Forms Data Extraction Example

Source: https://github.com/capevace/data-wizard-docs/blob/main/introduction.mdx

Details the process of extracting structured data from paper tax forms. It covers personal information, income details, deductions, and credits, streamlining tax document processing.

```mdx
import More from '/snippets/more.mdx';

<More />
```

--------------------------------

### Example JSON Schema for Product Extraction

Source: https://github.com/capevace/data-wizard-docs/blob/main/extractors.mdx

This JSON schema defines the structure for extracting product data from supermarket brochures. It includes properties for product name, original price, and discounted price, with validation rules and UI hints.

```APIDOC
{
  "type": "object",
  "required": ["products"],
  "properties": {
    "products": {
      "type": "array",
      "magic_ui": "table",
      "items": {
        "type": "object",
        "required": ["name", "original_price"],
        "properties": {
          "name": {
            "type": "string",
            "maxLength": 255,
            "description": "The name of the product."
          },
          "original_price": {
            "type": "number",
            "minimum": 0,
            "multipleOf": 0.01,
            "description": "The original price of the product."
          },
          "discounted_price": {
            "type": ["number", "null"],
            "minimum": 0,
            "multipleOf": 0.01,
            "description": "The discounted price for all customers. Prices only applying to customers with a membership card should not be included here."
          }
        }
      }
    }
  }
}
```

--------------------------------

### Re-install Dependencies

Source: https://github.com/capevace/data-wizard-docs/blob/main/README.md

Re-installs project dependencies, often used to resolve issues when 'mintlify dev' is not running correctly.

```bash
mintlify install
```

--------------------------------

### Get Buckets API

Source: https://github.com/capevace/data-wizard-docs/blob/main/endpoint/get.mdx

Retrieves a list of buckets from the system. This endpoint is used to fetch all available buckets, which can be used for further operations like data extraction or management.

```APIDOC
GET /api/buckets

Description:
  Retrieves a list of all available buckets.

Parameters:
  None

Responses:
  200 OK:
    Description: A list of buckets.
    Content:
      application/json:
        Schema:
          type: array
          items:
            type: object
            properties:
              id:
                type: string
                description: The unique identifier for the bucket.
              name:
                type: string
                description: The name of the bucket.
              createdAt:
                type: string
                format: date-time
                description: The timestamp when the bucket was created.
  500 Internal Server Error:
    Description: An unexpected error occurred on the server.
```

--------------------------------

### Data Wizard Extraction Workflow

Source: https://github.com/capevace/data-wizard-docs/blob/main/extracting-data.mdx

This snippet outlines the key steps involved in setting up and running an data extraction task using Data Wizard. It details the process from creating an extractor to running it within an application.

```markdown
## Prepare your extraction task

Before we can extract some data, you'll need to tell the wizard what data you want to extract and how to extract it.

<Steps>
    <Step title="Create an extractor for your extraction task">
        You can just describe the shape of data you want to extract, and an AI will generate an initial draft for you.
        ![Create extractor](./images/screenshots/setup/quick-create-extractor.png)

        <br /><br />
    </Step>
    <Step title="Refine your JSON Schema and output instructions">
        Edit the generated schema to your liking and add other instructions for the AI to follow. Read more in the [Extractors](./extractors) section.

        ![Define JSON Schema](./images/screenshots/setup/edit-extractor.png)

        <Card title="Extractors" icon="laptop-code" href="./extractors">
            Extractors are the core configuration objects in Data Wizard
        </Card>

        <br /><br />
    </Step>
    <Step title="Select the LLM to use">
        You can select from a large number of LLMs thanks to the [LLM Magic](https://github.com/Capevace/llm-magic) PHP package.
        You will need to add your API keys in the LLM settings before you can use them in an extractor. Find out more in the [LLM Provider Configuration](./configure-llm) section.

        ![Define JSON Schema](./images/screenshots/setup/select-model.png)

        <Card title="LLM Provider Configuration" icon="sliders" href="./configure-llm">
            Configure your Large Language Model (LLM) API provider in Data Wizard to connect to leading LLMs like OpenAI, Anthropic, Google AI, Mistral AI, and more.
        </Card>

        <br /><br />
    </Step>
    <Step title="Select the extraction strategy to use">
        There are multiple [built-in strategies](./strategies) to choose from, or you can create your own custom strategy.

        ![Define JSON Schema](./images/screenshots/setup/select-strategy.png)

        <Card title="Extraction Strategies" icon="code-branch" href="./strategies">
            Learn about the built-in and custom extraction strategies available in Data Wizard.
        </Card>

        <br /><br />
    </Step>
    <Step title="Run the extractor">
        After you have configured your extractor, you can run it to extract data from your documents.

        You can either use the built-in UI to do this, or you can integrate the feature into an existing application using the iFrame and HTTP API.

        <CardGroup cols={2}>
          <Card title="Run inside DataWizard" icon="hat-wizard" href="#run-inside-data-wizard">
            Via Data Wizard's backend UI
          </Card>
          <Card title="Run inside your own application" icon="server" href="#run-inside-your-own-application">
            Via the embedded iFrame UI
          </Card>
        </CardGroup>
    </Step>
</Steps>
```

--------------------------------

### Data Extraction Workflow

Source: https://github.com/capevace/data-wizard-docs/blob/main/introduction.mdx

Illustrates the general workflow of Data Wizard, showing how input data, LLM configuration, and output format interact to produce extracted and validated JSON data.

```mermaid
graph TB
    Input[Input data] -- PDF / Word files --> Extraction[Data Wizard]
    LLM[LLM Config] -- Prompt & Strategy --> Extraction
    Output[Output format] -- JSON Schema --> Extraction
    Extraction --> Results[Extracted and validated JSON data]
```

--------------------------------

### LLM Provider Configuration

Source: https://github.com/capevace/data-wizard-docs/blob/main/snippets/more.mdx

Set up your Large Language Model API keys.

```markdown
Set up your Large Language Model API keys.
```

--------------------------------

### GraphQL API Endpoint and Example Query

Source: https://github.com/capevace/data-wizard-docs/blob/main/apis.mdx

Data Wizard exposes a GraphQL endpoint for flexible data querying. This section provides the endpoint URL and an example of a GraphQL query to retrieve saved extractors.

```graphql
https://YOUR_DATA_WIZARD_URL/api/graphql

query {
  savedExtractors {
    collection {
      id
      label
    }
  }
}
```

--------------------------------

### List All Extractions

Source: https://github.com/capevace/data-wizard-docs/blob/main/endpoint/list.mdx

Retrieves a list of all available extractions via the API. This endpoint is used to get an overview of all extraction jobs or configurations within the system.

```APIDOC
GET /api/v1/extractions

Description:
  Lists all extractions.

Response:
  200 OK:
    content:
      application/json:
        schema:
          type: array
          items:
            type: object
            properties:
              id:
                type: string
                description: Unique identifier for the extraction.
              name:
                type: string
                description: Name of the extraction.
              status:
                type: string
                description: Current status of the extraction (e.g., 'completed', 'running', 'failed').
              createdAt:
                type: string
                format: date-time
                description: Timestamp when the extraction was created.
              updatedAt:
                type: string
                format: date-time
                description: Timestamp when the extraction was last updated.
```

--------------------------------

### Strategies

Source: https://github.com/capevace/data-wizard-docs/blob/main/snippets/more.mdx

Understand different data processing strategies.

```markdown
Understand different data processing strategies.
```

--------------------------------

### Data Wizard Integration into Application

Source: https://github.com/capevace/data-wizard-docs/blob/main/introduction.mdx

Details how Data Wizard can be integrated into an application, showing the flow from a software UI embedding Data Wizard's UI, through file processing and LLM extraction, to final data delivery.

```mermaid
graph TB
    A[Example software UI] -- Embeds an iFrame --> B[Data Wizard Embedded UI]
    B -- Upload files --> F[Data Wizard Core]
    F -- Extract text and images from files --> G[Artifacts]
    G --> H[Extraction Strategy]
    H -.-> I[LLM]
    I -.-> H
    H -- JSON data --> F
    F <-."Streaming results into\nautomatic UI".-> B
    B -- Final data via download,\nJavaScript API or webhook --> A
```

--------------------------------

### Simple Invoice JSON Output

Source: https://github.com/capevace/data-wizard-docs/blob/main/examples/paper-invoice-to-structured-data.mdx

Example JSON output from the invoice extractor, detailing invoice number, dates, seller and buyer information, line items with quantities and prices, total amounts, and payment details.

```json
{
    "invoiceNumber": "INV-2022-001",
    "issueDate": "2022-01-01",
    "currency": "EUR",
    "seller": {
        "name": "ACME Inc.",
        "address": "123 Main St.",
        "postalCode": "12345",
        "city": "Springfield",
        "country": "US",
        "vatNumber": "US123456789"
    },
    "buyer": {
        "customerNumber": "CUST-123",
        "name": "Buyer Corp.",
        "address": "456 Elm St.",
        "postalCode": "54321",
        "city": "Shelbyville",
        "country": "US"
    },
    "lineItems": [
        {
            "position": 1,
            "description": "Product A",
            "unitPrice": 100.0,
            "quantity": 2,
            "vatRate": 19.0,
            "netAmount": 200.0
        },
        {
            "position": 2,
            "description": "Product B",
            "unitPrice": 50.0,
            "quantity": 3,
            "vatRate": 19.0,
            "netAmount": 150.0
        }
    ],
    "totalAmounts": {
        "netTotal": 350.0,
        "taxTotal": 66.5,
        "grossTotal": 416.5,
        "dueTotal": 416.5
    },
    "paymentDetails": {
        "paymentTerms": "Net 30 days",
        "paymentMethod": "SEPA_TRANSFER",
        "iban": "DE89370400440532013000"
    }
}
```

--------------------------------

### Invoice Data Structure Example

Source: https://github.com/capevace/data-wizard-docs/blob/main/examples/paper-invoice-to-structured-data.mdx

This JSON snippet illustrates the expected structure for invoice data, including details about the seller, buyer, line items, and total amounts. It serves as a schema for data extraction and validation.

```json
{
  "invoiceNumber": {
    "type": "string",
    "description": "Unique invoice identifier"
  },
  "issueDate": {
    "type": "string",
    "description": "Date the invoice was issued"
  },
  "currency": {
    "type": "string",
    "description": "Currency code (e.g., EUR, USD)"
  },
  "seller": {
    "type": "object",
    "description": "Information about the seller",
    "properties": {
      "name": {
        "type": "string",
        "description": "Seller's name"
      },
      "address": {
        "type": "string",
        "description": "Seller's address"
      }
    },
    "required": [
      "name",
      "address"
    ]
  },
  "buyer": {
    "type": "object",
    "description": "Information about the buyer",
    "properties": {
      "name": {
        "type": "string",
        "description": "Buyer's name"
      },
      "address": {
        "type": "string",
        "description": "Buyer's address"
      }
    },
    "required": [
      "name",
      "address"
    ]
  },
  "lineItems": {
    "type": "array",
    "description": "List of items or services on the invoice",
    "items": {
      "type": "object",
      "properties": {
        "description": {
          "type": "string",
          "description": "Description of the item/service"
        },
        "quantity": {
          "type": "number",
          "description": "Quantity of the item/service"
        },
        "unitPrice": {
          "type": "number",
          "description": "Price per unit"
        },
        "totalPrice": {
          "type": "number",
          "description": "Total price for the line item"
        }
      },
      "required": [
        "description",
        "quantity",
        "unitPrice",
        "totalPrice"
      ]
    }
  },
  "totalAmounts": {
    "type": "object",
    "description": "Summary of all amounts",
    "properties": {
      "netTotal": {
        "type": "number",
        "description": "Total amount before tax"
      },
      "taxTotal": {
        "type": "number",
        "description": "Total tax amount"
      },
      "grossTotal": {
        "type": "number",
        "description": "Total amount including tax"
      },
      "dueTotal": {
        "type": "number",
        "description": "Total amount due"
      }
    },
    "required": [
      "netTotal",
      "taxTotal",
      "grossTotal",
      "dueTotal"
    ]
  },
  "paymentDetails": {
    "type": "object",
    "description": "Payment information",
    "properties": {
      "paymentTerms": {
        "type": "string",
        "description": "Payment terms"
      },
      "paymentMethod": {
        "type": "string",
        "description": "Payment method",
        "enum": [
          "SEPA_TRANSFER",
          "CREDIT_CARD",
          "PAYPAL"
        ]
      },
      "iban": {
        "type": "string",
        "description": "IBAN for bank transfer"
      }
    },
    "required": [
      "paymentTerms",
      "paymentMethod"
    ]
  }
}
```

--------------------------------

### Add Smart Import Feature to SaaS

Source: https://github.com/capevace/data-wizard-docs/blob/main/introduction.mdx

Enable a 'smart import' feature in your SaaS application by embedding Data Wizard via iFrames or its REST/GraphQL API. Users can upload documents, and extracted data is streamed back in real-time.

```html
<Card title="SaaS company that wants to add smart import feature" icon="upload" iconSize={200}>
    You can offer a "smart import" feature in your SaaS application, allowing users to upload documents and automatically populate your application with the extracted data.

    <br/>

    **Use Case:** You are a SaaS provider for CRM, accounting, or inventory management software and want to offer your users a "smart import" feature.

    <br/>

    **Solution:** Embed Data Wizard directly into your SaaS application using iFrames or the REST/GraphQL API. Provide a seamless user experience by integrating data extraction directly into your workflow. Users can upload documents within your application, and Data Wizard will stream the extracted data back in real-time.
</Card>
```

--------------------------------

### Create Custom Strategy

Source: https://github.com/capevace/data-wizard-docs/blob/main/custom-strategies.mdx

Demonstrates how to create a custom strategy by implementing the `Strategy` interface or extending the `Extractor` class. It also shows how to extend an existing strategy or create a completely custom one.

```php
use Mateffy\Magic\Extraction\Strategies\Strategy;
use Mateffy\Magic\Extraction\Strategies\Extractor;

// Create a custom strategy
class MyCustomStrategy extends Extractor {}

// Extend an existing strategy
class MyCustomizedStrategy extends SequentialStrategy {}

// Or completely custom by doing everything yourself
class MyCompletelyCustomStrategy implements Strategy {}
```

--------------------------------

### Customized iFrame Embedding

Source: https://github.com/capevace/data-wizard-docs/blob/main/integrate.mdx

This example demonstrates how to embed the Data Wizard iFrame with custom dimensions and styling. It includes width, height, frameborder, and inline styles for better integration into the host application's layout.

```html
<div style="width:800px; height:600px;">
  <iframe
    src="https://data-wizard.ai/embed/123456...?signature=..."
    width="100%"
    height="100%"
    frameborder="0"
    style="border: 1px solid #ccc;"
  ></iframe>
</div>
```

--------------------------------

### Competitor Analysis and Market Research

Source: https://github.com/capevace/data-wizard-docs/blob/main/introduction.mdx

Gather product and pricing information from competitor brochures, websites, or advertisements for market research. Data Wizard automates the extraction of this data for efficient analysis.

```html
<Card title="Performing some competitor analysis & market research" icon="chart-line" iconSize={200}>
    You can gather product and pricing information from competitor brochures, websites, or advertisements for market research and competitive analysis.

    <br/>

    **Use Case:** You need to gather product and pricing information from competitor brochures, websites, or advertisements for market research and competitive analysis.

    <br/>

    **Solution:** Use Data Wizard to automatically extract product details, pricing, and other relevant information from publicly available documents. Gain valuable market insights quickly and efficiently, without manual data scraping and entry.
</Card>
```

--------------------------------

### Run in Own Application

Source: https://github.com/capevace/data-wizard-docs/blob/main/extracting-data.mdx

Instructions and code for running extractions within your own application using Data Wizard.

```markdown
## Run inside your own application

import More from '/snippets/more.mdx';

<More />
```

--------------------------------

### Update and Restart Data Wizard Docker Container

Source: https://github.com/capevace/data-wizard-docs/blob/main/deployment.mdx

Commands to update the Data Wizard Docker image to the latest version, stop the current container, remove it, and then start a new container with the updated image. It's recommended to back up data before performing updates.

```bash
docker pull mateffy/data-wizard:latest\
docker stop data-wizard\
docker rm data-wizard\
docker run --name data-wizard -p 9090:80 -p 4430:443 -p 4430:443/udp -v data_wizard_storage:/app/storage -v data_wizard_sqlite_data:/app/database -v data_wizard_caddy_data:/data -v data_wizard_caddy_config:/config -e APP_KEY=[REPLACE_WITH_APP_KEY] mateffy/data-wizard:latest
```

--------------------------------

### Programmatic Data Extraction Workflow Overview

Source: https://github.com/capevace/data-wizard-docs/blob/main/apis.mdx

This Mermaid diagram illustrates the programmatic data extraction workflow using the HTTP or GraphQL API. It covers file upload, extraction runs, and receiving notifications.

```mermaid
graph TB
    subgraph User-Driven File Upload
        A[Create a Bucket POST /api/buckets] --> B[User Uploads Files via Embeddable URL];
        B --> D[Redirect to Extractor URL or Embed iFrame];
    end

    subgraph Programmatic File Upload
        C[Create a Bucket POST /api/buckets] --> C1[Upload Files];
    end

    D --> E[Start Extraction Run];
    C1 --> E;
    E --> F[Webhook Notifications];
    E --> G[Poll API for Updates];
    F --> H[Receive Data];
    G --> H;
```

--------------------------------

### Configure LLM API Keys via Environment Variables (Docker)

Source: https://github.com/capevace/data-wizard-docs/blob/main/configure-llm.mdx

This snippet demonstrates how to configure LLM API keys for Data Wizard when running in a Docker container using environment variables. It shows examples for both a direct Docker run command and a Docker Compose file. Ensure you replace placeholder values with your actual API keys and application key.

```bash
docker run \
  -p 9090:80 \
  -e OPENAI_API_TOKEN=<YOUR_OPENAI_API_KEY> \
  mateffy/data-wizard:latest
```

```yaml
services:
  data-wizard:
    image: mateffy/data-wizard:latest
    ports:
      - ...
    volumes:
      - ...
    environment:
      - APP_KEY=<REPLACE_WITH_APP_KEY>
      - OPENAI_API_TOKEN=<YOUR_OPEN_AI_API_KEY>
```

--------------------------------

### iFrame Theme Customization

Source: https://github.com/capevace/data-wizard-docs/blob/main/integrate.mdx

Demonstrates how to customize the Data Wizard iFrame's theme using URL parameters and JavaScript postMessage API.

```javascript
// Set initial theme via URL parameter
// Example: <iframe src="/data-wizard?theme=light"></iframe>

// Dynamically change theme using postMessage
const wizardFrame = document.getElementById('data-wizard-iframe'); // Assuming you have an iframe with this ID

if (wizardFrame) {
  wizardFrame.contentWindow.postMessage({
    event: 'set_theme',
    theme: 'dark'
  }, '*');
}
```

--------------------------------

### AI-Powered Data Extraction for SaaS Platforms

Source: https://github.com/capevace/data-wizard-docs/blob/main/introduction.mdx

Utilize Data Wizard as the core data extraction engine for platforms requiring robust and adaptable data extraction. Its modular architecture and LLM abstraction layer allow for easy switching between LLM providers and customization.

```html
<Card title="AI-powered data extraction as a core feature for SaaS" icon="gear" iconSize={200}>
    You can use Data Wizard as the core data extraction engine for your platform, supporting a wide range of document types and extraction tasks.

    <br/>

    **Use Case:** You are building a document processing or data analysis platform and need robust, adaptable data extraction capabilities.

    <br/>

    **Solution:** Use Data Wizard as the core data extraction engine for your platform using the REST/GraphQL API. Its modular architecture and LLM abstraction layer allow you to easily switch between different LLM providers, customize extraction strategies, and adapt to evolving LLM technologies.
</Card>
```

--------------------------------

### Custom Strategies Link

Source: https://github.com/capevace/data-wizard-docs/blob/main/strategies.mdx

Provides a link to learn how to build custom strategies for more control over the extraction process.

```markdown
<Card title="Learn how to build custom strategies for more control" icon="gear" iconSize={200} horizontal href="./custom-strategies">
    You can create custom strategies to tailor the extraction process to your specific needs. Custom strategies allow you to define how the document is processed, how the LLM is interacted with, and how the results are merged.
</Card>
```

--------------------------------

### Register Custom Strategy

Source: https://github.com/capevace/data-wizard-docs/blob/main/custom-strategies.mdx

Shows how to register a custom strategy with the Data Wizard by calling `Magic::registerStrategy()` in the `boot()` method of your service provider. This makes the custom strategy available in the UI.

```php
use Illuminate\Support\ServiceProvider;
use Mateffy\Magic\Magic;

class AppServiceProvider extends ServiceProvider
{
    public function register()
    {
        Magic::registerStrategy('my-custom-strategy', MyCustomStrategy::class);
    }
}
```