### Example Pipeline with Multiple Steps

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_PIPELINE.md

A comprehensive example demonstrating a pipeline with environment variables, logging, replication, conditional command execution, and an HTTP notification step.

```yaml
env:
  TARGET_SCHEMA: "analytics"

steps:
  - type: log
    message: "Pipeline started at {timestamp.datetime}"

  - type: replication
    path: replications/main_replication.yaml
    env:
      SLING_THREADS: 8
    id: main_replication

  - type: command
    command: "grep -q 'ERROR' /var/log/app.log"
    if: "false" # This step will be skipped

  - type: http
    url: "https://hooks.slack.com/services/..."
    method: "POST"
    payload: |
      {
        "text": "Pipeline completed successfully for schema: {env.TARGET_SCHEMA}"
      }
    if: state.main_replication.status == "success"
```

--------------------------------

### Install Sling CLI using Scoop on Windows

Source: https://github.com/slingdata-io/sling-cli/blob/main/README.md

Add the Sling repository to Scoop and install the Sling CLI on Windows. Verify the installation with `sling -h`.

```powershell
scoop bucket add sling https://github.com/slingdata-io/scoop-sling.git
scoop install sling

# You're good to go!
 sling -h
```

--------------------------------

### Getting Connection Documentation with MCP

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION.md

Demonstrates how to use the 'connection' tool with the 'docs' action to retrieve comprehensive documentation for connections.

```json
{
  "tool": "connection",
  "action": "docs",
  "input": {}
}
```

--------------------------------

### Basic File Listing Example

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Demonstrates listing only files within a specific directory in an S3 connection.

```json
{
  "action": "list",
  "input": {
    "connection": "MY_S3",
    "path": "data/csv_files/",
    "recursive": false,
    "only": "files"
  }
}
```

--------------------------------

### Extend Default Setup and Teardown Sequences

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

Use `+setup`, `setup+`, `+teardown`, and `teardown+` modifiers to prepend or append custom setup and teardown calls relative to default sequences.

```yaml
defaults:
  setup:
    - request:
        url: "{base_url}/auth/refresh"

endpoints:
  my_endpoint:
    +setup:  # Runs BEFORE default setup
      - request:
          url: "{base_url}/pre-check"
    teardown+:  # Runs AFTER default teardown
      - request:
          url: "{base_url}/cleanup"
    request:
      url: "{base_url}/data"
```

--------------------------------

### Install Sling CLI Binary on Linux

Source: https://github.com/slingdata-io/sling-cli/blob/main/README.md

Download the latest Sling CLI binary for Linux (amd64), extract it, and clean up the archive. Check the installation with `sling -h`.

```bash
curl -LO 'https://github.com/slingdata-io/sling-cli/releases/latest/download/sling_linux_amd64.tar.gz' \
  && tar xzf sling_linux_amd64.tar.gz \
  && rm -f sling_linux_amd64.tar.gz

# You're good to go!
 sling -h
```

--------------------------------

### Install Sling CLI via Python Wrapper

Source: https://github.com/slingdata-io/sling-cli/blob/main/README.md

Install the Sling CLI using pip. This makes the `sling` command available on your system.

```python
pip install sling
```

--------------------------------

### HTTP and Command Hook Examples

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Illustrates using 'http' hooks for notifications and 'command' hooks for cleanup tasks. Ensure necessary environment variables like SLACK_TOKEN are set.

```yaml
hooks:
  start:
    # Notification
    - type: http
      url: 'https://slack.com/api/chat.postMessage'
      method: POST
      headers:
        Authorization: 'Bearer ${SLACK_TOKEN}'
      body: |
        {
          "channel": "#data-pipeline",
          "text": "Starting data replication"
        }
  
  end:
    # Cleanup
    - type: command
      command: 'rm -f /tmp/temp_files_*'
    
    # Update status
    - type: query
      connection: METADATA_DB
      sql: |
        INSERT INTO replication_log (job_id, status, completed_at)
        VALUES ('repl_001', 'completed', NOW())
```

--------------------------------

### Full Refresh Replication Example

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Configures a full-refresh replication from a source to a target, with options for object naming, column casing, and chunking for large tables.

```yaml
source: POSTGRES_PROD
target: SNOWFLAKE_DW

defaults:
  mode: full-refresh
  object: 'analytics.{stream_schema}_{stream_table}'
  target_options:
    column_casing: snake

streams:
  # Single table
  public.customers:
  
  # All sales tables
  sales.*
    object: 'sales_data.{stream_table}'
  
  # Large table with chunking
  public.transactions:
    source_options:
      chunk_count: 10

env:
  SLING_THREADS: 5
  SLING_LOADED_AT_COLUMN: true
```

--------------------------------

### Install Sling CLI using Homebrew on Mac

Source: https://github.com/slingdata-io/sling-cli/blob/main/README.md

Use Homebrew to install the Sling CLI on macOS. After installation, verify by running `sling -h`.

```shell
brew install slingdata-io/sling/sling

# You're good to go!
 sling -h
```

--------------------------------

### Local to Cloud Backup Example

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Back up local files to cloud storage recursively. Note the double slash in the local absolute path for clarity.

```json
{
  "action": "copy",
  "input": {
    "source_location": "LOCAL//home/user/documents/",
    "target_location": "BACKUP_S3/daily-backup/documents/",
    "recursive": true
  }
}
```

--------------------------------

### Single File Copy Example

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Demonstrates copying a single file from a local path to an S3 bucket. Ensure the source and target locations are correctly formatted.

```json
{
  "action": "copy",
  "input": {
    "source_location": "local/path/to/source.csv",
    "target_location": "s3/bucket/folder/destination.csv",
    "recursive": false
  }
}
```

--------------------------------

### Pattern Matching Examples for Streams

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Combine schema and file wildcards with custom configurations for specific tables or patterns. Includes setting mode, object names, and disabling streams.

```yaml
source: POSTGRES
target: SNOWFLAKE

defaults:
  mode: incremental
  object: 'warehouse.{stream_schema}_{stream_table}'

streams:
  # All user-related tables
  public.user_*:
    primary_key: [user_id]
  
  # All tables except sensitive ones
  public.*:
  public.passwords:
    disabled: true
  public.credit_cards:
    disabled: true
  
  # Specific tables with custom config
  public.large_table:
    source_options:
      chunk_size: 12h
```

--------------------------------

### Incremental Replication Example

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Sets up incremental replication, defining primary and update keys, and handling source/target options like empty values and new columns. Supports custom SQL for incremental logic.

```yaml
source: MYSQL_APP
target: POSTGRES_DW

defaults:
  mode: incremental
  object: 'warehouse.{stream_table}'
  primary_key: [id]
  update_key: updated_at
  
  source_options:
    empty_as_null: false
  
  target_options:
    column_casing: snake
    add_new_columns: true

streams:
  # Standard incremental
  app.users:
  app.orders:
    primary_key: [order_id]
  
  # Append-only (no primary key)
  app.events:
    primary_key: []
    update_key: created_at
  
  # Custom SQL with incremental
  app.user_summary:
    sql: |
      SELECT 
        user_id,
        COUNT(*) as order_count,
        MAX(created_at) as last_order_date
      FROM orders 
      WHERE updated_at > coalesce({incremental_value}, '1900-01-01')
      GROUP BY user_id
    update_key: last_order_date

env:
  SLING_THREADS: 3
  SLING_RETRIES: 2
```

--------------------------------

### Full Sling API Spec Structure with Authentication

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

Illustrates a comprehensive Sling API specification, including queues, detailed authentication methods (basic and OAuth2), and endpoint configurations. This example covers more advanced setup options.

```yaml
# Name of the API
name: "Example API"

# Description of what the API does
description: "This API provides access to example data"

# Queues pass data between endpoints (write-then-read, temporary storage)
queues:
  - user_ids

# Authentication configuration for accessing the API
authentication:
  # Type of authentication: "basic", "oauth2", "aws-sigv4", "sequence", or empty for none
  type: "basic"

  # Basic authentication credentials
  username: "{secrets.username}"
  password: "{secrets.password}"

  # OAuth2 Client Credentials Flow (most common for API integrations)
  type: "oauth2"
  flow: "client_credentials"
  client_id: "{secrets.oauth_client_id}"
  client_secret: "{secrets.oauth_client_secret}"
  authentication_url: "https://api.example.com/oauth/token"
  scopes: ["read:data", "write:data"]

  expires: 3600  # Re-auth interval in seconds; automatic before each request if expired
  refresh_on_expire: true  # Auto-refresh OAuth2 tokens (requires refresh_token)
```

--------------------------------

### Install Sling Python Package

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_PYTHON.md

Install the Sling Python package using pip. Include the `[arrow]` extra for Apache Arrow support for high-performance streaming.

```bash
pip install sling
```

```bash
pip install sling[arrow]
```

--------------------------------

### File Discovery Workflow - Browse Root

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Start a file discovery process by listing the contents of the root directory of a specified connection. This is the first step in exploring storage.

```json
{
  "action": "list",
  "input": {
    "connection": "MY_STORAGE",
    "path": ""
  }
}
```

--------------------------------

### Directory Copy Example

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Use this to copy an entire directory recursively from S3 to a local path. The `recursive` parameter must be set to `true`.

```json
{
  "action": "copy",
  "input": {
    "source_location": "s3/bucket/source_folder/",
    "target_location": "local/backup/target_folder/",
    "recursive": true
  }
}
```

--------------------------------

### Replication-Level Hooks

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Define commands or SQL queries to execute at the start and end of a replication process. Supports command-line execution and database queries.

```yaml
hooks:
  start:
    - type: command
      command: 'echo "Starting replication at $(date)"'
    
    - type: query
      connection: MY_DB
      sql: 'UPDATE job_status SET status = "running" WHERE job_id = "repl_001"'
  
  end:
    - type: command
      command: 'echo "Replication completed at $(date)"'
```

--------------------------------

### Simple Database-to-Database Replication

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Example of a basic database-to-database replication. Configures source and target connections, default object naming, and specifies tables to replicate, including exclusions.

```yaml
source: MY_POSTGRES
target: MY_SNOWFLAKE

defaults:
  mode: full-refresh
  object: 'warehouse.{stream_schema}_{stream_table}'

streams:
  # Single table
  public.customers:
    
  # All tables in schema
  public.*:
  
  # Exclude specific table
  public.sensitive_data:
    disabled: true

env:
  SLING_THREADS: 5
```

--------------------------------

### Simple File-to-Database Replication

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Example of replicating data from files to a database. Specifies source and target, default object naming for files, and source options for CSV format.

```yaml
source: LOCAL
target: POSTGRES

defaults:
  mode: full-refresh
  object: 'staging.{stream_file_name}'
  source_options:
    format: csv
    header: true

streams:
  'file://data/customers.csv':
  'file://data/products.csv':
  'data/*.csv':  # All CSV files in directory

env:
  SLING_THREADS: 3
```

--------------------------------

### Get Database Documentation

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_DATABASE.md

Use the `docs` action to retrieve comprehensive documentation for a specific database. This action requires a Pro token and is subject to rate limits.

```json
{
  "action": "docs",
  "input": {}
}
```

--------------------------------

### State Variable Rendering Order Example

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

Demonstrates the automatic topological sort for state variable dependencies. Ensure variables are resolved before requests are made. Circular dependencies will cause errors.

```yaml
state:
  base_url: "https://api.com"              # No dependencies
  users_url: "{state.base_url}/users"      # Depends on base_url
  full_url: "{state.users_url}?limit=100"  # Depends on users_url
# Renders: base_url → users_url → full_url

# ❌ Circular dependency error:
state:
  var_a: "{state.var_b}"  # A → B
  var_b: "{state.var_a}"  # B → A (circular!)
```

--------------------------------

### Date Parsing Examples

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

Demonstrates how to parse date strings into time objects using 'auto' detection or a specified strftime format. Ensure the input string matches the provided format for successful parsing.

```yaml
# Auto-detect format
- expression: date_parse("05/15/2022", "auto")
  output: state.parsed_auto
```

```yaml
# Specify strftime format
- expression: date_parse("15-May-2022 10:30", "%d-%b-%Y %H:%M")
  output: state.parsed_specific
```

--------------------------------

### Date Formatting Examples

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

Shows how to format a time object into a string using a specified strftime format. This is useful for creating human-readable dates or preparing dates for API requests.

```yaml
# Format a date object
- expression: date_format(state.parsed_auto, "%Y%m%d")
  output: state.formatted_compact
```

```yaml
# Format for API parameter (ISO 8601 with timezone)
request:
  parameters:
    updated_since: "{date_format(date_add(now(), -1, 'hour'), '%Y-%m-%dT%H:%M:%SZ')}"
```

--------------------------------

### Multi-Schema Replication Example

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Demonstrates replicating data across multiple schemas, including schema mapping and table exclusions. Configures target options like column casing and sort keys.

```yaml
source: ORACLE_ERP
target: REDSHIFT_DW

defaults:
  mode: incremental
  primary_key: [id]
  update_key: last_modified
  object: '{stream_schema}_data.{stream_table}'
  
  target_options:
    column_casing: lower
    table_keys:
      sort: [id, last_modified]

streams:
  # Finance schema
  finance.*
    target_options:
      table_keys:
        sort: [account_id, transaction_date]
  
  # HR schema  
  hr.*
    object: 'human_resources.{stream_table}'
  
  # Sensitive table exclusions
  hr.salaries:
    disabled: true
  finance.audit_log:
    disabled: true

env:
  SLING_THREADS: 8
```

--------------------------------

### Simple Data Load and Query Pipeline

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_PIPELINE.md

An example pipeline that first loads data from a CSV file into a PostgreSQL table and then queries the first 5 rows, conditional on the load success.

```yaml
steps:
  - type: run
    source: "file://data/users.csv"
    target: "MY_POSTGRES.public.users"
    mode: "full-refresh"
    id: load_data

  - type: query
    connection: "MY_POSTGRES"
    query: "SELECT * FROM public.users LIMIT 5;"
    if: state.load_data.status == "success"
```

--------------------------------

### Loop Over a List of Files

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_PIPELINE.md

Iterate over a list of files using a 'group' step. This example processes each CSV file, loading its content into a staging table.

```yaml
- type: group
  loop:
    - "customers.csv"
    - "products.csv"
    - "orders.csv"
  steps:
    - type: log
      message: "Processing file {loop.index + 1}: {loop.value}"
    - type: run
      source: "file://data/{loop.value}"
      target: "STAGING.{loop.value.split('.')[0]}"
      mode: "truncate"
```

--------------------------------

### Multi-Cloud Copy Example

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Copy files between different cloud providers, such as AWS S3 and Google Cloud Storage. The action handles the underlying transfer.

```json
{
  "action": "copy",
  "input": {
    "source_location": "AWS_S3/data/export.csv",
    "target_location": "GCS_BUCKET/imports/data.csv"
  }
}
```

--------------------------------

### Authorization Code Flow (Interactive Mode)

Source: https://github.com/slingdata-io/sling-cli/blob/main/core/dbio/api/OAUTH2_EXAMPLES.md

For CLI applications. Leave redirect_uri empty to automatically start a local server and open your browser for authentication.

```yaml
authentication:
  type: oauth2
  flow: authorization_code
  client_id: "${secrets.OAUTH_CLIENT_ID}"
  client_secret: "${secrets.OAUTH_CLIENT_SECRET}"
  authentication_url: "https://api.example.com/oauth/token"
  # redirect_uri: ""  # Leave empty for automatic localhost server
  scopes:
    - "read:data"
```

--------------------------------

### Dynamic Endpoints Generation

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

This example shows how to dynamically generate API endpoints at runtime based on discovery or catalog APIs. It first fetches a list of items (e.g., tables) and then iterates over this list to create individual endpoints, each templated with data from the iterated item.

```yaml
dynamic_endpoints:
  - setup:  # Optional: fetch list of items
      - request:
          url: "{state.base_url}/catalog"
        response:
          processors:
            - expression: 'jmespath(response.json, "tables")'
              output: "state.table_list"

    iterate: "state.table_list"  # Or JSON literal: '["a","b","c"]'
    into: "state.current_table"  # Must be state.*

    endpoint:  # Template with {state.current_table} access
      name: "{state.current_table.name}"
      description: "Table: {state.current_table.description}"
      state:
        table_id: "{state.current_table.id}"
      request:
        url: "{state.base_url}/data/{state.table_id}"
      response:
        records:
          jmespath: "rows[]"
          primary_key: ["id"]
```

--------------------------------

### Backfill Historical Data Example

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Configures backfill replication for historical data with specified date ranges and chunking strategies. Adjust thread count for conservative historical loads.

```yaml
source: POSTGRES_OLD
target: BIGQUERY_DW

defaults:
  mode: backfill
  object: 'historical.{stream_table}'
  update_key: created_date
  
  source_options:
    range: '2020-01-01,2023-12-31'
    chunk_size: 30d  # 30-day chunks

streams:
  transactions:
    primary_key: [transaction_id]
  
  user_activity:
    source_options:
      chunk_size: 7d  # Smaller chunks for large table

env:
  SLING_THREADS: 2  # Conservative for historical loads
```

--------------------------------

### CSV Import Example

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Configures importing CSV files into a database, specifying source options like format, header, encoding, and handling of empty values. Defines column types for specific files.

```yaml
source: LOCAL
target: POSTGRES

defaults:
  mode: full-refresh
  object: 'staging.{stream_file_name}'
  
  source_options:
    format: csv
    header: true
    encoding: utf8
    empty_as_null: true
    skip_blank_lines: true

streams:
  'data/customers.csv':
    columns:
      customer_id: bigint
      email: string(255)
      created_at: datetime
  
  'data/products.csv':
    target_options:
      column_casing: snake
  
  # All CSV files in directory
  'imports/*.csv':

env:
  SLING_THREADS: 3
```

--------------------------------

### Start MCP Server with Sling CLI

Source: https://context7.com/slingdata-io/sling-cli/llms.txt

Launch the Model Context Protocol (MCP) server using the 'sling serve mcp' command. This server supports stdio transport and is compatible with MCP clients.

```bash
# Start the MCP server (stdio transport)
sling serve mcp
```

--------------------------------

### Replication Hooks Configuration

Source: https://context7.com/slingdata-io/sling-cli/llms.txt

Define shell commands, SQL queries, or HTTP calls to execute before or after replication or individual streams. Supports 'start' and 'end' hooks at the global level, and 'pre' and 'post' hooks at the default level.

```yaml
source: MY_PG
target: MY_SNOWFLAKE

hooks:
  start:
    - type: query
      connection: MY_SNOWFLAKE
      sql: "UPDATE job_log SET status='running', started_at=NOW() WHERE job='daily_sync'"
    - type: http
      url: "https://hooks.slack.com/services/T00/B00/XXXX"
      method: POST
      body: '{"text": "Sling replication started"}'

  end:
    - type: query
      connection: MY_SNOWFLAKE
      sql: "UPDATE job_log SET status='done', ended_at=NOW() WHERE job='daily_sync'"
    - type: command
      command: 'echo "Replication finished at $(date)"'

defaults:
  mode: incremental
  update_key: updated_at
  primary_key: [id]
  hooks:
    pre:
      - type: query
        connection: MY_SNOWFLAKE
        sql: "CREATE SCHEMA IF NOT EXISTS analytics"
    post:
      - type: query
        connection: MY_SNOWFLAKE
        sql: "ANALYZE {object_name}"

streams:
  public.users:
  public.orders:
    hooks:
      post:
        - type: command
          command: 'python validate_orders.py'

```

--------------------------------

### Backup and Sync Pattern - Inspect Source

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Inspect the source directory recursively to understand its size and contents before creating a backup. This helps in planning and verification.

```json
{
  "action": "inspect",
  "input": {
    "connection": "LOCAL",
    "path": "/important/data/",
    "recursive": true
  }
}
```

--------------------------------

### List All Connections

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION.md

Retrieve a list of all configured connections across different sources like environment files, dbt profiles, and environment variables.

```json
{
  "action": "list",
  "input": {}
}
```

--------------------------------

### Compile Sling CLI from Source on Linux/Mac

Source: https://github.com/slingdata-io/sling-cli/blob/main/README.md

Clone the Sling CLI repository, navigate into the directory, and build the project using the provided script. Run `./sling --help` to verify.

```bash
git clone https://github.com/slingdata-io/sling-cli.git
cd sling-cli
bash scripts/build.sh

./sling --help
```

--------------------------------

### Test Sling Connectivity

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/DEBUG.md

Tests the connectivity of Sling. Use this to ensure proper connection setup before executing queries.

```bash
./sling conns test
```

--------------------------------

### Data Migration Workflow - List Source Files

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Begin a data migration by listing all files recursively from the source storage. This provides an overview of the data to be moved.

```json
{
  "action": "list",
  "input": {
    "connection": "OLD_STORAGE",
    "path": "legacy_data/",
    "recursive": true
  }
}
```

--------------------------------

### Get Value by Path

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

Retrieve a value from a nested object using dot notation with `get_path()`. The path can include array indices.

```sling
get_path(response.json, "user.profile.email")
```

--------------------------------

### Load Replication Config from File in Go

Source: https://context7.com/slingdata-io/sling-cli/llms.txt

Loads a replication configuration from a YAML file, compiles it to resolve wildcards and apply defaults, and iterates through the generated tasks. Ensure 'replication.yaml' exists and is correctly formatted.

```go
package main

import (
    "github.com/slingdata-io/sling-cli/core/sling"
    "github.com/flarco/g"
)

func main() {
    // Load replication config from YAML file
    replication, err := sling.LoadReplicationConfigFromFile("replication.yaml")
    if err != nil {
        g.LogFatal(err)
    }

    // Compile: resolve wildcards, apply defaults, build task list
    err = replication.Compile(nil) // pass nil or a *sling.Config overwrite
    if err != nil {
        g.LogFatal(err)
    }

    g.Info("Will run %d streams", len(replication.Tasks))

    // Iterate compiled tasks
    for _, task := range replication.Tasks {
        g.Info("Stream: %s -> %s", task.Source.Stream, task.Target.Object)
    }
}
```

--------------------------------

### Inspect Files or Directories Recursively

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Get metadata and statistics for files or directories. Set `recursive` to `true` to include all contents of a directory.

```json
{
  "action": "inspect",
  "input": {
    "connection": "MY_S3",
    "path": "data/large_dataset/",
    "recursive": true
  }
}
```

--------------------------------

### Load Replication Configuration from YAML

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_PYTHON.md

Initialize a Replication object by specifying a file path to a YAML configuration file.

```python
replication = Replication(file_path="replication.yaml")
replication.run()
```

--------------------------------

### Conditional Execution with 'if'

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_PIPELINE.md

Use the 'if' key to conditionally execute a step based on an expression. This example checks if the day of the week is a weekend.

```yaml
steps:
  - type: command
    command: "date +%u" # Returns 1-7 (Mon-Sun)
    id: "day_of_week"

  - type: replication
    path: "weekend_job.yaml"
    if: "cast(state.day_of_week.output.stdout, 'int') > 5" # Only run on Sat or Sun
```

--------------------------------

### Explore Tables within a Schema

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_DATABASE.md

Get a list of tables within a specific schema. This action helps in navigating and understanding the structure of the database.

```json
{
  "action": "get_schemata",
  "input": {
    "connection": "MY_DB",
    "level": "table",
    "schema_name": "production"
  }
}
```

--------------------------------

### Basic MCP Tool Usage

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION.md

Illustrates the general JSON structure for interacting with MCP tools, specifying the tool, action, and input parameters.

```json
{
  "tool": "connection",
  "action": "action_name",
  "input": {
    "parameter1": "value1",
    "parameter2": "value2"
  }
}
```

--------------------------------

### Get Schemata with Schema Level Detail

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_DATABASE.md

List all available schemas within a database connection. This provides a high-level overview of the database organization.

```json
{
  "action": "get_schemata",
  "input": {
    "connection": "MY_POSTGRES",
    "level": "schema"
  }
}
```

--------------------------------

### Iterate Over ID Chunks

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

Example of iterating over batches of IDs from a queue using the `chunk` function. The batch is joined into a comma-separated string for the API parameter.

```yaml
iterate:
  # Use the chunk() function to process IDs in batches of 50
  over: "chunk(queue.variant_ids, 50)"
  into: "state.variant_id_batch" # state.variant_id_batch will be an array/list
  concurrency: 5
request:
  parameters:
    # Join the batch of IDs into a comma-separated string for the API parameter
    ids: '{join(state.variant_id_batch,",")}'
```

--------------------------------

### Iterate Over Date Range

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

Example of iterating over a date range using the `range` function. The current date is stored in `state.current_day` and formatted for the request parameters.

```yaml
iterate:
  over: >
    range(
      date_trunc(date_add(now(), -7, "day"), "day"), # Start date: 7 days ago
      date_trunc(now(), "day"),                      # End date: today
      "1d"                                           # Step: 1 day
    )
  into: state.current_day
  concurrency: 10
request:
  parameters:
    date: '{date_format(state.current_day, "%Y-%m-%d")}'
    # ... other params ...
```

--------------------------------

### Endpoint Definition Example

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

Defines an API endpoint within the Sling configuration. The key used for the endpoint (e.g., 'users') serves as its internal name.

```yaml
endpoints:
  # The key 'users' is the effective name
  users:
    description: "Retrieve users from the API"
    # ... other endpoint config ...

  # The key 'get_details' is the effective name
  get_details:
    description: "Get item details"
    # ... other endpoint config ...
```

--------------------------------

### Run Replication with Connection Test

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

This snippet demonstrates running a replication job after testing a connection. It uses the 'run' action with a specified file path.

```json
{
  "action": "run",
  "input": {
    "file_path": "/path/to/replication.yaml"
  }
}
```

--------------------------------

### Execute Replication

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Use the 'run' action to execute a replication job based on a configuration file. Options include selecting specific streams, setting a working directory, specifying a backfill range, overriding the mode, and passing environment variables.

```json
{
  "action": "run",
  "input": {
    "file_path": "/path/to/replication.yaml",
    "select_streams": ["specific_table"],
    "working_dir": "/project/directory",
    "range": "2024-01-01,2024-01-31",
    "mode": "incremental",
    "env": {
      "CUSTOM_VAR": "value",
      "SLING_THREADS": "5"
    }
  }
}
```

--------------------------------

### Sling MCP Database Tool Actions

Source: https://context7.com/slingdata-io/sling-cli/llms.txt

Query databases in a read-only manner. Supports querying data, retrieving schemata, and getting table columns.

```json
{
  "action": "query",
  "input": {
    "connection": "MY_PG",
    "query": "SELECT status, COUNT(*) FROM orders GROUP BY status",
    "description": "Get order counts by status",
    "limit": 100
  }
}
```

```json
{
  "action": "get_schemata",
  "input": {
    "connection": "MY_PG",
    "level": "column",
    "schema_name": "public",
    "table_names": ["users", "orders"]
  }
}
```

```json
{
  "action": "get_columns",
  "input": {
    "connection": "MY_PG",
    "table_name": "public.orders"
  }
}
```

--------------------------------

### Compile Sling CLI from Source on Windows (PowerShell)

Source: https://github.com/slingdata-io/sling-cli/blob/main/README.md

Clone the Sling CLI repository, change directory, and build the project using the PowerShell script. Execute `.\sling --help` to confirm.

```powershell
git clone https://github.com/slingdata-io/sling-cli.git
cd sling-cli

.\scripts\build.ps1

.\sling --help
```

--------------------------------

### Get Schemata with Specific Tables and Levels

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_DATABASE.md

Retrieve schema information for specific tables and levels of detail. This allows for targeted exploration of your database structure.

```json
{
  "action": "get_schemata",
  "input": {
    "connection": "MY_POSTGRES",
    "level": "table",
    "schema_name": "public",
    "table_names": ["users", "orders"]
  }
}
```

--------------------------------

### Get Schemata with Table Level Detail

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_DATABASE.md

Retrieve a list of tables within a specific schema. This is useful for understanding the structure of your database at the table level.

```json
{
  "action": "get_schemata",
  "input": {
    "connection": "MY_POSTGRES",
    "level": "table",
    "schema_name": "public"
  }
}
```

--------------------------------

### Sync State Validation: Valid Configuration

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

Ensures that each key in the `sync` array has a corresponding processor that writes to `state.<key>`. This example shows a valid configuration.

```yaml
endpoints:
  valid:
    sync: ["last_id"]
    response:
      processors:
        - expression: "record.id"
          output: "state.last_id"  # ✅ Matches sync key
          aggregation: "last"
```

--------------------------------

### Backfill with Incremental Fallback using Context Variables

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

This configuration demonstrates how to set up a replication process that supports backfilling historical data and falling back to incremental updates. It utilizes context variables like `context.range_start`, `context.range_end`, and `sync.last_date` to define the iteration range, with `coalesce` ensuring a fallback mechanism.

```yaml
endpoints:
  events:
    sync: [last_date]
    iterate:
      over: >
        range(
          coalesce(context.range_start, sync.last_date, "2024-01-01"),  # Priority order
          coalesce(context.range_end, date_format(now(), "%Y-%m-%d")),
          "1d"
        )
      into: "state.current_date"
    response:
      processors:
        - expression: "state.current_date"
          output: "state.last_date"
          aggregation: "maximum"

# Replication config for backfill:
# source_options:
#   range: '2024-01-01,2024-01-31'  # Sets context.range_start/end
```

--------------------------------

### Default Table Keys Configuration

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Define default table keys for primary, index, cluster, and partition columns, with specific examples for different database systems.

```yaml
defaults:
  target_options:
    table_keys:
      primary: [id]
      index: [customer_id, created_at]
      cluster: [region]  # BigQuery/Snowflake
      partition: [date_column]  # PostgreSQL/ClickHouse
```

--------------------------------

### File Discovery Workflow - Inspect Files

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Inspect specific files identified during the discovery process to get their metadata. This is useful for understanding file properties before further action.

```json
{
  "action": "inspect",
  "input": {
    "connection": "MY_STORAGE",
    "path": "data/2024/large_dataset.parquet"
  }
}
```

--------------------------------

### Backup and Sync Pattern - Create Backup

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Create a timestamped backup of local data to cloud storage recursively. This ensures that backups are organized by date.

```json
{
  "action": "copy",
  "input": {
    "source_location": "LOCAL//important/data/",
    "target_location": "BACKUP_S3/backups/2024-08-24/data/",
    "recursive": true
  }
}
```

--------------------------------

### Directory Statistics

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Get recursive statistics for a local directory, including total size, file count, and directory count. This is useful for understanding directory contents.

```json
{
  "action": "inspect",
  "input": {
    "connection": "LOCAL_FS",
    "path": "/var/log/",
    "recursive": true
  }
}
```

--------------------------------

### Backup and Sync Pattern - Verify Backup

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_FILE.md

Verify the integrity and size of the created backup by inspecting the backup location recursively. This confirms the backup process completed successfully.

```json
{
  "action": "inspect",
  "input": {
    "connection": "BACKUP_S3",
    "path": "backups/2024-08-24/",
    "recursive": true
  }
}
```

--------------------------------

### Aggregate Data with IF Condition

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

Example of using an IF condition before an expression for aggregation. The processor is skipped if the IF condition is false, and the aggregation is applied to the 'state.total_amount' if the condition is true and the expression evaluates.

```yaml
processors:
  # Example using aggregation with IF condition
  - expression: "record.amount"
    # Optional: Control whether processor is evaluated with IF condition
    # - Evaluated BEFORE the main expression
    # - Must return a boolean value (true/false)
    # - If false, this processor is COMPLETELY SKIPPED for the current record
    # - Has access to full state map: record, state, response, env, secrets
    # - Common use: Filter records, skip nulls, conditional logic
    if: '!is_null(record.amount) && record.amount > 0'
    # Target output for aggregation must be 'state.<variable>'
    output: "state.total_amount"
    # Aggregation type: maximum, minimum, collect, first, last (default: none)
    aggregation: "maximum"
```

--------------------------------

### Using 'store' for Cross-Step Communication

Source: https://context7.com/slingdata-io/sling-cli/llms.txt

Utilize the 'store' step to save values and access them in subsequent steps using the 'store.<key>' syntax, enabling inter-step data sharing.

```yaml
steps:
  # Use store values for cross-step communication
  - type: store
    key: target_env
    value: "production"

  - type: log
    message: "Deploying to {store.target_env}"
    if: "store.target_env == 'production'"
```

--------------------------------

### Structuring Configuration with Defaults

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Define common replication patterns in a 'defaults' section to reduce repetition. Override defaults only when necessary for specific streams.

```yaml
# Define common patterns in defaults
defaults:
  mode: incremental
  object: 'warehouse.{stream_schema}_{stream_table}'
  primary_key: [id]
  update_key: updated_at
  
  target_options:
    column_casing: snake
    add_new_columns: true

# Override only when necessary
streams:
  public.users:           # Uses all defaults
  
  public.logs:           # Override for specific needs
    mode: full-refresh
    primary_key: []
```

--------------------------------

### Daily Data Warehouse Load Pipeline

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_PIPELINE.md

An example pipeline that performs a data warehouse load, runs dbt models, and sends Slack notifications for success or failure.

```yaml
env:
  SLACK_WEBHOOK_URL: "..."

steps:
  - type: log
    message: "Starting daily data warehouse load"

  - type: replication
    path: "replications/pg_to_sflake.yaml"
    mode: "incremental"
    id: dw_load

  - type: command
    command: "dbt run --select my_models"
    working_dir: "/path/to/dbt/project"
    if: state.dw_load.status == "success"

  - type: http
    url: "{env.SLACK_WEBHOOK_URL}"
    method: "POST"
    payload: '{"text": "Daily DW load completed successfully!"}'
    if: state.dw_load.status == "success"

  - type: http
    url: "{env.SLACK_WEBHOOK_URL}"
    method: "POST"
    payload: '{"text": "ERROR: Daily DW load failed!"}'
    if: state.dw_load.status == "error"
```

--------------------------------

### Configure Source and Target Options

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_PYTHON.md

Customize source and target behavior using `SourceOptions` and `TargetOptions`, specifying formats, delimiters, compression, and file settings.

```python
src_opts = SourceOptions(
    format=Format.CSV,
    delimiter="|",
    header=True,
    null_if="NULL",
)

tgt_opts = TargetOptions(
    format=Format.PARQUET,
    compression=Compression.ZSTD,
    file_max_rows=100000,
    column_casing="snake"
)

sling = Sling(
    src_stream="file://input.csv",
    src_options=src_opts,
    tgt_object="file://output.parquet",
    tgt_options=tgt_opts
)
sling.run()

```

--------------------------------

### MySQL: Get Database Size by Schema

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_DATABASE.md

Calculate and display the size of each database schema in megabytes for a MySQL instance, providing a storage overview. Requires a MySQL connection.

```json
// Get database size
{
  "action": "query",
  "input": {
    "connection": "MY_MYSQL",
    "query": "SELECT table_schema, ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS 'DB Size in MB' FROM information_schema.tables GROUP BY table_schema",
    "description": "Get size of each database schema in MB for storage overview"
  }
}
```

--------------------------------

### Programmatically Configure and Execute Sling Task in Go

Source: https://context7.com/slingdata-io/sling-cli/llms.txt

Builds a Sling configuration object programmatically, prepares it for execution by resolving connections and validating settings, determines the task type, and then executes the task. This is useful for dynamic task creation.

```go
package main

import (
    "github.com/slingdata-io/sling-cli/core/sling"
    "github.com/flarco/g"
)

func main() {
    // Build config programmatically
    cfg := &sling.Config{
        Source: sling.Source{
            Conn:   "MY_POSTGRES",
            Stream: "public.transactions",
            UpdateKey: "updated_at",
            PrimaryKeyI: []string{"transaction_id"},
            Options: &sling.SourceOptions{
                Limit: g.Int(10000),
            },
        },
        Target: sling.Target{
            Conn:   "MY_SNOWFLAKE",
            Object: "analytics.transactions",
            Options: &sling.TargetOptions{
                AddNewColumns:  g.Bool(true),
                UseBulk:        g.Bool(true),
            },
        },
        Mode: sling.IncrementalMode,
        Env: map[string]string{
            "SLING_THREADS": "5",
        },
    }

    // Prepare resolves connections, validates config, sets defaults
    err := cfg.Prepare()
    if err != nil {
        g.LogFatal(err, "could not prepare config")
    }

    // Determine task type: DbToDb, FileToDB, DbToFile, ApiToDB, etc.
    taskType, err := cfg.DetermineType()
    if err != nil {
        g.LogFatal(err)
    }
    g.Info("Task type: %s", taskType) // Task type: db-db

    // Execute
    task := sling.NewTask("my-exec-id", cfg)
    if err = task.Execute(); err != nil {
        g.LogFatal(err)
    }
    g.Info("Done. Rows: %d", task.GetCount())
}
```

--------------------------------

### Sample Data from a Table

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_DATABASE.md

Preview a small number of rows from a table to inspect data values and format. This is a common step in basic data exploration.

```json
{
  "action": "query",
  "input": {
    "connection": "MY_DB",
    "query": "SELECT * FROM production.users LIMIT 3",
    "description": "Sample 3 rows from production.users to inspect data values and format"
  }
}
```

--------------------------------

### Discover Available Streams

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Use the discover action to list available streams from a source connection, optionally filtering by a pattern.

```json
{
  "action": "discover",
  "input": {
    "connection": "MY_SOURCE_DB",
    "pattern": "schema.*"
  }
}
```

--------------------------------

### Advanced Transformations with Staged Logic

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Define complex business logic using staged transformations. This example shows data cleansing, computed metrics, segmentation, and risk scoring.

```yaml
streams:
  customer_analytics:
    transforms:
      # Stage 1: Data cleansing
      - email: 'lower(trim_space(value))'
        phone: 'replace(value, "[^0-9+થી", "")'
        name: 'trim_space(value)'
      
      # Stage 2: Computed metrics
      - days_since_signup: 'date_diff(now(), record.created_at, "day")'
        lifetime_value: 'coalesce(record.total_spent, 0) * 1.2'
      
      # Stage 3: Segmentation
      - customer_segment: |
          record.lifetime_value >= 10000 ? "enterprise" : (
            record.lifetime_value >= 1000 ? "professional" : (
              record.days_since_signup <= 30 ? "new" : "standard"
            )
          )
      
      # Stage 4: Risk scoring
      - risk_score: |
          (record.failed_payments * 0.3) + 
          (record.days_since_last_login * 0.1) + 
          (record.support_tickets * 0.2)
```

--------------------------------

### Build Sling Binary

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/DEBUG.md

Builds the sling binary in the cmd/sling directory. This is a prerequisite for debugging.

```bash
cd cmd/sling
go build .
```

--------------------------------

### Enable Debug Logging for Run Action

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_REPLICATION.md

Run a replication with the DEBUG environment variable set to true to enable detailed debug logging.

```json
{
  "action": "run",
  "input": {
    "file_path": "/path/to/replication.yaml",
    "env": {
      "DEBUG": "true"
    }
  }
}
```

--------------------------------

### Query Data Distribution by Status

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_DATABASE.md

Analyze the distribution of data across different categories, such as order statuses, to understand data breakdown. This example groups by status and limits results.

```json
{
  "action": "query",
  "input": {
    "connection": "MY_DB",
    "query": "SELECT status, COUNT(*) as count FROM sales.orders GROUP BY status ORDER BY count DESC LIMIT 10",
    "description": "Analyze order status distribution to understand the breakdown of order states"
  }
}
```

--------------------------------

### get_user_details

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_API_SPEC.md

Retrieve details for each user ID from the queue. This endpoint iterates over user IDs provided in the 'user_ids' queue, making concurrent GET requests to fetch user details.

```APIDOC
## GET /users/{state.current_user_id}

### Description
Retrieve details for each user ID from the queue. This endpoint iterates over user IDs provided in the 'user_ids' queue, making concurrent GET requests to fetch user details.

### Method
GET

### Endpoint
{state.base_url}/users/{state.current_user_id}

### Parameters
#### Path Parameters
- **state.current_user_id** (string) - Required - The ID of the user to retrieve details for, iterated from the 'user_ids' queue.

### Request Example
```json
{
  "example": "No request body for GET request"
}
```

### Response
#### Success Response (200)
- **user** (object) - Contains the user details.
  - **id** (string) - The unique identifier for the user.

#### Response Example
```json
{
  "user": {
    "id": "123e4567-e89b-12d3-a456-426614174000",
    "name": "John Doe",
    "email": "john.doe@example.com"
  }
}
```

### Iteration Configuration
- **over**: "queue.user_ids"
- **into**: "state.current_user_id"
- **concurrency**: 10
```

--------------------------------

### Query Total Records in a Table

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_DATABASE.md

Use this action to get the total row count of a table to understand data volume. Requires a valid connection name and SQL query.

```json
{
  "action": "query",
  "input": {
    "connection": "MY_DB",
    "query": "SELECT COUNT(*) as total_records FROM sales.orders",
    "description": "Get total row count of sales.orders to understand data volume"
  }
}
```

--------------------------------

### Run Specific Test Numbers/Ranges with Go Test

Source: https://github.com/slingdata-io/sling-cli/blob/main/README.md

Demonstrates various ways to specify individual test numbers, ranges, or subsequent tests using the `--` flag with `go test`.

```go
-go test -v -run TestCLI -- "1,2,3"
-go test -v -run TestSuiteFileS3 -- "1-5"
-go test -v -run TestCLI -- "3+"
```

--------------------------------

### Get Schema Names

Source: https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/resource/llm_CONNECTION_DATABASE.md

Retrieve a simple list of all schema names available in the specified database connection. This is a quick way to see available schemas without detailed metadata.

```json
{
  "action": "get_schemas",
  "input": {
    "connection": "MY_POSTGRES"
  }
}
```