### Install Development Dependencies

Source: https://github.com/cre-dev/xml2db/blob/main/README.md

Install additional development dependencies for testing and documentation, after cloning the repository.

```bash
pip install -e .[tests,docs]
```

--------------------------------

### Install xml2db Package

Source: https://github.com/cre-dev/xml2db/blob/main/README.md

Install the xml2db package using pip. It is recommended to do this within a virtual environment.

```bash
pip install xml2db
```

--------------------------------

### Install Project with Dev Dependencies

Source: https://github.com/cre-dev/xml2db/blob/main/CLAUDE.md

Installs the project in editable mode along with development and documentation dependencies, including DuckDB and pytz.

```bash
pip install -e .[tests,docs] duckdb_engine pytz
```

--------------------------------

### Install xml2db in Editable Mode

Source: https://github.com/cre-dev/xml2db/blob/main/docs/getting_started.md

For development, clone the repository and install xml2db in editable mode with development dependencies.

```bash
pip install -e .[docs,tests]
```

--------------------------------

### Load XML into Database

Source: https://github.com/cre-dev/xml2db/blob/main/README.md

Use this snippet to create a data model from an XSD, parse an XML file, and insert its content into a relational database. Ensure you have the necessary database driver installed.

```python
from xml2db import DataModel

# Create a data model of tables with relations based on the XSD file
data_model = DataModel(
    xsd_file="path/to/file.xsd", 
    connection_string="postgresql+psycopg2://testuser:testuser@localhost:5432/testdb",
)
# Parse an XML file based on this XSD
document = data_model.parse_xml(
    xml_file="path/to/file.xml"
)
# Insert the document content into the database
document.insert_into_target_tables()
```

--------------------------------

### Add Custom SQLAlchemy Index

Source: https://github.com/cre-dev/xml2db/blob/main/docs/configuring.md

Pass extra arguments to SQLAlchemy's Table constructor to customize indexes. This example demonstrates adding a custom index on multiple columns for a specific table.

```python
model_config = {
    "tables": {
        "my_table": {
            "extra_args": sqlalchemy.Index("my_index", "my_column1", "my_column2"),
        }
    }
}
```

--------------------------------

### Run a Specific Test by Name

Source: https://github.com/cre-dev/xml2db/blob/main/CLAUDE.md

Executes a specific test identified by its name, for example, 'test_iterative_recursive_parsing'.

```bash
pytest -k "test_iterative_recursive_parsing"
```

--------------------------------

### Serve Documentation Locally

Source: https://github.com/cre-dev/xml2db/blob/main/CLAUDE.md

Builds and serves the project's documentation locally using MkDocs, allowing for previewing changes.

```bash
mkdocs serve
```

--------------------------------

### Loading XML into a Database

Source: https://github.com/cre-dev/xml2db/blob/main/docs/index.md

This snippet demonstrates how to initialize a DataModel from an XSD file and a database connection string, parse an XML file, and load the data into the database tables.

```Python
from xml2db import DataModel

# Create a DataModel object from an XSD file
data_model = DataModel(
    xsd_file="path/to/file.xsd", 
    connection_string="postgresql+psycopg2://testuser:testuser@localhost:5432/testdb",
)

# Parse an XML file based on this XSD schema
document = data_model.parse_xml(xml_file="path/to/file.xml")

# Load data into the database, creating target tables if need be
document.insert_into_target_tables()
```

--------------------------------

### Create a DataModel object

Source: https://github.com/cre-dev/xml2db/blob/main/docs/getting_started.md

Create a DataModel object by providing the path to an XSD file, the target database schema name, and a SQLAlchemy connection string. An optional model configuration can also be provided.

```python
from xml2db import DataModel

data_model = DataModel(
    xsd_file="path/to/file.xsd",
    db_schema="source_data", # the name of the database target schema
    connection_string="postgresql+psycopg2://testuser:testuser@localhost:5432/testdb",
    model_config={},
)
```

--------------------------------

### Regenerate Snapshot Tests

Source: https://github.com/cre-dev/xml2db/blob/main/CLAUDE.md

Navigates to the sample models directory and runs the 'models.py' script to regenerate snapshot files for model outputs.

```bash
cd tests/sample_models && python models.py
```

--------------------------------

### Run Tests Against a Real Database

Source: https://github.com/cre-dev/xml2db/blob/main/CLAUDE.md

Executes tests against a persistent database using a provided connection string, here shown with PostgreSQL and psycopg2.

```bash
DB_STRING="postgresql+psycopg2://user:pass@localhost/testdb" pytest
```

--------------------------------

### Run Conversion Tests Only

Source: https://github.com/cre-dev/xml2db/blob/main/README.md

Run only the conversion tests that do not require a database connection. This is useful for quick checks.

```bash
pytest -m "not dbtest"
```

--------------------------------

### Run a Specific Test File

Source: https://github.com/cre-dev/xml2db/blob/main/CLAUDE.md

Executes tests located within a particular file, such as 'tests/test_conversions.py'.

```bash
pytest tests/test_conversions.py
```

--------------------------------

### n-n Relationship Modeling with Third Table

Source: https://github.com/cre-dev/xml2db/blob/main/docs/how_it_works.md

Demonstrates how n-n relationships are represented in a SQL model using an additional table to hold the relationship.

```mermaid
erDiagram
          CONTRACT ||--|{ CONTRACT_DELIVERY_PROFILE : is_in
          CONTRACT_DELIVERY_PROFILE }|--|| DELIVERY_PROFILE : involves
```

--------------------------------

### Run All Tests

Source: https://github.com/cre-dev/xml2db/blob/main/README.md

Execute all tests for the xml2db package using the pytest command.

```bash
python -m pytest
```

--------------------------------

### Data Loading Flowchart

Source: https://github.com/cre-dev/xml2db/blob/main/docs/api/overview.md

Visual representation of the data loading process from an XML file into database tables, detailing the functions involved in lower-level steps. Useful for understanding advanced data transformation and loading scenarios.

```mermaid
flowchart TB
    subgraph "<a href='../data_model/#xml2db.model.DataModel.parse_xml' style='color:var(--md-code-fg-color)'>DataModel.parse_xml</a>"
        direction TB
        A[XML file]-- "<a href='../xml_converter/#xml2db.xml_converter.XMLConverter.parse_xml' style='color:var(--md-code-fg-color)'>XMLConverter.parse_xml</a>" -->B[Document tree]
        B-- "<a href='../document/#xml2db.document.Document.doc_tree_to_flat_data' style='color:var(--md-code-fg-color)'>Document.doc_tree_to_flat_data</a>" -->C[Flat data model]
    end
    C -.- D
    subgraph "<a href='../document/#xml2db.document.Document.insert_into_target_tables' style='color:var(--md-code-fg-color)'>Document.insert_into_target_tables</a>"
        direction TB
        D[Flat data model]-- "<a href='../document/#xml2db.document.Document.insert_into_temp_tables' style='color:var(--md-code-fg-color)'>Document.insert_into_temp_tables</a>" -->E[Temporary tables]
        E-- "<a href='../document/#xml2db.document.Document.merge_into_target_tables' style='color:var(--md-code-fg-color)'>Document.merge_into_target_tables</a>" -->F[Target tables]
    end
```

--------------------------------

### Data Model Visualization

Source: https://github.com/cre-dev/xml2db/blob/main/docs/index.md

This Mermaid diagram illustrates the structure of a data model extracted from an XSD file, showing tables and their relationships.

```mermaid
erDiagram
    Unavailability_MarketDocument ||--o{ TimeSeries : "TimeSeries*"
    Unavailability_MarketDocument ||--|{ Reason : "Reason*"
    Unavailability_MarketDocument {
        string mRID
        string revisionNumber
        NMTOKEN type
        NMTOKEN process_processType
        dateTime createdDateTime
        string sender_MarketParticipant_mRID
        NMTOKEN sender_MarketParticipant_marketRole_type
        string receiver_MarketParticipant_mRID
        NMTOKEN receiver_MarketParticipant_marketRole_type
        string unavailability_Time_Period_timeInterval_start
        string unavailability_Time_Period_timeInterval_end
        NMTOKEN docStatus_value
    }
    TimeSeries ||--o{ Available_Period : "Available_Period*"
    TimeSeries ||--o{ Available_Period : "WindPowerFeedin_Period*"
    TimeSeries ||--o{ Asset_RegisteredResource : "Asset_RegisteredResource*"
    TimeSeries ||--o{ Reason : "Reason*"
    TimeSeries {
        string mRID
        NMTOKEN businessType
        string biddingZone_Domain_mRID
        string in_Domain_mRID
        string out_Domain_mRID
        date start_DateAndOrTime_date
        time start_DateAndOrTime_time
        date end_DateAndOrTime_date
        time end_DateAndOrTime_time
        NMTOKEN quantity_Measure_Unit_name
        NMTOKEN curveType
        string production_RegisteredResource_mRID
        string production_RegisteredResource_name
        string production_RegisteredResource_location_name
        NMTOKEN production_RegisteredResource_pSRType_psrType
        string production_RegisteredResource_pSRType_powerSystemResources_mRID
        string production_RegisteredResource_pSRType_powerSystemResources_name
        float production_RegisteredResource_pSRType_powerSystemResources_nominalP
    }
    Available_Period ||--|{ Point : "Point*"
    Available_Period {
        string timeInterval_start
        string timeInterval_end
        duration resolution
    }
    Point {
        integer position
        decimal quantity
    }
    Asset_RegisteredResource {
        string mRID
        string name
        NMTOKEN asset_PSRType_psrType
        string location_name
    }
    Reason {
        NMTOKEN code
        string text
    }
```

--------------------------------

### Model Configuration General Structure

Source: https://github.com/cre-dev/xml2db/blob/main/docs/configuring.md

This dictionary structure outlines the general configuration options available for the xml2db data model. It shows top-level model settings and nested table-specific configurations.

```python
{
    "document_tree_hook": None,
    "document_tree_node_hook": None,
    "row_numbers": False,
    "as_columnstore": False,
    "metadata_columns": None,
    "tables": {
        "table1": {
            "reuse": True,
            "choice_transform": False,
            "as_columnstore": False,
            "fields": {
                "my_column": {
                    "type": None #default type
                }
            },
            "extra_args": [],
        }
    }
}
```

--------------------------------

### Configure Joining for Simple Types

Source: https://github.com/cre-dev/xml2db/blob/main/docs/configuring.md

Configure how simple type elements with specific XSD types and maximum occurrences are joined into a single column. This setting is currently always applied and cannot be opted out.

```python
model_config = {
   "tables": {
       "my_table_name": {
           "fields": {
               "my_field_name": {
                   "transform": "join"
               }
           }
       }
   }
}
```

--------------------------------

### Write source tree and target tree to a file

Source: https://github.com/cre-dev/xml2db/blob/main/docs/getting_started.md

Generate text-based tree representations of the raw XML schema (source tree) and the simplified schema (target tree). This shows element names, data types, and cardinality.

```python
with open(f"source_tree.txt", "w") as f:
    f.write(data_model.source_tree)

with open(f"target_tree.txt", "w") as f:
    f.write(data_model.target_tree)
```

--------------------------------

### Parse a XML file

Source: https://github.com/cre-dev/xml2db/blob/main/docs/getting_started.md

Parses a single XML file and prepares its content for database insertion. Ensure the data model is defined before parsing.

```python
document = data_model.parse_xml(
    xml_file="path/to/file.xml",
)
document.insert_into_target_tables()
```

--------------------------------

### XML Schema Data Types and Database Mapping

Source: https://github.com/cre-dev/xml2db/blob/main/tests/sample_models/table1/table1_erd_version1.md

Illustrates the mapping of various XML data types to their database schema equivalents. This is useful for understanding how to represent complex XML structures in a relational database.

```plaintext
decimal undisclosedVolume_value
string undisclosedVolume_unit
string orderDuration_duration
dateTime orderDuration_expirationDateTime
priceIntervalQuantityDetails {
    date intervalStartDate
    date intervalEndDate
    string daysOfTheWeek
    time-N intervalStartTime
    time-N intervalEndTime
    decimal quantity
    string unit
    decimal priceTimeIntervalQuantity_value
    string priceTimeIntervalQuantity_currency
}
optionDetails {
    string optionStyle
    string optionType
    date-N optionExerciseDate
    decimal optionStrikePrice_value
    string optionStrikePrice_currency
}
fixingIndex {
    string indexName
    decimal indexValue
}
deliveryProfile {
    date loadDeliveryStartDate
    date loadDeliveryEndDate
    string-N daysOfTheWeek
    time-N loadDeliveryStartTime
    time-N loadDeliveryEndTime
}
contractTradingHours {
    time startTime
    time endTime
    date date
}
```

--------------------------------

### Load multiple XML files in one database operation

Source: https://github.com/cre-dev/xml2db/blob/main/docs/getting_started.md

Accumulates data from multiple XML files in memory before inserting into the database in a single batch. This optimizes performance for numerous small files. Metadata can be passed for each file.

```python
flat_data = None
for xml_file in files:
    document = data_model.parse_xml(
        xml_file=xml_file,
        metadata={"input_file_path": xml_file},
        flat_data=flat_data,
    )
    flat_data = document.data
document.insert_into_target_tables()
```

--------------------------------

### Run All Tests with DB Integration

Source: https://github.com/cre-dev/xml2db/blob/main/CLAUDE.md

Executes all tests, including those requiring database integration, using in-memory DuckDB. Sets the timezone to Europe/Paris.

```bash
TZ="Europe/Paris" DB_STRING="duckdb:///:memory:" python -m pytest
```

--------------------------------

### 1-n Relationship Conversion

Source: https://github.com/cre-dev/xml2db/blob/main/docs/how_it_works.md

Shows how 1-n relationships are handled, allowing a child node to have multiple parents if used under different parent nodes.

```mermaid
erDiagram
          CONTRACT ||--|{ DELIVERY_PROFILE : delivers
          UNIQUE_CONTRACT }|--|{ UNIQUE_DELIVERY_PROFILE : delivers
```

--------------------------------

### Multiprocessing XML Loading with Database Lock

Source: https://github.com/cre-dev/xml2db/blob/main/docs/api/overview.md

Demonstrates parallel XML parsing across multiple processes, with database I/O serialised using a multiprocessing lock. This approach ensures data integrity for various database backends.

```python
import multiprocessing
from xml2db import DataModel


def load_one_file(xml_path, xsd_path, connection_string, lock):
    # Each process creates its own DataModel with a unique temp_prefix.
    model = DataModel(
        xsd_file=xsd_path,
        connection_string=connection_string,
    )
    # XML parsing is CPU-bound and runs in parallel across all processes.
    doc = model.parse_xml(xml_path)

    # Serialise all database I/O across processes.
    with lock:
        doc.insert_into_target_tables()
        model.engine.dispose()


if __name__ == "__main__":
    xsd_path = "schema.xsd"
    connection_string = "duckdb:///data.duckdb"
    xml_files = ["file1.xml", "file2.xml", "file3.xml"]

    lock = multiprocessing.Lock()
    processes = [
        multiprocessing.Process(
            target=load_one_file,
            args=(xml_path, xsd_path, connection_string, lock),
        )
        for xml_path in xml_files
    ]
    for p in processes:
        p.start()
    for p in processes:
        p.join()
        if p.exitcode != 0:
            raise RuntimeError(f"Worker failed with exit code {p.exitcode}")

```

--------------------------------

### 1-1 Relationship Conversion to n-1

Source: https://github.com/cre-dev/xml2db/blob/main/docs/how_it_works.md

Illustrates a 1-1 relationship converted to n-1 after node reuse, where UNIQUE_TRADE holds a foreign key to UNIQUE_CONTRACT.

```mermaid
erDiagram
          TRADE ||--|| CONTRACT : concerns
          UNIQUE_TRADE }|--|| UNIQUE_CONTRACT : concerns
```

--------------------------------

### Write Entity Relationship Diagram to a file

Source: https://github.com/cre-dev/xml2db/blob/main/docs/getting_started.md

Generate a Mermaid-compatible Entity Relationship Diagram (ERD) for the data model and write it to a markdown file. This helps visualize tables and relationships.

```python
with open(f"target_data_model_erd.md", "w") as f:
   f.write(data_model.get_entity_rel_diagram())
```

--------------------------------

### Extract data back to XML

Source: https://github.com/cre-dev/xml2db/blob/main/docs/getting_started.md

Extracts data from the database based on a WHERE clause and saves it to an XML file. Primarily used for testing and round-trip validation.

```python
document = data_model.extract_from_database(
    root_select_where="xml2db_input_file_path='path/to/file.xml'",
)
document.to_xml("extracted_file.xml")
```

--------------------------------

### Disable Choice Group Simplification

Source: https://github.com/cre-dev/xml2db/blob/main/docs/configuring.md

Use this configuration to prevent xml2db from simplifying choice groups with more than two options of the same data type. This is useful when you want to retain the original structure of choice groups.

```python
model_config = {
    "tables": {
        "my_table_name": {
            "choice_transform": False
        }
    }
}
```

--------------------------------

### Force Elevation of Complex Child Elements

Source: https://github.com/cre-dev/xml2db/blob/main/docs/configuring.md

Force the elevation of a complex child element to its parent, even if it has more than 5 fields. This can help simplify the data model by pulling child fields up to the parent level.

```python
model_config = {
    "tables": {
        "contract": {
            "fields": {
                "docStatus": {
                    "transform": "elevate"
                }
            }
        }
    }
}
```

--------------------------------

### Disable Deduplication for a Table

Source: https://github.com/cre-dev/xml2db/blob/main/docs/configuring.md

Opt-out of the default element deduplication behavior for a specific table. This can simplify the data model and potentially speed up queries if elements are mostly unique, at the cost of increased storage.

```python
model_config = {
    "tables": {
        "my_table": {"reuse": False}
    }
}
```

--------------------------------

### Override Default Column Type Mapping

Source: https://github.com/cre-dev/xml2db/blob/main/docs/configuring.md

Customize the SQLAlchemy data type for a specific column in your model configuration. This is useful when the default mapping does not meet your database requirements.

```python
import xml2db
from sqlalchemy.dialects import mssql

model_config = {
    "tables": {
        "my_table": {
            "fields": {
                "my_column": {
                    "type": mssql.BIGINT
                }
            }
        },
    },
}

data_model = xml2db.DataModel(
    xsd_file="path/to/file.xsd", db_schema="my_schema", model_config=model_config
)
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.