### Full Docker Compose Setup

Source: https://github.com/datajoint/datajoint-python/blob/master/CONTRIBUTING.md

Build and start the DataJoint test environment using Docker Compose.

```bash
docker compose --profile test up djtest --build
```

--------------------------------

### Alternative Development Setup with pip

Source: https://github.com/datajoint/datajoint-python/blob/master/CONTRIBUTING.md

Install the project in editable mode with test dependencies and run pytest.

```bash
pip install -e ".[test]"
pytest tests/
```

--------------------------------

### Install and Run Pre-commit Hooks

Source: https://github.com/datajoint/datajoint-python/blob/master/CONTRIBUTING.md

Install pre-commit hooks for the first time and run them manually across all files.

```bash
pixi run pre-commit install              # First time only
pixi run pre-commit run --all-files      # Run manually
```

--------------------------------

### Quick Start with pixi

Source: https://github.com/datajoint/datajoint-python/blob/master/CONTRIBUTING.md

Clone the repository and run tests or pre-commit hooks using pixi for dependency management.

```bash
git clone https://github.com/datajoint/datajoint-python.git
cd datajoint-python

# Run tests (containers managed automatically)
pixi run test

# Run with coverage
pixi run test-cov

# Run pre-commit hooks
pixi run pre-commit run --all-files
```

--------------------------------

### NumPy-style Docstring Example

Source: https://github.com/datajoint/datajoint-python/blob/master/CONTRIBUTING.md

Example of a public API docstring following NumPy style, including parameters, return values, raises, and examples.

```python
def insert(self, rows, *, replace=False):
    """
    Insert rows into the table.

    Parameters
    ----------
    rows : iterable
        Rows to insert. Each row can be a dict, numpy record, or sequence.
    replace : bool, optional
        If True, replace existing rows with matching keys. Default is False.

    Returns
    -------
    None

    Raises
    ------
    DuplicateError
        When inserting a duplicate key without ``replace=True``.

    Examples
    --------
    >>> Mouse.insert1({"mouse_id": 1, "dob": "2024-01-15"})
    """
```

--------------------------------

### Install PostgreSQL Driver

Source: https://github.com/datajoint/datajoint-python/blob/master/CONTRIBUTING.md

Install the PostgreSQL driver for DataJoint support.

```bash
pip install -e ".[postgres]"    # Installs psycopg2-binary
```

--------------------------------

### Install datajoint with pip

Source: https://github.com/datajoint/datajoint-python/blob/master/README.md

Use pip to install the datajoint package.

```bash
pip install datajoint
```

--------------------------------

### Install datajoint with Conda

Source: https://github.com/datajoint/datajoint-python/blob/master/README.md

Use Conda to install the datajoint package from the conda-forge channel.

```bash
conda install -c conda-forge datajoint
```

--------------------------------

### Running External Containers for Debugging

Source: https://github.com/datajoint/datajoint-python/blob/master/CONTRIBUTING.md

Start external MySQL, PostgreSQL, and MinIO containers using docker compose, run tests with DJ_USE_EXTERNAL_CONTAINERS=1, and then stop the containers.

```bash
# MySQL + MinIO
docker compose up -d db minio
DJ_USE_EXTERNAL_CONTAINERS=1 pixi run test
docker compose down

# MySQL + PostgreSQL + MinIO
docker compose up -d db postgres minio
DJ_USE_EXTERNAL_CONTAINERS=1 pixi run test
docker compose down
```

--------------------------------

### Conda-Forge `meta.yaml` Dependencies

Source: https://github.com/datajoint/datajoint-python/blob/master/RELEASE_MEMO.md

Example of the `requirements` section in a conda-forge `meta.yaml` file, detailing the host and run dependencies for the datajoint package. Ensure these match the `pyproject.toml`.

```yaml
requirements:
  host:
    - python {{ python_min }}
    - pip
    - setuptools >=62.0
  run:
    - python >={{ python_min }}
    - numpy
    - pandas
    - pymysql >=1.0
    - minio
    - packaging
    # ... etc
```

--------------------------------

### List and Get DataJoint Codecs

Source: https://context7.com/datajoint/datajoint-python/llms.txt

List all registered DataJoint codecs and retrieve a specific codec object by its name.

```python
print(dj.list_codecs())
# ['blob', 'npy', 'attach', 'filepath', 'hash', 'schema', 'graph']

codec_obj = dj.get_codec('graph')
```

--------------------------------

### Fetch Data as PyArrow Table

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Convert query results into a PyArrow Table. Requires the 'arrow' extra to be installed (`pip install datajoint[arrow]`).

```python
arrow_tbl = Subject.to_arrow()
```

--------------------------------

### Fetch Data as Pandas DataFrame

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Convert query results into a pandas DataFrame with primary keys as the index. Requires pandas to be installed.

```python
df = Subject.to_pandas()
# DataFrame with index=['subject_id'], columns=['species','dob']
```

--------------------------------

### Fetch Data as Polars DataFrame

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Retrieve data as a polars DataFrame. Requires the 'polars' extra to be installed (`pip install datajoint[polars]`).

```python
pl_df = Subject.to_polars(order_by='subject_id')
```

--------------------------------

### Get Table Description and Alter Definition

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Retrieve the DataJoint DDL string for a table using `describe()` and alter the table definition interactively or non-interactively.

```python
# Table description (DataJoint DDL)
print(Analysis.describe())

# Alter table definition (adds/modifies columns)
Analysis.alter()                # interactive prompt
Analysis.alter(prompt=False)    # immediate
```

--------------------------------

### Get SHA256 Hash for Conda Package (Bash)

Source: https://github.com/datajoint/datajoint-python/blob/master/RELEASE_MEMO.md

Bash command to retrieve the SHA256 hash for a Python package's source distribution from PyPI. It uses `curl` to fetch the JSON data and `jq` to parse and extract the specific hash.

```bash
curl -sL https://pypi.org/pypi/datajoint/2.1.0/json | jq -r '.urls[] | select(.packagetype=="sdist") | .digests.sha256'
```

--------------------------------

### Establishing Database Connections with dj.conn()

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Shows how to establish and manage database connections using the singleton `dj.conn()` function or the explicit `dj.Connection` class. Covers connecting with default credentials, explicit credentials, forcing a reconnect, using TLS, and managing transactions.

```python
import datajoint as dj

# Singleton connection — credentials from dj.config or env vars
connection = dj.conn()

# Explicit credentials
connection = dj.conn(host="localhost", user="root", password="secret")

# Force re-connect (e.g., after credential rotation)
connection = dj.conn(reset=True)

# With TLS
connection = dj.conn(host="secure-db.org", use_tls=True)

# Direct Connection object (bypasses singleton)
conn = dj.Connection("localhost", "root", "secret", port=3306)

# Using within a transaction
with conn.transaction:
    SomeTable.insert([{'id': 1, 'value': 42}])
    OtherTable.insert([{'id': 1, 'result': 'ok'}])
# Rolls back automatically on exception

# List all accessible schemas
schemas = dj.list_schemas()
print(schemas)  # ['my_lab', 'shared_data', ...]

```

--------------------------------

### Multi-Tenant Instances with dj.Instance

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Illustrates the creation and usage of `dj.Instance` for thread-safe, isolated contexts, essential for multi-tenant applications or when `DJ_THREAD_SAFE=true`. Shows how to create an instance with specific configurations and bind schemas to it.

```python
import datajoint as dj

# Create a fully isolated instance
inst = dj.Instance(
    host="db.example.org",
    user="alice",
    password="secret",
    backend="postgresql",
    safemode=False,              # keyword config overrides
)

# Access config and connection
inst.config.display.limit = 50
print(inst.connection)

# Create schema bound to this instance
schema = inst.Schema("lab_alice")

@schema
class Session(dj.Manual):
    definition = '''
    session_id : int
    --- 
    date       : date
    '''

# Access an existing table without defining a class
tbl = inst.FreeTable("lab_alice.session")
print(len(tbl))

```

--------------------------------

### Configuration Management with dj.config

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Demonstrates reading, modifying, and transiently overriding DataJoint configuration settings. Configuration can be loaded from environment variables, a .secrets/ directory, or a datajoint.json file. It also shows how to configure object stores like S3.

```python
import datajoint as dj

# Read settings
print(dj.config.database.host)    # "localhost"
print(dj.config.database.backend) # "mysql"
print(dj.config.safemode)         # True

# Modify at runtime
dj.config.database.host = "db.example.org"
dj.config['database.user'] = "alice"

# Dot-notation dict-style access
dj.config['loglevel'] = "DEBUG"

# Transient override (automatically restored)
with dj.config.override(safemode=False, database__host="staging-db"):
    SomeTable.delete()          # runs without confirmation prompt

# Generate a project template (creates datajoint.json + .secrets/)
dj.config.save_template()       # minimal template
dj.config.save_template("full-config.json", minimal=False)

# Load from explicit file
dj.config.load("my-project.json")

# Object store configuration (S3 example)
dj.config.stores['main'] = {
    "protocol": "s3",
    "endpoint": "s3.amazonaws.com",
    "bucket": "my-bucket",
    "access_key": "AKIAIOSFODNN7EXAMPLE",
    "secret_key": "wJalrXUtnFEMI/K7MDENG",
    "location": "my-project/data",
}
dj.config.stores['default'] = "main"

```

--------------------------------

### AutoPopulate — `populate` and `make`

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Drives automated computation. `populate()` calls `make(key)` for each primary key in `key_source` not yet present in the table. Supports parallel processing, progress display, error suppression, and distributed job queues.

```APIDOC
## AutoPopulate — `populate` and `make`

Drives automated computation. `populate()` calls `make(key)` for each primary key in `key_source` not yet present in the table. Supports parallel processing, progress display, error suppression, and distributed job queues.

```python
import datajoint as dj

schema = dj.Schema('pipeline')

@schema
class Subject(dj.Manual):
    definition = 'subject_id: int
---
name: varchar(64)'

@schema
class Analysis(dj.Computed):
    definition = '''
    -> Subject
    ---
    result : float
    '''
    def make(self, key):
        import time, random
        time.sleep(0.1)  # simulate computation
        result = random.gauss(0, 1)
        self.insert1(dict(key, result=result))

# Basic populate (direct mode)
Analysis.populate()

# With progress bar
Analysis.populate(display_progress=True)

# Only populate specific subjects
Analysis.populate(Subject & "subject_id < 100")

# Stop after N calls
Analysis.populate(max_calls=10)

# Suppress errors, collect failures
status = Analysis.populate(suppress_errors=True)
print(status['success_count'])
print(status['error_list'])    # list of (key, error_message) tuples

# Parallel (multi-process)
Analysis.populate(processes=4, display_progress=True)

# Check progress
remaining, total = Analysis.progress(display=True)

# Distributed mode with job table
# (creates ~~analysis job table automatically)
Analysis.populate(reserve_jobs=True)   # reserve + run
Analysis.jobs.refresh()                # populate job queue for others to claim

# Tripartite make for long computations (fetch outside transaction)
class HeavyAnalysis(dj.Computed):
    definition = '-> Subject
---
result: longblob'

    def make_fetch(self, key):
        return (Subject & key).to_dicts(),      # returns tuple

    def make_compute(self, key, subjects):
        import time
        time.sleep(60)    # long computation outside transaction
        return sum(s['subject_id'] for s in subjects),

    def make_insert(self, key, total):
        self.insert1(dict(key, result=total))
```
```

--------------------------------

### AutoPopulate with Progress Bar

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Run .populate() with the display_progress=True option to show a progress bar, which is helpful for long-running computations.

```python
Analysis.populate(display_progress=True)
```

--------------------------------

### Define DataJoint Schema and Manual Table

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Set up a DataJoint schema and define a manual table. Manual tables require data to be inserted explicitly.

```python
import datajoint as dj

schema = dj.Schema('pipeline')

@schema
class Subject(dj.Manual):
    definition = 'subject_id: int
---
name: varchar(64)'
```

--------------------------------

### Basic AutoPopulate Execution

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Trigger the population of a computed table using .populate(). This method computes entries for keys in the key_source that are not yet in the table.

```python
Analysis.populate()
```

--------------------------------

### Delete Data with Cascade Options

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Use `delete()` for cascading deletes with transaction previews and optional confirmation. `delete_quick()` offers a non-cascading fast delete. Part table integrity can be managed with `part_integrity` options.

```python
import datajoint as dj

# Delete with cascade (prompts if safemode=True)
n = (Subject & {'subject_id': 1}).delete()
print(f"Deleted {n} rows")
```

```python
# Delete without prompt
(Subject & "dob < '2023-01-01'").delete(prompt=False)
```

```python
# Non-cascading fast delete (fails if dependents exist)
(Subject & {'subject_id': 99}).delete_quick()
```

```python
# Part table integrity options
Analysis.Stats.delete(part_integrity='ignore')    # allow deleting parts directly
Analysis.Stats.delete(part_integrity='cascade')   # also delete master rows
```

```python
# Drop a table (cascading drop)
DeprecatedTable.drop()             # prompts for confirmation
DeprecatedTable.drop(prompt=False) # immediate drop
```

```python
# Part table drop
Analysis.Stats.drop(part_integrity='ignore')
```

```python
# Schema drop (all tables)
schema.drop(prompt=False)
```

--------------------------------

### Parallel AutoPopulate Execution

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Enable parallel processing for .populate() by specifying the number of processes. This can significantly speed up computations on multi-core machines.

```python
Analysis.populate(processes=4, display_progress=True)
```

--------------------------------

### Ordered Fetch for Previewing

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Use dj.Top() within a relation to fetch a limited number of rows based on a specified order, useful for previews or sampling.

```python
first5 = (Subject & dj.Top(limit=5, order_by='subject_id')).to_dicts()
```

--------------------------------

### Visualize Table Dependencies with `dj.Diagram`

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Use `dj.Diagram` to visualize the dependency graph of tables. It supports set operators for subgraph selection and cascade preview for delete/drop operations. Drawing requires matplotlib and pygraphviz.

```python
import datajoint as dj

# Diagram of a single table (no display; just the graph)
diag = dj.Diagram(Analysis)
```

```python
# Expand n levels up (ancestors) or down (descendants)
diag_up2   = dj.Diagram(Analysis) - 2   # 2 levels of parents
diag_down1 = dj.Diagram(Analysis) + 1   # 1 level of children
diag_both  = dj.Diagram(Analysis) - 1 + 1
```

```python
# Diagram of an entire schema
diag_schema = dj.Diagram(schema)
```

```python
# Set operators
combined = dj.Diagram(Analysis) + dj.Diagram(Subject)
diff = dj.Diagram(schema) - dj.Diagram(DeprecatedTable)
```

```python
# Draw (requires matplotlib + pygraphviz)
dj.Diagram(schema).draw()
```

```python
# Get row counts per table in the diagram
diag.counts()
```

```python
# Temporarily change layout direction
with dj.config.override(display__diagram_direction="TB"):
    dj.Diagram(schema).draw()
```

```python
# Preview cascade impact of a delete without executing
preview_diag = dj.Diagram.cascade(Subject & "subject_id < 5")
for ft in preview_diag:
    print(ft.full_table_name, len(ft))
```

--------------------------------

### Define Database Schema and Tables

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Binds table classes to a database schema. The schema creates the database if needed and declares tables from definition strings.

```python
import datajoint as dj

schema = dj.Schema('my_pipeline')

@schema
class Subject(dj.Manual):
    definition = '''
    subject_id   : varchar(12)   # unique subject identifier
    ---
    species      : enum('mouse','rat','human')
    date_of_birth: date
    notes=''     : varchar(2048)
    '''

@schema
class Session(dj.Manual):
    definition = '''
    -> Subject
    session_id   : smallint
    ---
    session_date : date
    experimenter : varchar(64)
    '
```

--------------------------------

### Lineage Tracking and Job Management

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Manage lineage tracking for DataJoint schemas, including rebuilding the lineage table and checking its existence. Also demonstrates refreshing and inspecting jobs for auto-populated tables.

```python
# Lineage tracking
print(schema.lineage)            # dict mapping attr -> origin
schema.rebuild_lineage()         # rebuild ~lineage table
print(schema.lineage_table_exists)

# Job management for auto-populated tables
Analysis.jobs.refresh()
pending_keys = Analysis.jobs.pending.to_dicts()
print(Analysis.jobs.errors.to_dicts())   # see error records
```

--------------------------------

### Distributed AutoPopulate with Job Table

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Configure .populate() for distributed processing by setting reserve_jobs=True. This creates and manages a job queue for distributed workers.

```python
# (creates ~~analysis job table automatically)
Analysis.populate(reserve_jobs=True)   # reserve + run
Analysis.jobs.refresh()                # populate job queue for others to claim
```

--------------------------------

### Tripartite make for Long Computations

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Implement a tripartite make pattern for long computations by defining make_fetch, make_compute, and make_insert methods. This allows fetching data outside the transaction.

```python
class HeavyAnalysis(dj.Computed):
    definition = '-> Subject
---
result: longblob'

    def make_fetch(self, key):
        return (Subject & key).to_dicts(),      # returns tuple

    def make_compute(self, key, subjects):
        import time
        time.sleep(60)    # long computation outside transaction
        return sum(s['subject_id'] for s in subjects),

    def make_insert(self, key, total):
        self.insert1(dict(key, result=total))
```

--------------------------------

### Running DataJoint Tests

Source: https://github.com/datajoint/datajoint-python/blob/master/CONTRIBUTING.md

Execute various test suites using pixi, including all tests, coverage, unit tests, specific integration files, or tests filtered by backend.

```bash
pixi run test                                    # All tests (both backends)
pixi run test-cov                                # With coverage
pixi run -e test pytest tests/unit/              # Unit tests only
pixi run -e test pytest tests/integration/test_blob.py -v  # Specific file
pixi run -e test pytest -m mysql                 # MySQL tests only
pixi run -e test pytest -m postgresql            # PostgreSQL tests only
```

--------------------------------

### Projection: Select, Rename, and Compute Attributes

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Project specific attributes, rename them, or compute new ones using SQL-like expressions. Use '...' to include all existing attributes along with new computed ones.

```python
pk_only = Subject.proj()
selected = Subject.proj('species', 'dob')
renamed = Subject.proj(birth_date='dob')
computed = Subject.proj(age_days="DATEDIFF(NOW(), dob)")
all_plus_computed = Subject.proj(..., age_days="DATEDIFF(NOW(), dob)")
```

--------------------------------

### Release Notes Markdown Format

Source: https://github.com/datajoint/datajoint-python/blob/master/RELEASE_MEMO.md

Use this markdown structure to organize release notes, categorizing changes into BREAKING, Added, Changed, Fixed, etc. Link to PRs/issues for detailed information.

```markdown
## What's Changed

### BREAKING CHANGES
- **`fetch()` removed** — Use `to_dicts()`, `to_pandas()`, or `to_arrays()` instead (#123)

### Added
- New `to_polars()` method for Polars DataFrame output (#456)
- Support for custom codecs via `@codec` decorator (#789)

### Changed
- Improved query performance for complex joins (2-3x faster)
- Default connection timeout increased to 30s

### Fixed
- Fixed incorrect NULL handling in aggregations (#234)

### Full Changelog
https://github.com/datajoint/datajoint-python/compare/v2.0.0...v2.1.0
```

--------------------------------

### Schema Introspection and Iteration

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Utilities for inspecting the schema, such as listing tables, checking for table existence, retrieving tables, and iterating through tables in dependency order.

```python
# Schema introspection
print(schema.list_tables())           # ['subject', 'session', ...]
print('Subject' in schema)            # True
tbl = schema.get_table('Subject')     # FreeTable
tbl2 = schema['session']              # bracket-notation alias

# Iterate all tables in dependency order
for table in schema:
    print(table.full_table_name, len(table))
```

--------------------------------

### Download Path for Attachments/Filepaths

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Specify a download path for attachments or filepaths using dj.config.override. This ensures downloaded files are stored in the designated location.

```python
with dj.config.override(download_path='/tmp/downloads'):
    data = AttachmentTable.to_dicts()
```

--------------------------------

### macOS Docker Host Configuration

Source: https://github.com/datajoint/datajoint-python/blob/master/CONTRIBUTING.md

Set the DOCKER_HOST environment variable for macOS Docker Desktop users if tests fail to connect.

```bash
export DOCKER_HOST=unix://$HOME/.docker/run/docker.sock
```

--------------------------------

### Trigger Documentation Build

Source: https://github.com/datajoint/datajoint-python/blob/master/RELEASE_MEMO.md

Manually trigger a documentation rebuild for datajoint-docs using the GitHub CLI. This is useful after updating docstrings in the datajoint-python repository.

```bash
gh workflow run development.yml --repo datajoint/datajoint-docs
```

--------------------------------

### Fetch Data as List of Dictionaries

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Retrieve data as a list of Python dictionaries using .to_dicts(). Supports ordering, limiting, and offsetting results.

```python
rows = Subject.to_dicts()
# [{'subject_id': 1, 'species': 'mouse', 'dob': datetime.date(2024, 1, 15)}, ...]

rows = Subject.to_dicts(order_by='dob DESC', limit=10, offset=20)
```

--------------------------------

### AutoPopulate for Specific Keys

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Limit the population process to a subset of keys by providing a relation. This allows targeted computation for specific data entries.

```python
Analysis.populate(Subject & "subject_id < 100")
```

--------------------------------

### Define Table Tiers

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Demonstrates the definition of various table tiers: Lookup, Manual, Imported, Computed, and Part. Each tier has specific characteristics and SQL name prefixes.

```python
import datajoint as dj

schema = dj.Schema('tiers_demo')

@schema
class Species(dj.Lookup):
    """Static reference data — auto-populated from `contents`."""
    definition = '''
    species : varchar(24)
    '''
    contents = [['mouse'], ['rat'], ['human']]

@schema
class Subject(dj.Manual):
    definition = '''
    subject_id : int auto_increment
    ---
    -> Species
    dob : date
    '''

@schema
class RawData(dj.Imported):
    """Populated from external files; make() called once per Subject."""
    definition = '''
    -> Subject
    ---
    raw_signal : <blob>  # serialised numpy array
    '''
    def make(self, key):
        import numpy as np
        signal = np.load(f"/data/{key['subject_id']}.npy")
        self.insert1(dict(key, raw_signal=signal))

@schema
class Analysis(dj.Computed):
    definition = '''
    -> RawData
    ---
    mean_value : float
    std_value  : float
    '''
    class Stats(dj.Part):
        definition = '''
        -> Analysis
        bin_idx : int
        ---
        bin_mean : float
        '''
    def make(self, key):
        import numpy as np
        signal = (RawData & key).fetch1('raw_signal')
        self.insert1(dict(key, mean_value=float(np.mean(signal)),
                               std_value=float(np.std(signal))))
        bins = np.array_split(signal, 10)
        self.Stats.insert([dict(key, bin_idx=i, bin_mean=float(b.mean()))
                           for i, b in enumerate(bins)])
```

--------------------------------

### Conda-Forge `meta.yaml` Configuration

Source: https://github.com/datajoint/datajoint-python/blob/master/RELEASE_MEMO.md

Configuration snippet for the `recipe/meta.yaml` file in a conda-forge feedstock. It specifies the package version, source URL, and SHA256 hash for the distribution.

```yaml
{% set version = "2.1.0" %}

package:
  name: datajoint
  version: {{ version }}

source:
  url: https://pypi.org/packages/source/d/datajoint/datajoint-{{ version }}.tar.gz
  sha256: <NEW_SHA256_HASH>

build:
  number: 0  # Reset to 0 for new version
```

--------------------------------

### Check AutoPopulate Progress

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Monitor the progress of a population job using .progress(). It returns the number of remaining and total entries to process.

```python
remaining, total = Analysis.progress(display=True)
```

--------------------------------

### Fetch Data as NumPy Structured Array

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Retrieve data as a NumPy structured array. Individual columns can be accessed by their names.

```python
arr = Subject.to_arrays()
print(arr['species'])    # array(['mouse', 'rat', ...])
```

--------------------------------

### Access Tables with FreeTable (Singleton Connection)

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Access a DataJoint table directly by its full name without defining a class, using a singleton connection. Retrieve data as dictionaries.

```python
import datajoint as dj

# Using singleton connection
tbl = dj.FreeTable("my_schema.subject")
print(tbl.to_dicts(limit=3))
```

--------------------------------

### Insert Data from File or DataFrame

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Inserts data from a CSV file or a pandas DataFrame. For DataFrames, the index is automatically detected as the primary key.

```python
# Insert from CSV file
Subject.insert(Path('subjects.csv'))

# Insert from pandas DataFrame
df = pd.DataFrame({'subject_id': [4, 5], 'species': ['mouse', 'rat'],
                    'dob': ['2024-04-01', '2024-05-01']})
Subject.insert(df)

# Round-trip: fetch → modify → re-insert
df = Subject.to_pandas()        # PK becomes index
df.index  # subject_id
Subject.insert_dataframe(df, skip_duplicates=True)   # auto-detects index as PK
```

--------------------------------

### Fetch Specific Columns as Separate Arrays

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Fetch specified columns as separate NumPy arrays. The order of returned arrays corresponds to the order of column names provided.

```python
species_arr, dob_arr = Subject.to_arrays('species', 'dob')
```

--------------------------------

### Top/Limit/Offset for Data Retrieval

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Retrieve a limited number of rows based on ordering. dj.Top() is used within a relation to apply limit and ordering.

```python
top5 = Subject & dj.Top(limit=5, order_by='dob DESC')
```

--------------------------------

### Navigate Table Dependencies

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Use graph-traversal methods on a DataJoint Table object to inspect relationships, including parents, children, ancestors, and descendants.

```python
import datajoint as dj

# Navigate the dependency graph
print(Analysis.parents())                    # list of parent table names
print(Analysis.children(as_objects=True))   # list of FreeTable objects
print(Analysis.ancestors())                 # all upstream tables (topological order)
print(Analysis.descendants())               # all downstream tables

# Part tables of a master
print(Analysis.parts())
print(Analysis.parts(as_objects=True))
```

--------------------------------

### Chunked Insert for Large Datasets

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Performs bulk inserts using chunked batches to manage memory pressure for large datasets. Requires an iterator for the rows.

```python
# Chunked insert for large datasets
Subject.insert(large_rows_iter, chunk_size=10_000)
```

--------------------------------

### Schema-Addressed Storage with NpyRef

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Define a table to store numpy arrays in the default object store using the '<npy@>' type. Use NpyRef to create a lightweight handle to a stored numpy array and fetch its data.

```python
@schema
class LargeData(dj.Manual):
    definition = '''
    data_id : int
    ---    
    matrix : <npy@>      # numpy array in default object store
    '''

# NpyRef — lightweight handle to a stored numpy array
ref = dj.NpyRef(schema="my_schema", table="large_data", field="matrix",
                key={'data_id': 1})
arr = ref.fetch()     # downloads only on access
```

--------------------------------

### Aggregation with DataJoint

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Aggregate data using the .aggr() method. Specify the table to join with and the aggregation function. Use exclude_nonmatching=True to only include entries with at least one match in the joined table.

```python
from datajoint import U
session_count = Subject.aggr(Session, n='count(*)')    # all subjects, n=0 if none
# Only subjects with at least one session:
active_subjects = Subject.aggr(Session, n='count(*)', exclude_nonmatching=True)
```

--------------------------------

### Extend DataJoint Type System with Codecs

Source: https://context7.com/datajoint/datajoint-python/llms.txt

The `dj.Codec` system allows extending DataJoint's type system with custom encode/decode logic. Codec subclasses auto-register on definition and can be used in table definitions with `<codec_name>` syntax.

```python
import datajoint as dj
import numpy as np
import networkx as nx
```

--------------------------------

### Fetching Data

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Modern fetch API for retrieving data as Python objects, pandas DataFrames, polars DataFrames, PyArrow Tables, numpy arrays, or primary keys. Supports ordering, limiting, and offsetting results. fetch1 retrieves exactly one row.

```APIDOC
## Fetching Data — `to_dicts`, `to_pandas`, `to_polars`, `to_arrays`, `keys`, `fetch1`

Modern fetch API (DataJoint 2.0). Returns decoded Python objects including numpy arrays, custom codec values, and downloaded file paths for attachment/filepath types.

```python
import datajoint as dj

# List of dicts (recommended)
rows = Subject.to_dicts()
# [{'subject_id': 1, 'species': 'mouse', 'dob': datetime.date(2024, 1, 15)}, ...]

# With ordering, limit, offset
rows = Subject.to_dicts(order_by='dob DESC', limit=10, offset=20)

# pandas DataFrame (PK as index)
df = Subject.to_pandas()
# DataFrame with index=['subject_id'], columns=['species','dob']

# polars DataFrame (requires: pip install datajoint[polars])
pl_df = Subject.to_polars(order_by='subject_id')

# PyArrow Table (requires: pip install datajoint[arrow])
arrow_tbl = Subject.to_arrow()

# numpy structured array
arr = Subject.to_arrays()
print(arr['species'])    # array(['mouse', 'rat', ...])

# Specific columns as separate arrays (tuple)
species_arr, dob_arr = Subject.to_arrays('species', 'dob')

# Primary keys only
keys = Subject.keys()
# [{'subject_id': 1}, {'subject_id': 2}, ...]

# Fetch exactly one row (raises if 0 or >1 matches)
row = (Subject & {'subject_id': 1}).fetch1()
# {'subject_id': 1, 'species': 'mouse', 'dob': datetime.date(2024,1,15)}

species = (Subject & {'subject_id': 1}).fetch1('species')
# 'mouse'

sp, dob = (Subject & {'subject_id': 1}).fetch1('species', 'dob')

# Ordered fetch for previewing
first5 = (Subject & dj.Top(limit=5, order_by='subject_id')).to_dicts()

# Download path for attachments/filepaths
with dj.config.override(download_path='/tmp/downloads'):
    data = AttachmentTable.to_dicts()
```
```

--------------------------------

### AutoPopulate with Maximum Calls

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Control the number of make calls during population using max_calls. This is useful for testing or limiting the scope of a population run.

```python
Analysis.populate(max_calls=10)
```

--------------------------------

### Union of Relations

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Combine results from multiple relations using the '+' operator. This is useful for creating a union of datasets that share a common structure.

```python
all_subjects = (Subject & "species='mouse'") + (Subject & "species='rat'")
```

--------------------------------

### Define DataJoint Computed Table with make Method

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Define a computed table that automatically populates its data using the make method. The make method is called for each new primary key.

```python
@schema
class Analysis(dj.Computed):
    definition = '''
    -> Subject
    ---
    result : float
    '''
    def make(self, key):
        import time, random
        time.sleep(0.1)  # simulate computation
        result = random.gauss(0, 1)
        self.insert1(dict(key, result=result))
```

--------------------------------

### Extract Version from Release Name (Bash)

Source: https://github.com/datajoint/datajoint-python/blob/master/RELEASE_MEMO.md

This bash command uses grep with a Perl-compatible regular expression to extract a version string (e.g., X.Y.Z) from a GitHub release name. It's used in the `post_draft_release_published.yaml` workflow.

```bash
VERSION=$(echo "${{ github.event.release.name }}" | grep -oP '\d+\.\d+\.\d+')
```

--------------------------------

### Define and Use a Custom Graph Codec

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Define a custom codec for serializing and deserializing networkx graphs. Use the custom codec in a DataJoint table definition and insert/fetch data.

```python
class GraphCodec(dj.Codec):
    name = "graph"

    def get_dtype(self, is_store: bool) -> str:
        return "<blob>"          # serializes to blob

    def encode(self, graph, *, key=None, store_name=None):
        return {'nodes': list(graph.nodes()), 'edges': list(graph.edges())}

    def decode(self, stored, *, key=None):
        G = nx.Graph()
        G.add_nodes_from(stored['nodes'])
        G.add_edges_from(stored['edges'])
        return G

    def validate(self, value):
        if not isinstance(value, nx.Graph):
            raise TypeError(f"Expected networkx.Graph, got {type(value).__name__}")

@schema
class Connectivity(dj.Manual):
    definition = '''
    conn_id : int
    ---    
    graph_data : <graph>
    '''

G = nx.path_graph(5)
Connectivity.insert1({'conn_id': 1, 'graph_data': G})
row = Connectivity.fetch1()
assert len(row['graph_data'].nodes) == 5
```

--------------------------------

### Fetch Exactly One Row

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Fetch a single row using .fetch1(). This method raises an error if zero or more than one row matches the query.

```python
row = (Subject & {'subject_id': 1}).fetch1()
# {'subject_id': 1, 'species': 'mouse', 'dob': datetime.date(2024,1,15)}

species = (Subject & {'subject_id': 1}).fetch1('species')
# 'mouse'

sp, dob = (Subject & {'subject_id': 1}).fetch1('species', 'dob')
```

--------------------------------

### Drop Entire Schema

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Safely drops an entire database schema. This action prompts for confirmation unless `safemode=False` is explicitly set.

```python
# Drop entire schema (prompts unless safemode=False)
schema.drop()
```

--------------------------------

### AutoPopulate with Error Suppression

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Use suppress_errors=True to prevent population from stopping on errors. The status dictionary returned contains success counts and a list of errors.

```python
status = Analysis.populate(suppress_errors=True)
print(status['success_count'])
print(status['error_list'])    # list of (key, error_message) tuples
```

--------------------------------

### Introspect Schemas with `virtual_schema`

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Use `dj.virtual_schema` to introspect an existing database schema and auto-generate Python table classes. This is useful when working with data produced by a different codebase. `dj.VirtualModule` offers lower-level access.

```python
import datajoint as dj

# Access an existing schema without Python class definitions
lab = dj.virtual_schema('my_lab_schema')
```

```python
# Table classes are auto-generated as module attributes
df = lab.Subject.to_pandas()
sessions = lab.Session & "experimenter='alice'"
```

```python
# Iterate tables
for name in lab.schema.list_tables():
    print(name, len(getattr(lab, name.replace('_', ' ').title().replace(' ', ''))))
```

```python
# VirtualModule (lower-level)
lab2 = dj.VirtualModule('lab', 'my_lab_schema', connection=dj.conn())
lab2.Session.to_dicts(limit=5)
```

```python
# Schema bracket / attribute access
schema['Subject'].to_dicts(limit=3)
schema.get_table('session').to_dicts()
```

```python
# list_schemas helper
all_schemas = dj.list_schemas()
```

--------------------------------

### Fetch Primary Keys Only

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Retrieve only the primary keys of the relation as a list of dictionaries. This is efficient for operations that only require identifiers.

```python
keys = Subject.keys()
# [{'subject_id': 1}, {'subject_id': 2}, ...]
```

--------------------------------

### Search Conda-Forge Package (Bash)

Source: https://github.com/datajoint/datajoint-python/blob/master/RELEASE_MEMO.md

Bash command to search for the datajoint package within the conda-forge channel. This is used for verification after a release.

```bash
conda search datajoint -c conda-forge
```

--------------------------------

### NOT Wrapper for Conditions

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Use dj.Not() to negate a condition, effectively selecting rows that do not meet the specified criteria. This is useful for exclusion filtering.

```python
not_condition = dj.Not(Subject & "species='mouse'")
others = Subject & not_condition
```

--------------------------------

### Join Operations in DataJoint

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Perform natural joins on shared primary keys or extend relations with attributes from another table. '*' denotes a natural join, while .extend() performs a left join.

```python
joined = Session * RawData              # inner join
extended = Session.extend(Subject)      # left join — add Subject attrs to Session
```

--------------------------------

### Query Operators: Join and Semi-join

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Performs joins and semi-joins between tables. Semi-joins are used to find rows in one table that have corresponding entries in another table.

```python
# Restriction by another table (semi-join)
with_data = Subject & RawData           # subjects that have raw data
without_data = Subject - RawData        # subjects missing raw data
```

--------------------------------

### Insert Multiple Rows

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Inserts multiple rows into a table using a list of dictionaries. Supports skipping duplicate primary keys or replacing existing rows.

```python
# Multiple rows (list of dicts)
Subject.insert([
    {'subject_id': 2, 'species': 'rat',   'dob': '2024-02-01'},
    {'subject_id': 3, 'species': 'mouse', 'dob': '2024-03-10'},
])

# Skip duplicate primary keys silently
Subject.insert(rows, skip_duplicates=True)

# Replace existing rows
Subject.insert(rows, replace=True)
```

--------------------------------

### Validate Data Before Insertion

Source: https://context7.com/datajoint/datajoint-python/llms.txt

The `validate()` method checks field existence, types, null constraints, and more without database interaction. It returns a validation result object that can be used to insert data or report errors.

```python
import datajoint as dj

rows = [
    {'subject_id': 10, 'species': 'mouse', 'dob': '2024-01-01'},
    {'subject_id': 11, 'species': 'INVALID_SPECIES'},   # missing dob
    {'subject_id': 12, 'unknown_field': 'x'},            # unknown field
]

result = Subject.validate(rows)
if result:
    Subject.insert(rows)
else:
    print(result.summary())
    # Validation failed: 2 error(s) in 3 rows
    #   Row 1 in field 'dob': Required field 'dob' is missing
    #   Row 2 in field 'unknown_field': Field 'unknown_field' not in table heading
```

```python
# Access individual errors
for row_idx, field_name, msg in result.errors:
    print(f"  Row {row_idx}, {field_name}: {msg}")
```

```python
# Raise on failure
result.raise_if_invalid()
```

```python
# Ignore extra fields during validation
result = Subject.validate(rows, ignore_extra_fields=True)
```

--------------------------------

### Restriction by List of Dicts (OR Logic)

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Use a list of dictionaries to apply OR logic for filtering relations. Each dictionary represents a condition.

```python
some = Subject & [{'subject_id': 1}, {'subject_id': 3}]
```

--------------------------------

### Insert Single Row

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Inserts a single row into a table using a dictionary.

```python
# Single row
Subject.insert1({'subject_id': 1, 'species': 'mouse', 'dob': '2024-01-15'})
```

--------------------------------

### Access Tables with FreeTable (Explicit Connection)

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Access a DataJoint table directly by its full name using an explicit connection object. Supports standard query operators and conversion to pandas DataFrames.

```python
# Using explicit connection
conn = dj.conn()
tbl = dj.FreeTable(conn, "`my_schema`.`subject`")

# FreeTable supports all query operators
filtered = tbl & "species='mouse'"
df = filtered.to_pandas()

# Inspect structure
print(tbl.heading.names)
print(tbl.primary_key)
print(tbl.describe())           # DataJoint DDL string

# Dependency navigation
parents = tbl.parents()
children = tbl.children(as_objects=True)
ancestors = tbl.ancestors()
descendants = tbl.descendants()
parts = tbl.parts(as_objects=True)
```

--------------------------------

### Query Operators: Restriction

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Applies restrictions (AND, NOT) to query tables based on conditions. Supports string-based conditions and dictionary-based lookups.

```python
# Restriction: & (AND), - (NOT)
mice = Subject & "species='mouse'"
young = Subject & "dob > '2024-01-01'"
mice_and_young = Subject & "species='mouse'" & "dob > '2024-01-01'"
not_mice = Subject - "species='mouse'"

# Restriction by dict
one = Subject & {'subject_id': 1}
```

--------------------------------

### Update Single Row

Source: https://context7.com/datajoint/datajoint-python/llms.txt

Updates a single existing row in a table. All primary key fields are required in the dictionary.

```python
# Update a single existing row (all PK fields required)
Subject.update1({'subject_id': 1, 'dob': '2024-01-20'})
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.