### Complete Python Integration Example for sqlite-ml

Source: https://context7.com/rclement/sqlite-ml/llms.txt

This Python code demonstrates a full workflow using the sqlite-ml library, including initialization, schema setup, function registration, and loading a dataset. It sets up a connection to an SQLite database, initializes the SQML object, and registers its functions. The example then loads the 'iris' dataset and prints information about the loaded data.

```python
import sqlite3
import json
from sqlite_ml.sqml import SQML

# Initialize
conn = sqlite3.connect("ml_database.db")
conn.row_factory = sqlite3.Row  # Enable dict-like row access

sqml = SQML()
sqml.setup_schema(conn)
sqml.register_functions(conn)

# Load dataset
result = conn.execute("SELECT sqml_load_dataset('iris')").fetchone()[0]
dataset_info = json.loads(result)
print(f"Loaded {dataset_info['size']} samples into {dataset_info['table']}")
```

--------------------------------

### Setup Python SQLite3 Extension with sqlite-ml

Source: https://github.com/rclement/sqlite-ml/blob/main/README.md

Demonstrates how to initialize the sqlite-ml extension within a Python SQLite3 connection. This involves creating a connection, instantiating the SQML class, setting up the schema, and registering the necessary functions for using sqlite-ml's SQL capabilities.

```python
import sqlite3

from sqlite_ml.sqml import SQML

# get a `sqlite3.Connection` object with read-write permissions
conn = sqlite3.connect(":memory:")

# setup sqlite-ml extension
sqml = SQML()
sqml.setup_schema(conn)
sqml.register_functions(conn)

# execute sqlite-ml functions
conn.execute("SELECT sqml_python_version();").fetchone()[0]
```

--------------------------------

### Initialize sqlite-ml Extension in Python

Source: https://context7.com/rclement/sqlite-ml/llms.txt

Initializes the sqlite-ml extension by setting up necessary schema tables and registering SQL functions for ML operations. It requires an active SQLite connection and demonstrates how to verify the Python and scikit-learn versions used by the extension.

```python
import sqlite3
from sqlite_ml.sqml import SQML

# Create SQLite connection (in-memory or file-based)
conn = sqlite3.connect(":memory:")
# For persistent storage:
# conn = sqlite3.connect("my_database.db")

# Initialize sqlite-ml extension
sqml = SQML()
sqml.setup_schema(conn)  # Creates sqml_experiments, sqml_runs, sqml_models, sqml_metrics, sqml_deployments tables
sqml.register_functions(conn)  # Registers sqml_* SQL functions

# Verify installation
python_version = conn.execute("SELECT sqml_python_version();").fetchone()[0]
sklearn_version = conn.execute("SELECT sqml_sklearn_version();").fetchone()[0]
print(f"Python: {python_version}")
print(f"scikit-learn: {sklearn_version}")
```

--------------------------------

### Load Sample Datasets with sqml_load_dataset

Source: https://context7.com/rclement/sqlite-ml/llms.txt

Loads predefined scikit-learn datasets into SQLite tables for testing and experimentation. Users can specify a custom table name or load all available datasets. The function returns metadata about the loaded dataset, including table name, feature names, target names, and the number of records.

```sql
-- Load the Iris dataset into default table 'dataset_iris'
SELECT sqml_load_dataset('iris') AS dataset;
-- Returns: {"table": "dataset_iris", "feature_names": ["sepal length (cm)", "sepal width (cm)", "petal length (cm)", "petal width (cm)"], "target_names": ["setosa", "versicolor", "virginica"], "size": 150}

-- Load with custom table name
SELECT sqml_load_dataset('diabetes', 'my_diabetes_data') AS dataset;
-- Returns: {"table": "my_diabetes_data", "feature_names": ["age", "sex", "bmi", "bp", "s1", "s2", "s3", "s4", "s5", "s6"], "target_names": [], "size": 442}

-- Load all available datasets
SELECT sqml_load_dataset('iris');
SELECT sqml_load_dataset('digits');
SELECT sqml_load_dataset('wine');
SELECT sqml_load_dataset('breast_cancer');
SELECT sqml_load_dataset('diabetes');
```

--------------------------------

### Load Dataset using SQL Function

Source: https://github.com/rclement/sqlite-ml/blob/main/README.md

Shows how to load a dataset into an SQLite table using the `sqml_load_dataset` SQL function. This function supports loading standard datasets from libraries like Scikit-Learn, making data readily available for ML tasks within the database.

```sql
SELECT sqml_load_dataset('iris') AS dataset;
```

--------------------------------

### Train Classification Model using SQL Function

Source: https://github.com/rclement/sqlite-ml/blob/main/README.md

Illustrates training a machine learning model for classification using the `sqml_train` SQL function. This function takes parameters such as experiment name, prediction type, algorithm, dataset table, and target column to train and deploy a model directly via SQL.

```sql
SELECT sqml_train(
  'Iris prediction',
  'classification',
  'logistic_regression',
  'dataset_iris',
  'target'
) AS training;
```

--------------------------------

### Query sqlite-ml Database Schema for ML Pipeline State

Source: https://context7.com/rclement/sqlite-ml/llms.txt

sqlite-ml creates internal tables to track experiments, training runs, models, metrics, and deployments. These tables can be queried to inspect the state of your machine learning pipeline. The provided SQL queries demonstrate how to view experiments, training runs, detailed run information, model metrics, and active deployments.

```sql
-- View all experiments
SELECT * FROM sqml_experiments;
-- Returns: id | created_at | updated_at | name | prediction_type

-- View training runs overview (aggregated view)
SELECT * FROM sqml_runs_overview;
-- Returns: run_id | experiment | start_time | status | algorithm | score

-- View detailed run information
SELECT
  sqml_runs.id,
  sqml_experiments.name AS experiment,
  sqml_runs.algorithm,
  sqml_runs.dataset,
  sqml_runs.target,
  sqml_runs.test_size,
  sqml_runs.status
FROM sqml_runs
JOIN sqml_experiments ON sqml_experiments.id = sqml_runs.experiment_id;

-- View model metrics (classification: accuracy, f1, precision, recall; regression: r2, mae, rmse)
SELECT
  sqml_experiments.name AS experiment,
  sqml_metrics.name AS metric,
  sqml_metrics.value
FROM sqml_metrics
JOIN sqml_models ON sqml_models.id = sqml_metrics.model_id
JOIN sqml_runs ON sqml_runs.id = sqml_models.run_id
JOIN sqml_experiments ON sqml_experiments.id = sqml_runs.experiment_id;

-- View active deployments
SELECT
  sqml_experiments.name AS experiment,
  sqml_deployments.id AS deployment_id,
  sqml_deployments.active,
  sqml_deployments.created_at
FROM sqml_deployments
JOIN sqml_experiments ON sqml_experiments.id = sqml_deployments.experiment_id
WHERE sqml_deployments.active = TRUE;
```

--------------------------------

### Train ML Models with sqml_train

Source: https://context7.com/rclement/sqlite-ml/llms.txt

Trains machine learning models using specified algorithms and datasets. The function automatically deploys the model if it performs better than existing ones for the same experiment. It supports both classification and regression tasks and allows customization of test size and data splitting strategies.

```sql
-- Train a logistic regression classifier
SELECT sqml_train(
  'Iris prediction',
  'classification',
  'logistic_regression',
  'dataset_iris',
  'target'
) AS training;
-- Returns: {"experiment_name": "Iris prediction", "prediction_type": "classification", "algorithm": "logistic_regression", "deployed": true, "score": 0.9473684210526315}

-- Train with custom test size and split strategy
SELECT sqml_train(
  'Diabetes prediction',
  'regression',
  'random_forest',
  'dataset_diabetes',
  'target',
  0.3,
  'shuffle'
) AS training;
-- Returns: {"experiment_name": "Diabetes prediction", "prediction_type": "regression", "algorithm": "random_forest", "deployed": true, "score": 0.42}

-- Train multiple algorithms for the same experiment (best model auto-deploys)
SELECT sqml_train('Iris prediction', 'classification', 'svc', 'dataset_iris', 'target');
SELECT sqml_train('Iris prediction', 'classification', 'random_forest', 'dataset_iris', 'target');
SELECT sqml_train('Iris prediction', 'classification', 'knn', 'dataset_iris', 'target');

-- Classification algorithms: dummy, logistic_regression, sgd, ridge, ridge_cv, decision_tree,
--                            ada_boost, bagging, gradient_boosting, random_forest, knn, mlp, svc
-- Regression algorithms: dummy, linear_regression, sgd, ridge, ridge_cv, elastic_net, elastic_net_cv,
--                        lasso, lasso_cv, decision_tree, ada_boost, bagging, gradient_boosting,
--                        random_forest, knn, mlp, svr
```

--------------------------------

### Check Deployed Model with sqlite-ml

Source: https://context7.com/rclement/sqlite-ml/llms.txt

This snippet demonstrates how to query the database to determine which machine learning model is currently active and deployed. It joins several tables (`sqml_deployments`, `sqml_models`, `sqml_runs`, `sqml_metrics`) to find the algorithm and score of the active model.

```python
deployed = conn.execute("""
    SELECT sqml_runs.algorithm, sqml_metrics.value AS score
    FROM sqml_deployments
    JOIN sqml_models ON sqml_models.id = sqml_deployments.model_id
    JOIN sqml_runs ON sqml_runs.id = sqml_models.run_id
    JOIN sqml_metrics ON sqml_metrics.model_id = sqml_models.id
    WHERE sqml_deployments.active = TRUE AND sqml_metrics.name = 'score'
""").fetchone()
print(f"Currently deployed: {deployed['algorithm']} with score {deployed['score']:.4f}")
```

--------------------------------

### Database Schema and Model Management

Source: https://context7.com/rclement/sqlite-ml/llms.txt

Query internal tables to inspect your ML pipeline state, including experiments, training runs, models, metrics, and deployments.

```APIDOC
## Database Schema and Model Management

### Description
sqlite-ml creates internal tables to track experiments, training runs, models, metrics, and deployments. Query these tables to inspect your ML pipeline state.

### Method
SQL Queries

### Endpoint
N/A (SQL Queries)

### Parameters
None

### Request Example
```sql
-- View all experiments
SELECT * FROM sqml_experiments;

-- View training runs overview (aggregated view)
SELECT * FROM sqml_runs_overview;

-- View detailed run information
SELECT
  sqml_runs.id,
  sqml_experiments.name AS experiment,
  sqml_runs.algorithm,
  sqml_runs.dataset,
  sqml_runs.target,
  sqml_runs.test_size,
  sqml_runs.status
FROM sqml_runs
JOIN sqml_experiments ON sqml_experiments.id = sqml_runs.experiment_id;

-- View model metrics (classification: accuracy, f1, precision, recall; regression: r2, mae, rmse)
SELECT
  sqml_experiments.name AS experiment,
  sqml_metrics.name AS metric,
  sqml_metrics.value
FROM sqml_metrics
JOIN sqml_models ON sqml_models.id = sqml_metrics.model_id
JOIN sqml_runs ON sqml_runs.id = sqml_models.run_id
JOIN sqml_experiments ON sqml_experiments.id = sqml_runs.experiment_id;

-- View active deployments
SELECT
  sqml_experiments.name AS experiment,
  sqml_deployments.id AS deployment_id,
  sqml_deployments.active,
  sqml_deployments.created_at
FROM sqml_deployments
JOIN sqml_experiments ON sqml_experiments.id = sqml_deployments.experiment_id
WHERE sqml_deployments.active = TRUE;
```

### Response
#### Success Response (200)
- Varies based on the query. Returns rows and columns from the specified internal tables.

#### Response Example
```json
-- Example for `SELECT * FROM sqml_experiments;`
[
  {
    "id": 1,
    "created_at": "2023-10-27 10:00:00",
    "updated_at": "2023-10-27 10:00:00",
    "name": "Iris prediction",
    "prediction_type": "classification"
  }
]
```
```

--------------------------------

### Train Multiple ML Models with sqlite-ml

Source: https://context7.com/rclement/sqlite-ml/llms.txt

This snippet demonstrates how to train multiple machine learning models (logistic regression, SVC, random forest, KNN) using the `sqml_train` function in sqlite-ml. It iterates through a list of algorithms, trains each one on the 'flower_classifier' model, and prints the score and deployment status for each.

```python
algorithms = ['logistic_regression', 'svc', 'random_forest', 'knn']
for algo in algorithms:
    result = conn.execute(f"""
        SELECT sqml_train('flower_classifier', 'classification', '{algo}', 'dataset_iris', 'target')
    """).fetchone()[0]
    training = json.loads(result)
    print(f"{algo}: score={training['score']:.4f}, deployed={training['deployed']}")
```

--------------------------------

### Train a New Classification Model using sqml_train (SVC)

Source: https://github.com/rclement/sqlite-ml/blob/main/README.md

This SQL snippet demonstrates how to train a new classification model using the `sqml_train` function. It specifies the model name, prediction type ('classification'), algorithm ('svc'), dataset table ('dataset_iris'), and the target column ('target'). The function returns metadata about the training process, including the model's score and deployment status.

```sql
SELECT sqml_train(
  'Iris prediction',
  'classification',
  'svc',
  'dataset_iris',
  'target'
) AS training;
```

--------------------------------

### Perform Batch Predictions using sqml_predict_batch

Source: https://github.com/rclement/sqlite-ml/blob/main/README.md

This SQL snippet shows how to efficiently make predictions for multiple rows in the `dataset_iris` table using `sqml_predict_batch`. It joins the dataset with the batch predictions and identifies rows where the prediction does not match the actual target label. This is useful for evaluating model accuracy on a larger scale.

```sql
SELECT
  dataset_iris.*
  batch.value AS prediction,
  dataset_iris.target = batch.value AS match
FROM
  dataset_iris
  JOIN json_each (
    (
      SELECT
        sqml_predict_batch(
          'Iris prediction',
          json_group_array(
            json_object(
              'sepal length (cm)', [sepal length (cm)],
              'sepal width (cm)', [sepal width (cm)],
              'petal length (cm)', [petal length (cm)],
              'petal width (cm)', [petal width (cm)]
            )
          )
        )
      FROM
        dataset_iris
    )
  ) batch ON (batch.rowid + 1) = dataset_iris.rowid
WHERE match = FALSE;
```

--------------------------------

### Predict Target Label for a Single Row using sqml_predict

Source: https://github.com/rclement/sqlite-ml/blob/main/README.md

This SQL snippet demonstrates how to predict the target label for a single row in the `dataset_iris` table using the `sqml_predict` function. It takes the model name and a JSON object of features as input and returns the predicted label. The output includes all original columns plus the prediction.

```sql
SELECT
  dataset_iris.*
  sqml_predict(
    'Iris prediction',
    json_object(
      'sepal length (cm)', [sepal length (cm)],
      'sepal width (cm)', [sepal width (cm)],
      'petal length (cm)', [petal length (cm)],
      'petal width (cm)', [petal width (cm)]
    )
  ) AS prediction
FROM dataset_iris
LIMIT 1;
```

--------------------------------

### Make Predictions with sqlite-ml

Source: https://context7.com/rclement/sqlite-ml/llms.txt

This code snippet shows how to use the `sqml_predict` function to generate predictions from a trained model within SQLite. It selects the target variable and predicts based on input features from the 'dataset_iris' table, then calculates the accuracy of these predictions.

```python
predictions = conn.execute("""
    SELECT
        target,
        sqml_predict('flower_classifier', json_object(
            'sepal length (cm)', [sepal length (cm)],
            'sepal width (cm)', [sepal width (cm)],
            'petal length (cm)', [petal length (cm)],
            'petal width (cm)', [petal width (cm)]
        )) AS prediction
    FROM dataset_iris
""").fetchall()

accuracy = sum(1 for p in predictions if p['target'] == p['prediction']) / len(predictions)
print(f"Accuracy on full dataset: {accuracy:.4f}")
```

--------------------------------

### sqml_predict_batch - Batch Predictions

Source: https://context7.com/rclement/sqlite-ml/llms.txt

Performs predictions on multiple data points efficiently in a single call. Returns a JSON array of predictions, ideal for large datasets.

```APIDOC
## sqml_predict_batch - Batch Predictions

### Description
Performs predictions on multiple data points efficiently in a single call. Returns a JSON array of predictions, ideal for large datasets.

### Method
SQL Function

### Endpoint
N/A (SQL Function)

### Parameters
#### Path Parameters
None

#### Query Parameters
None

#### Request Body
None

### Request Example
```sql
-- Batch predict for entire dataset
SELECT sqml_predict_batch(
  'Iris prediction',
  json_group_array(
    json_object(
      'sepal length (cm)', [sepal length (cm)],
      'sepal width (cm)', [sepal width (cm)],
      'petal length (cm)', [petal length (cm)],
      'petal width (cm)', [petal width (cm)]
    )
  )
) AS predictions
FROM dataset_iris;
```

### Response
#### Success Response (200)
- **predictions** (JSON array) - An array of predicted values for each data point.

#### Response Example
```json
{
  "predictions": [0.0, 0.0, 0.0, ..., 2.0, 2.0]
}
```
```

--------------------------------

### sqml_predict - Single Row Prediction

Source: https://context7.com/rclement/sqlite-ml/llms.txt

Performs a prediction on a single data point using the currently deployed model for an experiment. Returns a float value representing the predicted class or regression value.

```APIDOC
## sqml_predict - Single Row Prediction

### Description
Performs a prediction on a single data point using the currently deployed model for an experiment. Returns a float value representing the predicted class or regression value.

### Method
SQL Function

### Endpoint
N/A (SQL Function)

### Parameters
#### Path Parameters
None

#### Query Parameters
None

#### Request Body
None

### Request Example
```sql
-- Predict for a single row using JSON object features
SELECT sqml_predict(
  'Iris prediction',
  json_object(
    'sepal length (cm)', 5.1,
    'sepal width (cm)', 3.5,
    'petal length (cm)', 1.4,
    'petal width (cm)', 0.2
  )
) AS prediction;
```

### Response
#### Success Response (200)
- **prediction** (float) - The predicted class or regression value.

#### Response Example
```json
{
  "prediction": 0.0
}
```
```

--------------------------------

### Perform Single Row Prediction with sqml_predict

Source: https://context7.com/rclement/sqlite-ml/llms.txt

The `sqml_predict` function performs a prediction on a single data point using the currently deployed model for an experiment. It returns a float value representing the predicted class or regression value. It can accept features as a JSON object or by referencing columns from a table row.

```sql
-- Predict for a single row using JSON object features
SELECT sqml_predict(
  'Iris prediction',
  json_object(
    'sepal length (cm)', 5.1,
    'sepal width (cm)', 3.5,
    'petal length (cm)', 1.4,
    'petal width (cm)', 0.2
  )
) AS prediction;
-- Returns: 0.0 (setosa class)

-- Predict using values from a table row
SELECT
  dataset_iris.*
  sqml_predict(
    'Iris prediction',
    json_object(
      'sepal length (cm)', [sepal length (cm)],
      'sepal width (cm)', [sepal width (cm)],
      'petal length (cm)', [petal length (cm)],
      'petal width (cm)', [petal width (cm)]
    )
  ) AS prediction
FROM dataset_iris
LIMIT 1;
-- Returns: sepal length (cm) | sepal width (cm) | petal length (cm) | petal width (cm) | target | prediction
--          5.1               | 3.5              | 1.4               | 0.2              | 0.0    | 0.0

-- Predict for all rows and compare with actual target
SELECT
  *,
  target = prediction AS match
FROM (
  SELECT
    *,
    sqml_predict(
      'Iris prediction',
      json_object(
        'sepal length (cm)', [sepal length (cm)],
        'sepal width (cm)', [sepal width (cm)],
        'petal length (cm)', [petal length (cm)],
        'petal width (cm)', [petal width (cm)]
      )
    ) AS prediction
  FROM dataset_iris
)
ORDER BY match;
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.