### Complete Python Integration Example for sqlite-ml Source: https://context7.com/rclement/sqlite-ml/llms.txt This Python code demonstrates a full workflow using the sqlite-ml library, including initialization, schema setup, function registration, and loading a dataset. It sets up a connection to an SQLite database, initializes the SQML object, and registers its functions. The example then loads the 'iris' dataset and prints information about the loaded data. ```python import sqlite3 import json from sqlite_ml.sqml import SQML # Initialize conn = sqlite3.connect("ml_database.db") conn.row_factory = sqlite3.Row # Enable dict-like row access sqml = SQML() sqml.setup_schema(conn) sqml.register_functions(conn) # Load dataset result = conn.execute("SELECT sqml_load_dataset('iris')").fetchone()[0] dataset_info = json.loads(result) print(f"Loaded {dataset_info['size']} samples into {dataset_info['table']}") ``` -------------------------------- ### Setup Python SQLite3 Extension with sqlite-ml Source: https://github.com/rclement/sqlite-ml/blob/main/README.md Demonstrates how to initialize the sqlite-ml extension within a Python SQLite3 connection. This involves creating a connection, instantiating the SQML class, setting up the schema, and registering the necessary functions for using sqlite-ml's SQL capabilities. ```python import sqlite3 from sqlite_ml.sqml import SQML # get a `sqlite3.Connection` object with read-write permissions conn = sqlite3.connect(":memory:") # setup sqlite-ml extension sqml = SQML() sqml.setup_schema(conn) sqml.register_functions(conn) # execute sqlite-ml functions conn.execute("SELECT sqml_python_version();").fetchone()[0] ``` -------------------------------- ### Initialize sqlite-ml Extension in Python Source: https://context7.com/rclement/sqlite-ml/llms.txt Initializes the sqlite-ml extension by setting up necessary schema tables and registering SQL functions for ML operations. It requires an active SQLite connection and demonstrates how to verify the Python and scikit-learn versions used by the extension. ```python import sqlite3 from sqlite_ml.sqml import SQML # Create SQLite connection (in-memory or file-based) conn = sqlite3.connect(":memory:") # For persistent storage: # conn = sqlite3.connect("my_database.db") # Initialize sqlite-ml extension sqml = SQML() sqml.setup_schema(conn) # Creates sqml_experiments, sqml_runs, sqml_models, sqml_metrics, sqml_deployments tables sqml.register_functions(conn) # Registers sqml_* SQL functions # Verify installation python_version = conn.execute("SELECT sqml_python_version();").fetchone()[0] sklearn_version = conn.execute("SELECT sqml_sklearn_version();").fetchone()[0] print(f"Python: {python_version}") print(f"scikit-learn: {sklearn_version}") ``` -------------------------------- ### Load Sample Datasets with sqml_load_dataset Source: https://context7.com/rclement/sqlite-ml/llms.txt Loads predefined scikit-learn datasets into SQLite tables for testing and experimentation. Users can specify a custom table name or load all available datasets. The function returns metadata about the loaded dataset, including table name, feature names, target names, and the number of records. ```sql -- Load the Iris dataset into default table 'dataset_iris' SELECT sqml_load_dataset('iris') AS dataset; -- Returns: {"table": "dataset_iris", "feature_names": ["sepal length (cm)", "sepal width (cm)", "petal length (cm)", "petal width (cm)"], "target_names": ["setosa", "versicolor", "virginica"], "size": 150} -- Load with custom table name SELECT sqml_load_dataset('diabetes', 'my_diabetes_data') AS dataset; -- Returns: {"table": "my_diabetes_data", "feature_names": ["age", "sex", "bmi", "bp", "s1", "s2", "s3", "s4", "s5", "s6"], "target_names": [], "size": 442} -- Load all available datasets SELECT sqml_load_dataset('iris'); SELECT sqml_load_dataset('digits'); SELECT sqml_load_dataset('wine'); SELECT sqml_load_dataset('breast_cancer'); SELECT sqml_load_dataset('diabetes'); ``` -------------------------------- ### Load Dataset using SQL Function Source: https://github.com/rclement/sqlite-ml/blob/main/README.md Shows how to load a dataset into an SQLite table using the `sqml_load_dataset` SQL function. This function supports loading standard datasets from libraries like Scikit-Learn, making data readily available for ML tasks within the database. ```sql SELECT sqml_load_dataset('iris') AS dataset; ``` -------------------------------- ### Train Classification Model using SQL Function Source: https://github.com/rclement/sqlite-ml/blob/main/README.md Illustrates training a machine learning model for classification using the `sqml_train` SQL function. This function takes parameters such as experiment name, prediction type, algorithm, dataset table, and target column to train and deploy a model directly via SQL. ```sql SELECT sqml_train( 'Iris prediction', 'classification', 'logistic_regression', 'dataset_iris', 'target' ) AS training; ``` -------------------------------- ### Query sqlite-ml Database Schema for ML Pipeline State Source: https://context7.com/rclement/sqlite-ml/llms.txt sqlite-ml creates internal tables to track experiments, training runs, models, metrics, and deployments. These tables can be queried to inspect the state of your machine learning pipeline. The provided SQL queries demonstrate how to view experiments, training runs, detailed run information, model metrics, and active deployments. ```sql -- View all experiments SELECT * FROM sqml_experiments; -- Returns: id | created_at | updated_at | name | prediction_type -- View training runs overview (aggregated view) SELECT * FROM sqml_runs_overview; -- Returns: run_id | experiment | start_time | status | algorithm | score -- View detailed run information SELECT sqml_runs.id, sqml_experiments.name AS experiment, sqml_runs.algorithm, sqml_runs.dataset, sqml_runs.target, sqml_runs.test_size, sqml_runs.status FROM sqml_runs JOIN sqml_experiments ON sqml_experiments.id = sqml_runs.experiment_id; -- View model metrics (classification: accuracy, f1, precision, recall; regression: r2, mae, rmse) SELECT sqml_experiments.name AS experiment, sqml_metrics.name AS metric, sqml_metrics.value FROM sqml_metrics JOIN sqml_models ON sqml_models.id = sqml_metrics.model_id JOIN sqml_runs ON sqml_runs.id = sqml_models.run_id JOIN sqml_experiments ON sqml_experiments.id = sqml_runs.experiment_id; -- View active deployments SELECT sqml_experiments.name AS experiment, sqml_deployments.id AS deployment_id, sqml_deployments.active, sqml_deployments.created_at FROM sqml_deployments JOIN sqml_experiments ON sqml_experiments.id = sqml_deployments.experiment_id WHERE sqml_deployments.active = TRUE; ``` -------------------------------- ### Train ML Models with sqml_train Source: https://context7.com/rclement/sqlite-ml/llms.txt Trains machine learning models using specified algorithms and datasets. The function automatically deploys the model if it performs better than existing ones for the same experiment. It supports both classification and regression tasks and allows customization of test size and data splitting strategies. ```sql -- Train a logistic regression classifier SELECT sqml_train( 'Iris prediction', 'classification', 'logistic_regression', 'dataset_iris', 'target' ) AS training; -- Returns: {"experiment_name": "Iris prediction", "prediction_type": "classification", "algorithm": "logistic_regression", "deployed": true, "score": 0.9473684210526315} -- Train with custom test size and split strategy SELECT sqml_train( 'Diabetes prediction', 'regression', 'random_forest', 'dataset_diabetes', 'target', 0.3, 'shuffle' ) AS training; -- Returns: {"experiment_name": "Diabetes prediction", "prediction_type": "regression", "algorithm": "random_forest", "deployed": true, "score": 0.42} -- Train multiple algorithms for the same experiment (best model auto-deploys) SELECT sqml_train('Iris prediction', 'classification', 'svc', 'dataset_iris', 'target'); SELECT sqml_train('Iris prediction', 'classification', 'random_forest', 'dataset_iris', 'target'); SELECT sqml_train('Iris prediction', 'classification', 'knn', 'dataset_iris', 'target'); -- Classification algorithms: dummy, logistic_regression, sgd, ridge, ridge_cv, decision_tree, -- ada_boost, bagging, gradient_boosting, random_forest, knn, mlp, svc -- Regression algorithms: dummy, linear_regression, sgd, ridge, ridge_cv, elastic_net, elastic_net_cv, -- lasso, lasso_cv, decision_tree, ada_boost, bagging, gradient_boosting, -- random_forest, knn, mlp, svr ``` -------------------------------- ### Check Deployed Model with sqlite-ml Source: https://context7.com/rclement/sqlite-ml/llms.txt This snippet demonstrates how to query the database to determine which machine learning model is currently active and deployed. It joins several tables (`sqml_deployments`, `sqml_models`, `sqml_runs`, `sqml_metrics`) to find the algorithm and score of the active model. ```python deployed = conn.execute(""" SELECT sqml_runs.algorithm, sqml_metrics.value AS score FROM sqml_deployments JOIN sqml_models ON sqml_models.id = sqml_deployments.model_id JOIN sqml_runs ON sqml_runs.id = sqml_models.run_id JOIN sqml_metrics ON sqml_metrics.model_id = sqml_models.id WHERE sqml_deployments.active = TRUE AND sqml_metrics.name = 'score' """).fetchone() print(f"Currently deployed: {deployed['algorithm']} with score {deployed['score']:.4f}") ``` -------------------------------- ### Database Schema and Model Management Source: https://context7.com/rclement/sqlite-ml/llms.txt Query internal tables to inspect your ML pipeline state, including experiments, training runs, models, metrics, and deployments. ```APIDOC ## Database Schema and Model Management ### Description sqlite-ml creates internal tables to track experiments, training runs, models, metrics, and deployments. Query these tables to inspect your ML pipeline state. ### Method SQL Queries ### Endpoint N/A (SQL Queries) ### Parameters None ### Request Example ```sql -- View all experiments SELECT * FROM sqml_experiments; -- View training runs overview (aggregated view) SELECT * FROM sqml_runs_overview; -- View detailed run information SELECT sqml_runs.id, sqml_experiments.name AS experiment, sqml_runs.algorithm, sqml_runs.dataset, sqml_runs.target, sqml_runs.test_size, sqml_runs.status FROM sqml_runs JOIN sqml_experiments ON sqml_experiments.id = sqml_runs.experiment_id; -- View model metrics (classification: accuracy, f1, precision, recall; regression: r2, mae, rmse) SELECT sqml_experiments.name AS experiment, sqml_metrics.name AS metric, sqml_metrics.value FROM sqml_metrics JOIN sqml_models ON sqml_models.id = sqml_metrics.model_id JOIN sqml_runs ON sqml_runs.id = sqml_models.run_id JOIN sqml_experiments ON sqml_experiments.id = sqml_runs.experiment_id; -- View active deployments SELECT sqml_experiments.name AS experiment, sqml_deployments.id AS deployment_id, sqml_deployments.active, sqml_deployments.created_at FROM sqml_deployments JOIN sqml_experiments ON sqml_experiments.id = sqml_deployments.experiment_id WHERE sqml_deployments.active = TRUE; ``` ### Response #### Success Response (200) - Varies based on the query. Returns rows and columns from the specified internal tables. #### Response Example ```json -- Example for `SELECT * FROM sqml_experiments;` [ { "id": 1, "created_at": "2023-10-27 10:00:00", "updated_at": "2023-10-27 10:00:00", "name": "Iris prediction", "prediction_type": "classification" } ] ``` ``` -------------------------------- ### Train Multiple ML Models with sqlite-ml Source: https://context7.com/rclement/sqlite-ml/llms.txt This snippet demonstrates how to train multiple machine learning models (logistic regression, SVC, random forest, KNN) using the `sqml_train` function in sqlite-ml. It iterates through a list of algorithms, trains each one on the 'flower_classifier' model, and prints the score and deployment status for each. ```python algorithms = ['logistic_regression', 'svc', 'random_forest', 'knn'] for algo in algorithms: result = conn.execute(f""" SELECT sqml_train('flower_classifier', 'classification', '{algo}', 'dataset_iris', 'target') """).fetchone()[0] training = json.loads(result) print(f"{algo}: score={training['score']:.4f}, deployed={training['deployed']}") ``` -------------------------------- ### Train a New Classification Model using sqml_train (SVC) Source: https://github.com/rclement/sqlite-ml/blob/main/README.md This SQL snippet demonstrates how to train a new classification model using the `sqml_train` function. It specifies the model name, prediction type ('classification'), algorithm ('svc'), dataset table ('dataset_iris'), and the target column ('target'). The function returns metadata about the training process, including the model's score and deployment status. ```sql SELECT sqml_train( 'Iris prediction', 'classification', 'svc', 'dataset_iris', 'target' ) AS training; ``` -------------------------------- ### Perform Batch Predictions using sqml_predict_batch Source: https://github.com/rclement/sqlite-ml/blob/main/README.md This SQL snippet shows how to efficiently make predictions for multiple rows in the `dataset_iris` table using `sqml_predict_batch`. It joins the dataset with the batch predictions and identifies rows where the prediction does not match the actual target label. This is useful for evaluating model accuracy on a larger scale. ```sql SELECT dataset_iris.* batch.value AS prediction, dataset_iris.target = batch.value AS match FROM dataset_iris JOIN json_each ( ( SELECT sqml_predict_batch( 'Iris prediction', json_group_array( json_object( 'sepal length (cm)', [sepal length (cm)], 'sepal width (cm)', [sepal width (cm)], 'petal length (cm)', [petal length (cm)], 'petal width (cm)', [petal width (cm)] ) ) ) FROM dataset_iris ) ) batch ON (batch.rowid + 1) = dataset_iris.rowid WHERE match = FALSE; ``` -------------------------------- ### Predict Target Label for a Single Row using sqml_predict Source: https://github.com/rclement/sqlite-ml/blob/main/README.md This SQL snippet demonstrates how to predict the target label for a single row in the `dataset_iris` table using the `sqml_predict` function. It takes the model name and a JSON object of features as input and returns the predicted label. The output includes all original columns plus the prediction. ```sql SELECT dataset_iris.* sqml_predict( 'Iris prediction', json_object( 'sepal length (cm)', [sepal length (cm)], 'sepal width (cm)', [sepal width (cm)], 'petal length (cm)', [petal length (cm)], 'petal width (cm)', [petal width (cm)] ) ) AS prediction FROM dataset_iris LIMIT 1; ``` -------------------------------- ### Make Predictions with sqlite-ml Source: https://context7.com/rclement/sqlite-ml/llms.txt This code snippet shows how to use the `sqml_predict` function to generate predictions from a trained model within SQLite. It selects the target variable and predicts based on input features from the 'dataset_iris' table, then calculates the accuracy of these predictions. ```python predictions = conn.execute(""" SELECT target, sqml_predict('flower_classifier', json_object( 'sepal length (cm)', [sepal length (cm)], 'sepal width (cm)', [sepal width (cm)], 'petal length (cm)', [petal length (cm)], 'petal width (cm)', [petal width (cm)] )) AS prediction FROM dataset_iris """).fetchall() accuracy = sum(1 for p in predictions if p['target'] == p['prediction']) / len(predictions) print(f"Accuracy on full dataset: {accuracy:.4f}") ``` -------------------------------- ### sqml_predict_batch - Batch Predictions Source: https://context7.com/rclement/sqlite-ml/llms.txt Performs predictions on multiple data points efficiently in a single call. Returns a JSON array of predictions, ideal for large datasets. ```APIDOC ## sqml_predict_batch - Batch Predictions ### Description Performs predictions on multiple data points efficiently in a single call. Returns a JSON array of predictions, ideal for large datasets. ### Method SQL Function ### Endpoint N/A (SQL Function) ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```sql -- Batch predict for entire dataset SELECT sqml_predict_batch( 'Iris prediction', json_group_array( json_object( 'sepal length (cm)', [sepal length (cm)], 'sepal width (cm)', [sepal width (cm)], 'petal length (cm)', [petal length (cm)], 'petal width (cm)', [petal width (cm)] ) ) ) AS predictions FROM dataset_iris; ``` ### Response #### Success Response (200) - **predictions** (JSON array) - An array of predicted values for each data point. #### Response Example ```json { "predictions": [0.0, 0.0, 0.0, ..., 2.0, 2.0] } ``` ``` -------------------------------- ### sqml_predict - Single Row Prediction Source: https://context7.com/rclement/sqlite-ml/llms.txt Performs a prediction on a single data point using the currently deployed model for an experiment. Returns a float value representing the predicted class or regression value. ```APIDOC ## sqml_predict - Single Row Prediction ### Description Performs a prediction on a single data point using the currently deployed model for an experiment. Returns a float value representing the predicted class or regression value. ### Method SQL Function ### Endpoint N/A (SQL Function) ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```sql -- Predict for a single row using JSON object features SELECT sqml_predict( 'Iris prediction', json_object( 'sepal length (cm)', 5.1, 'sepal width (cm)', 3.5, 'petal length (cm)', 1.4, 'petal width (cm)', 0.2 ) ) AS prediction; ``` ### Response #### Success Response (200) - **prediction** (float) - The predicted class or regression value. #### Response Example ```json { "prediction": 0.0 } ``` ``` -------------------------------- ### Perform Single Row Prediction with sqml_predict Source: https://context7.com/rclement/sqlite-ml/llms.txt The `sqml_predict` function performs a prediction on a single data point using the currently deployed model for an experiment. It returns a float value representing the predicted class or regression value. It can accept features as a JSON object or by referencing columns from a table row. ```sql -- Predict for a single row using JSON object features SELECT sqml_predict( 'Iris prediction', json_object( 'sepal length (cm)', 5.1, 'sepal width (cm)', 3.5, 'petal length (cm)', 1.4, 'petal width (cm)', 0.2 ) ) AS prediction; -- Returns: 0.0 (setosa class) -- Predict using values from a table row SELECT dataset_iris.* sqml_predict( 'Iris prediction', json_object( 'sepal length (cm)', [sepal length (cm)], 'sepal width (cm)', [sepal width (cm)], 'petal length (cm)', [petal length (cm)], 'petal width (cm)', [petal width (cm)] ) ) AS prediction FROM dataset_iris LIMIT 1; -- Returns: sepal length (cm) | sepal width (cm) | petal length (cm) | petal width (cm) | target | prediction -- 5.1 | 3.5 | 1.4 | 0.2 | 0.0 | 0.0 -- Predict for all rows and compare with actual target SELECT *, target = prediction AS match FROM ( SELECT *, sqml_predict( 'Iris prediction', json_object( 'sepal length (cm)', [sepal length (cm)], 'sepal width (cm)', [sepal width (cm)], 'petal length (cm)', [petal length (cm)], 'petal width (cm)', [petal width (cm)] ) ) AS prediction FROM dataset_iris ) ORDER BY match; ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.