### Install modAL Python Package Source: https://github.com/modal-python/modal/blob/master/README.md Provides commands for installing the modAL Python library using pip. It includes installation from PyPI and directly from the GitHub repository. ```bash pip install modAL-python ``` ```bash pip install git+https://github.com/modAL-python/modAL.git ``` -------------------------------- ### Install modAL Python from Source Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/Installation.rst Installs the modAL Python library directly from its GitHub repository. This method is useful for developers who want to use the latest unreleased features or contribute to the project. Requires Git and pip. ```shell pip install git+https://github.com/modAL-python/modAL.git ``` -------------------------------- ### Install modAL Python via Pip Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/Installation.rst Installs the modAL Python library from the Python Package Index (PyPI). This is the standard and recommended method for most users. Ensure you have pip installed and a compatible Python environment. ```shell pip install modAL-python ``` -------------------------------- ### Initialize ActiveLearner with Initial Training Data Source: https://github.com/modal-python/modal/blob/master/docs/source/content/models/ActiveLearner.rst Shows how to initialize an ActiveLearner with pre-existing training data. The initial samples and their corresponding labels are passed using X_training and y_training arguments, allowing the learner to start with a trained estimator. ```python from modAL.models import ActiveLearner from modAL.uncertainty import uncertainty_sampling from sklearn.ensemble import RandomForestClassifier # Assuming X_training and y_training are defined learner = ActiveLearner( estimator=RandomForestClassifier(), query_strategy=uncertainty_sampling, X_training=X_training, y_training=y_training ) ``` -------------------------------- ### Initialize Active Learner with Gaussian Process Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/active_regression.ipynb Sets up the ActiveLearner with a GaussianProcessRegressor estimator and the custom query strategy. It initializes the learner with a small subset of the data to start the learning process. ```Python n_initial = 5 initial_idx = np.random.choice(range(len(X)), size=n_initial, replace=False) X_training, y_training = X[initial_idx], y[initial_idx] kernel = RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e3)) \ + WhiteKernel(noise_level=1, noise_level_bounds=(1e-10, 1e+1)) regressor = ActiveLearner( estimator=GaussianProcessRegressor(kernel=kernel), query_strategy=GP_regression_std, X_training=X_training.reshape(-1, 1), y_training=y_training.reshape(-1, 1) ) ``` -------------------------------- ### Load and Visualize Iris Dataset Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb Loads the iris dataset using scikit-learn and visualizes it in 2D using PCA. This step prepares the data for the active learning experiment and provides a visual understanding of the data distribution. ```python import matplotlib as mpl import matplotlib.pyplot as plt %matplotlib inline from sklearn.decomposition import PCA from sklearn.datasets import load_iris # loading the iris dataset iris = load_iris() # visualizing the classes with plt.style.context('seaborn-white'): plt.figure(figsize=(7, 7)) pca = PCA(n_components=2).fit_transform(iris['data']) plt.scatter(x=pca[:, 0], y=pca[:, 1], c=iris['target'], cmap='viridis', s=50) plt.title('The iris dataset') plt.show() ``` -------------------------------- ### Initialize Committee Members Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb Initializes multiple `ActiveLearner` instances, each with a `RandomForestClassifier`, to form the committee. It randomly selects initial training data for each learner and removes it from the pool. ```python import numpy as np from sklearn.ensemble import RandomForestClassifier from modAL.models import ActiveLearner, Committee # initializing Committee members n_members = 2 learner_list = list() for member_idx in range(n_members): # initial training data n_initial = 2 train_idx = np.random.choice(range(X_pool.shape[0]), size=n_initial, replace=False) X_train = X_pool[train_idx] y_train = y_pool[train_idx] # creating a reduced copy of the data with the known instances removed X_pool = np.delete(X_pool, train_idx, axis=0) y_pool = np.delete(y_pool, train_idx) # initializing learner learner = ActiveLearner( estimator=RandomForestClassifier(), X_training=X_train, y_training=y_train ) learner_list.append(learner) # assembling the committee committee = Committee(learner_list=learner_list) ``` -------------------------------- ### Prepare Data for Active Regression (Python) Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/modAL-in-a-nutshell.rst Provides an example of preparing data for an active regression task, specifically learning a noisy sine function. It involves generating synthetic data with added noise using NumPy. ```python import numpy as np X = np.random.choice(np.linspace(0, 20, 10000), size=200, replace=False).reshape(-1, 1) y = np.sin(X) + np.random.normal(scale=0.3, size=X.shape) ``` -------------------------------- ### Visualize Committee Initial Predictions Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb Visualizes the initial predictions made by each individual learner within the committee on the PCA-transformed iris dataset. This helps to see the diversity of initial hypotheses. ```python with plt.style.context('seaborn-white'): plt.figure(figsize=(n_members*7, 7)) for learner_idx, learner in enumerate(committee): plt.subplot(1, n_members, learner_idx + 1) plt.scatter(x=pca[:, 0], y=pca[:, 1], c=learner.predict(iris['data']), cmap='viridis', s=50) plt.title('Learner no. %d initial predictions' % (learner_idx + 1)) plt.show() ``` -------------------------------- ### Create and Visualize Dataset (Python) Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/bootstrapping_and_bagging.ipynb Generates a synthetic dataset representing three black disks on a white background using NumPy and itertools. The dataset is then visualized using Matplotlib, displaying the image with the shapes to be learned. ```Python import numpy as np from itertools import product # creating the dataset im_width = 500 im_height = 500 data = np.zeros((im_height, im_width)) # each disk is coded as a triple (x, y, r), where x and y are the centers and r is the radius disks = [(150, 150, 80), (200, 380, 50), (360, 200, 100)] for i, j in product(range(im_width), range(im_height)): for x, y, r in disks: if (x-i)**2 + (y-j)**2 < r**2: data[i, j] = 1 ``` ```Python import matplotlib.pyplot as plt # visualizing the dataset with plt.style.context('seaborn-white'): plt.figure(figsize=(7, 7)) plt.imshow(data) plt.title('The shapes to learn') plt.show() ``` -------------------------------- ### Initialize Learners with Bootstrapping (Python) Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/bootstrapping_and_bagging.ipynb Prepares the data pool and initializes multiple ActiveLearner instances with bootstrapping enabled. Each learner is trained on a bootstrapped subset of the data, and they are stored in a list for later aggregation. ```Python from sklearn.neighbors import KNeighborsClassifier from modAL.models import ActiveLearner # create the pool from the image X_pool = np.transpose( [np.tile(np.asarray(range(data.shape[0])), data.shape[1]), np.repeat(np.asarray(range(data.shape[1])), data.shape[0])] ) # map the intensity values against the grid y_pool = np.asarray([data[P[0], P[1]] for P in X_pool]) # initial training data initial_idx = np.random.choice(range(len(X_pool)), size=500) # initializing the learners n_learners = 3 learner_list = [] for _ in range(n_learners): learner = ActiveLearner( estimator=KNeighborsClassifier(n_neighbors=10), X_training=X_pool[initial_idx], y_training=y_pool[initial_idx], bootstrap_init=True ) learner_list.append(learner) ``` -------------------------------- ### Define GP Regression Standard Deviation Query Strategy (Python) Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/modAL-in-a-nutshell.rst This Python function defines a query strategy for Gaussian Process regression. It takes a regressor object and a pool of examples (X) as input, predicts the standard deviation for each example, and returns the index of the instance with the highest standard deviation. This is commonly used in active learning to select the most uncertain sample for labeling. ```python def GP_regression_std(regressor, X): _, std = regressor.predict(X, return_std=True) return np.argmax(std) ``` -------------------------------- ### Initialize ActiveLearner with RandomForestClassifier (Python) Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/modAL-in-a-nutshell.rst Demonstrates the basic initialization of an ActiveLearner using a scikit-learn RandomForestClassifier. It shows how to set up the initial training data and prepare for querying new instances from a pool. ```python from modAL.models import ActiveLearner from sklearn.ensemble import RandomForestClassifier # initializing the learner learner = ActiveLearner( estimator=RandomForestClassifier(), X_training=X_training, y_training=y_training ) # query for labels query_idx, query_inst = learner.query(X_pool) # ...obtaining new labels from the Oracle... # supply label for queried instance learner.teach(X_pool[query_idx], y_new) ``` -------------------------------- ### Get Maximum Value from Optimizer Source: https://github.com/modal-python/modal/blob/master/docs/source/content/models/BayesianOptimizer.rst Retrieves the current best found maximum value and its corresponding input from the BayesianOptimizer instance. ```python X_max, y_max = optmizer.get_max() ``` -------------------------------- ### Initialize ActiveLearner with Default Strategy Source: https://github.com/modal-python/modal/blob/master/docs/source/content/models/ActiveLearner.rst Demonstrates the basic initialization of an ActiveLearner object. It requires a scikit-learn estimator and optionally accepts a query strategy function. The default strategy is maximum uncertainty sampling. ```python from modAL.models import ActiveLearner from modAL.uncertainty import uncertainty_sampling from sklearn.ensemble import RandomForestClassifier learner = ActiveLearner( estimator=RandomForestClassifier(), query_strategy=uncertainty_sampling ) ``` -------------------------------- ### Committee Iteration and Length Source: https://github.com/modal-python/modal/blob/master/docs/source/content/models/Committee.rst Shows how to iterate through the learners within a Committee object and how to get the total number of learners using the len() function. ```python # Iterate through each learner in the committee for learner in committee: # Perform actions with the individual learner pass # Get the number of learners in the committee num_learners = len(committee) ``` -------------------------------- ### Basic Active Learner Initialization and Usage Source: https://github.com/modal-python/modal/blob/master/README.md Demonstrates initializing an ActiveLearner with a scikit-learn estimator and performing basic query and teach operations. This snippet shows how to set up a learner with a RandomForestClassifier and interact with it by querying for new instances and teaching it with new labels. ```python from modAL.models import ActiveLearner from sklearn.ensemble import RandomForestClassifier # initializing the learner learner = ActiveLearner( estimator=RandomForestClassifier(), X_training=X_training, y_training=y_training ) # query for labels query_idx, query_inst = learner.query(X_pool) # ...obtaining new labels from the Oracle... # supply label for queried instance learner.teach(X_pool[query_idx], y_new) ``` -------------------------------- ### Visualize Performance Improvement Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb Plots the performance history of the committee over the active learning queries. This visualization shows how the committee's accuracy improves as it queries more data points. ```python with plt.style.context('seaborn-white'): plt.figure(figsize=(7, 7)) plt.plot(range(n_queries + 1), performance_history) plt.xlabel('Number of queries') plt.ylabel('Accuracy') plt.title('Query by Committee performance history') plt.show() ``` -------------------------------- ### Initialize Active Learner with Gaussian Process Source: https://github.com/modal-python/modal/blob/master/README.md Initializes the ActiveLearner with a GaussianProcessRegressor estimator and the custom GP query strategy. It sets up the initial training data and the regressor configuration. ```Python from modAL.models import ActiveLearner from sklearn.gaussian_process import GaussianProcessRegressor from sklearn.gaussian_process.kernels import WhiteKernel, RBF n_initial = 5 initial_idx = np.random.choice(range(len(X)), size=n_initial, replace=False) X_training, y_training = X[initial_idx], y[initial_idx] kernel = RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e3)) \ + WhiteKernel(noise_level=1, noise_level_bounds=(1e-10, 1e+1)) regressor = ActiveLearner( estimator=GaussianProcessRegressor(kernel=kernel), query_strategy=GP_regression_std, X_training=X_training.reshape(-1, 1), y_training=y_training.reshape(-1, 1) ) ``` -------------------------------- ### Python Interpreter Output Example Source: https://github.com/modal-python/modal/blob/master/docs/source/content/query_strategies/Disagreement-sampling.rst Illustrative output from a Python interpreter showing numerical data arrays, potentially representing predictions or disagreement scores, used in active learning scenarios. ```python >>> # Example data arrays >>> data_array_1 [0.27549995, 0.23005799, 0.69397192] >>> data_array_2 [0.69314718, 0.34053564, 0.22380466] >>> data_array_3 [0.04613903, 0.02914912, 0.15686827] >>> data_array_4 [0.70556709, 0.40546511, 0.17201121] >>> # Example max disagreement calculation >>> max_disagreement [0.80234647, 0.69397192, 0.69314718, 0.15686827, 0.70556709] ``` -------------------------------- ### Initialize ActiveLearner with Batch Sampling (Python) Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/ranked_batch_mode.ipynb This snippet demonstrates initializing the ActiveLearner model from the modAL library. It configures the learner with a KNeighborsClassifier estimator and a custom batch sampling strategy that retrieves a fixed number of instances per query. Key dependencies include modAL.models.ActiveLearner and functools.partial. ```Python from functools import partial from modAL.batch import uncertainty_batch_sampling from modAL.models import ActiveLearner # Pre-set our batch sampling to retrieve 3 samples at a time. BATCH_SIZE = 3 preset_batch = partial(uncertainty_batch_sampling, n_instances=BATCH_SIZE) # Specify our active learning model. learner = ActiveLearner( estimator=knn, X_training=X_train, y_training=y_train, query_strategy=preset_batch ) ``` -------------------------------- ### Prepare Data Pool Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb Creates a deep copy of the iris dataset to serve as the pool of unlabeled data for the active learning process. This ensures the original dataset remains intact. ```python from copy import deepcopy # generate the pool X_pool = deepcopy(iris['data']) y_pool = deepcopy(iris['target']) ``` -------------------------------- ### Import Libraries for Active Learning Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/interactive_labeling.ipynb Imports necessary libraries for active learning, including modAL components, scikit-learn for dataset and classifier, and IPython/matplotlib for visualization and interaction. ```python import numpy as np from modAL.models import ActiveLearner from modAL.uncertainty import uncertainty_sampling from sklearn.datasets import load_digits from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestClassifier from IPython import display from matplotlib import pyplot as plt %matplotlib inline ``` -------------------------------- ### Set random seed for reproducibility Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/ranked_batch_mode.ipynb Sets the random number generator seed for NumPy to ensure reproducible results across different runs of the script. This is a common practice in machine learning examples. ```python import numpy as np # Set our RNG for reproducibility. RANDOM_STATE_SEED = 123 np.random.seed(RANDOM_STATE_SEED) ``` -------------------------------- ### Prepare Dataset and Initial Training/Pool Sets Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/interactive_labeling.ipynb Loads the handwritten digits dataset, splits it into training and testing sets, and then partitions the training data into an initial labeled set and a pool of unlabeled data for active learning. ```python X, y = load_digits(return_y=True) X_train, X_test, y_train, y_test = train_test_split(X, y) initial_idx = np.random.choice(range(len(X_train)), size=n_initial, replace=False) X_initial, y_initial = X_train[initial_idx], y_train[initial_idx] X_pool, y_pool = np.delete(X_train, initial_idx, axis=0), np.delete(y_train, initial_idx, axis=0) ``` -------------------------------- ### Committee Initial Prediction and Accuracy Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb Calculates and visualizes the committee's initial prediction accuracy and the consensus prediction on the iris dataset. The committee's prediction is an aggregation of its members' predictions. ```python unqueried_score = committee.score(iris['data'], iris['target']) with plt.style.context('seaborn-white'): plt.figure(figsize=(7, 7)) prediction = committee.predict(iris['data']) plt.scatter(x=pca[:, 0], y=pca[:, 1], c=prediction, cmap='viridis', s=50) plt.title('Committee initial predictions, accuracy = %1.3f' % unqueried_score) plt.show() ``` -------------------------------- ### Initialize Bayesian Optimizer Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/bayesian_optimization.ipynb Initializes the BayesianOptimizer from modAL. It takes the regressor, initial training data, and the query strategy (Expected Improvement maximization) as parameters. ```Python # initializing the optimizer optimizer = BayesianOptimizer( estimator=regressor, X_training=X_initial, y_training=y_initial, query_strategy=max_EI ) ``` -------------------------------- ### Active Learning Loop with Query by Committee Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb Executes the active learning loop for a specified number of queries. In each iteration, the committee queries an instance where its members disagree the most, and then learns from the newly labeled instance. ```python performance_history = [unqueried_score] # query by committee n_queries = 20 for idx in range(n_queries): query_idx, query_instance = committee.query(X_pool) committee.teach( X=X_pool[query_idx].reshape(1, -1), y=y_pool[query_idx].reshape(1, ) ) performance_history.append(committee.score(iris['data'], iris['target'])) # remove queried instance from pool X_pool = np.delete(X_pool, query_idx, axis=0) y_pool = np.delete(y_pool, query_idx) ``` -------------------------------- ### Initialize Active Learner Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/interactive_labeling.ipynb Initializes the ActiveLearner with a RandomForestClassifier as the estimator and uncertainty_sampling as the query strategy. It uses the prepared initial training data. ```python learner = ActiveLearner( estimator=RandomForestClassifier(), query_strategy=uncertainty_sampling, X_training=X_initial, y_training=y_initial ) ``` -------------------------------- ### Assemble and Visualize Committee Model (Python) Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/bootstrapping_and_bagging.ipynb Bundles the initialized ActiveLearner instances into a modAL Committee model. The predictions of each individual learner within the committee are then visualized, followed by the committee's consensus predictions. ```Python # assembling the Committee committee = Committee(learner_list) # visualizing every learner in the Committee with plt.style.context('seaborn-white'): plt.figure(figsize=(7*n_learners, 7)) for learner_idx, learner in enumerate(committee): plt.subplot(1, n_learners, learner_idx+1) plt.imshow(learner.predict(X_pool).reshape(im_height, im_width)) plt.title('Learner no. %d' % (learner_idx + 1)) plt.show() # visualizing the Committee's predictions with plt.style.context('seaborn-white'): plt.figure(figsize=(7, 7)) plt.imshow(committee.predict(X_pool).reshape(im_height, im_width)) plt.title('Committee consensus predictions') plt.show() ``` -------------------------------- ### Set Random Seed for Reproducibility Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb Sets the NumPy random number generator seed to ensure reproducible results across different runs of the active learning process. This is a common practice for debugging and consistent experimentation. ```python import numpy as np # Set our RNG seed for reproducibility. RANDOM_STATE_SEED = 1 np.random.seed(RANDOM_STATE_SEED) ``` -------------------------------- ### Query for Labels using .query() Source: https://github.com/modal-python/modal/blob/master/docs/source/content/models/ActiveLearner.rst Illustrates how to use the .query() method to select the most informative unlabeled samples for labeling. This method calls the specified query strategy function and returns the indices and samples chosen by the strategy. ```python # Assuming learner is an initialized ActiveLearner object # and X is a dataset of unlabeled samples query_idx, query_sample = learner.query(X) # After obtaining new labels (query_label) from an Oracle: # learner.teach(query_sample, query_label) ``` -------------------------------- ### Initialize Gaussian Process Regressor Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/bayesian_optimization.ipynb Sets up the Gaussian Process Regressor, a key component in Bayesian optimization for modeling the objective function and its uncertainty. An initial training set is provided, and a Matern kernel is defined. ```Python # assembling initial training set X_initial, y_initial = X[150].reshape(1, -1), y[150].reshape(1, -1) # defining the kernel for the Gaussian process kernel = Matern(length_scale=1.0) regressor = GaussianProcessRegressor(kernel=kernel) ``` -------------------------------- ### Visualize Committee Predictions Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb This code visualizes the aggregated predictions of a committee of learners. It plots the committee's predictions against PCA components, also displaying the overall accuracy. Dependencies include Matplotlib (`plt`) and a committee object. ```python # visualizing the Committee's predictions with plt.style.context('seaborn-white'): plt.figure(figsize=(7, 7)) prediction = committee.predict(iris['data']) plt.scatter(x=pca[:, 0], y=pca[:, 1], c=prediction, cmap='viridis', s=50) plt.title('Committee predictions after %d queries, accuracy = %1.3f' % (n_queries, committee.score(iris['data'], iris['target']))) plt.show() ``` -------------------------------- ### Initialize ActiveLearner with Gaussian Process Regressor (Python) Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/modAL-in-a-nutshell.rst This Python code snippet demonstrates how to initialize an ActiveLearner from the modAL library. It sets up a Gaussian Process Regressor with a specified kernel (RBF + WhiteKernel) and uses the previously defined GP_regression_std function as the query strategy. It also prepares initial training data by randomly selecting a subset of the available data. ```python from modAL.models import ActiveLearner from sklearn.gaussian_process import GaussianProcessRegressor from sklearn.gaussian_process.kernels import WhiteKernel, RBF n_initial = 5 initial_idx = np.random.choice(range(len(X)), size=n_initial, replace=False) X_training, y_training = X[initial_idx], y[initial_idx] kernel = RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e3)) \ + WhiteKernel(noise_level=1, noise_level_bounds=(1e-10, 1e+1)) regressor = ActiveLearner( estimator=GaussianProcessRegressor(kernel=kernel), query_strategy=GP_regression_std, X_training=X_training.reshape(-1, 1), y_training=y_training.reshape(-1, 1) ) ``` -------------------------------- ### Define Gaussian Process Query Strategy Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/active_regression.ipynb Implements a custom query strategy for Gaussian processes. It selects the instance with the highest prediction uncertainty (standard deviation) from the regressor, guiding the active learning process. ```Python def GP_regression_std(regressor, X): _, std = regressor.predict(X, return_std=True) return np.argmax(std) ``` -------------------------------- ### Clone modAL Repository Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/Contributing.rst Clones the modAL repository from GitHub to your local machine. This is the initial step for contributing to the project. ```bash $ git clone git@github.com:username/modAL.git ``` -------------------------------- ### Predict and Check Correctness with ActiveLearner (Python) Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/ranked_batch_mode.ipynb This code snippet shows how to use the predict method of an ActiveLearner model to get predictions on raw data. It then compares these predictions against the true labels (y_raw) to determine correctness, preparing data for visualization. ```Python # Isolate the data we'll need for plotting. predictions = learner.predict(X_raw) is_correct = (predictions == y_raw) predictions ``` -------------------------------- ### Partition dataset into labeled and unlabeled pools Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/ranked_batch_mode.ipynb Splits the Iris dataset into a small labeled training set (mathcal{L}) and a larger unlabeled pool (mathcal{U}). Three random examples are selected for the training set, and the rest form the pool. ```python # Isolate our examples for our labeled dataset. n_labeled_examples = X_raw.shape[0] training_indices = np.random.randint(low=0, high=n_labeled_examples + 1, size=3) X_train = X_raw[training_indices] y_train = y_raw[training_indices] # Isolate the non-training examples we'll be querying. X_pool = np.delete(X_raw, training_indices, axis=0) y_pool = np.delete(y_raw, training_indices, axis=0) ``` -------------------------------- ### Perform Active Learning Queries (Python) Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/modAL-in-a-nutshell.rst This Python code illustrates the active learning loop using the initialized ActiveLearner. It iterates for a specified number of queries, using the learner's query method to select the next instance to label and the teach method to update the model with the new labeled data. This process iteratively improves the regressor's accuracy. ```python # active learning n_queries = 10 for idx in range(n_queries): query_idx, query_instance = regressor.query(X) regressor.teach(X[query_idx].reshape(1, -1), y[query_idx].reshape(1, -1)) ``` -------------------------------- ### Partition Data into Labeled and Unlabeled Pools Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/pool-based_sampling.ipynb Splits the Iris dataset into a small labeled training set (mathcal{L}) and a larger unlabeled pool (mathcal{U}). It randomly selects a few examples for training and uses the rest for the pool. ```python # Isolate our examples for our labeled dataset. n_labeled_examples = X_raw.shape[0] training_indices = np.random.randint(low=0, high=n_labeled_examples + 1, size=3) X_train = X_raw[training_indices] y_train = y_raw[training_indices] # Isolate the non-training examples we'll be querying. X_pool = np.delete(X_raw, training_indices, axis=0) y_pool = np.delete(y_raw, training_indices, axis=0) ``` -------------------------------- ### Visualize Learner Predictions Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb This snippet visualizes the final predictions made by individual learners within a committee. It uses Matplotlib to create subplots for each learner, displaying predictions against PCA components. Dependencies include Matplotlib (`plt`) and potentially a committee object and PCA data. ```python # visualizing the final predictions per learner with plt.style.context('seaborn-white'): plt.figure(figsize=(n_members*7, 7)) for learner_idx, learner in enumerate(committee): plt.subplot(1, n_members, learner_idx + 1) plt.scatter(x=pca[:, 0], y=pca[:, 1], c=learner.predict(iris['data']), cmap='viridis', s=50) plt.title('Learner no. %d predictions after %d queries' % (learner_idx + 1, n_queries)) ``` -------------------------------- ### Clone modAL Repository Source: https://github.com/modal-python/modal/blob/master/CONTRIBUTING.md Clones the modAL repository from GitHub to your local machine. ```bash git clone git@github.com:username/modAL.git ``` -------------------------------- ### Plot Performance History Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb This snippet visualizes the performance history of an incremental classification model. It plots accuracy over query iterations using Matplotlib, including scatter points for individual data points and formatted axes for clarity. Dependencies include Matplotlib (`plt`, `mpl.ticker`) and a `performance_history` list. ```python # Plot our performance over time. fig, ax = plt.subplots(figsize=(8.5, 6), dpi=130) ax.plot(performance_history) ax.scatter(range(len(performance_history)), performance_history, s=13) ax.xaxis.set_major_locator(mpl.ticker.MaxNLocator(nbins=5, integer=True)) ax.yaxis.set_major_locator(mpl.ticker.MaxNLocator(nbins=10)) ax.yaxis.set_major_formatter(mpl.ticker.PercentFormatter(xmax=1)) ax.set_ylim(bottom=0, top=1) ax.grid(True) ax.set_title('Incremental classification accuracy') ax.set_xlabel('Query iteration') ax.set_ylabel('Classification Accuracy') plt.show() ``` -------------------------------- ### Retrain ActiveLearner using .fit() Source: https://github.com/modal-python/modal/blob/master/docs/source/content/models/ActiveLearner.rst Details the .fit() method, which allows the ActiveLearner to forget all previously seen data and retrain the estimator from scratch using the provided new data. This is useful for resetting the learner's state. ```python # Assuming learner is an initialized ActiveLearner object # and X, y are new samples and labels learner.fit(X, y) ```