### Install modAL Python Package

Source: https://github.com/modal-python/modal/blob/master/README.md

Provides commands for installing the modAL Python library using pip. It includes installation from PyPI and directly from the GitHub repository.

```bash
pip install modAL-python
```

```bash
pip install git+https://github.com/modAL-python/modAL.git
```

--------------------------------

### Install modAL Python from Source

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/Installation.rst

Installs the modAL Python library directly from its GitHub repository. This method is useful for developers who want to use the latest unreleased features or contribute to the project. Requires Git and pip.

```shell
pip install git+https://github.com/modAL-python/modAL.git
```

--------------------------------

### Install modAL Python via Pip

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/Installation.rst

Installs the modAL Python library from the Python Package Index (PyPI). This is the standard and recommended method for most users. Ensure you have pip installed and a compatible Python environment.

```shell
pip install modAL-python
```

--------------------------------

### Initialize ActiveLearner with Initial Training Data

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/models/ActiveLearner.rst

Shows how to initialize an ActiveLearner with pre-existing training data. The initial samples and their corresponding labels are passed using X_training and y_training arguments, allowing the learner to start with a trained estimator.

```python
from modAL.models import ActiveLearner
from modAL.uncertainty import uncertainty_sampling
from sklearn.ensemble import RandomForestClassifier

# Assuming X_training and y_training are defined
learner = ActiveLearner(
    estimator=RandomForestClassifier(),
    query_strategy=uncertainty_sampling,
    X_training=X_training, y_training=y_training
)
```

--------------------------------

### Initialize Active Learner with Gaussian Process

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/active_regression.ipynb

Sets up the ActiveLearner with a GaussianProcessRegressor estimator and the custom query strategy. It initializes the learner with a small subset of the data to start the learning process.

```Python
n_initial = 5
initial_idx = np.random.choice(range(len(X)), size=n_initial, replace=False)
X_training, y_training = X[initial_idx], y[initial_idx]

kernel = RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e3)) \
         + WhiteKernel(noise_level=1, noise_level_bounds=(1e-10, 1e+1))

regressor = ActiveLearner(
    estimator=GaussianProcessRegressor(kernel=kernel),
    query_strategy=GP_regression_std,
    X_training=X_training.reshape(-1, 1), y_training=y_training.reshape(-1, 1)
)
```

--------------------------------

### Load and Visualize Iris Dataset

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb

Loads the iris dataset using scikit-learn and visualizes it in 2D using PCA. This step prepares the data for the active learning experiment and provides a visual understanding of the data distribution.

```python
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.decomposition import PCA
from sklearn.datasets import load_iris

# loading the iris dataset
iris = load_iris()

# visualizing the classes
with plt.style.context('seaborn-white'):
    plt.figure(figsize=(7, 7))
    pca = PCA(n_components=2).fit_transform(iris['data'])
    plt.scatter(x=pca[:, 0], y=pca[:, 1], c=iris['target'], cmap='viridis', s=50)
    plt.title('The iris dataset')
    plt.show()
```

--------------------------------

### Initialize Committee Members

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb

Initializes multiple `ActiveLearner` instances, each with a `RandomForestClassifier`, to form the committee. It randomly selects initial training data for each learner and removes it from the pool.

```python
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from modAL.models import ActiveLearner, Committee

# initializing Committee members
n_members = 2
learner_list = list()

for member_idx in range(n_members):
    # initial training data
    n_initial = 2
    train_idx = np.random.choice(range(X_pool.shape[0]), size=n_initial, replace=False)
    X_train = X_pool[train_idx]
    y_train = y_pool[train_idx]

    # creating a reduced copy of the data with the known instances removed
    X_pool = np.delete(X_pool, train_idx, axis=0)
    y_pool = np.delete(y_pool, train_idx)

    # initializing learner
    learner = ActiveLearner(
        estimator=RandomForestClassifier(),
        X_training=X_train, y_training=y_train
    )
    learner_list.append(learner)

# assembling the committee
committee = Committee(learner_list=learner_list)
```

--------------------------------

### Prepare Data for Active Regression (Python)

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/modAL-in-a-nutshell.rst

Provides an example of preparing data for an active regression task, specifically learning a noisy sine function. It involves generating synthetic data with added noise using NumPy.

```python
import numpy as np

X = np.random.choice(np.linspace(0, 20, 10000), size=200, replace=False).reshape(-1, 1)
y = np.sin(X) + np.random.normal(scale=0.3, size=X.shape)
```

--------------------------------

### Visualize Committee Initial Predictions

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb

Visualizes the initial predictions made by each individual learner within the committee on the PCA-transformed iris dataset. This helps to see the diversity of initial hypotheses.

```python
with plt.style.context('seaborn-white'):
    plt.figure(figsize=(n_members*7, 7))
    for learner_idx, learner in enumerate(committee):
        plt.subplot(1, n_members, learner_idx + 1)
        plt.scatter(x=pca[:, 0], y=pca[:, 1], c=learner.predict(iris['data']), cmap='viridis', s=50)
        plt.title('Learner no. %d initial predictions' % (learner_idx + 1))
    plt.show()
```

--------------------------------

### Create and Visualize Dataset (Python)

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/bootstrapping_and_bagging.ipynb

Generates a synthetic dataset representing three black disks on a white background using NumPy and itertools. The dataset is then visualized using Matplotlib, displaying the image with the shapes to be learned.

```Python
import numpy as np
from itertools import product

# creating the dataset
im_width = 500
im_height = 500
data = np.zeros((im_height, im_width))
# each disk is coded as a triple (x, y, r), where x and y are the centers and r is the radius
disks = [(150, 150, 80), (200, 380, 50), (360, 200, 100)]
for i, j in product(range(im_width), range(im_height)):
    for x, y, r in disks:
        if (x-i)**2 + (y-j)**2 < r**2:
            data[i, j] = 1
```

```Python
import matplotlib.pyplot as plt

# visualizing the dataset
with plt.style.context('seaborn-white'):
    plt.figure(figsize=(7, 7))
    plt.imshow(data)
    plt.title('The shapes to learn')
    plt.show()
```

--------------------------------

### Initialize Learners with Bootstrapping (Python)

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/bootstrapping_and_bagging.ipynb

Prepares the data pool and initializes multiple ActiveLearner instances with bootstrapping enabled. Each learner is trained on a bootstrapped subset of the data, and they are stored in a list for later aggregation.

```Python
from sklearn.neighbors import KNeighborsClassifier
from modAL.models import ActiveLearner

# create the pool from the image
X_pool = np.transpose(
    [np.tile(np.asarray(range(data.shape[0])), data.shape[1]),
     np.repeat(np.asarray(range(data.shape[1])), data.shape[0])]
)
# map the intensity values against the grid
y_pool = np.asarray([data[P[0], P[1]] for P in X_pool])

# initial training data
initial_idx = np.random.choice(range(len(X_pool)), size=500)

# initializing the learners
n_learners = 3
learner_list = []
for _ in range(n_learners):
    learner = ActiveLearner(
        estimator=KNeighborsClassifier(n_neighbors=10),
        X_training=X_pool[initial_idx], y_training=y_pool[initial_idx],
        bootstrap_init=True
    )
    learner_list.append(learner)
```

--------------------------------

### Define GP Regression Standard Deviation Query Strategy (Python)

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/modAL-in-a-nutshell.rst

This Python function defines a query strategy for Gaussian Process regression. It takes a regressor object and a pool of examples (X) as input, predicts the standard deviation for each example, and returns the index of the instance with the highest standard deviation. This is commonly used in active learning to select the most uncertain sample for labeling.

```python
def GP_regression_std(regressor, X):
    _, std = regressor.predict(X, return_std=True)
    return np.argmax(std)
```

--------------------------------

### Initialize ActiveLearner with RandomForestClassifier (Python)

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/modAL-in-a-nutshell.rst

Demonstrates the basic initialization of an ActiveLearner using a scikit-learn RandomForestClassifier. It shows how to set up the initial training data and prepare for querying new instances from a pool.

```python
from modAL.models import ActiveLearner
from sklearn.ensemble import RandomForestClassifier

# initializing the learner
learner = ActiveLearner(
    estimator=RandomForestClassifier(),
    X_training=X_training, y_training=y_training
)

# query for labels
query_idx, query_inst = learner.query(X_pool)

# ...obtaining new labels from the Oracle...

# supply label for queried instance
learner.teach(X_pool[query_idx], y_new)
```

--------------------------------

### Get Maximum Value from Optimizer

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/models/BayesianOptimizer.rst

Retrieves the current best found maximum value and its corresponding input from the BayesianOptimizer instance.

```python
X_max, y_max = optmizer.get_max()
```

--------------------------------

### Initialize ActiveLearner with Default Strategy

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/models/ActiveLearner.rst

Demonstrates the basic initialization of an ActiveLearner object. It requires a scikit-learn estimator and optionally accepts a query strategy function. The default strategy is maximum uncertainty sampling.

```python
from modAL.models import ActiveLearner
from modAL.uncertainty import uncertainty_sampling
from sklearn.ensemble import RandomForestClassifier

learner = ActiveLearner(
    estimator=RandomForestClassifier(),
    query_strategy=uncertainty_sampling
)
```

--------------------------------

### Committee Iteration and Length

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/models/Committee.rst

Shows how to iterate through the learners within a Committee object and how to get the total number of learners using the len() function.

```python
# Iterate through each learner in the committee
for learner in committee:
    # Perform actions with the individual learner
    pass

# Get the number of learners in the committee
num_learners = len(committee)
```

--------------------------------

### Basic Active Learner Initialization and Usage

Source: https://github.com/modal-python/modal/blob/master/README.md

Demonstrates initializing an ActiveLearner with a scikit-learn estimator and performing basic query and teach operations. This snippet shows how to set up a learner with a RandomForestClassifier and interact with it by querying for new instances and teaching it with new labels.

```python
from modAL.models import ActiveLearner
from sklearn.ensemble import RandomForestClassifier

# initializing the learner
learner = ActiveLearner(
    estimator=RandomForestClassifier(),
    X_training=X_training, y_training=y_training
)

# query for labels
query_idx, query_inst = learner.query(X_pool)

# ...obtaining new labels from the Oracle...

# supply label for queried instance
learner.teach(X_pool[query_idx], y_new)
```

--------------------------------

### Visualize Performance Improvement

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb

Plots the performance history of the committee over the active learning queries. This visualization shows how the committee's accuracy improves as it queries more data points.

```python
with plt.style.context('seaborn-white'):
    plt.figure(figsize=(7, 7))
    plt.plot(range(n_queries + 1), performance_history)
    plt.xlabel('Number of queries')
    plt.ylabel('Accuracy')
    plt.title('Query by Committee performance history')
    plt.show()
```

--------------------------------

### Initialize Active Learner with Gaussian Process

Source: https://github.com/modal-python/modal/blob/master/README.md

Initializes the ActiveLearner with a GaussianProcessRegressor estimator and the custom GP query strategy. It sets up the initial training data and the regressor configuration.

```Python
from modAL.models import ActiveLearner
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import WhiteKernel, RBF

n_initial = 5
initial_idx = np.random.choice(range(len(X)), size=n_initial, replace=False)
X_training, y_training = X[initial_idx], y[initial_idx]

kernel = RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e3)) \
         + WhiteKernel(noise_level=1, noise_level_bounds=(1e-10, 1e+1))

regressor = ActiveLearner(
    estimator=GaussianProcessRegressor(kernel=kernel),
    query_strategy=GP_regression_std,
    X_training=X_training.reshape(-1, 1), y_training=y_training.reshape(-1, 1)
)
```

--------------------------------

### Python Interpreter Output Example

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/query_strategies/Disagreement-sampling.rst

Illustrative output from a Python interpreter showing numerical data arrays, potentially representing predictions or disagreement scores, used in active learning scenarios.

```python
>>> # Example data arrays
>>> data_array_1
[0.27549995,  0.23005799,  0.69397192]
>>> data_array_2
[0.69314718,  0.34053564,  0.22380466]
>>> data_array_3
[0.04613903,  0.02914912,  0.15686827]
>>> data_array_4
[0.70556709,  0.40546511,  0.17201121]

>>> # Example max disagreement calculation
>>> max_disagreement
[0.80234647,  0.69397192,  0.69314718,  0.15686827,  0.70556709]
```

--------------------------------

### Initialize ActiveLearner with Batch Sampling (Python)

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/ranked_batch_mode.ipynb

This snippet demonstrates initializing the ActiveLearner model from the modAL library. It configures the learner with a KNeighborsClassifier estimator and a custom batch sampling strategy that retrieves a fixed number of instances per query. Key dependencies include modAL.models.ActiveLearner and functools.partial.

```Python
from functools import partial
from modAL.batch import uncertainty_batch_sampling
from modAL.models import ActiveLearner

# Pre-set our batch sampling to retrieve 3 samples at a time.
BATCH_SIZE = 3
preset_batch = partial(uncertainty_batch_sampling, n_instances=BATCH_SIZE)

# Specify our active learning model.
learner = ActiveLearner(
  estimator=knn,

  X_training=X_train,
  y_training=y_train,

  query_strategy=preset_batch
)
```

--------------------------------

### Prepare Data Pool

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb

Creates a deep copy of the iris dataset to serve as the pool of unlabeled data for the active learning process. This ensures the original dataset remains intact.

```python
from copy import deepcopy

# generate the pool
X_pool = deepcopy(iris['data'])
y_pool = deepcopy(iris['target'])
```

--------------------------------

### Import Libraries for Active Learning

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/interactive_labeling.ipynb

Imports necessary libraries for active learning, including modAL components, scikit-learn for dataset and classifier, and IPython/matplotlib for visualization and interaction.

```python
import numpy as np

from modAL.models import ActiveLearner
from modAL.uncertainty import uncertainty_sampling

from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

from IPython import display
from matplotlib import pyplot as plt
%matplotlib inline
```

--------------------------------

### Set random seed for reproducibility

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/ranked_batch_mode.ipynb

Sets the random number generator seed for NumPy to ensure reproducible results across different runs of the script. This is a common practice in machine learning examples.

```python
import numpy as np

# Set our RNG for reproducibility.
RANDOM_STATE_SEED = 123
np.random.seed(RANDOM_STATE_SEED)
```

--------------------------------

### Prepare Dataset and Initial Training/Pool Sets

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/interactive_labeling.ipynb

Loads the handwritten digits dataset, splits it into training and testing sets, and then partitions the training data into an initial labeled set and a pool of unlabeled data for active learning.

```python
X, y = load_digits(return_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)

initial_idx = np.random.choice(range(len(X_train)), size=n_initial, replace=False)

X_initial, y_initial = X_train[initial_idx], y_train[initial_idx]
X_pool, y_pool = np.delete(X_train, initial_idx, axis=0), np.delete(y_train, initial_idx, axis=0)
```

--------------------------------

### Committee Initial Prediction and Accuracy

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb

Calculates and visualizes the committee's initial prediction accuracy and the consensus prediction on the iris dataset. The committee's prediction is an aggregation of its members' predictions.

```python
unqueried_score = committee.score(iris['data'], iris['target'])

with plt.style.context('seaborn-white'):
    plt.figure(figsize=(7, 7))
    prediction = committee.predict(iris['data'])
    plt.scatter(x=pca[:, 0], y=pca[:, 1], c=prediction, cmap='viridis', s=50)
    plt.title('Committee initial predictions, accuracy = %1.3f' % unqueried_score)
    plt.show()
```

--------------------------------

### Initialize Bayesian Optimizer

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/bayesian_optimization.ipynb

Initializes the BayesianOptimizer from modAL. It takes the regressor, initial training data, and the query strategy (Expected Improvement maximization) as parameters.

```Python
# initializing the optimizer
optimizer = BayesianOptimizer(
    estimator=regressor,
    X_training=X_initial, y_training=y_initial,
    query_strategy=max_EI
)
```

--------------------------------

### Active Learning Loop with Query by Committee

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb

Executes the active learning loop for a specified number of queries. In each iteration, the committee queries an instance where its members disagree the most, and then learns from the newly labeled instance.

```python
performance_history = [unqueried_score]

# query by committee
n_queries = 20
for idx in range(n_queries):
    query_idx, query_instance = committee.query(X_pool)
    committee.teach(
        X=X_pool[query_idx].reshape(1, -1),
        y=y_pool[query_idx].reshape(1, )
    )
    performance_history.append(committee.score(iris['data'], iris['target']))
    # remove queried instance from pool
    X_pool = np.delete(X_pool, query_idx, axis=0)
    y_pool = np.delete(y_pool, query_idx)
```

--------------------------------

### Initialize Active Learner

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/interactive_labeling.ipynb

Initializes the ActiveLearner with a RandomForestClassifier as the estimator and uncertainty_sampling as the query strategy. It uses the prepared initial training data.

```python
learner = ActiveLearner(
    estimator=RandomForestClassifier(),
    query_strategy=uncertainty_sampling,
    X_training=X_initial, y_training=y_initial
)
```

--------------------------------

### Assemble and Visualize Committee Model (Python)

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/bootstrapping_and_bagging.ipynb

Bundles the initialized ActiveLearner instances into a modAL Committee model. The predictions of each individual learner within the committee are then visualized, followed by the committee's consensus predictions.

```Python
# assembling the Committee
committee = Committee(learner_list)

# visualizing every learner in the Committee
with plt.style.context('seaborn-white'):
    plt.figure(figsize=(7*n_learners, 7))
    for learner_idx, learner in enumerate(committee):
        plt.subplot(1, n_learners, learner_idx+1)
        plt.imshow(learner.predict(X_pool).reshape(im_height, im_width))
        plt.title('Learner no. %d' % (learner_idx + 1))
    plt.show()

# visualizing the Committee's predictions
with plt.style.context('seaborn-white'):
    plt.figure(figsize=(7, 7))
    plt.imshow(committee.predict(X_pool).reshape(im_height, im_width))
    plt.title('Committee consensus predictions')
    plt.show()
```

--------------------------------

### Set Random Seed for Reproducibility

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb

Sets the NumPy random number generator seed to ensure reproducible results across different runs of the active learning process. This is a common practice for debugging and consistent experimentation.

```python
import numpy as np

# Set our RNG seed for reproducibility.
RANDOM_STATE_SEED = 1
np.random.seed(RANDOM_STATE_SEED)
```

--------------------------------

### Query for Labels using .query()

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/models/ActiveLearner.rst

Illustrates how to use the .query() method to select the most informative unlabeled samples for labeling. This method calls the specified query strategy function and returns the indices and samples chosen by the strategy.

```python
# Assuming learner is an initialized ActiveLearner object
# and X is a dataset of unlabeled samples
query_idx, query_sample = learner.query(X)

# After obtaining new labels (query_label) from an Oracle:
# learner.teach(query_sample, query_label)
```

--------------------------------

### Initialize Gaussian Process Regressor

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/bayesian_optimization.ipynb

Sets up the Gaussian Process Regressor, a key component in Bayesian optimization for modeling the objective function and its uncertainty. An initial training set is provided, and a Matern kernel is defined.

```Python
# assembling initial training set
X_initial, y_initial = X[150].reshape(1, -1), y[150].reshape(1, -1)

# defining the kernel for the Gaussian process
kernel = Matern(length_scale=1.0)
regressor = GaussianProcessRegressor(kernel=kernel)
```

--------------------------------

### Visualize Committee Predictions

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb

This code visualizes the aggregated predictions of a committee of learners. It plots the committee's predictions against PCA components, also displaying the overall accuracy. Dependencies include Matplotlib (`plt`) and a committee object.

```python
# visualizing the Committee's predictions
with plt.style.context('seaborn-white'):
    plt.figure(figsize=(7, 7))
    prediction = committee.predict(iris['data'])
    plt.scatter(x=pca[:, 0], y=pca[:, 1], c=prediction, cmap='viridis', s=50)
    plt.title('Committee predictions after %d queries, accuracy = %1.3f'
              % (n_queries, committee.score(iris['data'], iris['target'])))
    plt.show()
```

--------------------------------

### Initialize ActiveLearner with Gaussian Process Regressor (Python)

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/modAL-in-a-nutshell.rst

This Python code snippet demonstrates how to initialize an ActiveLearner from the modAL library. It sets up a Gaussian Process Regressor with a specified kernel (RBF + WhiteKernel) and uses the previously defined GP_regression_std function as the query strategy. It also prepares initial training data by randomly selecting a subset of the available data.

```python
from modAL.models import ActiveLearner
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import WhiteKernel, RBF

n_initial = 5
initial_idx = np.random.choice(range(len(X)), size=n_initial, replace=False)
X_training, y_training = X[initial_idx], y[initial_idx]

kernel = RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e3)) \
         + WhiteKernel(noise_level=1, noise_level_bounds=(1e-10, 1e+1))

regressor = ActiveLearner(
    estimator=GaussianProcessRegressor(kernel=kernel),
    query_strategy=GP_regression_std,
    X_training=X_training.reshape(-1, 1), y_training=y_training.reshape(-1, 1)
)
```

--------------------------------

### Define Gaussian Process Query Strategy

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/active_regression.ipynb

Implements a custom query strategy for Gaussian processes. It selects the instance with the highest prediction uncertainty (standard deviation) from the regressor, guiding the active learning process.

```Python
def GP_regression_std(regressor, X):
    _, std = regressor.predict(X, return_std=True)
    return np.argmax(std)
```

--------------------------------

### Clone modAL Repository

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/Contributing.rst

Clones the modAL repository from GitHub to your local machine. This is the initial step for contributing to the project.

```bash
$ git clone git@github.com:username/modAL.git
```

--------------------------------

### Predict and Check Correctness with ActiveLearner (Python)

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/ranked_batch_mode.ipynb

This code snippet shows how to use the predict method of an ActiveLearner model to get predictions on raw data. It then compares these predictions against the true labels (y_raw) to determine correctness, preparing data for visualization.

```Python
# Isolate the data we'll need for plotting.
predictions = learner.predict(X_raw)
is_correct = (predictions == y_raw)

predictions
```

--------------------------------

### Partition dataset into labeled and unlabeled pools

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/ranked_batch_mode.ipynb

Splits the Iris dataset into a small labeled training set (mathcal{L}) and a larger unlabeled pool (mathcal{U}). Three random examples are selected for the training set, and the rest form the pool.

```python
# Isolate our examples for our labeled dataset.
n_labeled_examples = X_raw.shape[0]
training_indices = np.random.randint(low=0, high=n_labeled_examples + 1, size=3)

X_train = X_raw[training_indices]
y_train = y_raw[training_indices]

# Isolate the non-training examples we'll be querying.
X_pool = np.delete(X_raw, training_indices, axis=0)
y_pool = np.delete(y_raw, training_indices, axis=0)
```

--------------------------------

### Perform Active Learning Queries (Python)

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/overview/modAL-in-a-nutshell.rst

This Python code illustrates the active learning loop using the initialized ActiveLearner. It iterates for a specified number of queries, using the learner's query method to select the next instance to label and the teach method to update the model with the new labeled data. This process iteratively improves the regressor's accuracy.

```python
# active learning
n_queries = 10
for idx in range(n_queries):
    query_idx, query_instance = regressor.query(X)
    regressor.teach(X[query_idx].reshape(1, -1), y[query_idx].reshape(1, -1))
```

--------------------------------

### Partition Data into Labeled and Unlabeled Pools

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/pool-based_sampling.ipynb

Splits the Iris dataset into a small labeled training set (mathcal{L}) and a larger unlabeled pool (mathcal{U}). It randomly selects a few examples for training and uses the rest for the pool.

```python
# Isolate our examples for our labeled dataset.
n_labeled_examples = X_raw.shape[0]
training_indices = np.random.randint(low=0, high=n_labeled_examples + 1, size=3)

X_train = X_raw[training_indices]
y_train = y_raw[training_indices]

# Isolate the non-training examples we'll be querying.
X_pool = np.delete(X_raw, training_indices, axis=0)
y_pool = np.delete(y_raw, training_indices, axis=0)
```

--------------------------------

### Visualize Learner Predictions

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb

This snippet visualizes the final predictions made by individual learners within a committee. It uses Matplotlib to create subplots for each learner, displaying predictions against PCA components. Dependencies include Matplotlib (`plt`) and potentially a committee object and PCA data.

```python
# visualizing the final predictions per learner
with plt.style.context('seaborn-white'):
    plt.figure(figsize=(n_members*7, 7))
    for learner_idx, learner in enumerate(committee):
        plt.subplot(1, n_members, learner_idx + 1)
        plt.scatter(x=pca[:, 0], y=pca[:, 1], c=learner.predict(iris['data']), cmap='viridis', s=50)
        plt.title('Learner no. %d predictions after %d queries' % (learner_idx + 1, n_queries))
```

--------------------------------

### Clone modAL Repository

Source: https://github.com/modal-python/modal/blob/master/CONTRIBUTING.md

Clones the modAL repository from GitHub to your local machine.

```bash
git clone git@github.com:username/modAL.git
```

--------------------------------

### Plot Performance History

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/examples/query_by_committee.ipynb

This snippet visualizes the performance history of an incremental classification model. It plots accuracy over query iterations using Matplotlib, including scatter points for individual data points and formatted axes for clarity. Dependencies include Matplotlib (`plt`, `mpl.ticker`) and a `performance_history` list.

```python
# Plot our performance over time.
fig, ax = plt.subplots(figsize=(8.5, 6), dpi=130)

ax.plot(performance_history)
ax.scatter(range(len(performance_history)), performance_history, s=13)

ax.xaxis.set_major_locator(mpl.ticker.MaxNLocator(nbins=5, integer=True))
ax.yaxis.set_major_locator(mpl.ticker.MaxNLocator(nbins=10))
ax.yaxis.set_major_formatter(mpl.ticker.PercentFormatter(xmax=1))

ax.set_ylim(bottom=0, top=1)
ax.grid(True)

ax.set_title('Incremental classification accuracy')
ax.set_xlabel('Query iteration')
ax.set_ylabel('Classification Accuracy')

plt.show()
```

--------------------------------

### Retrain ActiveLearner using .fit()

Source: https://github.com/modal-python/modal/blob/master/docs/source/content/models/ActiveLearner.rst

Details the .fit() method, which allows the ActiveLearner to forget all previously seen data and retrain the estimator from scratch using the provided new data. This is useful for resetting the learner's state.

```python
# Assuming learner is an initialized ActiveLearner object
# and X, y are new samples and labels
learner.fit(X, y)
```