### Run integration tests

Source: https://github.com/kno10/python-kmedoids/blob/main/docs/index.md

Validate the installation by running the integration tests using Python's unittest module.

```sh
python -m unittest discover tests
```

--------------------------------

### Install kmedoids with pip or conda

Source: https://github.com/kno10/python-kmedoids/blob/main/docs/index.md

Use pip or conda to install pre-built packages for various systems. For uncommon architectures, Rust may need to be installed first.

```sh
pip install kmedoids
```

```sh
conda install -c conda-forge kmedoids
```

--------------------------------

### Install kmedoids with pip

Source: https://github.com/kno10/python-kmedoids/blob/main/README.md

Install the kmedoids package using pip. This is the standard method for most users.

```sh
pip install kmedoids
```

--------------------------------

### Validate kmedoids installation

Source: https://github.com/kno10/python-kmedoids/blob/main/README.md

Run integration tests to validate the kmedoids installation. This requires numpy to be installed.

```sh
pip install numpy
python -m unittest discover tests
```

--------------------------------

### Install kmedoids with conda

Source: https://github.com/kno10/python-kmedoids/blob/main/README.md

Install the kmedoids package from the conda-forge channel. This is an alternative installation method.

```sh
conda install -c conda-forge kmedoids
```

--------------------------------

### Build kmedoids from source

Source: https://github.com/kno10/python-kmedoids/blob/main/README.md

Build and install the kmedoids package from source using maturin. This method is useful for development or when pre-built packages are unavailable. Ensure you have Rust/Cargo installed.

```sh
pip install maturin
git clone https://github.com/kno10/python-kmedoids.git
cd python-kmedoids
maturin develop --release
```

--------------------------------

### Compile kmedoids from source

Source: https://github.com/kno10/python-kmedoids/blob/main/docs/index.md

Compile the package from source using maturin, requiring Rust and Python 3. Ensure a virtual environment is activated before installation.

```sh
# activate your desired virtual environment first
pip install maturin
git clone https://github.com/kno10/python-kmedoids.git
cd python-kmedoids
# build and install the package:
maturin develop --release
```

--------------------------------

### Choose optimal number of clusters with DynMSC

Source: https://github.com/kno10/python-kmedoids/blob/main/docs/index.md

Use the DynMSC algorithm to find the optimal number of clusters (k) by optimizing the Medoid Silhouette score within a specified range. This example uses a subset of the MNIST dataset.

```python
import kmedoids, numpy
from sklearn.datasets import fetch_openml
from sklearn.metrics.pairwise import euclidean_distances
X, _ = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False)
X = X[:10000]
diss = euclidean_distances(X)
kmin = 10
kmax = 20
dm = kmedoids.dynmsc(diss, kmax, kmin)
print("Optimal number of clusters according to the Medoid Silhouette:", dm.bestk)
print("Medoid Silhouette over range of k:", dm.losses)
print("Range of k:", dm.rangek)
```

--------------------------------

### Compare FastPAM1 and PAM with BUILD init

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Compares the results of FastPAM1 and standard PAM when both use the 'build' initialization strategy. Asserts that their loss and medoids are identical.

```python
import kmedoids

fp1 = kmedoids.fastpam1(dist, medoids=2, init="build")
pam = kmedoids.pam(dist, medoids=2, init="build")

print("FastPAM1 loss:", fp1.loss)   # 9.0
print("PAM loss:     ", pam.loss)   # 9.0  (identical)
assert fp1.loss == pam.loss, "Results should match"
print("Medoids:", fp1.medoids)
print("Labels:", fp1.labels)
```

--------------------------------

### kmedoids.pam_build - PAM BUILD Phase Only

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Runs only the greedy PAM BUILD initialization. This is useful for obtaining initial medoids that can be used as a warm-start for iterative algorithms like FasterPAM.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float32)

# Get deterministic initial medoids from BUILD
build_result = kmedoids.pam_build(dist, k=2)
print("BUILD medoids:", build_result.medoids)   # e.g. [0 2]
print("BUILD loss:  ", build_result.loss)

# Use the BUILD medoids as a fixed start for FasterPAM
result = kmedoids.fasterpam(dist, medoids=build_result.medoids)
print("FasterPAM loss after BUILD init:", result.loss)

# Equivalent to using init="build" directly
result2 = kmedoids.fasterpam(dist, medoids=2, init="build")
assert result.loss == result2.loss
```

--------------------------------

### kmedoids.pam_build

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Runs only the greedy PAM BUILD initialization, which constructs k initial medoids by repeatedly selecting the point that minimizes the total distance to already-selected medoids. Returns a KMedoidsResult that can be used as a warm-start for iterative algorithms.

```APIDOC
## `kmedoids.pam_build` — PAM BUILD Phase Only

Runs only the greedy PAM BUILD initialization, which constructs k initial medoids by repeatedly selecting the point that minimizes the total distance to already-selected medoids. Returns a `KMedoidsResult` that can be used as a warm-start for iterative algorithms.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float32)

# Get deterministic initial medoids from BUILD
build_result = kmedoids.pam_build(dist, k=2)
print("BUILD medoids:", build_result.medoids)   # e.g. [0 2]
print("BUILD loss:  ", build_result.loss)

# Use the BUILD medoids as a fixed start for FasterPAM
result = kmedoids.fasterpam(dist, medoids=build_result.medoids)
print("FasterPAM loss after BUILD init:", result.loss)

# Equivalent to using init="build" directly
result2 = kmedoids.fasterpam(dist, medoids=2, init="build")
assert result.loss == result2.loss
```
```

--------------------------------

### KMedoids with Precomputed Distances and Raw Features

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Demonstrates initializing KMedoids with a precomputed distance matrix and with raw feature arrays using the 'euclidean' metric. Shows fitting the model and accessing labels, medoid indices, inertia, and cluster centers.

```python
import numpy as np
import kmedoids
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.metrics.pairwise import euclidean_distances

# --- With precomputed distance matrix ---
dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.int32)

km = kmedoids.KMedoids(n_clusters=2, method='fasterpam', init='build', random_state=0)
km.fit(dist)
print("Labels:         ", km.labels_)
print("Medoid indices: ", km.medoid_indices_)
print("Inertia (loss): ", km.inertia_)

# Transform: returns distances to each medoid
dist_to_medoids = km.transform(dist)
print("Shape:", dist_to_medoids.shape)   # (5, 2)

# --- With raw features and euclidean metric (requires sklearn) ---
X = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4]], dtype=np.float64)
km_euc = kmedoids.KMedoids(n_clusters=2, metric='euclidean', method='fasterpam', random_state=0)
km_euc.fit(X)
print("Euclidean labels:", km_euc.labels_)
print("Cluster centers: ", km_euc.cluster_centers_)  # actual data points

# --- DynMSC with automatic k selection ---
km_dyn = kmedoids.KMedoids(n_clusters=10, method='dynmsc', random_state=42)
km_dyn.fit(dist.astype(np.float32))
print("DynMSC labels:", km_dyn.labels_)

# --- fit_predict convenience method ---
labels = kmedoids.KMedoids(2, method='pam', init='build').fit_predict(dist)
print("fit_predict labels:", labels)
```

--------------------------------

### Classic PAM Clustering with kmedoids

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Use the classic PAM algorithm for correctness baselines or legacy reproducibility. It uses BUILD initialization followed by SWAP optimization.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.int32)

# Default init="build" uses PAM BUILD phase before SWAP
result = kmedoids.pam(dist, medoids=2, max_iter=100, init="build")
print("Loss:", result.loss)          # 9
print("Medoids:", result.medoids)    # [0 2] or similar
print("Labels:", result.labels)

# Compare speed and quality vs FasterPAM on larger data
import time
from sklearn.datasets import fetch_openml
from sklearn.metrics.pairwise import euclidean_distances

X, _ = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False)
X = X[:5000].astype(np.float32)
diss = euclidean_distances(X).astype(np.float32)

t0 = time.time()
fp = kmedoids.fasterpam(diss, 10, random_state=0)
print(f"FasterPAM: {(time.time()-t0)*1000:.1f} ms, loss={fp.loss:.2f}")

t0 = time.time()
pam = kmedoids.pam(diss, 10, init="build")
print(f"PAM:       {(time.time()-t0)*1000:.1f} ms, loss={pam.loss:.2f}")
```

--------------------------------

### kmedoids.pam

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Implements the classic Partitioning Around Medoids (PAM) algorithm. It includes the BUILD initialization phase followed by iterative SWAP optimization, serving as a baseline for correctness.

```APIDOC
## pam(dist, medoids, *, max_iter=300, init='build')

### Description
Performs k-medoids clustering using the classic PAM algorithm.

### Parameters
- **dist** (numpy.ndarray) - A square, symmetric distance/dissimilarity matrix.
- **medoids** (int) - The desired number of clusters (k).
- **max_iter** (int, optional) - Maximum number of iterations for the SWAP phase. Defaults to 300.
- **init** (str or list, optional) - Initialization method. Must be 'build' or a list of initial medoid indices. Defaults to 'build'.

### Returns
- **result** (object) - An object containing clustering results:
  - **loss** (float) - The total sum of distances from each point to its assigned medoid.
  - **labels** (numpy.ndarray) - An array where each element is the index of the assigned medoid for the corresponding data point.
  - **medoids** (numpy.ndarray) - An array containing the indices of the selected medoids.

### Request Example
```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.int32)

result = kmedoids.pam(dist, medoids=2, max_iter=100, init="build")
print("Loss:", result.loss)
print("Medoids:", result.medoids)
print("Labels:", result.labels)
```
```

--------------------------------

### Compare FasterPAM and PAM on MNIST dataset

Source: https://github.com/kno10/python-kmedoids/blob/main/README.md

Compares the performance and loss of FasterPAM and standard PAM algorithms on a subset of the MNIST dataset. Requires pre-computed distance matrix.

```python
import kmedoids, numpy, time
from sklearn.datasets import fetch_openml
from sklearn.metrics.pairwise import euclidean_distances
X, _ = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False)
X = X[:10000]
diss = euclidean_distances(X)
start = time.time()
fp = kmedoids.fasterpam(diss, 100)
print("FasterPAM took: %.2f ms" % ((time.time() - start)*1000))
print("Loss with FasterPAM:", fp.loss)
start = time.time()
pam = kmedoids.pam(diss, 100)
print("PAM took: %.2f ms" % ((time.time() - start)*1000))
print("Loss with PAM:", pam.loss)
```

--------------------------------

### Compare FasterPAM and PAM on MNIST dataset

Source: https://github.com/kno10/python-kmedoids/blob/main/docs/index.md

Compare the performance and loss of FasterPAM and PAM algorithms on a subset of the MNIST dataset. Calculates Euclidean distances and measures execution time.

```python
import kmedoids
import numpy
from sklearn.datasets import fetch_openml
from sklearn.metrics.pairwise import euclidean_distances
X, _ = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False)
X = X[:10000]
diss = euclidean_distances(X)
start = time.time()
fp = kmedoids.fasterpam(diss, 100)
print("FasterPAM took: %.2f ms" % ((time.time() - start)*1000))
print("Loss with FasterPAM:", fp.loss)
start = time.time()
pam = kmedoids.pam(diss, 100)
print("PAM took: %.2f ms" % ((time.time() - start)*1000))
print("Loss with PAM:", pam.loss)
```

--------------------------------

### FastPAM1 Clustering with kmedoids

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Use FastPAM1 for a drop-in replacement for classic PAM when exact PAM-equivalent behavior is needed at reduced computational cost. It finds each best swap O(k) times faster.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float64)

```

--------------------------------

### Basic kmedoids clustering

Source: https://github.com/kno10/python-kmedoids/blob/main/README.md

Perform k-medoids clustering using the fasterpam algorithm with a precomputed distance matrix. The loss of the clustering is printed.

```python
import kmedoids
c = kmedoids.fasterpam(distmatrix, 5)
print("Loss is:", c.loss)
```

--------------------------------

### kmedoids.fastpam1

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Implements the FastPAM1 algorithm, which performs the same sequence of swaps as classic PAM but computes each best swap more efficiently. It offers a performance improvement over PAM while maintaining equivalent swap behavior.

```APIDOC
## fastpam1(dist, medoids, *, max_iter=300, init='build')

### Description
Performs k-medoids clustering using the FastPAM1 algorithm.

### Parameters
- **dist** (numpy.ndarray) - A square, symmetric distance/dissimilarity matrix.
- **medoids** (int) - The desired number of clusters (k).
- **max_iter** (int, optional) - Maximum number of iterations for the SWAP phase. Defaults to 300.
- **init** (str or list, optional) - Initialization method. Can be 'random', 'build', or a list of initial medoid indices. Defaults to 'build'.

### Returns
- **result** (object) - An object containing clustering results:
  - **loss** (float) - The total sum of distances from each point to its assigned medoid.
  - **labels** (numpy.ndarray) - An array where each element is the index of the assigned medoid for the corresponding data point.
  - **medoids** (numpy.ndarray) - An array containing the indices of the selected medoids.

### Request Example
```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float64)

result = kmedoids.fastpam1(dist, medoids=2)
print("Loss:", result.loss)
print("Medoids:", result.medoids)
print("Labels:", result.labels)
```
```

--------------------------------

### Scikit-learn compatible kmedoids clustering

Source: https://github.com/kno10/python-kmedoids/blob/main/README.md

Use the scikit-learn compatible API for k-medoids clustering with the fasterpam method. The inertia of the clustering is printed.

```python
import kmedoids
km = kmedoids.KMedoids(5, method='fasterpam')
c = km.fit(distmatrix)
print("Loss is:", c.inertia_)
```

--------------------------------

### kmedoids.fasterpam

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Implements the FasterPAM clustering algorithm, an accelerated variant of PAM that optimizes swap selection for improved performance. It supports multi-threading, various initialization strategies, and different data types.

```APIDOC
## fasterpam(dist, medoids, *, max_iter=300, init='random', random_state=None, n_cpu=None)

### Description
Performs k-medoids clustering using the FasterPAM algorithm.

### Parameters
- **dist** (numpy.ndarray) - A square, symmetric distance/dissimilarity matrix.
- **medoids** (int) - The desired number of clusters (k).
- **max_iter** (int, optional) - Maximum number of iterations for the SWAP phase. Defaults to 300.
- **init** (str or list, optional) - Initialization method. Can be 'random', 'build', or a list of initial medoid indices. Defaults to 'random'.
- **random_state** (int, optional) - Seed for random number generation for reproducible results. Defaults to None.
- **n_cpu** (int, optional) - Number of CPU cores to use for parallel execution. Auto-detected if None.

### Returns
- **result** (object) - An object containing clustering results:
  - **loss** (float) - The total sum of distances from each point to its assigned medoid.
  - **labels** (numpy.ndarray) - An array where each element is the index of the assigned medoid for the corresponding data point.
  - **medoids** (numpy.ndarray) - An array containing the indices of the selected medoids.
  - **n_iter** (int) - The number of iterations performed.
  - **n_swap** (int) - The number of swaps performed during the optimization.

### Request Example
```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float64)

result = kmedoids.fasterpam(dist, medoids=2, max_iter=100, init="random", random_state=42)

print("Loss:", result.loss)
print("Labels:", result.labels)
print("Medoids:", result.medoids)
```
```

--------------------------------

### FasterPAM Clustering with kmedoids

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Use FasterPAM for accelerated k-medoids clustering. Supports multi-threading, various initializations, and numpy dtypes. Reproducible results with `random_state`.

```python
import numpy as np
import kmedoids

# Build a symmetric distance matrix (5 points)
dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float64)

# Cluster into k=2 groups, reproducible with a seed
result = kmedoids.fasterpam(dist, medoids=2, max_iter=100, init="random", random_state=42)

print("Loss (sum of distances to medoids):", result.loss)
# Loss (sum of distances to medoids): 9.0
print("Cluster labels:", result.labels)
# Cluster labels: [0 0 1 1 1]  (array index of the assigned medoid)
print("Medoid indices:", result.medoids)
# Medoid indices: [0 2]
print("Iterations:", result.n_iter)
print("Swaps performed:", result.n_swap)

# Use PAM BUILD initialization for a deterministic, higher-quality starting point
result_build = kmedoids.fasterpam(dist, medoids=2, init="build")
print("Loss with BUILD init:", result_build.loss)

# Parallel execution on large matrices (auto-detected for n >= 1000)
large_dist = np.random.rand(2000, 2000).astype(np.float32)
large_dist = (large_dist + large_dist.T) / 2
np.fill_diagonal(large_dist, 0)
result_par = kmedoids.fasterpam(large_dist, medoids=10, n_cpu=4, random_state=0)
print("Parallel loss:", result_par.loss)
```

--------------------------------

### PAMSIL Clustering with kmedoids

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Employ PAMSIL to optimize the full (non-medoid) Silhouette criterion using the PAM SWAP framework. Note that this is generally slower than Medoid Silhouette variants.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float32)

result = kmedoids.pamsil(dist, medoids=2, init="build")
print("Silhouette criterion loss:", result.loss)   # e.g. 0.3138
print("Medoids:", result.medoids)
print("Labels:", result.labels)
```

--------------------------------

### KMedoids Class - Scikit-learn Compatible API

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Demonstrates the usage of the KMedoids class for clustering with both precomputed distance matrices and raw feature arrays, including fit, transform, and fit_predict methods. Also shows DynMSC for automatic k selection.

```APIDOC
## `kmedoids.KMedoids` — sklearn-Compatible API

A scikit-learn `BaseEstimator`/`ClusterMixin` wrapper supporting all clustering methods via a standard `fit`/`predict`/`transform`/`fit_predict` interface. Accepts precomputed distance matrices (`metric="precomputed"`, default) or raw feature arrays with any sklearn-supported metric. The `n_clusters` parameter doubles as the maximum k for `method="dynmsc"`.

```python
import numpy as np
import kmedoids
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.metrics.pairwise import euclidean_distances

# --- With precomputed distance matrix ---
dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.int32)

km = kmedoids.KMedoids(n_clusters=2, method='fasterpam', init='build', random_state=0)
pm.fit(dist)
print("Labels:         ", km.labels_)
print("Medoid indices: ", km.medoid_indices_)
print("Inertia (loss): ", km.inertia_)

# Transform: returns distances to each medoid
dist_to_medoids = km.transform(dist)
print("Shape:", dist_to_medoids.shape)   # (5, 2)

# --- With raw features and euclidean metric (requires sklearn) ---
X = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4]], dtype=np.float64)
km_euc = kmedoids.KMedoids(n_clusters=2, metric='euclidean', method='fasterpam', random_state=0)
km_euc.fit(X)
print("Euclidean labels:", km_euc.labels_)
print("Cluster centers: ", km_euc.cluster_centers_)  # actual data points

# --- DynMSC with automatic k selection ---
km_dyn = kmedoids.KMedoids(n_clusters=10, method='dynmsc', random_state=42)
km_dyn.fit(dist.astype(np.float32))
print("DynMSC labels:", km_dyn.labels_)

# --- fit_predict convenience method ---
labels = kmedoids.KMedoids(2, method='pam', init='build').fit_predict(dist)
print("fit_predict labels:", labels)
```
```

--------------------------------

### kmedoids.alternating - Alternating k-Medoids

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Implements a k-means-style k-medoids algorithm. It alternates between assigning points to nearest medoids and updating medoids. This is typically faster per iteration than PAM but may yield worse cluster quality.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.int32)

alt = kmedoids.alternating(dist, medoids=2, init="build")
fp  = kmedoids.fasterpam(dist, medoids=2, init="build")

print("Alternating loss:", alt.loss)   # May be higher than PAM
print("FasterPAM loss: ", fp.loss)     # Usually lower
print("Alternating medoids:", alt.medoids)
```

--------------------------------

### kmedoids.fastermsc

Source: https://context7.com/kno10/python-kmedoids/llms.txt

FasterMSC directly optimizes the Average Medoid Silhouette by eagerly accepting any improving swap, using an O(k²) speedup over PAMMEDSIL. It finds clusterings with higher silhouette scores than PAM-family loss minimization, at the cost of needing float-typed dissimilarity matrices.

```APIDOC
## `kmedoids.fastermsc` — FasterMSC: Fast Medoid Silhouette Clustering

FasterMSC directly optimizes the Average Medoid Silhouette by eagerly accepting any improving swap, using an O(k²) speedup over PAMMEDSIL. It finds clusterings with higher silhouette scores than PAM-family loss minimization, at the cost of needing float-typed dissimilarity matrices.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float32)

result = kmedoids.fastermsc(dist, medoids=2, init="build")
print("Avg Medoid Silhouette (loss):", result.loss)  # e.g. 0.8172
print("Medoids:", result.medoids)
print("Labels:", result.labels)
print("Iterations:", result.n_iter)
print("Swaps:", result.n_swap)
```
```

--------------------------------

### FastMSC Clustering with kmedoids

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Use FastMSC for efficient clustering that provides a balance between PAMMEDSIL's accuracy and FasterMSC's speed. It uses the same swaps as PAMMEDSIL but is significantly faster.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float32)

fmsc    = kmedoids.fastmsc(dist, medoids=2, init="build")
pammsil = kmedoids.pammedsil(dist, medoids=2, init="build")

print("FastMSC loss:   ", fmsc.loss)    # same result as PAMMEDSIL
print("PAMMEDSIL loss: ", pammsil.loss)
assert fmsc.loss == pammsil.loss
print("Medoids:", fmsc.medoids)
```

--------------------------------

### Find optimal number of clusters using DynMSC

Source: https://github.com/kno10/python-kmedoids/blob/main/README.md

Uses the DynMSC algorithm to find the optimal number of clusters based on the Medoid Silhouette index within a specified range. Requires a pre-computed distance matrix.

```python
import kmedoids, numpy
from sklearn.datasets import fetch_openml
from sklearn.metrics.pairwise import euclidean_distances
X, _ = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False)
X = X[:10000]
diss = euclidean_distances(X)
kmin, kmax = 10, 20
dm = kmedoids.dynmsc(diss, kmax, kmin)
print("Optimal number of clusters according to the Medoid Silhouette:", dm.bestk)
print("Medoid Silhouette over range of k:", dm.losses)
print("Range of k:", dm.rangek)
```

--------------------------------

### kmedoids.fastermsc - Faster Medoid Silhouette Clustering

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Optimizes the Average Medoid Silhouette score directly using an O(k²) speedup. This algorithm typically finds clusterings with higher silhouette scores than PAM-family loss minimization but requires float-typed dissimilarity matrices.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float32)

result = kmedoids.fastermsc(dist, medoids=2, init="build")
print("Avg Medoid Silhouette (loss):", result.loss)  # e.g. 0.8172
print("Medoids:", result.medoids)
print("Labels:", result.labels)
print("Iterations:", result.n_iter)
print("Swaps:", result.n_swap)
```

--------------------------------

### PAMMEDSIL Clustering with kmedoids

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Utilize PAMMEDSIL for clustering that directly optimizes the Medoid Silhouette criterion. This algorithm is preserved for reproducibility and comparison with faster methods.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float32)

result = kmedoids.pammedsil(dist, medoids=2, init="build")
print("Medoid Silhouette:", result.loss)   # e.g. 0.8172
print("Medoids:", result.medoids)
print("Labels:", result.labels)
```

--------------------------------

### Silhouette Index Evaluation with kmedoids

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Compute the average Silhouette score for a given clustering using a dissimilarity matrix. Supports parallel computation and retrieval of per-sample silhouette values.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.int32)

# Get clustering labels
result = kmedoids.pam(dist, medoids=2)
print("Labels:", result.labels)

# Compute average silhouette (scalar)
avg_sil, _ = kmedoids.silhouette(dist, result.labels)
print("Average Silhouette:", avg_sil)

# Get per-sample silhouette values
avg_sil, sample_sils = kmedoids.silhouette(dist, result.labels, samples=True, n_cpu=1)
print("Per-sample Silhouette:", sample_sils)
# e.g. [0.75, 0.60, 0.50, 0.45, 0.30]

# Parallel computation (samples=True requires n_cpu=1)
avg_par, _ = kmedoids.silhouette(dist.astype(np.float32), result.labels, n_cpu=2)
print("Parallel Silhouette:", avg_par)
```

--------------------------------

### kmedoids.alternating

Source: https://context7.com/kno10/python-kmedoids/llms.txt

A k-means-style k-medoids algorithm that alternates between assigning each point to its nearest medoid and updating each medoid to the point in its cluster that minimizes the cluster's total intra-distance. Significantly faster per iteration than PAM-family algorithms but typically yields substantially worse cluster quality.

```APIDOC
## `kmedoids.alternating` — Alternating k-Medoids (k-Means Style)

A k-means-style k-medoids algorithm that alternates between assigning each point to its nearest medoid and updating each medoid to the point in its cluster that minimizes the cluster's total intra-distance. Significantly faster per iteration than PAM-family algorithms but typically yields substantially worse cluster quality.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.int32)

alt = kmedoids.alternating(dist, medoids=2, init="build")
fp  = kmedoids.fasterpam(dist, medoids=2, init="build")

print("Alternating loss:", alt.loss)   # May be higher than PAM
print("FasterPAM loss: ", fp.loss)     # Usually lower
print("Alternating medoids:", alt.medoids)
```
```

--------------------------------

### kmedoids.dynmsc - Automatic Cluster Count Selection

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Automatically selects the optimal number of clusters (k) by running FasterMSC for a range of k values and choosing the one with the highest Average Medoid Silhouette score. Requires a float-typed dissimilarity matrix.

```python
import numpy as np
import kmedoids
from sklearn.datasets import fetch_openml
from sklearn.metrics.pairwise import euclidean_distances

X, _ = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False)
X = X[:5000].astype(np.float32)
diss = euclidean_distances(X).astype(np.float32)

# Search for the best k in the range [5, 20]
# kmax should be 2-3x the number of clusters you expect
kmin, kmax = 5, 20
dm = kmedoids.dynmsc(diss, medoids=kmax, minimum_k=kmin, random_state=42)

print("Best k (auto-selected):", dm.bestk)
print("Best loss (Avg Medoid Silhouette):", dm.loss)
print("Labels for best k:", dm.labels[:10], " ...")
print("Medoids for best k:", dm.medoids)
print("Silhouette scores over range:", dict(zip(dm.rangek, dm.losses)))
# e.g. {5: 0.71, 6: 0.74, 7: 0.72, ..., 20: 0.51}

# Use via sklearn-compatible API
from kmedoids import KMedoids
km = KMedoids(n_clusters=kmax, method='dynmsc')
km.fit(diss)
print("sklearn dynmsc bestk:", km.medoid_indices_)
```

--------------------------------

### Medoid Silhouette Index Evaluation with kmedoids

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Calculate the Average Medoid Silhouette, an efficient approximation of the full Silhouette score. This metric uses distances to medoid points for faster computation.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float32)

# Cluster and evaluate
result = kmedoids.fasterpam(dist, medoids=2)
avg_msil, _ = kmedoids.medoid_silhouette(dist, result.medoids)
print("Avg Medoid Silhouette:", avg_msil)

# Get per-sample values
avg_msil, sample_msils = kmedoids.medoid_silhouette(dist, result.medoids, samples=True)
print("Per-sample Medoid Silhouette:", sample_msils)

# Compare full Silhouette vs Medoid Silhouette
avg_sil, _ = kmedoids.silhouette(dist.astype(np.int32), result.labels, n_cpu=1)
print(f"Full Silhouette: {avg_sil:.4f}  |  Medoid Silhouette: {avg_msil:.4f}")
```

--------------------------------

### kmedoids.pamsil

Source: https://context7.com/kno10/python-kmedoids/llms.txt

PAMSIL clustering algorithm. This algorithm optimizes the full (non-medoid) Silhouette criterion using the PAM SWAP framework and is generally slower than Medoid Silhouette variants.

```APIDOC
## `kmedoids.pamsil` — PAMSIL Clustering

PAMSIL (Van der Laan, Pollard & Bryan, 2003) optimizes the full (non-medoid) Silhouette criterion using the PAM SWAP framework. It is generally slower than the Medoid Silhouette variants; use `fastermsc`/`fastmsc` or the standard `fasterpam` for typical clustering workloads.

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float32)

result = kmedoids.pamsil(dist, medoids=2, init="build")
print("Silhouette criterion loss:", result.loss)   # e.g. 0.3138
print("Medoids:", result.medoids)
print("Labels:", result.labels)
```
```

--------------------------------

### Result Objects: KMedoidsResult and DynkResult

Source: https://context7.com/kno10/python-kmedoids/llms.txt

Details the fields available in KMedoidsResult (returned by fixed-k clustering) and DynkResult (returned by dynmsc, including automatic k-selection metadata).

```APIDOC
## Result Objects: `KMedoidsResult` and `DynkResult`

`KMedoidsResult` is returned by all fixed-k clustering functions; `DynkResult` is returned by `dynmsc` and extends it with automatic k-selection metadata (`bestk`, `losses`, `rangek`).

```python
import numpy as np
import kmedoids

dist = np.array([
    [0,  2,  3,  4,  5],
    [2,  0,  6,  7,  8],
    [3,  6,  0,  9, 10],
    [4,  7,  9,  0, 11],
    [5,  8, 10, 11,  0]
], dtype=np.float32)

# KMedoidsResult fields
r = kmedoids.fasterpam(dist, 2, random_state=0)
print(type(r))           # <class 'kmedoids.KMedoidsResult'>
print(r.loss)            # float: total sum of distances to medoids
print(r.labels)          # ndarray[int]: cluster index for each point
print(r.medoids)         # ndarray[int]: indices of medoid points
print(r.n_iter)          # int: number of SWAP iterations performed
print(r.n_swap)          # int: total number of swaps accepted

# DynkResult fields (superset of KMedoidsResult)
dm = kmedoids.dynmsc(dist, 3, minimum_k=2, random_state=0)
print(type(dm))          # <class 'kmedoids.DynkResult'>
print(dm.bestk)          # int: optimal k by Avg Medoid Silhouette
print(dm.losses)         # ndarray: Avg Medoid Silhouette for each k in rangek
print(dm.rangek)         # range object: range(minimum_k, kmax+1)
print(dm.loss)           # float: Avg Medoid Silhouette for the best k
print(dm.labels)         # ndarray: cluster labels for the best k clustering
print(dm.medoids)        # ndarray: medoid indices for the best k clustering
```
```