### Install VectorVFS with pip

Source: https://github.com/perone/vectorvfs/blob/main/docs/source/installation.md

Use this command to install the VectorVFS package from PyPI. Ensure pip is available in your environment.

```bash
pip install vectorvfs
```

--------------------------------

### Display vfs Script Help

Source: https://github.com/perone/vectorvfs/blob/main/docs/source/installation.md

After installation, you can use the `vfs` command-line tool to access its functionalities. This command displays the help message.

```bash
vfs --help
```

--------------------------------

### Run commands with uv

Source: https://github.com/perone/vectorvfs/blob/main/codex.md

Use this command to execute tasks within the project's virtual environment.

```bash
uv run [command]
```

--------------------------------

### Store and Retrieve Tensors with VFSStore

Source: https://context7.com/perone/vectorvfs/llms.txt

Serialize and store PyTorch tensors into file metadata using VFSStore.

```python
import torch
from pathlib import Path
from vectorvfs.vfsstore import VFSStore, XAttrFile

# Initialize VFSStore for a file
file_path = Path("/path/to/image.jpg")
xattr_file = XAttrFile(file_path)
vfs_store = VFSStore(xattr_file)

# Create a sample embedding tensor (1024 dimensions, half precision)
embedding = torch.randn(1, 1024, dtype=torch.float16)

# Write tensor to the file's extended attributes
bytes_written = vfs_store.write_tensor(embedding)
print(f"Stored embedding: {bytes_written} bytes")
# Output: Stored embedding: 2114 bytes

# Read the tensor back from extended attributes
retrieved_embedding = vfs_store.read_tensor()
print(f"Retrieved tensor shape: {retrieved_embedding.shape}")
print(f"Tensor dtype: {retrieved_embedding.dtype}")
# Output: Retrieved tensor shape: torch.Size([1, 1024])
# Output: Tensor dtype: torch.float16

# Verify the stored and retrieved tensors match
assert torch.allclose(embedding, retrieved_embedding)
```

--------------------------------

### Search Files via CLI

Source: https://context7.com/perone/vectorvfs/llms.txt

Perform semantic searches across directories using the vfs search command.

```bash
# Basic search - find images matching "cat" in a folder
vfs search cat /my_folder

# Recursive search through all subdirectories
vfs search -r "orange tabby cat" /photos

# Limit results to top 3 matches
vfs search -n 3 "sunset over ocean" /vacation_photos

# Force re-indexing of all files (ignores cached embeddings)
vfs search -f "mountain landscape" /nature_photos

# Combined options: recursive search, top 5 results, force reindex
vfs search -r -n 5 -f "happy dog playing" /pet_photos
```

--------------------------------

### VFSStore Class

Source: https://context7.com/perone/vectorvfs/llms.txt

Handles high-level tensor storage and retrieval using the user.vectorvfs extended attribute.

```APIDOC
## VFSStore Class

### Description
Provides high-level tensor storage and retrieval operations, handling serialization of PyTorch tensors to/from bytes.

### Methods
- **write_tensor(tensor)**: Serializes and stores a PyTorch tensor in the 'user.vectorvfs' attribute.
- **read_tensor()**: Retrieves and deserializes the tensor from the 'user.vectorvfs' attribute.
```

--------------------------------

### Perform Semantic Search on Images

Source: https://context7.com/perone/vectorvfs/llms.txt

Indexes images in a directory and performs semantic search using text queries. Requires PerceptionEncoder and VFSStore. Can force re-indexing if needed.

```python
import torch
import torch.nn.functional as F
from pathlib import Path
from heapq import heappush, heappushpop
from dataclasses import dataclass, field

from vectorvfs.encoders import PerceptionEncoder
from vectorvfs.vfsstore import VFSStore, XAttrFile
from vectorvfs.utils import pillow_image_extensions

@dataclass(order=True)
class SearchResult:
    similarity: float
    path: Path = field(compare=False)

def semantic_search(query: str, directory: Path, top_k: int = 5,
                    recursive: bool = False, force_reindex: bool = False):
    """
    Perform semantic search across images in a directory.

    Args:
        query: Text query to search for
        directory: Directory to search in
        top_k: Number of results to return
        recursive: Whether to search subdirectories
        force_reindex: Whether to force re-embedding of all files

    Returns:
        List of (path, similarity_score) tuples
    """
    # Initialize encoder
    encoder = PerceptionEncoder()

    # Encode the search query
    query_features = encoder.encode_text(query)
    query_features = F.normalize(query_features)

    # Get supported image extensions
    supported_extensions = pillow_image_extensions()

    # Iterate over files
    if recursive:
        files = directory.rglob("*")
    else:
        files = directory.iterdir()

    # Min-heap for top-k results
    results_heap = []

    for file_path in files:
        if not file_path.is_file():
            continue
        if file_path.suffix.lower() not in supported_extensions:
            continue

        # Set up VFS store for this file
        xattr_file = XAttrFile(file_path)
        vfs_store = VFSStore(xattr_file)

        # Check if already indexed
        existing_attrs = xattr_file.list()
        needs_indexing = "user.vectorvfs" not in existing_attrs or force_reindex

        if needs_indexing:
            try:
                # Generate and store embedding
                features = encoder.encode_vision(file_path)
                features = F.normalize(features)
                features = features.to(torch.float16)
                vfs_store.write_tensor(features)
                print(f"Indexed: {file_path.name}")
            except Exception as e:
                print(f"Failed to index {file_path.name}: {e}")
                continue
        else:
            # Load cached embedding
            features = vfs_store.read_tensor()

        # Compute similarity
        features = features.to(torch.float32)
        similarity = (features @ query_features.T).item()

        # Maintain top-k heap
        result = SearchResult(similarity=similarity, path=file_path)
        if len(results_heap) < top_k:
            heappush(results_heap, result)
        else:
            heappushpop(results_heap, result)

    # Return sorted results (highest similarity first)
    return [(r.path, r.similarity) for r in sorted(results_heap, reverse=True)]

# Example usage
if __name__ == "__main__":
    search_dir = Path("/home/user/photos")
    results = semantic_search(
        query="happy golden retriever playing in park",
        directory=search_dir,
        top_k=5,
        recursive=True,
        force_reindex=False
    )

    print("\nSearch Results:")
    for path, score in results:
        print(f"  {path.name}: {score:.4f}")
```

--------------------------------

### Encode and compute similarity with DualEncoder

Source: https://context7.com/perone/vectorvfs/llms.txt

Demonstrates how to encode images and text, normalize the resulting features, and compute a cosine similarity score.

```python
image_path = Path("/path/to/cat.jpg")
image_features = encoder.encode_vision(image_path)
print(f"Image embedding shape: {image_features.shape}")
# Output: Image embedding shape: torch.Size([1, 1024])

# Encode text query
text_features = encoder.encode_text("a cute orange cat")
print(f"Text embedding shape: {text_features.shape}")
# Output: Text embedding shape: torch.Size([1, 1024])

# Normalize features for cosine similarity
image_features = F.normalize(image_features)
text_features = F.normalize(text_features)

# Compute similarity score
similarity = (image_features @ text_features.T).item()
print(f"Similarity score: {similarity:.4f}")
# Output: Similarity score: 0.3245

# Get logit scale for softmax computation
logit_scale = encoder.logit_scale()
print(f"Logit scale: {logit_scale.item():.2f}")
```

--------------------------------

### XAttrFile Class

Source: https://context7.com/perone/vectorvfs/llms.txt

Provides low-level access to Linux extended attributes on individual files.

```APIDOC
## XAttrFile Class

### Description
Wraps OS-level xattr operations for reading, writing, listing, and removing extended attributes on a file.

### Methods
- **list()**: Returns a list of existing extended attribute keys.
- **write(key, value)**: Writes binary data to a specific extended attribute key.
- **read(key)**: Reads binary data from a specific extended attribute key.
- **remove(key)**: Removes an extended attribute key from the file.
```

--------------------------------

### Implement a custom DualEncoder

Source: https://context7.com/perone/vectorvfs/llms.txt

Provides a template for creating custom encoders by inheriting from the DualEncoder abstract base class.

```python
from abc import ABC, abstractmethod
from pathlib import Path
import torch
from vectorvfs.encoders import DualEncoder

class CustomEncoder(DualEncoder):
    """Example custom encoder implementation."""

    def __init__(self, model_path: str):
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        # Load your custom model here
        self.model = self._load_model(model_path)

    def _load_model(self, path):
        # Custom model loading logic
        pass

    def encode_vision(self, file: Path) -> torch.Tensor:
        """Encode image file to tensor."""
        # Custom image encoding logic
        # Must return shape [1, embedding_dim]
        return torch.randn(1, 1024)

    def encode_text(self, text: str) -> torch.Tensor:
        """Encode text string to tensor."""
        # Custom text encoding logic
        # Must return shape [1, embedding_dim]
        return torch.randn(1, 1024)

    def logit_scale(self) -> torch.Tensor:
        """Return logit scale for similarity computation."""
        return torch.tensor(100.0)

# Use the custom encoder
encoder = CustomEncoder("/path/to/model")
features = encoder.encode_vision(Path("/path/to/image.jpg"))
```

--------------------------------

### Basic vfs search command

Source: https://github.com/perone/vectorvfs/blob/main/docs/source/usage.md

Use this command to search for files containing a specific term within a given folder. The tool automatically embeds or loads existing embeddings for supported files.

```bash
$ vfs search cat /my_folder
```

--------------------------------

### Encode Data with PerceptionEncoder

Source: https://context7.com/perone/vectorvfs/llms.txt

Initialize the PerceptionEncoder to transform images and text into a shared embedding space.

```python
import torch
import torch.nn.functional as F
from pathlib import Path
from vectorvfs.encoders import PerceptionEncoder

# Initialize the encoder (uses PE-Core-L14-336 model by default)
encoder = PerceptionEncoder()
print(f"Model: {encoder.model_name}")
print(f"Device: {encoder.device}")
# Output: Model: PE-Core-L14-336
# Output: Device: cuda (or cpu)
```

--------------------------------

### Manage Extended Attributes with XAttrFile

Source: https://context7.com/perone/vectorvfs/llms.txt

Use XAttrFile to read, write, and remove Linux extended attributes on individual files.

```python
from pathlib import Path
from vectorvfs.vfsstore import XAttrFile

# Initialize XAttrFile for a specific file
file_path = Path("/path/to/image.jpg")
xattr_file = XAttrFile(file_path)

# List all extended attributes on the file
attributes = xattr_file.list()
print(f"Existing attributes: {attributes}")
# Output: ['user.vectorvfs', 'user.custom']

# Write custom data as an extended attribute
custom_data = b"my custom metadata"
xattr_file.write("user.my_custom_key", custom_data)

# Read an extended attribute value
data = xattr_file.read("user.my_custom_key")
print(f"Retrieved data: {data}")
# Output: b'my custom metadata'

# Remove an extended attribute
xattr_file.remove("user.my_custom_key")

# Handle non-existent attributes with error handling
try:
    xattr_file.read("user.nonexistent")
except OSError as e:
    print(f"Attribute not found: {e}")
```

--------------------------------

### Measure Operation Time with PerfCounter

Source: https://context7.com/perone/vectorvfs/llms.txt

Utility for measuring elapsed time. Can be used as a context manager or a decorator. Import from vectorvfs.utils.

```python
from vectorvfs.utils import PerfCounter

# Use as a context manager
with PerfCounter() as timer:
    # Perform some operation
    result = sum(range(1000000))

print(f"Operation took {timer.elapsed:.4f} seconds")
# Output: Operation took 0.0234 seconds

# Use as a decorator
@PerfCounter()
def expensive_operation():
    return [x**2 for x in range(100000)]

expensive_operation()
```

--------------------------------

### Limit search results with vfs search

Source: https://github.com/perone/vectorvfs/blob/main/docs/source/usage.md

To display only the top N most similar files, use the -n flag followed by the desired number. This limits the output to the specified quantity of results.

```bash
$ vfs search -n 3 cat /my_folder
```

--------------------------------

### PerceptionEncoder Class

Source: https://context7.com/perone/vectorvfs/llms.txt

Implements Meta's Perception Encoder for encoding images and text into a shared 1024-dimensional embedding space.

```APIDOC
## PerceptionEncoder Class

### Description
Provides methods for encoding both vision (images) and text inputs for semantic similarity computation using the PE-Core-L14-336 model.
```

--------------------------------

### CLI: vfs search

Source: https://context7.com/perone/vectorvfs/llms.txt

Performs semantic search across files in a directory using cached embeddings or indexing new files.

```APIDOC
## CLI: vfs search

### Description
Performs semantic search across files in a directory, automatically indexing files that haven't been processed and using cached embeddings for previously indexed files.

### Usage
`vfs search [options] <query> <directory>`

### Parameters
#### Options
- **-r** (flag) - Optional - Recursive search through all subdirectories.
- **-n** (integer) - Optional - Limit results to the top N matches.
- **-f** (flag) - Optional - Force re-indexing of all files, ignoring cached embeddings.
```

--------------------------------

### Force re-indexing with vfs search

Source: https://github.com/perone/vectorvfs/blob/main/docs/source/usage.md

To ensure VectorFS re-indexes files, use the -f flag with the search command. This is useful when files have changed and existing embeddings may be outdated.

```bash
$ vfs search -f cat /my_folder
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.