Try Live
Add Docs
Rankings
Pricing
Docs
Install
Theme
Install
Docs
Pricing
More...
More...
Try Live
Rankings
Enterprise
Create API Key
Add Docs
LanceDB
https://github.com/lancedb/lancedb
Admin
Developer-friendly, embedded retrieval engine for multimodal AI. Search More; Manage Less.
Tokens:
66,401
Snippets:
621
Trust Score:
9.2
Update:
1 month ago
Context
Skills
Chat
Benchmark
87.7
Suggestions
Latest
Show doc for...
Code
Info
Show Results
Context Summary (auto-generated)
Raw
Copy
Link
# LanceDB LanceDB is a multimodal AI lakehouse and serverless vector database designed for fast, scalable, and production-ready vector search. Built on the Lance columnar format, it enables storing, indexing, and searching over petabytes of multimodal data and vectors with millisecond latency. LanceDB supports vector similarity search, full-text search, hybrid search, and traditional SQL filtering, with native APIs for Python, JavaScript/TypeScript, Rust, and Java. The database can run locally against a filesystem, on cloud storage (S3, GCS, Azure), or connect to LanceDB Cloud for managed deployments. Key features include zero-copy access, automatic versioning, GPU-accelerated index building, and seamless integrations with LangChain, LlamaIndex, Apache Arrow, Pandas, Polars, and DuckDB. LanceDB is 100% open source with no vendor lock-in. ## Connection and Database Operations Connect to a LanceDB database to create, open, and manage tables. Connections can be local (filesystem), remote (cloud storage), or to LanceDB Cloud with API authentication. ### Python - Connect to Database Establish a connection to a local or remote LanceDB database. Supports local paths, S3, GCS, Azure storage, and LanceDB Cloud. ```python import lancedb from datetime import timedelta # Connect to local database db = lancedb.connect("~/.lancedb") # Connect to S3 with storage options db = lancedb.connect( "s3://my-bucket/lancedb", storage_options={ "aws_access_key_id": "YOUR_ACCESS_KEY", "aws_secret_access_key": "YOUR_SECRET_KEY", "aws_region": "us-east-1" } ) # Connect to LanceDB Cloud db = lancedb.connect( "db://my_database", api_key="ldb_xxxxxxxxxxxxx", region="us-east-1", client_config={"retry_config": {"retries": 5}} ) # Connect with read consistency for multi-process access db = lancedb.connect( "~/.lancedb", read_consistency_interval=timedelta(seconds=0) # Strong consistency ) # Async connection db = await lancedb.connect_async("~/.lancedb") ``` ### JavaScript/TypeScript - Connect to Database Connect to LanceDB from Node.js applications with support for local and cloud storage. ```typescript import * as lancedb from "@lancedb/lancedb"; // Connect to local database const db = await lancedb.connect("/path/to/database"); // Connect with storage options const db = await lancedb.connect({ uri: "s3://my-bucket/lancedb", storageOptions: { awsAccessKeyId: "YOUR_ACCESS_KEY", awsSecretAccessKey: "YOUR_SECRET_KEY", awsRegion: "us-east-1", timeout: "60s" } }); // List all tables const tableNames = await db.tableNames(); console.log("Tables:", tableNames); // Close connection when done (optional) db.close(); ``` ### Rust - Connect to Database Connect to LanceDB using the Rust SDK with async/await support. ```rust use lancedb::{connect, Result}; #[tokio::main] async fn main() -> Result<()> { // Connect to local database let db = connect("data/sample-lancedb").execute().await?; // List all tables let tables = db.table_names().execute().await?; println!("Tables: {:?}", tables); Ok(()) } ``` ## Table Operations Create, open, and manage tables with support for schema definition, data insertion, updates, and deletions. ### Python - Create Table Create a new table with data from dictionaries, pandas DataFrames, or PyArrow tables. ```python import lancedb import pyarrow as pa db = lancedb.connect("~/.lancedb") # Create table from list of dictionaries data = [ {"vector": [1.1, 1.2], "item": "foo", "price": 10.0}, {"vector": [0.2, 1.8], "item": "bar", "price": 20.0} ] table = db.create_table("products", data) # Create table with custom PyArrow schema schema = pa.schema([ pa.field("vector", pa.list_(pa.float32(), 128)), pa.field("text", pa.utf8()), pa.field("category", pa.utf8()) ]) table = db.create_table("documents", schema=schema) # Create table with overwrite mode table = db.create_table("products", data, mode="overwrite") # Create table with exist_ok (open if exists) table = db.create_table("products", data, exist_ok=True) # Open existing table table = db.open_table("products") # Access table using dictionary syntax table = db["products"] # Drop table db.drop_table("products") ``` ### JavaScript/TypeScript - Create Table Create and manage tables in JavaScript with Arrow schema support. ```typescript import * as lancedb from "@lancedb/lancedb"; import { Schema, Field, Float32, FixedSizeList, Utf8 } from "apache-arrow"; const db = await lancedb.connect("/tmp/lancedb"); // Create table from array of objects const data = [ { vector: [1.1, 1.2], item: "foo", price: 10.0 }, { vector: [0.2, 1.8], item: "bar", price: 20.0 } ]; const table = await db.createTable("products", data); // Create empty table with schema const schema = new Schema([ new Field("vector", new FixedSizeList(128, new Field("item", new Float32()))), new Field("text", new Utf8()) ]); const emptyTable = await db.createEmptyTable("documents", schema); // Open existing table const existingTable = await db.openTable("products"); // Get table info console.log("Name:", existingTable.name); console.log("Row count:", await existingTable.countRows()); const tableSchema = await existingTable.schema(); // Drop table await db.dropTable("products"); ``` ### Python - Add, Update, Delete Data Perform CRUD operations on table data with filtering support. ```python import lancedb db = lancedb.connect("~/.lancedb") table = db.open_table("products") # Add new records new_data = [ {"vector": [1.3, 1.4], "item": "fizz", "price": 100.0}, {"vector": [9.5, 56.2], "item": "buzz", "price": 200.0} ] table.add(new_data) # Update records with filter table.update( where="item = 'fizz'", values={"price": 150.0} ) # Delete records with filter table.delete("price > 100") # Merge insert (upsert) operation table.merge_insert("item") \ .when_matched_update_all() \ .when_not_matched_insert_all() \ .execute([ {"vector": [1.3, 1.4], "item": "fizz", "price": 175.0}, {"vector": [2.0, 3.0], "item": "new_item", "price": 50.0} ]) # Count rows count = table.count_rows() print(f"Total rows: {count}") # Count with filter filtered_count = table.count_rows("price > 50") ``` ### JavaScript/TypeScript - Add, Update, Delete Data Perform data operations in JavaScript with async/await. ```typescript import * as lancedb from "@lancedb/lancedb"; const db = await lancedb.connect("/tmp/lancedb"); const table = await db.openTable("products"); // Add new records await table.add([ { vector: [1.3, 1.4], item: "fizz", price: 100.0 }, { vector: [9.5, 56.2], item: "buzz", price: 200.0 } ]); // Update records await table.update({ where: "item = 'fizz'", values: { price: 150.0 } }); // Delete records await table.delete("price > 100"); // Merge insert (upsert) await table .mergeInsert("item") .whenMatchedUpdateAll() .whenNotMatchedInsertAll() .execute([ { vector: [1.3, 1.4], item: "fizz", price: 175.0 } ]); // Count rows const count = await table.countRows(); const filteredCount = await table.countRows("price > 50"); ``` ## Vector Search Perform approximate nearest neighbor (ANN) vector search with filtering and distance metrics. ### Python - Vector Search Search for similar vectors with optional filtering and distance configuration. ```python import lancedb db = lancedb.connect("~/.lancedb") table = db.open_table("products") # Basic vector search results = table.search([1.0, 2.0]) \ .limit(10) \ .to_pandas() # Vector search with filter (pre-filter) results = table.search([1.0, 2.0]) \ .where("price > 50") \ .limit(10) \ .to_pandas() # Vector search with post-filter results = table.search([1.0, 2.0]) \ .where("price > 50", prefilter=False) \ .limit(10) \ .to_pandas() # Specify distance metric results = table.search([1.0, 2.0]) \ .metric("cosine") \ .limit(10) \ .to_pandas() # Search specific vector column results = table.search([1.0, 2.0]) \ .vector_column_name("embedding") \ .limit(10) \ .to_arrow() # Configure search parameters for indexed tables results = table.search([1.0, 2.0]) \ .nprobes(20) \ .refine_factor(10) \ .limit(10) \ .to_pandas() # Select specific columns results = table.search([1.0, 2.0]) \ .select(["item", "price"]) \ .limit(10) \ .to_pandas() # Include row IDs results = table.search([1.0, 2.0]) \ .with_row_id(True) \ .limit(10) \ .to_pandas() # Async vector search results = await table.vector_search([1.0, 2.0]).limit(10).to_pandas() ``` ### JavaScript/TypeScript - Vector Search Perform vector similarity search in JavaScript applications. ```typescript import * as lancedb from "@lancedb/lancedb"; const db = await lancedb.connect("/tmp/lancedb"); const table = await db.openTable("products"); // Basic vector search const results = await table .vectorSearch([1.0, 2.0]) .limit(10) .toArray(); // Vector search with filter const filteredResults = await table .vectorSearch([1.0, 2.0]) .where("price > 50") .limit(10) .toArray(); // Search with distance metric const cosineResults = await table .vectorSearch([1.0, 2.0]) .distanceType("cosine") .limit(10) .toArray(); // Select specific columns const selectedResults = await table .vectorSearch([1.0, 2.0]) .select(["item", "price"]) .limit(10) .toArray(); // Return as Arrow table const arrowResults = await table .vectorSearch([1.0, 2.0]) .limit(10) .toArrow(); // Configure search parameters const indexedResults = await table .vectorSearch([1.0, 2.0]) .nprobes(20) .refineFactor(10) .limit(10) .toArray(); ``` ### Rust - Vector Search Execute vector search queries in Rust with streaming results. ```rust use lancedb::{connect, Result}; use futures::TryStreamExt; #[tokio::main] async fn main() -> Result<()> { let db = connect("data/sample-lancedb").execute().await?; let table = db.open_table("products").execute().await?; // Vector search with limit let query_vector: Vec<f32> = vec![1.0, 2.0]; let results = table .query() .nearest_to(&query_vector)? .limit(10) .execute() .await? .try_collect::<Vec<_>>() .await?; for batch in results { println!("Found {} rows", batch.num_rows()); } Ok(()) } ``` ## Full-Text Search Perform full-text search on string columns with support for match queries, phrase queries, and boolean combinations. ### Python - Full-Text Search Execute full-text search queries after creating an FTS index. ```python import lancedb from lancedb.index import FTS from lancedb.query import MatchQuery, PhraseQuery, BooleanQuery, Occur db = lancedb.connect("~/.lancedb") # Create table with text data data = [ {"text": "Machine learning is a subset of artificial intelligence", "category": "tech"}, {"text": "Deep learning neural networks process data", "category": "tech"}, {"text": "Natural language processing enables text understanding", "category": "nlp"} ] table = db.create_table("documents", data) # Create full-text search index table.create_fts_index("text", config=FTS( with_position=True, # Enable phrase queries base_tokenizer="simple", # Options: simple, whitespace, raw language="English", stem=True, remove_stop_words=True, lower_case=True )) # Basic full-text search results = table.search("machine learning", query_type="fts") \ .limit(10) \ .to_pandas() # Match query with options results = table.search( MatchQuery("neural networks", "text", fuzziness=1, boost=2.0), query_type="fts" ).limit(10).to_pandas() # Phrase query (requires with_position=True) results = table.search( PhraseQuery("deep learning", "text"), query_type="fts" ).limit(10).to_pandas() # Boolean query combining multiple conditions boolean_query = BooleanQuery([ (Occur.MUST, MatchQuery("machine", "text")), (Occur.SHOULD, MatchQuery("learning", "text")), (Occur.MUST_NOT, MatchQuery("neural", "text")) ]) results = table.search(boolean_query, query_type="fts").limit(10).to_pandas() # Combine with filter results = table.search("artificial intelligence", query_type="fts") \ .where("category = 'tech'") \ .limit(10) \ .to_pandas() ``` ### JavaScript/TypeScript - Full-Text Search Perform full-text search in JavaScript with index configuration. ```typescript import * as lancedb from "@lancedb/lancedb"; import { Index } from "@lancedb/lancedb"; const db = await lancedb.connect("/tmp/lancedb"); const table = await db.openTable("documents"); // Create FTS index await table.createIndex("text", { config: Index.fts({ withPosition: true, baseTokenizer: "simple", language: "English", stem: true, removeStopWords: true }) }); // Basic full-text search const results = await table .search("machine learning", "fts") .limit(10) .toArray(); // Full-text search with filter const filteredResults = await table .search("neural networks", "fts") .where("category = 'tech'") .limit(10) .toArray(); ``` ## Hybrid Search Combine vector similarity search with full-text search for improved relevance using reranking strategies. ### Python - Hybrid Search Execute hybrid queries that combine vector and text search with configurable reranking. ```python import lancedb from lancedb.rerankers import RRFReranker, LinearCombinationReranker db = lancedb.connect("~/.lancedb") table = db.open_table("documents") # Table with both vector and text columns # Ensure both vector and FTS indices exist table.create_index("vector") # Vector index table.create_fts_index("text") # Full-text index # Basic hybrid search (vector + FTS) results = table.search([1.0, 2.0, 3.0], query_type="hybrid") \ .text("machine learning") \ .limit(10) \ .to_pandas() # Hybrid search with RRF reranking reranker = RRFReranker() results = table.search([1.0, 2.0, 3.0], query_type="hybrid") \ .text("machine learning") \ .rerank(reranker) \ .limit(10) \ .to_pandas() # Hybrid search with linear combination reranking reranker = LinearCombinationReranker(weight=0.7) # 70% vector, 30% FTS results = table.search([1.0, 2.0, 3.0], query_type="hybrid") \ .text("machine learning") \ .rerank(reranker) \ .limit(10) \ .to_pandas() # Hybrid search with filter results = table.search([1.0, 2.0, 3.0], query_type="hybrid") \ .text("deep learning") \ .where("category = 'tech'") \ .limit(10) \ .to_pandas() ``` ## Indexing Create vector and scalar indices to accelerate search performance on large datasets. ### Python - Create Vector Index Create various types of vector indices optimized for different use cases. ```python import lancedb from lancedb.index import IvfPq, HnswPq, HnswSq, IvfFlat db = lancedb.connect("~/.lancedb") table = db.open_table("products") # Create IVF-PQ index (good for large datasets) table.create_index( "vector", config=IvfPq( distance_type="l2", # Options: l2, cosine, dot num_partitions=256, # Number of IVF partitions num_sub_vectors=16, # PQ sub-vectors num_bits=8, # Bits per sub-vector (4 or 8) max_iterations=50, # K-means iterations sample_rate=256 # Training sample rate ) ) # Create HNSW-PQ index (faster search, higher memory) table.create_index( "vector", config=HnswPq( distance_type="cosine", num_partitions=1, # Usually 1 for HNSW num_sub_vectors=16, m=20, # HNSW connections per node ef_construction=300 # Build-time candidate list size ) ) # Create HNSW-SQ index (scalar quantization) table.create_index( "vector", config=HnswSq( distance_type="l2", m=20, ef_construction=300 ) ) # Create IVF-Flat index (no compression, highest accuracy) table.create_index( "vector", config=IvfFlat( distance_type="l2", num_partitions=256 ) ) # List indices indices = table.list_indices() for idx in indices: print(f"Index: {idx}") ``` ### Python - Create Scalar Index Create indices on scalar columns to accelerate filtering operations. ```python import lancedb from lancedb.index import BTree, Bitmap, LabelList, FTS db = lancedb.connect("~/.lancedb") table = db.open_table("products") # BTree index for high-cardinality columns table.create_scalar_index("price", index_type="BTREE") # Bitmap index for low-cardinality columns table.create_scalar_index("category", index_type="BITMAP") # LabelList index for list columns (tags, categories) table.create_scalar_index("tags", index_type="LABEL_LIST") # Full-text search index table.create_fts_index( "description", config=FTS( with_position=True, language="English", stem=True, remove_stop_words=True ) ) ``` ### JavaScript/TypeScript - Create Index Create vector and scalar indices in JavaScript applications. ```typescript import * as lancedb from "@lancedb/lancedb"; import { Index } from "@lancedb/lancedb"; const db = await lancedb.connect("/tmp/lancedb"); const table = await db.openTable("products"); // Create IVF-PQ vector index await table.createIndex("vector", { config: Index.ivfPq({ distanceType: "l2", numPartitions: 256, numSubVectors: 16 }) }); // Create HNSW-PQ vector index await table.createIndex("vector", { config: Index.hnswPq({ distanceType: "cosine", numPartitions: 1, m: 20, efConstruction: 300 }) }); // Create scalar index await table.createIndex("price"); // Create FTS index await table.createIndex("text", { config: Index.fts({ withPosition: true, language: "English" }) }); // List indices const indices = await table.listIndices(); console.log("Indices:", indices); // Drop index await table.dropIndex("vector_idx"); ``` ## Embeddings Automatically generate vector embeddings using built-in embedding functions from various providers. ### Python - Embedding Functions Use embedding functions to automatically vectorize text and other data types. ```python import lancedb from lancedb.embeddings import get_registry from lancedb.pydantic import LanceModel, Vector db = lancedb.connect("~/.lancedb") # Get embedding function from registry registry = get_registry() # Sentence Transformers (local) sentence_transformer = registry.get("sentence-transformers").create( name="all-MiniLM-L6-v2" ) # OpenAI embeddings openai_embedding = registry.get("openai").create( name="text-embedding-3-small", api_key="your-api-key" ) # Define schema with embedding function using Pydantic class Document(LanceModel): text: str = sentence_transformer.SourceField() vector: Vector(sentence_transformer.ndims()) = sentence_transformer.VectorField() category: str # Create table with automatic embedding table = db.create_table("documents", schema=Document) # Add data - embeddings generated automatically table.add([ {"text": "Machine learning is fascinating", "category": "tech"}, {"text": "Neural networks process information", "category": "tech"} ]) # Search with text query - automatically embedded results = table.search("artificial intelligence").limit(10).to_pandas() # Available embedding providers: # - sentence-transformers (local) # - openai # - cohere # - huggingface # - ollama # - gemini # - voyageai # - bedrock ``` ### JavaScript/TypeScript - Embedding Functions Configure embedding functions in JavaScript for automatic vectorization. ```typescript import * as lancedb from "@lancedb/lancedb"; import { getRegistry, LanceSchema } from "@lancedb/lancedb/embedding"; const db = await lancedb.connect("/tmp/lancedb"); // Get embedding function from registry const registry = getRegistry(); const embeddingFunc = await registry .get("openai") .create({ model: "text-embedding-3-small" }); // Define schema with embedding const schema = LanceSchema({ text: embeddingFunc.sourceField(), vector: embeddingFunc.vectorField(), category: "string" }); // Create table with embedding function const table = await db.createTable("documents", [], { schema }); // Add data - embeddings generated automatically await table.add([ { text: "Machine learning is fascinating", category: "tech" }, { text: "Neural networks process information", category: "tech" } ]); // Search with automatic query embedding const results = await table.search("artificial intelligence").limit(10).toArray(); ``` ## Schema Operations Modify table schemas by adding, altering, or dropping columns. ### Python - Schema Evolution Evolve table schema with column operations. ```python import lancedb import pyarrow as pa db = lancedb.connect("~/.lancedb") table = db.open_table("products") # Add columns with SQL expressions table.add_columns({ "double_price": "cast((price * 2) as float)", "is_expensive": "price > 100" }) # Alter column (rename, change nullability, change type) table.alter_columns([ { "path": "double_price", "rename": "price_doubled", "nullable": True } ]) # Drop columns table.drop_columns(["price_doubled", "is_expensive"]) # Get current schema schema = table.schema print(schema) ``` ### JavaScript/TypeScript - Schema Evolution Modify table structure in JavaScript applications. ```typescript import * as lancedb from "@lancedb/lancedb"; import { Field, Float32 } from "apache-arrow"; const db = await lancedb.connect("/tmp/lancedb"); const table = await db.openTable("products"); // Add columns with SQL expressions await table.addColumns([ { name: "double_price", valueSql: "cast((price * 2) as float)" } ]); // Add column with Arrow Field (initialized to null) await table.addColumns(new Field("new_column", new Float32(), true)); // Alter columns await table.alterColumns([ { path: "double_price", rename: "price_doubled", nullable: true } ]); // Drop columns await table.dropColumns(["price_doubled"]); // Get schema const schema = await table.schema(); ``` ## Table Versioning Access and manage table versions for time-travel queries and data recovery. ### Python - Version Management Navigate table history and restore previous versions. ```python import lancedb db = lancedb.connect("~/.lancedb") table = db.open_table("products") # Get current version version = table.version print(f"Current version: {version}") # List all versions versions = table.list_versions() for v in versions: print(f"Version {v['version']}: created at {v['timestamp']}") # Checkout specific version (time-travel read) table.checkout(5) old_data = table.to_pandas() # Return to latest version table.checkout_latest() # Restore to a previous version (creates new version) table.restore(5) # Create a tag for a version table.create_tag("production-v1") # Checkout by tag table.checkout("production-v1") # Compact old versions to save space stats = table.compact_files() print(f"Compaction stats: {stats}") # Clean up old versions table.cleanup_old_versions(older_than_days=7) ``` ### JavaScript/TypeScript - Version Management Manage table versions in JavaScript with checkout and restore operations. ```typescript import * as lancedb from "@lancedb/lancedb"; const db = await lancedb.connect("/tmp/lancedb"); const table = await db.openTable("products"); // Get current version const version = await table.version(); console.log(`Current version: ${version}`); // Checkout specific version (time-travel) await table.checkout(5); const oldData = await table.query().toArrow(); // Return to latest await table.checkoutLatest(); // Restore to previous version await table.restore(5); // Table optimization const compactionStats = await table.optimize({ cleanupOlderThan: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000) // 7 days ago }); ``` ## Query Builder Build complex queries with filtering, projection, and pagination. ### Python - Query Builder Construct and execute SQL-like queries on table data. ```python import lancedb db = lancedb.connect("~/.lancedb") table = db.open_table("products") # Basic query with filter results = table.search() \ .where("price > 50 AND category = 'electronics'") \ .limit(100) \ .to_pandas() # Query with column selection results = table.search() \ .select(["item", "price", "category"]) \ .where("price > 50") \ .limit(100) \ .to_pandas() # Query with computed columns results = table.search() \ .select(["item", "price"]) \ .select({"discounted": "price * 0.9"}) \ .limit(100) \ .to_pandas() # Query with offset for pagination page_size = 50 page_number = 2 results = table.search() \ .offset(page_number * page_size) \ .limit(page_size) \ .to_pandas() # Get all data as Arrow table all_data = table.to_arrow() # Get all data as Pandas DataFrame all_data = table.to_pandas() # Async query execution results = await table.query().where("price > 50").limit(100).to_pandas() ``` ### JavaScript/TypeScript - Query Builder Build and execute queries in JavaScript with type-safe results. ```typescript import * as lancedb from "@lancedb/lancedb"; const db = await lancedb.connect("/tmp/lancedb"); const table = await db.openTable("products"); // Basic query with filter const results = await table .query() .where("price > 50 AND category = 'electronics'") .limit(100) .toArray(); // Query with column selection const selectedResults = await table .query() .select(["item", "price", "category"]) .where("price > 50") .limit(100) .toArray(); // Query with computed columns const computedResults = await table .query() .select({ item: "item", discounted: "price * 0.9" }) .limit(100) .toArrow(); // Pagination with offset const pageSize = 50; const pageNumber = 2; const pagedResults = await table .query() .offset(pageNumber * pageSize) .limit(pageSize) .toArray(); ``` ## Storage Options Configure storage backends for local, cloud, and enterprise deployments. ### Python - Storage Configuration Configure connection options for various storage backends. ```python import lancedb # AWS S3 configuration db = lancedb.connect( "s3://my-bucket/lancedb", storage_options={ "aws_access_key_id": "YOUR_ACCESS_KEY", "aws_secret_access_key": "YOUR_SECRET_KEY", "aws_region": "us-east-1", "aws_session_token": "SESSION_TOKEN", # Optional for temporary credentials } ) # Google Cloud Storage configuration db = lancedb.connect( "gs://my-bucket/lancedb", storage_options={ "google_service_account": "/path/to/service-account.json", # Or use google_service_account_key for inline JSON } ) # Azure Blob Storage configuration db = lancedb.connect( "az://my-container/lancedb", storage_options={ "azure_storage_account_name": "account_name", "azure_storage_account_key": "account_key", # Or use azure_client_id, azure_client_secret, azure_tenant_id } ) # General storage options db = lancedb.connect( "s3://my-bucket/lancedb", storage_options={ "timeout": "60s", "connect_timeout": "30s", "new_table_data_storage_version": "stable", "new_table_enable_stable_row_ids": "true", "new_table_enable_v2_manifest_paths": "true" } ) ``` LanceDB serves as a comprehensive solution for building AI/ML applications that require fast vector search, multimodal data management, and scalable storage. Its primary use cases include semantic search engines, recommendation systems, RAG (Retrieval Augmented Generation) pipelines, image similarity search, and any application requiring efficient similarity matching across large datasets. The database excels at combining traditional filtering with vector search, making it ideal for production systems that need both structured queries and semantic understanding. Integration patterns typically involve connecting LanceDB as the vector store in larger AI pipelines. For RAG applications, LanceDB integrates seamlessly with LangChain and LlamaIndex frameworks, providing the retrieval layer for LLM-powered applications. The embedding function system allows automatic vectorization of text, images, and other modalities, simplifying the ingestion pipeline. For analytics workloads, LanceDB's Apache Arrow foundation enables zero-copy data exchange with Pandas, Polars, and DuckDB, making it easy to combine vector search results with traditional data analysis. The versioning system supports ML experimentation workflows, allowing teams to track dataset versions alongside model versions.