### Install Packages with `uv` (Shell) Source: https://github.com/langchain-ai/langchain-postgres/blob/main/DEVELOPMENT.md This command installs the project's packages in editable mode and synchronizes dependencies, including those for testing. It requires the virtual environment to be activated. ```shell uv sync --group test ``` -------------------------------- ### Install langchain-postgres and dependencies Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Installs the `langchain-postgres` library and other required dependencies for the notebook. It uses pip to upgrade and install packages quietly. ```python %pip install --upgrade --quiet langchain-postgres # This notebook also requires the following dependencies %pip install --upgrade --quiet langchain-core langchain-cohere sqlalchemy ``` -------------------------------- ### Start PostgreSQL/PGVector Docker Container (Shell) Source: https://github.com/langchain-ai/langchain-postgres/blob/main/DEVELOPMENT.md This command launches a PostgreSQL database with the PGVector extension enabled using Docker. It configures the container with specific user, password, and database name, and maps the container's port 5432 to the host's port 6024. The `log_statement=all` flag ensures all SQL statements are logged. ```shell docker run --rm -it --name pgvector-container \ -e POSTGRES_USER=langchain \ -e POSTGRES_PASSWORD=langchain \ -e POSTGRES_DB=langchain_test \ -p 6024:5432 pgvector/pgvector:pg16 \ postgres -c log_statement=all ``` -------------------------------- ### Install Dependencies and Initialize PGVector Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/vectorstore.ipynb Installs the required LangChain Cohere integration and demonstrates how to initialize the PGVector store using a connection string and embedding model. ```python !pip install --quiet -U langchain_cohere from langchain_cohere import CohereEmbeddings from langchain_core.documents import Document from langchain_postgres.vectorstores import PGVector # See docker command above to launch a postgres instance with pgvector enabled. connection = "postgresql+psycopg://langchain:langchain@localhost:6024/langchain" collection_name = "my_docs" embeddings = CohereEmbeddings() vectorstore = PGVector( embeddings=embeddings, collection_name=collection_name, connection=connection, use_jsonb=True, ) ``` -------------------------------- ### Initialize PGVectorStore with Hybrid Search Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Demonstrates how to create a new PostgreSQL table with hybrid search capabilities and initialize the PGVectorStore instance using the provided hybrid search configuration. ```python from langchain_postgres import PGVectorStore TABLE_NAME = "hybrid_search_products" await pg_engine.ainit_vectorstore_table( table_name=TABLE_NAME, vector_size=VECTOR_SIZE, id_column="product_id", content_column="description", embedding_column="embed", metadata_columns=["name", "category", "price_usd", "quantity", "sku", "image_url"], metadata_json_column="metadata", hybrid_search_config=hybrid_search_config, store_metadata=True, ) vs_hybrid = await PGVectorStore.create( pg_engine, table_name=TABLE_NAME, embedding_service=embedding, id_column="product_id", content_column="description", embedding_column="embed", metadata_columns=["name", "category", "price_usd", "quantity", "sku", "image_url"], metadata_json_column="metadata", hybrid_search_config=hybrid_search_config, ) docs = await custom_store.asimilarity_search("products", k=5) ``` -------------------------------- ### Clone Repository and Set Up Virtual Environment (Shell) Source: https://github.com/langchain-ai/langchain-postgres/blob/main/DEVELOPMENT.md This snippet clones the langchain-postgres repository, navigates into the directory, creates a Python virtual environment using `uv`, and activates it. It's the initial step for setting up the development environment. ```shell git clone https://github.com/langchain-ai/langchain-postgres cd langchain-postgres uv venv --python=3.13 source .venv/bin/activate ``` -------------------------------- ### Install langchain-postgres Package (Bash) Source: https://github.com/langchain-ai/langchain-postgres/blob/main/README.md Provides the command to install the langchain-postgres package using pip. This is a prerequisite for using the package's functionalities. ```bash pip install -U langchain-postgres ``` -------------------------------- ### Initialize Vectorstore Table with PGEngine Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Creates a PostgreSQL table for storing document embeddings. Allows customization of schema, including metadata, vector dimensions, and hybrid search support. Includes basic, advanced, and async initialization examples. ```python from langchain_postgres import PGEngine, Column CONNECTION_STRING = "postgresql+asyncpg://langchain:langchain@localhost:6024/langchain" engine = PGEngine.from_connection_string(url=CONNECTION_STRING) # Basic table creation engine.init_vectorstore_table( table_name="my_documents", vector_size=1536, # Must match your embedding model's dimension ) # Advanced table with custom metadata columns engine.init_vectorstore_table( table_name="products", vector_size=768, schema_name="public", content_column="content", embedding_column="embedding", metadata_columns=[ Column("category", "TEXT"), Column("price", "FLOAT"), Column("in_stock", "BOOLEAN"), ], id_column=Column("product_id", "VARCHAR(50)", nullable=False), overwrite_existing=False, # Set True to drop and recreate ) # Async version await engine.ainit_vectorstore_table( table_name="async_documents", vector_size=1536, ) ``` -------------------------------- ### Add documents and texts to vector store Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Demonstrates how to ingest data into the vector store using either structured Document objects or raw text lists with associated metadata and IDs. ```python import uuid from langchain_core.documents import Document docs = [ Document(id=str(uuid.uuid4()), page_content="Red Apple", metadata={"description": "red", "content": "1", "category": "fruit"}), Document(id=str(uuid.uuid4()), page_content="Banana Cavendish", metadata={"description": "yellow", "content": "2", "category": "fruit"}), Document(id=str(uuid.uuid4()), page_content="Orange Navel", metadata={"description": "orange", "content": "3", "category": "fruit"}), ] await store.aadd_documents(docs) all_texts = ["Apples and oranges", "Cars and airplanes", "Pineapple", "Train", "Banana"] metadatas = [{"len": len(t)} for t in all_texts] ids = [str(uuid.uuid4()) for _ in all_texts] await store.aadd_texts(all_texts, metadatas=metadatas, ids=ids) ``` -------------------------------- ### Create PGEngine from SQLAlchemy engine Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Initializes a `PGEngine` using an existing SQLAlchemy `AsyncEngine` object. This allows for more control over the SQLAlchemy engine configuration before passing it to `PGEngine`. ```python from sqlalchemy.ext.asyncio import create_async_engine # Create an SQLAlchemy Async Engine engine = create_async_engine( CONNECTION_STRING, ) pg_engine = PGEngine.from_engine(engine=engine) ``` -------------------------------- ### Manage vector indexes Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Apply, re-index, and drop vector indexes like HNSW and IVFFlat to optimize search performance. ```python from langchain_postgres.v2.indexes import HNSWIndex, IVFFlatIndex index = IVFFlatIndex(name="my-ivfflat", lists=120) await store.aapply_vector_index(index) await store.areindex("my-hnsw-index") await store.adrop_vector_index("my-hnsw-index") ``` -------------------------------- ### Customize PGVectorStore Table Schema Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Demonstrates how to initialize a vector store table with custom ID types, specific metadata columns, and how to map an existing database table to the PGVectorStore interface. ```python from langchain_postgres import Column # Custom ID column example await pg_engine.ainit_vectorstore_table( ..., id_column=Column(name="langchain_id", data_type="INTEGER") ) # Initialize with custom metadata columns await pg_engine.ainit_vectorstore_table( table_name="vectorstore_custom", vector_size=VECTOR_SIZE, metadata_columns=[Column("len", "INTEGER")], ) # Connect to existing table custom_store = await PGVectorStore.create( engine=pg_engine, table_name="products", embedding_service=embedding, id_column="product_id", content_column="description", embedding_column="embed", metadata_columns=["name", "category", "price_usd", "quantity", "sku", "image_url"], metadata_json_column="metadata", ) ``` -------------------------------- ### Perform Metadata-Filtered Similarity Search Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Shows how to add documents with metadata to the store and perform a similarity search using dictionary-based filters with supported operators like $gte. ```python import uuid docs = [ Document( id=str(uuid.uuid4()), page_content="Red Apple", metadata={"description": "red", "content": "1", "category": "fruit"}, ), Document( id=str(uuid.uuid4()), page_content="Banana Cavendish", metadata={"description": "yellow", "content": "2", "category": "fruit"}, ) ] await custom_store.aadd_documents(docs) # Filtered search docs = await custom_store.asimilarity_search(query, filter={"content": {"$gte": 1}}) ``` -------------------------------- ### Create PGEngine from connection string (asyncpg) Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Initializes a `PGEngine` using a PostgreSQL connection string with the `asyncpg` driver. The `PGEngine` manages a connection pool for efficient database interactions. The connection string specifies the driver, user, password, host, port, and database. ```python # See docker command above to launch a Postgres instance with pgvector enabled. CONNECTION_STRING = ( f"postgresql+asyncpg://{POSTGRES_USER}:{POSTGRES_PASSWORD}@{POSTGRES_HOST}" f":{POSTGRES_PORT}/{POSTGRES_DB}" ) # To use psycopg3 driver, set your connection string to `postgresql+psycopg://` from langchain_postgres import PGEngine pg_engine = PGEngine.from_connection_string(url=CONNECTION_STRING) ``` -------------------------------- ### Perform similarity searches Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Execute similarity searches using natural language queries or pre-computed vector embeddings. ```python query = "I'd like a fruit." docs = await store.asimilarity_search(query) query_vector = embedding.embed_query(query) docs = await store.asimilarity_search_by_vector(query_vector, k=2) ``` -------------------------------- ### Install langchain-postgres Package Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore.ipynb Installs or upgrades the `langchain-postgres` Python package, which provides the necessary tools to interact with PostgreSQL as a vector store. ```python %pip install --upgrade --quiet langchain-postgres ``` -------------------------------- ### Initialize Vector Store Tables with PGEngine Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/migrate_pgvector_to_pgvectorstore.ipynb Demonstrates how to create database tables for vector storage using PGEngine. Includes examples for specifying custom schemas, metadata columns, and non-UUID identifiers. ```python SCHEMA_NAME="my_schema" await engine.ainit_vectorstore_table( table_name=TABLE_NAME, vector_size=768, schema_name=SCHEMA_NAME, # Default: "public" ) ``` ```python metadata_columns = [ Column(f"col_0_{collection_name}", "VARCHAR"), Column(f"col_1_{collection_name}", "VARCHAR"), ] engine.init_vectorstore_table( table_name="destination_table", vector_size=VECTOR_SIZE, metadata_columns=metadata_columns, id_column=Column("langchain_id", "VARCHAR"), ) ``` ```python VECTOR_SIZE = 768 for collection_name in all_collection_names: engine.init_vectorstore_table( table_name=collection_name, vector_size=VECTOR_SIZE, ) ``` -------------------------------- ### Initialize and Use PGVectorStore for Document Search (Python) Source: https://github.com/langchain-ai/langchain-postgres/blob/main/README.md Demonstrates how to initialize a PGVectorStore with a PostgreSQL engine, add documents, and perform similarity searches. It requires langchain-core and langchain-postgres packages, and a PostgreSQL database connection. The example uses DeterministicFakeEmbedding for demonstration purposes. ```python from langchain_core.documents import Document from langchain_core.embeddings import DeterministicFakeEmbedding from langchain_postgres import PGEngine, PGVectorStore # Replace the connection string with your own Postgres connection string CONNECTION_STRING = "postgresql+psycopg://langchain:langchain@localhost:6024/langchain" engine = PGEngine.from_connection_string(url=CONNECTION_STRING) # Replace the vector size with your own vector size VECTOR_SIZE = 768 embedding = DeterministicFakeEmbedding(size=VECTOR_SIZE) TABLE_NAME = "my_doc_collection" engine.init_vectorstore_table( table_name=TABLE_NAME, vector_size=VECTOR_SIZE, ) store = PGVectorStore.create_sync( engine=engine, table_name=TABLE_NAME, embedding_service=embedding, ) docs = [ Document(page_content="Apples and oranges"), Document(page_content="Cars and airplanes"), Document(page_content="Train") ] store.add_documents(docs) query = "I'd like a fruit." docs = store.similarity_search(query) print(docs) ``` -------------------------------- ### Configure PostgreSQL connection parameters Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Sets the connection parameters for a PostgreSQL instance, including user, password, host, port, and database name. These values are used to construct the connection string for the PGEngine. ```python # @title Set your values or use the defaults to connect to Docker { display-mode: "form" } POSTGRES_USER = "langchain" # @param {type: "string"} POSTGRES_PASSWORD = "langchain" # @param {type: "string"} POSTGRES_HOST = "localhost" # @param {type: "string"} POSTGRES_PORT = "6024" # @param {type: "string"} POSTGRES_DB = "langchain" # @param {type: "string"} TABLE_NAME = "vectorstore" # @param {type: "string"} VECTOR_SIZE = 1024 # @param {type: "int"} ``` -------------------------------- ### Install LangChain-Postgres dependencies Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/migrate_pgvector_to_pgvectorstore.ipynb Installs the necessary Python packages including langchain-postgres, langchain-core, and SQLAlchemy to enable vector store functionality. ```python %pip install --upgrade --quiet langchain-postgres langchain-core SQLAlchemy ``` -------------------------------- ### Initialize a default PGVectorStore Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Initializes the `PGVectorStore` using the provided `PGEngine`, table name, and embedding service. This uses the default table schema for storing vectors. ```python from langchain_postgres import PGVectorStore store = await PGVectorStore.create( engine=pg_engine, table_name=TABLE_NAME, # schema_name=SCHEMA_NAME, embedding_service=embedding, ) ``` -------------------------------- ### Install langchain-openai Package for Embeddings Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore.ipynb Installs or upgrades the `langchain-openai` package, which is required to use OpenAI's embedding models with LangChain. This allows for the conversion of text into vector embeddings. ```python %pip install --upgrade --quiet langchain-openai ``` -------------------------------- ### Retrieve and delete documents Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Methods for fetching documents via filters and removing them using specific IDs or complex metadata criteria. ```python documents_with_apple = await store.aget(where_document={"$ilike": "%apple%"}, include="documents") paginated_ids = await store.aget(limit=3, offset=3) await store.adelete([ids[1]]) await store.adelete(filter={"source": "documentation"}) await store.adelete(filter={"$and": [{"category": "obsolete"}, {"year": {"$lt": 2020}}]}) await store.adelete(ids=["id1", "id2"], filter={"status": "archived"}) ``` -------------------------------- ### Initialize a PGVectorStore table Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Initializes a table in the PostgreSQL database with the necessary schema for storing vectors. This method is called on the `PGEngine` object and requires the table name and vector size. ```python await pg_engine.ainit_vectorstore_table( table_name=TABLE_NAME, vector_size=VECTOR_SIZE, ) ``` -------------------------------- ### Add Documents and Perform Hybrid Search in Langchain-Postgres Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Demonstrates adding documents to a hybrid search vector store and performing a similarity search. Assumes `vs_hybrid` and `docs` are pre-defined. The `aadd_documents` method adds data, and `asimilarity_search` retrieves relevant documents. ```python await vs_hybrid.aadd_documents(docs) # Use hybrid search hybrid_docs = await vs_hybrid.asimilarity_search("products", k=5) print(hybrid_docs) ``` -------------------------------- ### Initialize a PGVectorStore table with a custom schema Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Initializes a table for the PGVectorStore, allowing specification of a custom schema name in addition to the table name and vector size. Defaults to the 'public' schema if not provided. ```python SCHEMA_NAME="my_schema" await pg_engine.ainit_vectorstore_table( table_name=TABLE_NAME, vector_size=768, schema_name=SCHEMA_NAME, # Default: "public" ) ``` -------------------------------- ### Initialize a Cohere Embeddings instance Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Creates an instance of the `CohereEmbeddings` class, which is used for generating vector embeddings. This requires the `langchain-cohere` library and specifies the embedding model to use. ```python from langchain_cohere import CohereEmbeddings embedding = CohereEmbeddings(model="embed-english-v3.0") ``` -------------------------------- ### Apply Vector Index to Vectorstore Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore.ipynb Applies a vector index to the vectorstore to speed up search queries. This example uses the IVFFlatIndex, a common indexing strategy for approximate nearest neighbor search. ```python from langchain_postgres.v2.indexes import IVFFlatIndex index = IVFFlatIndex() # Add an index using a default index name await vectorstore.aapply_vector_index(index) ``` -------------------------------- ### Initialize PGVectorStore with Hybrid Search Config for Existing Table Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Initializes a `PGVectorStore` instance for an existing table, enabling hybrid search even without a pre-existing `tsv_column`. It uses `HybridSearchConfig` to define language, fusion function, and parameters. The `create` method connects to the database and configures columns for vector storage. ```python from langchain_postgres import PGVectorStore from langchain_postgres.vectorstore import HybridSearchConfig from langchain_core.runnables.base import reciprocal_rank_fusion # Set the existing table name TABLE_NAME = "products" # SCHEMA_NAME = "my_schema" hybrid_search_config = HybridSearchConfig( tsv_lang="pg_catalog.english", fusion_function=reciprocal_rank_fusion, fusion_function_parameters={ "rrf_k": 60, "fetch_top_k": 10, }, ) # Initialize PGVectorStore with the hybrid search config custom_hybrid_store = await PGVectorStore.create( pg_engine, table_name=TABLE_NAME, # schema_name=SCHEMA_NAME, embedding_service=embedding, # Connect to existing VectorStore by customizing below column names id_column="product_id", content_column="description", embedding_column="embed", metadata_columns=["name", "category", "price_usd", "quantity", "sku", "image_url"], metadata_json_column="metadata", hybrid_search_config=hybrid_search_config, ) # Use hybrid search hybrid_docs = await custom_hybrid_store.asimilarity_search("products", k=5) print(hybrid_docs) ``` -------------------------------- ### Run PostgreSQL with pgvector Extension using Docker Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore.ipynb This command starts a PostgreSQL database instance with the pgvector extension enabled, suitable for local development with LangChain. It maps the container's port 5432 to the host's port 6024 and sets up basic user and database credentials. ```shell docker run --name pgvector-container -e POSTGRES_USER=langchain -e POSTGRES_PASSWORD=langchain -e POSTGRES_DB=langchain -p 6024:5432 -d pgvector/pgvector:pg16 ``` -------------------------------- ### Run Test Cases with Pytest (Shell) Source: https://github.com/langchain-ai/langchain-postgres/blob/main/DEVELOPMENT.md This snippet sets the `POSTGRES_PORT` environment variable to 6024 and then executes the project's test suite using `pytest`. The `-vvv` flag provides verbose output during test execution. ```shell export POSTGRES_PORT=6024 pytest -vvv ``` -------------------------------- ### Configure Hybrid Search with HybridSearchConfig Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Defines the configuration for hybrid search, specifying the TSV column, language, and the fusion function used to combine semantic and keyword search results. ```python from langchain_postgres.v2.hybrid_search_config import ( HybridSearchConfig, reciprocal_rank_fusion, ) hybrid_search_config = HybridSearchConfig( tsv_column="hybrid_description", tsv_lang="pg_catalog.english", fusion_function=reciprocal_rank_fusion, fusion_function_parameters={ "rrf_k": 60, "fetch_top_k": 10, }, ) ``` -------------------------------- ### Apply Hybrid Search to Specific Queries in Langchain-Postgres Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Shows how to apply hybrid search to individual queries by passing the `hybrid_search_config` directly to the `asimilarity_search` method, rather than during vector store initialization. This allows for selective use of hybrid search. ```python # Use hybrid search hybrid_docs = await custom_store.asimilarity_search( "products", k=5, hybrid_search_config=hybrid_search_config ) print(hybrid_docs) ``` -------------------------------- ### Create PGEngine from Connection String Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Initializes a database connection engine using a PostgreSQL connection string. Supports both asyncpg (recommended for async) and psycopg3 drivers. Ensure to close the engine when done. ```python from langchain_postgres import PGEngine # Using asyncpg driver (recommended for async operations) CONNECTION_STRING = "postgresql+asyncpg://langchain:langchain@localhost:6024/langchain" engine = PGEngine.from_connection_string(url=CONNECTION_STRING) # Using psycopg3 driver (for sync/async hybrid operations) CONNECTION_STRING = "postgresql+psycopg://langchain:langchain@localhost:6024/langchain" engine = PGEngine.from_connection_string(url=CONNECTION_STRING) # Close the engine when done await engine.close() ``` -------------------------------- ### Initialize PGVectorStore and Perform Basic Similarity Search Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Initializes the PGVectorStore with an engine, table name, embedding service, and metadata columns. Then, performs an asynchronous similarity search for documents related to a query. ```python from langchain_postgres import PGEngine, PGVectorStore from langchain_openai import OpenAIEmbeddings engine = PGEngine.from_connection_string( url="postgresql+asyncpg://langchain:langchain@localhost:6024/langchain" ) embedding = OpenAIEmbeddings(model="text-embedding-3-small") vectorstore = await PGVectorStore.create( engine=engine, table_name="products", embedding_service=embedding, metadata_columns=["category", "price"], ) # Basic similarity search results = await vectorstore.asimilarity_search( query="I want something healthy to eat", k=5, # Number of results to return ) for doc in results: print(f"Content: {doc.page_content}") print(f"Metadata: {doc.metadata} ") ``` -------------------------------- ### Create PG Engine Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/migrate_pgvector_to_pgvectorstore.md Initializes a PostgreSQL engine for interacting with your database. Requires a connection string. All synchronous methods have asynchronous counterparts. ```python from langchain_postgres import PGEngine # Replace these variable values engine = PGEngine.from_connection_string(url=CONNECTION_STRING) ``` -------------------------------- ### Initialize PGVectorStore and Perform MMR Search Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Initializes the PGVectorStore and performs an asynchronous Maximal Marginal Relevance (MMR) search to retrieve diverse yet relevant documents. ```python from langchain_postgres import PGEngine, PGVectorStore from langchain_openai import OpenAIEmbeddings engine = PGEngine.from_connection_string( url="postgresql+asyncpg://langchain:langchain@localhost:6024/langchain" ) embedding = OpenAIEmbeddings(model="text-embedding-3-small") vectorstore = await PGVectorStore.create( engine=engine, table_name="articles", embedding_service=embedding, ) # MMR search for diverse results results = await vectorstore.amax_marginal_relevance_search( query="machine learning applications", k=5, # Number of documents to return fetch_k=20, # Number of documents to fetch before filtering lambda_mult=0.5, # 0=max diversity, 1=max relevance ) for doc in results: print(f"Content: {doc.page_content[:100]}...") print(f"Metadata: {doc.metadata} ") ``` -------------------------------- ### Initialize PGVectorStore and Perform Similarity Search with Scores Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Initializes the PGVectorStore and performs an asynchronous similarity search, returning both the documents and their corresponding similarity scores. ```python from langchain_postgres import PGEngine, PGVectorStore from langchain_openai import OpenAIEmbeddings engine = PGEngine.from_connection_string( url="postgresql+asyncpg://langchain:langchain@localhost:6024/langchain" ) embedding = OpenAIEmbeddings(model="text-embedding-3-small") vectorstore = await PGVectorStore.create( engine=engine, table_name="products", embedding_service=embedding, ) # Search with scores results = await vectorstore.asimilarity_search_with_score( query="organic vegetables", k=5, ) for doc, score in results: print(f"Score: {score:.4f}") print(f"Content: {doc.page_content}") print(f"ID: {doc.id} ") ``` -------------------------------- ### Create and Apply Vector Indexes in PostgreSQL Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Demonstrates creating HNSW and IVFFlat vector indexes in a PostgreSQL database using Langchain. HNSW is recommended for most use cases, while IVFFlat is better for very large datasets. Ensure the PGEngine and PGVectorStore are properly initialized with your database connection and embedding service. ```python from langchain_postgres import PGEngine, PGVectorStore from langchain_postgres.v2.indexes import ( HNSWIndex, HNSWQueryOptions, IVFFlatIndex, IVFFlatQueryOptions, DistanceStrategy, ) from langchain_openai import OpenAIEmbeddings engine = PGEngine.from_connection_string( url="postgresql+asyncpg://langchain:langchain@localhost:6024/langchain" ) embedding = OpenAIEmbeddings(model="text-embedding-3-small") vectorstore = await PGVectorStore.create( engine=engine, table_name="indexed_docs", embedding_service=embedding, ) # Apply HNSW index (recommended for most use cases) hnsw_index = HNSWIndex( name="my_hnsw_index", m=16, # Max connections per layer ef_construction=64, # Size of dynamic candidate list during construction distance_strategy=DistanceStrategy.COSINE_DISTANCE, ) await vectorstore.aapply_vector_index(hnsw_index) # Apply IVFFlat index (better for very large datasets) ivfflat_index = IVFFlatIndex( name="my_ivfflat_index", lists=100, # Number of clusters distance_strategy=DistanceStrategy.COSINE_DISTANCE, ) await vectorstore.aapply_vector_index(ivfflat_index) # Create vectorstore with query options for index vectorstore_with_options = await PGVectorStore.create( engine=engine, table_name="indexed_docs", embedding_service=embedding, index_query_options=HNSWQueryOptions(ef_search=100), ) # Check if index exists is_valid = await vectorstore.is_valid_index("my_hnsw_index") print(f"Index valid: {is_valid}") # Re-index after adding many documents await vectorstore.areindex(index_name="my_hnsw_index") # Drop index await vectorstore.adrop_vector_index(index_name="my_hnsw_index") ``` -------------------------------- ### Create PGVectorStore Instance Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Factory methods to create a PGVectorStore instance. Use `create` for asynchronous contexts and `create_sync` for synchronous code. Requires an engine, table name, and embedding service. ```python from langchain_postgres import PGEngine, PGVectorStore from langchain_openai import OpenAIEmbeddings CONNECTION_STRING = "postgresql+asyncpg://langchain:langchain@localhost:6024/langchain" engine = PGEngine.from_connection_string(url=CONNECTION_STRING) embedding = OpenAIEmbeddings(model="text-embedding-3-small") # Async creation vectorstore = await PGVectorStore.create( engine=engine, table_name="my_documents", embedding_service=embedding, metadata_columns=["category", "source"], # Columns to use for filtering ) # Sync creation vectorstore = PGVectorStore.create_sync( engine=engine, table_name="my_documents", embedding_service=embedding, ) ``` -------------------------------- ### Drop Vector Store Table in Langchain-Postgres Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Provides a code snippet to drop a vector store table from the PostgreSQL database. This action is irreversible and should be used with caution. It requires the table name to be specified. ```python # Drop the vector store table. await pg_engine.adrop_table(TABLE_NAME) ``` -------------------------------- ### PGEngine.init_vectorstore_table Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Creates a table for storing document embeddings with customizable schema. Supports async operations. ```APIDOC ## PGEngine.init_vectorstore_table ### Description Creates a table for storing document embeddings with customizable schema including metadata columns, vector dimensions, and optional hybrid search support. The table schema determines how documents and their embeddings are stored. ### Method `PGEngine.init_vectorstore_table` (sync) or `PGEngine.ainit_vectorstore_table` (async) ### Parameters #### Path Parameters - **table_name** (string) - Required - The name of the table to create. - **vector_size** (integer) - Required - The dimension of the vectors to be stored. - **schema_name** (string) - Optional - The schema for the table. Defaults to 'public'. - **content_column** (string) - Optional - The name of the content column. Defaults to 'document'. - **embedding_column** (string) - Optional - The name of the embedding column. Defaults to 'embedding'. - **metadata_columns** (list of Column objects) - Optional - List of custom metadata columns. - **id_column** (Column object) - Optional - Custom ID column definition. - **overwrite_existing** (boolean) - Optional - If true, drops and recreates the table. Defaults to false. ### Request Example ```python from langchain_postgres import PGEngine, Column CONNECTION_STRING = "postgresql+asyncpg://langchain:langchain@localhost:6024/langchain" engine = PGEngine.from_connection_string(url=CONNECTION_STRING) # Basic table creation engine.init_vectorstore_table( table_name="my_documents", vector_size=1536, # Must match your embedding model's dimension ) # Advanced table with custom metadata columns engine.init_vectorstore_table( table_name="products", vector_size=768, schema_name="public", content_column="content", embedding_column="embedding", metadata_columns=[ Column("category", "TEXT"), Column("price", "FLOAT"), Column("in_stock", "BOOLEAN"), ], id_column=Column("product_id", "VARCHAR(50)", nullable=False), overwrite_existing=False, # Set True to drop and recreate ) # Async version await engine.ainit_vectorstore_table( table_name="async_documents", vector_size=1536, ) ``` ### Response #### Success Response (200) - **None** - Indicates the table creation was successful. #### Response Example ```json { "status": "Table created successfully" } ``` ``` -------------------------------- ### Create Hybrid Search Index in Langchain-Postgres Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore_how_to.ipynb Demonstrates how to create a hybrid search index for a Postgres table that already has a `tsv_column`. The `aapply_hybrid_search_index` method is called on the vector store object to generate the index. ```python await vs_hybrid.aapply_hybrid_search_index() ``` -------------------------------- ### Initialize PGEngine connection pool Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/migrate_pgvector_to_pgvectorstore.ipynb Demonstrates how to create a PGEngine object using either a connection string or an existing SQLAlchemy AsyncEngine to manage database connection pooling. ```python from langchain_postgres import PGEngine from sqlalchemy.ext.asyncio import create_async_engine CONNECTION_STRING = f"postgresql+asyncpg://{POSTGRES_USER}:{POSTGRES_PASSWORD}@{POSTGRES_HOST}:{POSTGRES_PORT}/{POSTGRES_DB}" # Method 1: From connection string engine = PGEngine.from_connection_string(url=CONNECTION_STRING) # Method 2: From existing SQLAlchemy engine pool = create_async_engine(CONNECTION_STRING) engine = PGEngine.from_engine(engine=pool) ``` -------------------------------- ### Get Collection from Vectorstore with Filters Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore.ipynb Retrieves documents from the vectorstore, supporting filtering by document content and pagination parameters like limit and offset. It returns a dictionary containing documents or IDs based on the 'include' parameter. ```python documents_with_apple = await vectorstore.aget( where_document={"$ilike": "%apple%"}, include="documents" ) paginated_ids = await vectorstore.aget(limit=3, offset=3) print(documents_with_apple["documents"]) print(paginated_ids["ids"]) ``` -------------------------------- ### Similarity Search with IN Filter Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/pg_vectorstore.ipynb Conducts a similarity search using the '$in' operator to filter documents based on a metadata field matching any value within a specified list. This example searches for documents where the 'topic' is either 'food' or 'animals'. ```python await vectorstore.asimilarity_search( "apple", filter={"topic": {"$in": ["food", "animals"]}} ) ``` -------------------------------- ### Configure Hybrid Search in PGVectorStore Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Sets up PGVectorStore for hybrid search, combining vector similarity with full-text search using RRF or weighted sum. Requires configuring TSVector columns and fusion algorithms. ```python from langchain_postgres import PGEngine, PGVectorStore from langchain_postgres.v2.hybrid_search_config import ( HybridSearchConfig, reciprocal_rank_fusion, weighted_sum_ranking, ) from langchain_openai import OpenAIEmbeddings engine = PGEngine.from_connection_string( url="postgresql+asyncpg://langchain:langchain@localhost:6024/langchain" ) # Create table with hybrid search support hybrid_config = HybridSearchConfig( tsv_column="content_tsv", # TSVector column for full-text search tsv_lang="pg_catalog.english", fusion_function=reciprocal_rank_fusion, primary_top_k=10, # Vector search results secondary_top_k=10, # Keyword search results ) await engine.ainit_vectorstore_table( table_name="hybrid_docs", vector_size=1536, hybrid_search_config=hybrid_config, ) embedding = OpenAIEmbeddings(model="text-embedding-3-small") # Create vectorstore with hybrid search enabled vectorstore = await PGVectorStore.create( engine=engine, table_name="hybrid_docs", embedding_service=embedding, hybrid_search_config=HybridSearchConfig( fusion_function=reciprocal_rank_fusion, fusion_function_parameters={"rrf_k": 60}, ), ) # Hybrid search automatically uses both vector and keyword matching results = await vectorstore.asimilarity_search( query="machine learning neural networks", k=5, ) # Using weighted sum ranking instead vectorstore_weighted = await PGVectorStore.create( engine=engine, table_name="hybrid_docs", embedding_service=embedding, hybrid_search_config=HybridSearchConfig( fusion_function=weighted_sum_ranking, fusion_function_parameters={ "primary_results_weight": 0.7, "secondary_results_weight": 0.3, }, ), ) ``` -------------------------------- ### PGVectorStore Document Addition Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Demonstrates how to add documents to the PGVectorStore, both asynchronously and synchronously. ```APIDOC ## PGVectorStore Document Addition This section covers the methods for adding documents to the PGVectorStore. ### Method `aadd_documents` (async), `add_documents` (sync) ### Description Adds a list of `Document` objects to the vector store. The `id` field of each `Document` is used as the primary key. If `id` is not provided, a UUID will be generated. ### Request Body Example (for `aadd_documents`) ```json [ { "id": "some-uuid-1", "page_content": "Fresh organic apples from local farms", "metadata": {"category": "fruits", "price": 3.99, "source": "farm_direct"} }, { "id": "some-uuid-2", "page_content": "Premium grass-fed beef steaks", "metadata": {"category": "meat", "price": 24.99, "source": "butcher"} } ] ``` ### Response Example (for `aadd_documents`) ```json ["some-uuid-1", "some-uuid-2"] ``` ``` -------------------------------- ### Add and Search Documents in Postgres Vectorstore Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/vectorstore.ipynb This snippet demonstrates how to initialize a list of Document objects with metadata, insert them into the vectorstore using specific IDs, and perform a similarity search. Note that providing IDs during insertion will overwrite existing documents with matching IDs. ```python docs = [ Document( page_content="there are cats in the pond", metadata={"id": 1, "location": "pond", "topic": "animals"}, ), Document( page_content="ducks are also found in the pond", metadata={"id": 2, "location": "pond", "topic": "animals"}, ), Document( page_content="fresh apples are available at the market", metadata={"id": 3, "location": "market", "topic": "food"}, ), Document( page_content="the market also sells fresh oranges", metadata={"id": 4, "location": "market", "topic": "food"}, ), Document( page_content="the new art exhibit is fascinating", metadata={"id": 5, "location": "museum", "topic": "art"}, ), Document( page_content="a sculpture exhibit is also at the museum", metadata={"id": 6, "location": "museum", "topic": "art"}, ), Document( page_content="a new coffee shop opened on Main Street", metadata={"id": 7, "location": "Main Street", "topic": "food"}, ), Document( page_content="the book club meets at the library", metadata={"id": 8, "location": "library", "topic": "reading"}, ), Document( page_content="the library hosts a weekly story time for kids", metadata={"id": 9, "location": "library", "topic": "reading"}, ), Document( page_content="a cooking class for beginners is offered at the community center", metadata={"id": 10, "location": "community center", "topic": "classes"}, ), ] vectorstore.add_documents(docs, ids=[doc.metadata["id"] for doc in docs]) vectorstore.similarity_search("kitty", k=10) ``` -------------------------------- ### Create PGVectorStore Object Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/migrate_pgvector_to_pgvectorstore.md Initializes a PGVectorStore object to interact with the newly created table. It uses a fake embedding service for initialization, as actual embeddings are copied from the PGVector table. ```python from langchain_postgres import PGVectorStore from langchain_core.embeddings import FakeEmbeddings destination_vector_store = PGVectorStore.create_sync( engine, embedding_service=FakeEmbeddings(size=VECTOR_SIZE), table_name="destination_table", ) ``` ```python from langchain_postgres import PGVectorStore from langchain_core.embeddings import FakeEmbeddings destination_vector_store = PGVectorStore.create_sync( engine, embedding_service=FakeEmbeddings(size=VECTOR_SIZE), table_name="destination_table", metadata_columns=[col.name for col in metadata_columns], id_column="langchain_id", ) ``` -------------------------------- ### Create PGVectorStore Instance Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/migrate_pgvector_to_pgvectorstore.md Instantiates a PGVectorStore object to interface with existing table data. It requires an embedding service and mapping of database columns to the vector store schema. ```python from langchain_postgres import PGVectorStore from langchain_core.embeddings import FakeEmbeddings vector_store = PGVectorStore.create_sync( engine=engine, table_name="langchain_pg_embedding", embedding_service=FakeEmbeddings(size=VECTOR_SIZE), content_column="document", metadata_json_column="cmetadata", metadata_columns=["collection_id"], id_column="id", ) ``` -------------------------------- ### Persist Chat History in PostgreSQL (Python) Source: https://github.com/langchain-ai/langchain-postgres/blob/main/README.md Illustrates how to use PostgresChatMessageHistory to store and retrieve chat messages in a PostgreSQL database. It requires the psycopg library and a PostgreSQL connection. The example shows table creation, session initialization, and adding messages. ```python import uuid from langchain_core.messages import SystemMessage, AIMessage, HumanMessage from langchain_postgres import PostgresChatMessageHistory import psycopg # Establish a synchronous connection to the database # (or use psycopg.AsyncConnection for async) # Replace with your actual connection info conn_info = "dbname=langchain user=langchain password=langchain host=localhost port=5432" sync_connection = psycopg.connect(conn_info) # Create the table schema (only needs to be done once) table_name = "chat_history" PostgresChatMessageHistory.create_tables(sync_connection, table_name) session_id = str(uuid.uuid4()) # Initialize the chat history manager chat_history = PostgresChatMessageHistory( table_name, session_id, sync_connection=sync_connection ) # Example of adding messages chat_history.add_message(SystemMessage("You are a helpful assistant.")) chat_history.add_message(HumanMessage("Hi there!")) chat_history.add_message(AIMessage("Hello! How can I help you today?")) # Example of retrieving messages retrieved_messages = chat_history.messages print(retrieved_messages) # Close the connection when done sync_connection.close() ``` -------------------------------- ### PGVectorStore.create / PGVectorStore.create_sync Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Factory methods to create a vector store instance connected to an existing table. Supports async and sync operations. ```APIDOC ## PGVectorStore.create / PGVectorStore.create_sync ### Description Factory methods to create a vector store instance connected to an existing table. Use `create` for async contexts and `create_sync` for synchronous code. The vector store provides methods for adding, searching, and deleting documents. ### Method `PGVectorStore.create` (async) or `PGVectorStore.create_sync` (sync) ### Parameters #### Path Parameters - **engine** (PGEngine) - Required - The initialized PGEngine instance. - **table_name** (string) - Required - The name of the table containing the vector data. - **embedding_service** (Embeddings) - Required - The embedding model to use for generating embeddings. - **metadata_columns** (list of strings) - Optional - List of metadata column names to use for filtering. ### Request Example ```python from langchain_postgres import PGEngine, PGVectorStore from langchain_openai import OpenAIEmbeddings CONNECTION_STRING = "postgresql+asyncpg://langchain:langchain@localhost:6024/langchain" engine = PGEngine.from_connection_string(url=CONNECTION_STRING) embedding = OpenAIEmbeddings(model="text-embedding-3-small") # Async creation vectorstore = await PGVectorStore.create( engine=engine, table_name="my_documents", embedding_service=embedding, metadata_columns=["category", "source"], # Columns to use for filtering ) # Sync creation vectorstore = PGVectorStore.create_sync( engine=engine, table_name="my_documents", embedding_service=embedding, ) ``` ### Response #### Success Response (200) - **vectorstore** (PGVectorStore) - An initialized PGVectorStore instance. #### Response Example ```json { "vectorstore": "" } ``` ``` -------------------------------- ### Create Documents with Metadata Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Defines a list of Document objects, each with unique IDs, page content, and associated metadata. This is a prerequisite for adding documents to the vector store. ```python docs = [ Document( id=str(uuid.uuid4()), page_content="Fresh organic apples from local farms", metadata={"category": "fruits", "price": 3.99, "source": "farm_direct"}, ), Document( id=str(uuid.uuid4()), page_content="Premium grass-fed beef steaks", metadata={"category": "meat", "price": 24.99, "source": "butcher"}, ), Document( id=str(uuid.uuid4()), page_content="Artisan sourdough bread baked daily", metadata={"category": "bakery", "price": 6.50, "source": "local_bakery"}, ), ] ``` -------------------------------- ### PGEngine.from_connection_string Source: https://context7.com/langchain-ai/langchain-postgres/llms.txt Creates a database connection engine from a PostgreSQL connection string. Supports both asyncpg and psycopg3 drivers. ```APIDOC ## PGEngine.from_connection_string ### Description Creates a database connection engine from a PostgreSQL connection string. This is the entry point for all database operations and manages an async connection pool that can be shared across multiple vector stores. ### Method `PGEngine.from_connection_string` ### Parameters #### Query Parameters - **url** (string) - Required - The PostgreSQL connection URL. ### Request Example ```python from langchain_postgres import PGEngine # Using asyncpg driver (recommended for async operations) CONNECTION_STRING = "postgresql+asyncpg://langchain:langchain@localhost:6024/langchain" engine = PGEngine.from_connection_string(url=CONNECTION_STRING) # Using psycopg3 driver (for sync/async hybrid operations) CONNECTION_STRING = "postgresql+psycopg://langchain:langchain@localhost:6024/langchain" engine = PGEngine.from_connection_string(url=CONNECTION_STRING) # Close the engine when done await engine.close() ``` ### Response #### Success Response (200) - **engine** (PGEngine) - An initialized PGEngine instance. #### Response Example ```json { "engine": "" } ``` ``` -------------------------------- ### Initialize Vectorstore Table Source: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/migrate_pgvector_to_pgvectorstore.md Creates a new table in PostgreSQL to store vector data. You can specify the vector size and optionally define custom metadata and ID columns for filtering and custom identifiers. ```python # Vertex AI embeddings uses a vector size of 768. # Adjust this according to your embeddings service. VECTOR_SIZE = 768 engine.init_vectorstore_table( table_name="destination_table", vector_size=VECTOR_SIZE, ) ``` ```python from sqlalchemy import Column metadata_columns = [ Column(f"col_0_{collection_name}", "VARCHAR"), Column(f"col_1_{collection_name}", "VARCHAR"), ] engine.init_vectorstore_table( table_name="destination_table", vector_size=VECTOR_SIZE, metadata_columns=metadata_columns, id_column=Column("langchain_id", "VARCHAR"), ) ```