### Install Qdrant Client Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/customize/retrievers/external/qdrant/README.md Install the necessary Qdrant client library using pip. ```bash pip install qdrant-client ``` -------------------------------- ### Install Weaviate Client Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/customize/retrievers/external/weaviate/README.md Install the necessary Weaviate client package to interact with the Weaviate database. ```bash pip install weaviate-client ``` -------------------------------- ### Install Dependencies with uv Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/index.rst Use this command to install all necessary dependencies for the project, including extras. ```bash uv sync --all-extras ``` -------------------------------- ### Base Configuration File Example (JSON) Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst An example of a base configuration file in JSON format for setting up the pipeline runner. ```json { "version_": 1, "template_": "SimpleKGPipeline", "neo4j_config": {}, "llm_config": {}, "embedder_config": {} } ``` -------------------------------- ### Base Configuration File Example (YAML) Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst An example of a base configuration file in YAML format for setting up the pipeline runner. ```yaml version_: 1 template_: SimpleKGPipeline neo4j_config: llm_config: embedder_config: ``` -------------------------------- ### Start Services Locally Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/customize/retrievers/external/qdrant/README.md Run this command to spin up Neo4j and Qdrant containers for local development. ```bash docker compose -f tests/e2e/docker-compose.yml up ``` -------------------------------- ### Install Development Dependencies Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/CONTRIBUTING.md Use uv to synchronize development dependencies. Ensure uv is installed first. ```shell uv sync --group dev ``` -------------------------------- ### Install with Optional Dependencies (OpenAI) Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/README.md Install the package along with specific optional dependencies, such as OpenAI support, using this command. ```shell pip install "neo4j-graphrag[openai]" ``` -------------------------------- ### Install Project Dependencies with uv Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/README.md Installs all project dependencies, including development dependencies, using the uv package manager. Ensure uv is installed on your system. ```bash uv sync --group dev ``` -------------------------------- ### LLM with System Instructions Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md This script demonstrates how to provide system-level instructions to guide the LLM's behavior. ```python from neo4j_graphrag.llms.openai import OpenAIWrapper # Initialize LLM with system instructions system_instructions = "You are a helpful assistant that speaks like a pirate." llm = OpenAIWrapper(api_key="YOUR_OPENAI_API_KEY", system_prompt=system_instructions) # Get a response following the system instructions response = llm.invoke("Tell me about yourself.") print(response) ``` -------------------------------- ### Install Pinecone Client Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/customize/retrievers/external/pinecone/README.md Install the pinecone-client package using pip to enable the Pinecone retriever functionality. ```bash pip install pinecone-client ``` -------------------------------- ### Start E2E Test Services with Docker Compose Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/README.md Set up the necessary services (Neo4j, Weaviate) for end-to-end tests using Docker Compose. This command starts the default stack. ```bash docker compose -f tests/e2e/docker-compose.yml up ``` -------------------------------- ### Quickstart: GraphRAG Query Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Connect to a Neo4j database, initialize a retriever and an LLM, and perform a GraphRAG query. ```python from neo4j import GraphDatabase from neo4j_graphrag.retrievers import VectorRetriever from neo4j_graphrag.llm import OpenAILLM from neo4j_graphrag.generation import GraphRAG from neo4j_graphrag.embeddings import OpenAIEmbeddings # 1. Neo4j driver URI = "neo4j://localhost:7687" AUTH = ("neo4j", "password") INDEX_NAME = "index-name" # Connect to Neo4j database driver = GraphDatabase.driver(URI, auth=AUTH) # 2. Retriever # Create Embedder object, needed to convert the user question (text) to a vector embedder = OpenAIEmbeddings(model="text-embedding-3-large") # Initialize the retriever retriever = VectorRetriever(driver, INDEX_NAME, embedder) # 3. LLM # Note: the OPENAI_API_KEY must be in the env vars llm = OpenAILLM(model_name="gpt-5", model_params={"temperature": 0}) # Initialize the RAG pipeline rag = GraphRAG(retriever=retriever, llm=llm) # Query the graph query_text = "How do I do similarity search in Neo4j?" response = rag.search(query_text=query_text, retriever_config={"top_k": 5}) print(response.answer) ``` -------------------------------- ### Install All Extra Packages Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/README.md Install all extra packages required to run all tests. This command ensures all necessary dependencies are available for testing. ```bash uv sync --all-extras ``` -------------------------------- ### Install GraphRAG with Optional Dependencies Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/index.rst Install extra dependencies for specific LLM providers or vector databases. For example, install with OpenAI support using: `pip install "neo4j-graphrag[openai]"`. ```bash pip install "neo4j-graphrag[openai]" ``` -------------------------------- ### Build KG Pipeline from PDF Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md End-to-end example of constructing a knowledge graph pipeline with explicit components using PDF input. ```python from neo4j_graphrag.graphrag.pipeline.kg_builder import KGBuilder from neo4j_graphrag.graphrag.pipeline.pipeline import Pipeline from neo4j_graphrag.graphrag.pipeline.components.loaders import PDFLoader from neo4j_graphrag.graphrag.pipeline.components.splitters import RecursiveCharacterTextSplitter from neo4j_graphrag.graphrag.pipeline.components.schema_builders import AutomaticSchemaExtraction from neo4j_graphrag.graphrag.pipeline.components.extractors import LLMEntityRelationExtractor from neo4j_graphrag.graphrag.pipeline.components.writers import Neo4jWriter # Define pipeline components loader = PDFLoader(file_path="./sample.pdf") text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200) schema_builder = AutomaticSchemaExtraction() entity_extractor = LLMEntityRelationExtractor() writer = Neo4jWriter() # Create KGBuilder instance kg_builder = KGBuilder(loader=loader, text_splitter=text_splitter, schema_builder=schema_builder, entity_extractor=entity_extractor, writer=writer) # Create and run the pipeline pipeline = Pipeline(steps=[kg_builder]) pipeline.run() print("Knowledge graph built successfully from PDF!") ``` -------------------------------- ### Initialize QdrantNeo4jRetriever Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Configure the QdrantNeo4jRetriever for vector searches in Qdrant, mapping to Neo4j nodes. Requires Qdrant Python client installation. ```python from qdrant_client import QdrantClient from neo4j_graphrag.retrievers import QdrantNeo4jRetriever client = QdrantClient(...) # construct the Qdrant client instance retriever = QdrantNeo4jRetriever( driver=driver, client=client, collection_name="my-collection", using="my-vector", id_property_external="neo4j_id", # The payload field that contains identifier to a corresponding Neo4j node id property id_property_neo4j="id", embedder=embedder, node_label_neo4j="Document", # optional ) ``` -------------------------------- ### Run Unit Tests with uv Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/index.rst Run only the unit tests for the project after installing dependencies. ```bash uv run pytest tests/unit ``` -------------------------------- ### VertexAI LLM Configuration Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Instantiate the VertexAILLM class to use Google VertexAI. Ensure 'google-cloud-aiplatform' is installed. ```python from neo4j_graphrag.llm import VertexAILLM from vertexai.generative_models import GenerationConfig generation_config = GenerationConfig(temperature=0.0) llm = VertexAILLM( model_name="gemini-2.5-flash", generation_config=generation_config ) llm.invoke("say something") ``` -------------------------------- ### Build KG Pipeline from Text Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md End-to-end example of constructing a knowledge graph pipeline with explicit components using text input. ```python from neo4j_graphrag.graphrag.pipeline.kg_builder import KGBuilder from neo4j_graphrag.graphrag.pipeline.pipeline import Pipeline from neo4j_graphrag.graphrag.pipeline.components.loaders import TextLoader from neo4j_graphrag.graphrag.pipeline.components.splitters import RecursiveCharacterTextSplitter from neo4j_graphrag.graphrag.pipeline.components.schema_builders import AutomaticSchemaExtraction from neo4j_graphrag.graphrag.pipeline.components.extractors import LLMEntityRelationExtractor from neo4j_graphrag.graphrag.pipeline.components.writers import Neo4jWriter # Define pipeline components loader = TextLoader(text="This is a sample document about Neo4j.") text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200) schema_builder = AutomaticSchemaExtraction() entity_extractor = LLMEntityRelationExtractor() writer = Neo4jWriter() # Create KGBuilder instance kg_builder = KGBuilder(loader=loader, text_splitter=text_splitter, schema_builder=schema_builder, entity_extractor=entity_extractor, writer=writer) # Create and run the pipeline pipeline = Pipeline(steps=[kg_builder]) pipeline.run() print("Knowledge graph built successfully!") ``` -------------------------------- ### Install Latest Stable Version Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/README.md Use this command to install the most recent stable release of the neo4j-graphrag package. ```shell pip install neo4j-graphrag ``` -------------------------------- ### Install Pre-commit Hook Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/CONTRIBUTING.md Install the pre-commit hook to automatically check code formatting before each commit. This helps maintain code style consistency. ```shell uv run pre-commit install ``` -------------------------------- ### Initialize GraphRAG with ToolsRetriever Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Initializes the GraphRAG pipeline using a ToolsRetriever. Ensure necessary LLM provider packages are installed, e.g., `pip install "neo4j_graphrag[openai]"`. ```python from neo4j_graphrag.rag import GraphRAG # Assume llm and tools_retriever are already initialized # Initialize GraphRAG pipeline with ToolsRetriever rag = GraphRAG(retriever=tools_retriever, llm=llm) # Query the pipeline - the LLM will automatically select appropriate tools query_text = "What movies did Tom Hanks act in and what are their plots?" response = rag.search(query_text=query_text, retriever_config={"top_k": 5}) print(response.answer) ``` -------------------------------- ### Initialize PineconeNeo4jRetriever Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Set up the PineconeNeo4jRetriever for vector searches using Pinecone, linking to Neo4j nodes. Requires Pinecone Python client installation. ```python from pinecone import Pinecone from neo4j_graphrag.retrievers import PineconeNeo4jRetriever client = Pinecone() # ... create your Pinecone client retriever = PineconeNeo4jRetriever( driver=driver, client=client, index_name="Movies", id_property_neo4j="id", embedder=embedder, node_label_neo4j="Document", # optional ) ``` -------------------------------- ### Install Neo4j GraphRAG Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/index.rst Install the latest stable version of the package using pip. It is recommended to use a virtual environment. ```bash pip install neo4j-graphrag ``` -------------------------------- ### Start E2E Test Services with Neo4j 2026 Docker Compose Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/README.md Start the end-to-end test services using a specific Docker Compose file for Neo4j 2026. This is required for tests using the Cypher 25 SEARCH clause. Ensure the default stack is stopped first. ```bash docker compose -f tests/e2e/docker-compose.neo4j2026.yml up -d ``` -------------------------------- ### Process Multiple Documents for KG Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md Example demonstrating how to process multiple documents to build a knowledge graph, including entity resolution. ```python from neo4j_graphrag.graphrag.pipeline.kg_builder import KGBuilder from neo4j_graphrag.graphrag.pipeline.pipeline import Pipeline from neo4j_graphrag.graphrag.pipeline.components.loaders import TextLoader from neo4j_graphrag.graphrag.pipeline.components.splitters import RecursiveCharacterTextSplitter from neo4j_graphrag.graphrag.pipeline.components.schema_builders import AutomaticSchemaExtraction from neo4j_graphrag.graphrag.pipeline.components.extractors import LLMEntityRelationExtractor from neo4j_graphrag.graphrag.pipeline.components.resolvers import FuzzyMatchEntityResolver from neo4j_graphrag.graphrag.pipeline.components.writers import Neo4jWriter # Define pipeline components documents = { "doc1": "This document is about apples.", "doc2": "This document is about oranges." } loader = TextLoader(text=documents) text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200) schema_builder = AutomaticSchemaExtraction() entity_extractor = LLMEntityRelationExtractor() entity_resolver = FuzzyMatchEntityResolver() writer = Neo4jWriter() # Create KGBuilder instance kg_builder = KGBuilder(loader=loader, text_splitter=text_splitter, schema_builder=schema_builder, entity_extractor=entity_extractor, entity_resolver=entity_resolver, writer=writer) # Create and run the pipeline pipeline = Pipeline(steps=[kg_builder]) pipeline.run() print("Knowledge graph built successfully from multiple documents!") ``` -------------------------------- ### Cypher Template Retriever Tool Example Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md This script demonstrates how to use a Cypher template to create a tool for retrieval. ```python from neo4j_graphrag.tools.cypher_template_to_tool import CypherTemplateToTool # Define a Cypher template cypher_template = "MATCH (n:Person {{name: $name}}) RETURN n" # Create a tool from the Cypher template tool = CypherTemplateToTool(cypher_template=cypher_template) # Example usage of the tool (assuming you have a way to call it) # result = tool.run(name="Alice") # print(result) ``` -------------------------------- ### LangChain Compatibility Example Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md Demonstrates compatibility with LangChain by using a LangChain component within the GraphRAG pipeline. ```python from neo4j_graphrag.graphrag.answer.langchain_compatiblity import LangchainCompatibility # Example usage (assuming LangchainCompatibility is a callable or class) # lc_compat = LangchainCompatibility() # result = lc_compat.run(...) ``` -------------------------------- ### Initialize WeaviateNeo4jRetriever Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Instantiate the WeaviateNeo4jRetriever to perform vector searches in Weaviate and map results to Neo4j nodes. Requires Weaviate Python client installation. ```python from weaviate.connect.helpers import connect_to_local from neo4j_graphrag.retrievers import WeaviateNeo4jRetriever client = connect_to_local() retriever = WeaviateNeo4jRetriever( driver=driver, client=client, embedder=embedder, collection="Movies", id_property_external="neo4j_id", id_property_neo4j="id", node_label_neo4j="Document", # optional ) ``` -------------------------------- ### Initialize Text2CypherRetriever Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Connect to Neo4j, configure an LLM, and initialize the Text2CypherRetriever with an optional schema and examples. This retriever generates Cypher queries from natural language. ```python from neo4j import GraphDatabase from neo4j_graphrag.retrievers import Text2CypherRetriever from neo4j_graphrag.llm import OpenAILLM URI = "neo4j://localhost:7687" AUTH = ("neo4j", "password") # Connect to Neo4j database driver = GraphDatabase.driver(URI, auth=AUTH) # Create LLM object llm = OpenAILLM(model_name="gpt-5") # (Optional) Specify your own Neo4j schema neo4j_schema = """ Node properties: Person {name: STRING, born: INTEGER} Movie {tagline: STRING, title: STRING, released: INTEGER} Relationship properties: ACTED_IN {roles: LIST} REVIEWED {summary: STRING, rating: INTEGER} The relationships: (:Person)-[:ACTED_IN]->(:Movie) (:Person)-[:DIRECTED]->(:Movie) (:Person)-[:PRODUCED]->(:Movie) (:Person)-[:WROTE]->(:Movie) (:Person)-[:FOLLOWS]->(:Person) (:Person)-[:REVIEWED]->(:Movie) """ # (Optional) Provide user input/query pairs for the LLM to use as examples examples = [ "USER INPUT: 'Which actors starred in the Matrix?' QUERY: MATCH (p:Person)-[:ACTED_IN]->(m:Movie) WHERE m.title = 'The Matrix' RETURN p.name" ] # Initialize the retriever retriever = Text2CypherRetriever( driver=driver, llm=llm, # type: ignore neo4j_schema=neo4j_schema, examples=examples, ) # Generate a Cypher query using the LLM, send it to the Neo4j database, and return the results query_text = "Which movies did Hugo Weaving star in?" print(retriever.search(query_text=query_text)) ``` -------------------------------- ### Implement a Custom DataLoader Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Example of creating a custom data loader by inheriting from the DataLoader interface and implementing the run method to return a LoadedDocument. ```python from pathlib import Path from neo4j_graphrag.experimental.components.data_loader import DataLoader from neo4j_graphrag.experimental.components.types import LoadedDocument class MyDataLoader(DataLoader): async def run(self, filepath: Path, metadata: Optional[Dict[str, str]] = None) -> LoadedDocument: # process file in `filepath` return LoadedDocument( text="text", document_info=DocumentInfo( path=str(filepath), metadata=metadata, ) ) ``` -------------------------------- ### Using Mistral AI LLM Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Instantiate the MistralAILLM class to use Mistral AI models. Ensure the `mistralai` package is installed. The API key can be provided directly or set via environment variables. ```python from neo4j_graphrag.llm import MistralAILLM llm = MistralAILLM( model_name="mistral-small-latest", api_key=api_key, # can also set `MISTRAL_API_KEY` in env vars ) llm.invoke("say something") ``` -------------------------------- ### Export Lexical Graph to Another Pipeline Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md Example showing how to export lexical graph creation into a separate pipeline for further processing. ```python from neo4j_graphrag.graphrag.pipeline.text_to_lexical_graph import TextToLexicalGraph from neo4j_graphrag.graphrag.pipeline.pipeline import Pipeline from neo4j_graphrag.graphrag.pipeline.components.loaders import TextLoader from neo4j_graphrag.graphrag.pipeline.components.splitters import RecursiveCharacterTextSplitter from neo4j_graphrag.graphrag.pipeline.components.writers import Neo4jWriter # Define pipeline components loader = TextLoader(text="This document contains information about Neo4j.") text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200) writer = Neo4jWriter() # Create TextToLexicalGraph instance text_to_lexical_graph = TextToLexicalGraph(loader=loader, text_splitter=text_splitter, writer=writer) # Create and run the pipeline pipeline = Pipeline(steps=[text_to_lexical_graph]) pipeline.run() print("Lexical graph created and exported successfully!") ``` -------------------------------- ### Customize LLM Tool Selection with System Instruction Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Provide a custom system instruction to guide the LLM's tool selection logic. This is useful for tailoring tool usage to specific domains or complex query types. ```python custom_instruction = """ You are a specialized assistant for movie database queries. Select tools based on query type: use vector_search for plot similarity, cypher_search for specific actor/director queries, and both for complex requests. """ tools_retriever = ToolsRetriever( driver=driver, llm=llm, tools=[vector_tool, cypher_tool], system_instruction=custom_instruction, ) ``` -------------------------------- ### Run Unit Tests Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/README.md Install project dependencies and run unit tests locally. This command executes the pytest suite located in the tests/unit directory. ```bash uv run pytest tests/unit ``` -------------------------------- ### Anthropic LLM Configuration Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Instantiate the AnthropicLLM class to use Anthropic models. Ensure 'anthropic' is installed and API key is provided. ```python from neo4j_graphrag.llm import AnthropicLLM llm = AnthropicLLM( model_name="claude-3-opus-20240229", model_params={"max_tokens": 1000}, # max_tokens must be specified api_key=api_key, # can also set `ANTHROPIC_API_KEY` in env vars ) llm.invoke("say something") ``` -------------------------------- ### Connect Components in a Pipeline Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_pipeline.rst Assemble components into a pipeline, connecting their inputs and outputs. This example connects two `ComponentAdd` instances, feeding the result of the first into the second. ```python import asyncio from neo4j_graphrag.experimental.pipeline import Pipeline pipe = Pipeline() pipe.add_component(ComponentAdd(), "a") pipe.add_component(ComponentAdd(), "b") pipe.connect("a", "b", input_config={"number2": "a.result"}) asyncio.run(pipe.run({"a": {"number1": 10, "number2": 1}, "b": {"number1": 4}})) ``` -------------------------------- ### Azure OpenAI LLM Configuration Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Use Azure OpenAI by switching to the AzureOpenAILLM class. Ensure the 'openai' Python package is installed. ```python from neo4j_graphrag.llm import AzureOpenAILLM llm = AzureOpenAILLM( model_name="gpt-5", azure_endpoint="https://example-endpoint.openai.azure.com/", # update with your endpoint api_version="2024-06-01", # update appropriate version api_key="...", # api_key is optional and can also be set with OPENAI_API_KEY env var ) llm.invoke("say something") ``` -------------------------------- ### Initialize SchemaBuilder Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Initializes the SchemaBuilder, which is used to define node and relationship types for grounding LLMs. This example shows the basic initialization. ```python from neo4j_graphrag.experimental.components.schema import ( SchemaBuilder, NodeType, PropertyType, RelationshipType, ) schema_builder = SchemaBuilder() await schema_builder.run( node_types=[ NodeType( label="Person", properties=[ ``` -------------------------------- ### Build Knowledge Graph with SimpleKGPipeline Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/README.md Constructs a knowledge graph from text using the SimpleKGPipeline. Requires APOC core library and OpenAI embeddings. Ensure 'openai' package is installed. ```python import asyncio from neo4j import GraphDatabase from neo4j_graphrag.embeddings import OpenAIEmbeddings from neo4j_graphrag.experimental.pipeline.kg_builder import SimpleKGPipeline from neo4j_graphrag.llm import OpenAILLM NEO4J_URI = "neo4j://localhost:7687" NEO4J_USERNAME = "neo4j" NEO4J_PASSWORD = "password" # Connect to the Neo4j database driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD)) # List the entities and relations the LLM should look for in the text node_types = ["Person", "House", "Planet"] relationship_types = ["PARENT_OF", "HEIR_OF", "RULES"] patterns = [ ("Person", "PARENT_OF", "Person"), ("Person", "HEIR_OF", "House"), ("House", "RULES", "Planet"), ] # Create an Embedder object embedder = OpenAIEmbeddings(model="text-embedding-3-large") # Instantiate the LLM llm = OpenAILLM( model_name="gpt-5", model_params={ "max_tokens": 2000, "response_format": {"type": "json_object"}, "temperature": 0, }, ) # Instantiate the SimpleKGPipeline kg_builder = SimpleKGPipeline( llm=llm, driver=driver, embedder=embedder, schema={ "node_types": node_types, "relationship_types": relationship_types, "patterns": patterns, }, on_error="IGNORE", from_file=False, ) # Run the pipeline on a piece of text text = ( "The son of Duke Leto Atreides and the Lady Jessica, Paul is the heir of House " "Atreides, an aristocratic family that rules the planet Caladan." ) asyncio.run(kg_builder.run_async(text=text)) driver.close() ``` -------------------------------- ### Configure Neo4j Driver in YAML Config Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Example of configuring Neo4j connection parameters (URI, user, password) within a YAML configuration file. ```yaml neo4j_config: params_: ``` -------------------------------- ### Using Cohere LLM Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Instantiate the CohereLLM class to integrate with Cohere models. The `cohere` Python package must be installed. The API key can be provided directly or set via environment variables. ```python from neo4j_graphrag.llm import CohereLLM llm = CohereLLM( model_name="command-r", api_key=api_key, # can also set `CO_API_KEY` in env vars ) llm.invoke("say something") ``` -------------------------------- ### Configure Neo4j Driver in JSON Config Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Example of configuring Neo4j connection parameters (URI, user, password) within a JSON configuration file. ```json { "neo4j_config": { "params_": { "uri": "bolt://...", "user": "neo4j", "password": "password" } } } ``` -------------------------------- ### Creating a Custom LLM with LLMBase Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Developers can create custom LLM integrations by subclassing `LLMBase`. This example demonstrates a custom Ollama LLM implementation handling both string and message list inputs. ```python from typing import Any, List, Optional, Type, Union import ollama from pydantic import BaseModel from neo4j_graphrag.llm import LLMBase, LLMResponse from neo4j_graphrag.message_history import MessageHistory from neo4j_graphrag.types import LLMMessage class MyOllamaLLM(LLMBase): def invoke( self, input: Union[str, List[LLMMessage]], message_history=None, system_instruction=None, response_format=None, **kwargs: Any, ) -> LLMResponse: if isinstance(input, str): messages = [{"role": "user", "content": input}] else: messages = list(input) response = ollama.chat(model=self.model_name, messages=messages) return LLMResponse(content=response["message"]["content"]) async def ainvoke( self, input: Union[str, List[LLMMessage]], message_history=None, system_instruction=None, response_format=None, **kwargs: Any, ) -> LLMResponse: return self.invoke(input) # TODO: implement with ollama.AsyncClient # retriever = ... llm = MyOllamaLLM("llama3:8b") rag = GraphRAG(retriever=retriever, llm=llm) query_text = "How do I do similarity search in Neo4j?" response = rag.search(query_text=query_text, retriever_config={"top_k": 5}) print(response.answer) ``` -------------------------------- ### Build Sphinx Documentation Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/README.md Run this command from the project root to build the HTML documentation. ```bash make -C docs html ``` -------------------------------- ### Define a Custom Component for Pipeline Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_pipeline.rst Create a custom component by subclassing `Component` and defining a `run` method with input and output data models. This example defines a component to add two integers. ```python from neo4j_graphrag.experimental.pipeline import Component, DataModel class IntResultModel(DataModel): result: int class ComponentAdd(Component): async def run(self, number1: int, number2: int = 1) -> IntResultModel: return IntResultModel(result = number1 + number2) ``` -------------------------------- ### Build KG Pipeline from Config File with URL Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md This script demonstrates building a knowledge graph pipeline using a configuration file and a PDF URL. ```python from neo4j_graphrag.graph_rag.rag_graph import RagGraph # Initialize the RagGraph with a Neo4j connection, config file, and PDF URL rag_graph = RagGraph(uri="bolt://localhost:7687", username="neo4j", password="password", config_file="./config.yaml", pdf_url="http://example.com/document.pdf") # Build the knowledge graph using the configuration and URL rag_graph.build_graph_from_config() print("Knowledge graph built successfully using config file and PDF URL!") ``` -------------------------------- ### Build Pipeline from Config File Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md Demonstrates how to construct a knowledge graph pipeline by loading its configuration from files. ```python from neo4j_graphrag.graphrag.pipeline.pipeline import Pipeline # Load pipeline from configuration files pipeline = Pipeline.from_config_files(config_path="./pipeline_config.yaml") pipeline.run() print("Pipeline built and run successfully from config files!") ``` -------------------------------- ### Build KG Pipeline from Config File Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md This script shows how to build a knowledge graph pipeline using a configuration file. ```python from neo4j_graphrag.graph_rag.rag_graph import RagGraph # Initialize the RagGraph with a Neo4j connection and a config file rag_graph = RagGraph(uri="bolt://localhost:7687", username="neo4j", password="password", config_file="./config.yaml") # Build the knowledge graph using the configuration rag_graph.build_graph_from_config() print("Knowledge graph built successfully using config file!") ``` -------------------------------- ### Initialize SimpleKGPipeline Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Instantiate the `SimpleKGPipeline` for building a KG from a file or text. Ensure you provide valid LLM, Neo4j driver, and embedder interfaces. Set `from_file` to `True` for file paths or `False` for direct text input. ```python from neo4j_graphrag.experimental.pipeline.kg_builder import SimpleKGPipeline kg_builder = SimpleKGPipeline( llm=llm, # an LLMInterface for Entity and Relation extraction driver=neo4j_driver, # a neo4j driver to write results to graph embedder=embedder, # an Embedder for chunks from_file=True, # set to False if parsing an already extracted text ) await kg_builder.run_async(file_path=str(file_path)) # await kg_builder.run_async(text="my text") # if using from_file=False ``` -------------------------------- ### Populate Pinecone and Neo4j Databases Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/customize/retrievers/external/pinecone/README.md Run this command from the project root to write test data to both Pinecone and Neo4j databases. Ensure NEO4J_AUTH, NEO4J_URL, and PC_API_KEY are updated in the script. ```bash uv run python -m tests/e2e/pinecone_e2e/populate_dbs.py ``` -------------------------------- ### Use Custom Prompt for Answering Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md Example of how to use a custom prompt when generating answers with the GraphRAG system. ```python from neo4j_graphrag.graphrag.answer.custom_prompt import CustomPrompt # Example usage (assuming CustomPrompt is a callable or class) # custom_prompt = CustomPrompt(prompt="Your custom prompt here") # result = custom_prompt.run(...) ``` -------------------------------- ### Run Pipeline from Configuration File Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Initialize and run a pipeline using a configuration file (JSON or YAML) with the PipelineRunner. ```python from neo4j_graphrag.experimental.pipeline.config.runner import PipelineRunner file_path = "my_config.json" pipeline = PipelineRunner.from_config_file(file_path) await pipeline.run({"text": "my text"}) ``` -------------------------------- ### Run a Knowledge Graph Builder Component Individually Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Demonstrates how to instantiate and run a single component, such as PdfLoader, independently using asyncio. ```python import asyncio from neo4j_graphrag.experimental.components.data_loader import PdfLoader my_component = PdfLoader() asyncio.run(my_component.run("my_file.pdf")) ``` -------------------------------- ### Add Event Listener for Pipeline Progress Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md Example of adding an event listener to a pipeline to receive notifications about its progress. ```python from neo4j_graphrag.graphrag.pipeline.pipeline import Pipeline from neo4j_graphrag.graphrag.pipeline.listeners import PipelineListener class MyPipelineListener(PipelineListener): def on_pipeline_start(self, pipeline_id): print(f"Pipeline {pipeline_id} started.") def on_pipeline_end(self, pipeline_id): print(f"Pipeline {pipeline_id} ended.") # Assume pipeline is already defined # pipeline = Pipeline(...) # pipeline.add_listener(MyPipelineListener()) # pipeline.run() ``` -------------------------------- ### Set up Pipeline with Event Callback Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_pipeline.rst Initialize a pipeline with a callback function to handle events. The callback can process various event types, including pipeline-level events. ```python import asyncio import logging from neo4j_graphrag.experimental.pipeline import Pipeline from neo4j_graphrag.experimental.pipeline.notification import Event logger = logging.getLogger(__name__) logging.basicConfig() logger.setLevel(logging.WARNING) async def event_handler(event: Event) -> None: """Function can do anything about the event, here we're just logging it if it's a pipeline-level event. """ if event.event_type.is_pipeline_event: logger.warning(event) pipeline = Pipeline( callback=event_handler, ) # ... add components, connect them as usual await pipeline.run(...) ``` -------------------------------- ### Write Data to Neo4j and Qdrant Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/customize/retrievers/external/qdrant/README.md Execute this command from the project root to populate both Neo4j and Qdrant databases. ```bash uv run python -m examples.customize.retrievers.external.qdrant.populate_dbs ``` -------------------------------- ### Search Weaviate by Vector Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/customize/retrievers/external/weaviate/README.md Execute a vector search against Weaviate. This example demonstrates searching using pre-computed vector embeddings. ```bash # search by vector uv run python -m examples.customize.retrievers.external.weaviate.vector_search ``` -------------------------------- ### Initialize HybridCypherRetriever Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Set up the HybridCypherRetriever for combined vector and full-text searches, followed by a Cypher query for graph traversal. Requires specifying index names and a retrieval query. ```python from neo4j_graphrag.retrievers import HybridCypherRetriever INDEX_NAME = "embedding-name" FULLTEXT_INDEX_NAME = "fulltext-index-name" retriever = HybridCypherRetriever( driver, INDEX_NAME, FULLTEXT_INDEX_NAME, retrieval_query="MATCH (node)-[:AUTHORED_BY]->(author:Author) RETURN author.name" embedder=embedder, ) ``` -------------------------------- ### Initialize SimpleKGPipeline with Extra Configurations Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Configure the EntityAndRelationExtractor component with custom prompt templates, lexical graph configurations, and error handling. ```python kg_builder = SimpleKGPipeline( # ... prompt_template="", lexical_graph_config=my_config, on_error="RAISE", # ... ) ``` -------------------------------- ### Customize LLMEntityRelationExtractor Prompt Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Provide a custom prompt string to the LLMEntityRelationExtractor. The prompt can utilize variables like {text}, {schema}, and {examples}. ```python extractor = LLMEntityRelationExtractor( llm=...., prompt="Extract entities from {text}", ) ``` -------------------------------- ### Tool Calling with OpenAI Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md This script demonstrates how to enable tool calling capabilities with OpenAI models. ```python from neo4j_graphrag.llms.openai import OpenAIWrapper # Define a tool (e.g., a function to get current weather) def get_weather(location: str) -> str: return f"The weather in {location} is sunny." # Initialize OpenAI wrapper and register the tool llm = OpenAIWrapper(api_key="YOUR_OPENAI_API_KEY", tools=[get_weather]) # Ask a question that requires using the tool response = llm.invoke("What's the weather like in London?") print(response) ``` -------------------------------- ### Initialize ToolsRetriever with Vector and Text2Cypher Retrievers Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Connect to Neo4j, configure LLM and embedder, create VectorRetriever and Text2CypherRetriever, convert them to tools, and initialize ToolsRetriever. This retriever intelligently selects and executes appropriate tools based on user queries. ```python from neo4j import GraphDatabase from neo4j_graphrag.retrievers import ToolsRetriever, VectorRetriever, Text2CypherRetriever from neo4j_graphrag.llm import OpenAILLM from neo4j_graphrag.embeddings import OpenAIEmbeddings URI = "neo4j://localhost:7687" AUTH = ("neo4j", "password") # Connect to Neo4j database driver = GraphDatabase.driver(URI, auth=AUTH) # Create LLM object llm = OpenAILLM(model_name="gpt-5") # Create embedder embedder = OpenAIEmbeddings(model="text-embedding-3-large") # Create individual retrievers to use as tools vector_retriever = VectorRetriever( driver=driver, index_name="embedding-index", embedder=embedder, ) text2cypher_retriever = Text2CypherRetriever( driver=driver, llm=llm, ) # Convert retrievers to tools vector_tool = vector_retriever.convert_to_tool( name="vector_search", description="Search for similar documents using vector similarity", ) cypher_tool = text2cypher_retriever.convert_to_tool( name="cypher_search", description="Generate and execute Cypher queries for structured data retrieval", ) # Initialize ToolsRetriever with the tools tools_retriever = ToolsRetriever( driver=driver, llm=llm, tools=[vector_tool, cypher_tool], ) # Use the retriever - the LLM will automatically select appropriate tools result = tools_retriever.search("What movies did Tom Hanks act in and what are their plots?") ``` -------------------------------- ### Write Data to Weaviate and Neo4j Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/customize/retrievers/external/weaviate/README.md Run this script from the project root to populate both Neo4j and Weaviate databases with data. ```bash uv run python -m tests/e2e/weaviate_e2e/populate_dbs.py ``` -------------------------------- ### Tool Calling with VertexAI Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md This script demonstrates enabling tool calling with Google's Vertex AI models. ```python from neo4j_graphrag.llms.vertexai import VertexAIWrapper # Define a tool (e.g., a function to search for information) def search_info(query: str) -> str: return f"Search results for {query}: ..." # Initialize VertexAI wrapper and register the tool llm = VertexAIWrapper(project="your-gcp-project", location="us-central1", tools=[search_info]) # Ask a question that requires using the tool response = llm.invoke("Find information about Neo4j GraphRAG.") print(response) ``` -------------------------------- ### Initialize LLMEntityRelationExtractor Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Instantiate the `LLMEntityRelationExtractor` with an LLM, such as `OpenAILLM`. Ensure the LLM adheres to the `LLMInterface`. This example uses a JSON object response format. ```python from neo4j_graphrag.experimental.components.entity_relation_extractor import ( LLMEntityRelationExtractor, ) from neo4j_graphrag.experimental.components.types import ( TextChunks, TextChunk ) from neo4j_graphrag.llm import OpenAILLM extractor = LLMEntityRelationExtractor( llm=OpenAILLM( model_name="gpt-5", model_params={ "max_tokens": 1000, "response_format": {"type": "json_object"}, }, ) ) await extractor.run(chunks=TextChunks(chunks=[TextChunk(text="some text", index=0)])) ``` -------------------------------- ### Tool Calling with Ollama Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md This script demonstrates enabling tool calling with Ollama for local models. ```python from neo4j_graphrag.llms.ollama import OllamaWrapper # Define a tool (e.g., a calculator function) def add(a: int, b: int) -> int: return a + b # Initialize Ollama wrapper and register the tool llm = OllamaWrapper(model="llama2", tools=[add]) # Ask a question that requires using the tool response = llm.invoke("What is 5 plus 7?") print(response) ``` -------------------------------- ### Build Knowledge Graph from Text Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md This script demonstrates an end-to-end pipeline for building a knowledge graph from a plain text file. ```python from neo4j_graphrag.graph_rag.rag_graph import RagGraph # Initialize the RagGraph with a Neo4j connection (replace with your connection details) rag_graph = RagGraph(uri="bolt://localhost:7687", username="neo4j", password="password") # Build the knowledge graph from a text file rag_graph.build_graph_from_text(text_path="./example.txt") print("Knowledge graph built successfully!") ``` -------------------------------- ### Subclass EntityRelationExtractor for Custom Logic Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Subclass the EntityRelationExtractor to implement custom logic for running the extraction process. This example demonstrates returning a predefined Neo4jGraph. ```python from pydantic import validate_call from neo4j_graphrag.experimental.components.entity_relation_extractor import EntityRelationExtractor from neo4j_graphrag.experimental.components.types import ( Neo4jGraph, Neo4jNode, Neo4jRelationship, TextChunks, ) class MyExtractor(EntityRelationExtractor): @validate_call async def run(self, chunks: TextChunks, **kwargs: Any) -> Neo4jGraph: return Neo4jGraph( nodes=[ Neo4jNode(id="0", label="Person", properties={"name": "A. Einstein"}), Neo4jNode(id="1", label="Concept", properties={"name": "Theory of relativity"}), ], relationships=[ Neo4jRelationship(type="PROPOSED_BY", start_node_id="1", end_node_id="0", properties={"year": 1915}) ], ) ``` -------------------------------- ### Using Ollama LLM Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Instantiate the OllamaLLM class to query local models via Ollama. Assumes Ollama is running on the default address. Optional parameters for model configuration and host can be provided. ```python from neo4j_graphrag.llm import OllamaLLM llm = OllamaLLM( model_name="orca-mini", # model_params={"options": {"temperature": 0}, "format": "json"}, # host="...", # when using a remote server ) llm.invoke("say something") ``` -------------------------------- ### Configure LLM with API Key from Environment Variable (YAML) Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst YAML configuration for setting up an LLM, including fetching the API key from an environment variable. The `resolver_` should be 'ENV' and `var_` should specify the environment variable name. ```yaml llm_config: class_: OpenAILLM params_: model_name: gpt-5 api_key: resolver_: ENV var_: OPENAI_API_KEY model_params: temperature: 0 max_tokens: 2000 ``` -------------------------------- ### Filter Nodes During Resolution Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Configure an entity resolver with a `filter_query` to exclude specific nodes from the resolution process, for example, nodes already marked with a ':Resolved' label. ```python from neo4j_graphrag.experimental.components.resolver import ( SinglePropertyExactMatchResolver, ) filter_query = "WHERE NOT entity:Resolved" resolver = SinglePropertyExactMatchResolver(driver, filter_query=filter_query) res = await resolver.run() ``` -------------------------------- ### Run All Tests with uv Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/index.rst Execute all tests in the project using the uv command. ```bash uv run pytest ``` -------------------------------- ### Implement a Custom Text Splitter Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_kg_builder.rst Example of creating a custom text splitter by inheriting from the TextSplitter interface. This involves defining the run method to return TextChunks. ```python from neo4j_graphrag.experimental.components.text_splitters.base import TextSplitter from neo4j_graphrag.experimental.components.types import TextChunks, TextChunk class MyTextSplitter(TextSplitter): ``` -------------------------------- ### Integrating LangChain LLM with GraphRAG Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Use LangChain's LLM implementations with GraphRAG by instantiating a compatible LangChain model and passing it to the GraphRAG constructor. This example uses ChatOllama. ```python from neo4j_graphrag.generation import GraphRAG from langchain_community.chat_models import ChatOllama # retriever = ... llm = ChatOllama(model="llama3:8b") rag = GraphRAG(retriever=retriever, llm=llm) query_text = "How do I do similarity search in Neo4j?" response = rag.search(query_text=query_text, retriever_config={"top_k": 5}) print(response.answer) ``` -------------------------------- ### Initialize Text2CypherRetriever Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Configure the Text2CypherRetriever to generate and execute Cypher queries for information retrieval, using potentially different LLMs for query generation and answer generation. ```python from neo4j import GraphDatabase from neo4j_graphrag.retrievers import Text2CypherRetriever from neo4j_graphrag.llm import OpenAILLM URI = "neo4j://localhost:7687" ``` -------------------------------- ### Implementing a Custom Rate Limit Handler Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/docs/source/user_guide_rag.rst Create a custom rate limit handler by inheriting from `RateLimitHandler` and implementing the desired retry logic. This example shows the base class for custom implementations. ```python from neo4j_graphrag.llm import AnthropicLLM from neo4j_graphrag.utils.rate_limit import RateLimitHandler class CustomRateLimitHandler(RateLimitHandler): pass # Example usage (assuming AnthropicLLM is configured to use this handler) # llm = AnthropicLLM(model_name="claude-3-opus", rate_limit_handler=CustomRateLimitHandler()) ``` -------------------------------- ### Build Knowledge Graph from PDF Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md This script demonstrates an end-to-end pipeline for building a knowledge graph from a PDF document. ```python from neo4j_graphrag.graph_rag.rag_graph import RagGraph # Initialize the RagGraph with a Neo4j connection (replace with your connection details) rag_graph = RagGraph(uri="bolt://localhost:7687", username="neo4j", password="password") # Build the knowledge graph from a PDF file rag_graph.build_graph_from_pdf(pdf_path="./example.pdf") print("Knowledge graph built successfully!") ``` -------------------------------- ### MistralAI LLM Integration Source: https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/README.md This script demonstrates integrating with Mistral AI language models. ```python from neo4j_graphrag.llms.mistralai import MistralAIWrapper # Initialize MistralAI wrapper with your API key llm = MistralAIWrapper(api_key="YOUR_MISTRAL_API_KEY") # Use the LLM for text generation response = llm.invoke("Generate a short story about a robot.") print(response) ```