### Set up Haystack for development Source: https://docs.haystack.deepset.ai/docs/intro/installation Clones the official Haystack repository from GitHub, navigates into the cloned directory, and upgrades pip. This prepares your local environment for making changes and contributing to the Haystack codebase. ```Shell # Clone the repo git clone https://github.com/deepset-ai/haystack.git # Move into the cloned folder cd haystack # Upgrade pip pip install --upgrade pip ``` -------------------------------- ### Install Haystack with Development Dependencies Source: https://docs.haystack.deepset.ai/docs/intro/installation This command installs the Haystack library from its source in editable mode, which is useful for development and contributing to the project. The '.[dev]' part ensures that all development-related dependencies are also installed. ```bash pip install -e '.[dev]' ``` -------------------------------- ### Install Haystack-ai with pip Source: https://docs.haystack.deepset.ai/docs/intro/installation Installs the Haystack-ai package using pip, the Python package installer. This is the recommended method for basic Haystack usage. ```Shell pip install haystack-ai ``` -------------------------------- ### Use AzureAISearchHybridRetriever Independently Source: https://docs.haystack.deepset.ai/docs/intro/azureaisearchhybridretriever Demonstrates how to initialize `AzureAISearchDocumentStore`, write sample documents, and then use `AzureAISearchHybridRetriever` with a query and a dummy embedding for standalone retrieval. This example showcases basic setup and execution of the retriever. ```Python from haystack import Document from haystack_integrations.components.retrievers.azure_ai_search import AzureAISearchHybridRetriever from haystack_integrations.document_stores.azure_ai_search import AzureAISearchDocumentStore document_store = AzureAISearchDocumentStore(index_name="haystack_docs") documents = [Document(content="There are over 7,000 languages spoken around the world today."), Document(content="Elephants have been observed to behave in a way that indicates a high level of self-awareness, such as recognizing themselves in mirrors."), Document(content="In certain parts of the world, like the Maldives, Puerto Rico, and San Diego, you can witness the phenomenon of bioluminescent waves.")] document_store.write_documents(documents=documents) retriever = AzureAISearchHybridRetriever(document_store=document_store) # fake embeddings to keep the example simple retriever.run(query="How many languages are spoken around the world today?", query_embedding=[0.1]*384) ``` -------------------------------- ### Haystack 1.x Indexing Pipeline Example Source: https://docs.haystack.deepset.ai/docs/intro/migration This Python code demonstrates how to construct an indexing pipeline in Haystack 1.x. It initializes a document store, adds nodes for file classification, text conversion, preprocessing, and document writing, then executes the pipeline with specified file paths and metadata. ```Python from haystack.document_stores import InMemoryDocumentStore from haystack.nodes.file_classifier import FileTypeClassifier from haystack.nodes.file_converter import TextConverter from haystack.nodes.preprocessor import PreProcessor from haystack.pipelines import Pipeline # Initialize a DocumentStore document_store = InMemoryDocumentStore() # Indexing Pipeline indexing_pipeline = Pipeline() # Makes sure the file is a TXT file (FileTypeClassifier node) classifier = FileTypeClassifier() indexing_pipeline.add_node(classifier, name="Classifier", inputs=["File"]) # Converts a file into text and performs basic cleaning (TextConverter node) text_converter = TextConverter(remove_numeric_tables=True) indexing_pipeline.add_node(text_converter, name="Text_converter", inputs=["Classifier.output_1"]) # Pre-processes the text by performing splits and adding metadata to the text (Preprocessor node) preprocessor = PreProcessor( clean_whitespace=True, clean_empty_lines=True, split_length=100, split_overlap=50, split_respect_sentence_boundary=True, ) indexing_pipeline.add_node(preprocessor, name="Preprocessor", inputs=["Text_converter"]) # - Writes the resulting documents into the document store indexing_pipeline.add_node(document_store, name="Document_Store", inputs=["Preprocessor"]) # Then we run it with the documents and their metadata as input result = indexing_pipeline.run(file_paths=file_paths, meta=files_metadata) ``` -------------------------------- ### Install Sentence Transformers for Embeddings Source: https://docs.haystack.deepset.ai/docs/intro/astraretriever This command optionally installs the `sentence-transformers` library, which is used for generating document and text embeddings in the provided Haystack example. It is required to run the example pipeline. ```Shell pip install sentence-transformers ``` -------------------------------- ### Haystack 2.x Extractive QA Query Pipeline Source: https://docs.haystack.deepset.ai/docs/intro/migration This Python snippet illustrates building and running an extractive Question Answering pipeline with Haystack 2.x. It sets up an InMemoryDocumentStore, writes sample documents, and integrates InMemoryBM25Retriever and ExtractiveReader components into a new Pipeline. The example concludes by running a query with data parameters for the retriever and reader. ```Python from haystack.document_stores.in_memory import InMemoryDocumentStore from haystack import Document, Pipeline from haystack.components.retrievers.in_memory import InMemoryBM25Retriever from haystack.components.readers import ExtractiveReader document_store = InMemoryDocumentStore() document_store.write_documents([ Document(content="Paris is the capital of France."), Document(content="Berlin is the capital of Germany."), Document(content="Rome is the capital of Italy."), Document(content="Madrid is the capital of Spain."), ]) retriever = InMemoryBM25Retriever(document_store) reader = ExtractiveReader(model="deepset/roberta-base-squad2") extractive_qa_pipeline = Pipeline() extractive_qa_pipeline.add_component("retriever", retriever) extractive_qa_pipeline.add_component("reader", reader) extractive_qa_pipeline.connect("retriever", "reader") query = "What is the capital of France?" result = extractive_qa_pipeline.run(data={ "retriever": {"query": query, "top_k": 3}, "reader": {"query": query, "top_k": 2} }) ``` -------------------------------- ### Install Weaviate Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/weaviatedocumentstore Instructions to install the Weaviate Haystack integration package using pip. ```Shell pip install weaviate-haystack ``` -------------------------------- ### Install Haystack Experimental Package Source: https://docs.haystack.deepset.ai/docs/intro/experimental-package Command to install or upgrade the `haystack-experimental` package using pip. This ensures you get the latest experimental features. ```Shell pip install -U haystack-experimental ``` -------------------------------- ### Connect QdrantDocumentStore to Qdrant Cloud Source: https://docs.haystack.deepset.ai/docs/intro/qdrant-document-store This example shows how to connect `QdrantDocumentStore` to a Qdrant Cloud instance using a URL, index name, embedding dimension, and API key. It also includes writing sample documents and counting them, similar to the in-memory setup. ```Python from haystack.dataclasses.document import Document from haystack_integrations.document_stores.qdrant import QdrantDocumentStore from haystack.utils import Secret document_store = QdrantDocumentStore( url="https://XXXXXXXXX.us-east4-0.gcp.cloud.qdrant.io:6333", index="your_index_name", embedding_dim=1024, # based on the embedding model recreate_index=True, # enable only to recreate the index and not connect to the existing one api_key = Secret.from_token("YOUR_TOKEN") ) document_store.write_documents([ Document(content="This is first", embedding=[0.0]*5), Document(content="This is second", embedding=[0.1, 0.2, 0.3, 0.4, 0.5]) ]) print(document_store.count_documents()) ``` -------------------------------- ### Haystack Indexing Pipeline Setup Source: https://docs.haystack.deepset.ai/docs/intro/migration This Python snippet demonstrates how to build a Haystack indexing pipeline. It initializes an InMemoryDocumentStore and adds components like FileTypeRouter, TextFileToDocument, DocumentCleaner, DocumentSplitter, and DocumentWriter to process and store documents. The components are connected sequentially to define the data flow. ```Python document_store = InMemoryDocumentStore() indexing_pipeline = Pipeline() classifier = FileTypeRouter(mime_types=["text/plain"]) indexing_pipeline.add_component("file_type_router", classifier) text_converter = TextFileToDocument() indexing_pipeline.add_component("text_converter", text_converter) cleaner = DocumentCleaner( remove_empty_lines=True, remove_extra_whitespaces=True, ) indexing_pipeline.add_component("cleaner", cleaner) preprocessor = DocumentSplitter( split_by="passage", split_length=100, split_overlap=50 ) indexing_pipeline.add_component("preprocessor", preprocessor) indexing_pipeline.add_component("writer", DocumentWriter(document_store)) indexing_pipeline.connect("file_type_router.text/plain", "text_converter") indexing_pipeline.connect("text_converter", "cleaner") indexing_pipeline.connect("cleaner", "preprocessor") indexing_pipeline.connect("preprocessor", "writer") result = indexing_pipeline.run({"file_type_router": {"sources": file_paths}}) ``` -------------------------------- ### Setting up and Running a Haystack Chat Pipeline Source: https://docs.haystack.deepset.ai/docs/intro/azureopenaichatgenerator This Python snippet illustrates the setup of a Haystack pipeline. It initializes ChatPromptBuilder and AzureOpenAIChatGenerator, adds them as components, and connects the prompt builder's output to the LLM's input. The example then shows how to run the pipeline with templated messages, including a system prompt and a user prompt with a dynamic variable. ```python prompt_builder = ChatPromptBuilder() llm = AzureOpenAIChatGenerator() pipe = Pipeline() pipe.add_component("prompt_builder", prompt_builder) pipe.add_component("llm", llm) pipe.connect("prompt_builder.prompt", "llm.messages") location = "Berlin" messages = [ChatMessage.from_system("Always respond in German even if some input data is in other languages."), ChatMessage.from_user("Tell me about {{location}}")] pipe.run(data={"prompt_builder": {"template_variables":{"location": location}, "template": messages}}) ``` -------------------------------- ### Install Haystack GitHub Integration Source: https://docs.haystack.deepset.ai/docs/intro/githubrepoforker Instructions to install the GitHub integration package for Haystack using pip. ```Shell pip install github-haystack ``` -------------------------------- ### Install GitHub Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/githubissueviewer Instructions to install the necessary Python package for the GitHub integration using pip. ```Shell pip install github-haystack ``` -------------------------------- ### Install fastembed-haystack Source: https://docs.haystack.deepset.ai/docs/intro/fastembedsparsedocumentembedder Instructions to install the fastembed-haystack package using pip. ```Shell pip install fastembed-haystack ``` -------------------------------- ### Install stackit-haystack package Source: https://docs.haystack.deepset.ai/docs/intro/stackitdocumentembedder Instructions to install the `stackit-haystack` package using pip and set the `STACKIT_API_KEY` environment variable for authentication. ```Shell pip install stackit-haystack ``` -------------------------------- ### Install STACKIT Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/stackitchatgenerator Instructions to install the `stackit-haystack` package, which provides the `STACKITChatGenerator` component, using pip. ```Shell pip install stackit-haystack ``` -------------------------------- ### Haystack 1.x Extractive QA Query Pipeline Source: https://docs.haystack.deepset.ai/docs/intro/migration This Python example shows how to create and run an extractive Question Answering pipeline using Haystack 1.x. It initializes an InMemoryDocumentStore, populates it with sample documents, and configures a BM25Retriever and a FARMReader. The pipeline is then executed with a query and specific top_k parameters for retrieval and reading. ```Python from haystack.document_stores import InMemoryDocumentStore from haystack.pipelines import ExtractiveQAPipeline from haystack import Document from haystack.nodes import BM25Retriever from haystack.nodes import FARMReader document_store = InMemoryDocumentStore(use_bm25=True) document_store.write_documents([ Document(content="Paris is the capital of France."), Document(content="Berlin is the capital of Germany."), Document(content="Rome is the capital of Italy."), Document(content="Madrid is the capital of Spain."), ]) retriever = BM25Retriever(document_store=document_store) reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2") extractive_qa_pipeline = ExtractiveQAPipeline(reader, retriever) query = "What is the capital of France?" result = extractive_qa_pipeline.run( query=query, params={ "Retriever": {"top_k": 10}, "Reader": {"top_k": 5} } ) ``` -------------------------------- ### Install OpenRouter Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/openrouterchatgenerator Instructions to install the `openrouter-haystack` integration using pip. ```Shell pip install openrouter-haystack ``` -------------------------------- ### Install python-docx package Source: https://docs.haystack.deepset.ai/docs/intro/docxtodocument Instructions to install the necessary `python-docx` package using pip before using the DOCXToDocument converter. ```Shell pip install python-docx ``` -------------------------------- ### Perform basic chat completion with OpenAIChatGenerator Source: https://docs.haystack.deepset.ai/docs/intro/openaichatgenerator This example illustrates the basic usage of `OpenAIChatGenerator` to get a chat completion. It shows how to import `ChatMessage` and `OpenAIChatGenerator`, initialize the generator, and then call its `run()` method with a list of `ChatMessage` objects. The output demonstrates the structure of the response, including the generated assistant message. ```Python from haystack.dataclasses import ChatMessage from haystack.components.generators.chat import OpenAIChatGenerator client = OpenAIChatGenerator() response = client.run( [ChatMessage.from_user("What's Natural Language Processing? Be brief.")] ) ``` -------------------------------- ### Install qdrant-haystack for Pipeline Integration Source: https://docs.haystack.deepset.ai/docs/intro/fastembedsparsedocumentembedder Instructions to install the `qdrant-haystack` package, required for using QdrantDocumentStore in a Haystack pipeline with sparse embeddings. ```Shell pip install qdrant-haystack ``` -------------------------------- ### Install GitHub Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/githubissueviewertool Installs the necessary GitHub integration package for Haystack to use the GitHubIssueViewerTool. ```Shell pip install github-haystack ``` -------------------------------- ### Haystack Component: PreProcessor Source: https://docs.haystack.deepset.ai/docs/intro/migration Cleans and splits documents. Example usage: Normalizing white spaces, getting rid of headers and footers, splitting documents into smaller ones. ```APIDOC Haystack 1.x: PreProcessor Haystack 2.x: PreProcessors (/docs/preprocessors) ``` -------------------------------- ### Example: Run Chat with Website Pipeline Source: https://docs.haystack.deepset.ai/docs/intro/pipeline-templates Demonstrates how to initialize and run the `CHAT_WITH_WEBSITE` pipeline template with specific URLs and a query. ```python from haystack import Pipeline, PredefinedPipeline pipeline = Pipeline.from_template(PredefinedPipeline.CHAT_WITH_WEBSITE) pipeline.run({"fetcher": {"urls": ["https://haystack.deepset.ai:"]}, "prompt": {"query": "what is Haystack?"}}) ``` -------------------------------- ### FaithfulnessEvaluator Few-Shot Examples Format Source: https://docs.haystack.deepset.ai/docs/intro/faithfulnessevaluator Illustrates the expected dictionary format for the `examples` parameter, used to provide few-shot examples to the FaithfulnessEvaluator. Each example includes `inputs` (questions, contexts, predicted_answers) and `outputs` (statements, statement_scores). ```Python [{ "inputs": { "questions": "What is the capital of Italy?", "contexts": ["Rome is the capital of Italy."], "predicted_answers": "Rome is the capital of Italy with more than 4 million inhabitants.", }, "outputs": { "statements": ["Rome is the capital of Italy.", "Rome has more than 4 million inhabitants."], "statement_scores": [1, 0], }, }] ``` -------------------------------- ### Install Qdrant Haystack Integration for Pipelines Source: https://docs.haystack.deepset.ai/docs/intro/fastembedsparsetextembedder Instructions to install the `qdrant-haystack` package, which is required for using sparse embedding retrieval with `QdrantDocumentStore` in Haystack pipelines. ```Shell pip install qdrant-haystack ``` -------------------------------- ### Haystack Component: Retriever Source: https://docs.haystack.deepset.ai/docs/intro/migration Fetches relevant documents from the Document Store. Example usage: Coupling Retriever with a Reader in a query pipeline to speed up the search (the Reader only goes through the documents it gets from the Retriever). ```APIDOC Haystack 1.x: Retriever Haystack 2.x: Retrievers (/docs/retrievers) ``` -------------------------------- ### Example: Run Generative QA Pipeline Source: https://docs.haystack.deepset.ai/docs/intro/pipeline-templates Demonstrates how to initialize and run the `GENERATIVE_QA` pipeline template with a specific question. ```python from haystack import Pipeline, PredefinedPipeline pipeline = Pipeline.from_template(PredefinedPipeline.GENERATIVE_QA) pipeline.run({"prompt_builder":{"question":"Where is Rome?"}}) ``` -------------------------------- ### Install Qdrant Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/qdrant-document-store Instructions to install the Qdrant integration for Haystack using pip, enabling the use of Qdrant as a document store. ```Shell pip install qdrant-haystack ``` -------------------------------- ### Install Elasticsearch with Docker Pull Source: https://docs.haystack.deepset.ai/docs/intro/elasticsearchbm25retriever Command to pull and run an Elasticsearch Docker image, exposing port 9200 and disabling security for a single-node setup. This provides a quick way to get an Elasticsearch instance running for development or testing. ```Shell docker pull docker.elastic.co/elasticsearch/elasticsearch:8.11.1 docker run -p 9200:9200 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1024m -Xmx1024m" -e "xpack.security.enabled=false" elasticsearch:8.11.1 ``` -------------------------------- ### Initialize and Use QdrantEmbeddingRetriever Standalone Source: https://docs.haystack.deepset.ai/docs/intro/qdrantretriever Demonstrates how to set up a `QdrantDocumentStore` in memory and initialize `QdrantEmbeddingRetriever`. It shows a basic example of running the retriever with a placeholder query embedding. ```Python from haystack_integrations.components.retrievers.qdrant import QdrantEmbeddingRetriever from haystack_integrations.document_stores.qdrant import QdrantDocumentStore document_store = QdrantDocumentStore( ":memory:", recreate_index=True, return_embedding=True, wait_result_from_api=True, ) retriever = QdrantEmbeddingRetriever(document_store=document_store) # using a fake vector to keep the example simple retriever.run(query_embedding=[0.1]*768) ``` -------------------------------- ### Haystack Pipeline Class Core Methods Source: https://docs.haystack.deepset.ai/docs/intro/pipelines Documentation for the fundamental methods of the Haystack `Pipeline` class, covering instantiation, component management, connection logic, execution, and serialization capabilities. ```APIDOC Pipeline(): Description: Creates a new Pipeline object. Pipeline.add_component(name, component): Description: Adds a component to the pipeline without connecting it yet. Parameters: name: The unique name for the component within the pipeline. component: The component instance to add. Pipeline.connect("producer_component.output_name", "consumer_component.input_name"): Description: Explicitly connects an output of a producer component to an input of a consumer component. Performs validation before running. Parameters: producer_component.output_name: The specific output of the producing component. consumer_component.input_name: The specific input of the consuming component. Validation Checks: - Components exist in the pipeline. - Components' outputs and inputs match and are explicitly indicated. - Components' types match. - For input types other than Variadic, checks if the input is already occupied. Pipeline.run({"component_1": {"mandatory_inputs": value}, ...}): Description: Executes the pipeline by specifying the first component and its mandatory inputs. Optional inputs can be passed to other components. Parameters: component_1: The name of the first component to run. mandatory_inputs: A dictionary of mandatory inputs for the first component. (Optional) component_N: A dictionary of inputs for other components in the pipeline. Pipeline.from_dict(data: dict): Description: Loads a Pipeline object from a dictionary representation, enabling deserialization. Parameters: data: A dictionary containing the serialized pipeline data. Pipeline.to_dict(): Description: Converts the Pipeline object into a dictionary format for serialization, including its components and connections. Returns: A dictionary representing the serialized pipeline. ``` -------------------------------- ### Install Haystack-ai with conda Source: https://docs.haystack.deepset.ai/docs/intro/installation Configures conda channels to include haystack-ai_rc and then installs the Haystack-ai package using conda, an open-source package management system. This provides an alternative installation method. ```Shell conda config --add channels conda-forge/label/haystack-ai_rc conda install haystack-ai ``` -------------------------------- ### Start OpenSearch with Docker Compose Source: https://docs.haystack.deepset.ai/docs/intro/opensearch-document-store This command uses `docker compose` to start an OpenSearch instance, assuming a `docker-compose.yml` file is present in the current directory. This is an alternative to manually running the Docker image. ```Shell docker compose up ``` -------------------------------- ### Initialize NvidiaDocumentEmbedder with NVIDIA API Key Source: https://docs.haystack.deepset.ai/docs/intro/nvidiadocumentembedder This snippet demonstrates how to initialize the `NvidiaDocumentEmbedder` component for use with the NVIDIA API catalog. It shows how to specify the `model` and `api_url`, and securely pass the `api_key` using `Secret.from_token`. The example includes warming up the embedder and running it to get an embedding for a sample text. ```Python from haystack.utils.auth import Secret from haystack_integrations.components.embedders.nvidia import NvidiaDocumentEmbedder embedder = NvidiaDocumentEmbedder( model="NV-Embed-QA", api_url="https://ai.api.nvidia.com/v1/retrieval/nvidia", api_key=Secret.from_token(""), ) embedder.warm_up() result = embedder.run("A transformer is a deep learning architecture") print(result["embedding"]) print(result["meta"]) ``` -------------------------------- ### Building a Basic Haystack Pipeline with PromptBuilder Source: https://docs.haystack.deepset.ai/docs/intro/promptbuilder This snippet demonstrates how to set up a basic Haystack pipeline using `PromptBuilder` and `OpenAIGenerator`. It defines a prompt template, connects components, and runs the pipeline to answer a question based on provided documents. ```Python documents = [Document(content="Joe lives in Berlin"), Document(content="Joe is a software engineer") ] prompt_template = """ Given these documents, answer the question.\nDocuments: {% for doc in documents %} {{ doc.content }} {% endfor %} \nQuestion: {{query}} \nAnswer: """ p = Pipeline() p.add_component(instance=PromptBuilder(template=prompt_template), name="prompt_builder") p.add_component(instance=OpenAIGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY")), name="llm") p.connect("prompt_builder", "llm") question = "Where does Joe live?" result = p.run({"prompt_builder": {"documents": documents, "query": question}}) print(result) ``` -------------------------------- ### Install Fastembed Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/fastembedranker Instructions to install the `fastembed-haystack` package, which provides the `FastembedRanker` component for Haystack. ```Shell pip install fastembed-haystack ``` -------------------------------- ### Warm Up Haystack Components for Stand-Alone Use Source: https://docs.haystack.deepset.ai/docs/intro/components This example demonstrates the proper sequence for using resource-heavy Haystack components (e.g., embedders) when they are not part of a pipeline. It shows how to initialize the component, call its `warm_up()` method to load necessary models, and then execute its `run()` method. ```Python from haystack import Document from haystack.components.embedders import SentenceTransformersDocumentEmbedder doc = Document(content="I love pizza!") doc_embedder = SentenceTransformersDocumentEmbedder() # First, initialize the component doc_embedder.warm_up() # Then, warm it up to load the model result = doc_embedder.run([doc]) # And finally, run it print(result['documents'][0].embedding) ``` -------------------------------- ### Haystack Component: DocumentClassifier Source: https://docs.haystack.deepset.ai/docs/intro/migration Classifies documents by attaching metadata to them. Example usage: Labeling documents by their characteristic (for example, sentiment). ```APIDOC Haystack 1.x: DocumentClassifier Haystack 2.x: TransformersZeroShotDocumentClassifier (/docs/transformerszeroshotdocumentclassifier) ``` -------------------------------- ### Set up Haystack Pipeline for Component Output Inspection Source: https://docs.haystack.deepset.ai/docs/intro/debugging-pipelines This Python snippet demonstrates the setup of a Haystack pipeline designed for chat-based interactions. It initializes a `ChatPromptBuilder` with a defined prompt template and connects it to an `OpenAIChatGenerator`. The pipeline is configured to process documents and a query to generate an answer, serving as a foundational example for inspecting component outputs during pipeline execution for debugging. ```Python from haystack import Pipeline, Document from haystack.utils import Secret from haystack.components.generators.chat import OpenAIChatGenerator from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder from haystack.dataclasses import ChatMessage # Documents documents = [Document(content="Joe lives in Berlin"), Document(content="Joe is a software engineer")] # Define prompt template prompt_template = [ ChatMessage.from_system("You are a helpful assistant."), ChatMessage.from_user( "Given these documents, answer the question.\nDocuments:\n" "{% for doc in documents %}{{ doc.content }}{% endfor %}\n" "Question: {{query}}\nAnswer:" ) ] # Define pipeline p = Pipeline() p.add_component(instance=ChatPromptBuilder(template=prompt_template, required_variables={"query", "documents"}), name="prompt_builder") p.add_component(instance=OpenAIChatGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY")), name="llm") p.connect("prompt_builder", "llm.messages") ``` -------------------------------- ### Build a Basic RAG Pipeline with Haystack and OpenAI Source: https://docs.haystack.deepset.ai/docs/intro/get_started Demonstrates building a Retrieval Augmented Generation (RAG) pipeline using Haystack. It sets up an `InMemoryDocumentStore` with sample data, defines a chat prompt, and connects a retriever, prompt builder, and OpenAI generator to answer questions. Requires an `OPENAI_API_KEY`. ```Python from haystack import Pipeline, Document from haystack.utils import Secret from haystack.document_stores.in_memory import InMemoryDocumentStore from haystack.components.retrievers.in_memory import InMemoryBM25Retriever from haystack.components.generators.chat import OpenAIChatGenerator from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder from haystack.dataclasses import ChatMessage # Write documents to InMemoryDocumentStore document_store = InMemoryDocumentStore() document_store.write_documents([ Document(content="My name is Jean and I live in Paris."), Document(content="My name is Mark and I live in Berlin."), Document(content="My name is Giorgio and I live in Rome.") ]) # Build a RAG pipeline prompt_template = [ ChatMessage.from_system("You are a helpful assistant."), ChatMessage.from_user( "Given these documents, answer the question.\n" "Documents:\n{% for doc in documents %}{{ doc.content }}{% endfor %}\n" "Question: {{question}}\n" "Answer:" ) ] # Define required variables explicitly prompt_builder = ChatPromptBuilder(template=prompt_template, required_variables={"question", "documents"}) retriever = InMemoryBM25Retriever(document_store=document_store) llm = OpenAIChatGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY")) rag_pipeline = Pipeline() rag_pipeline.add_component("retriever", retriever) rag_pipeline.add_component("prompt_builder", prompt_builder) rag_pipeline.add_component("llm", llm) rag_pipeline.connect("retriever", "prompt_builder.documents") rag_pipeline.connect("prompt_builder", "llm.messages") # Ask a question question = "Who lives in Paris?" results = rag_pipeline.run( { "retriever": {"query": question}, "prompt_builder": {"question": question} } ) print(results["llm"]["replies"]) ``` -------------------------------- ### Perform Basic Image QA with VertexAIImageQA Source: https://docs.haystack.deepset.ai/docs/intro/vertexaiimageqa Demonstrates how to initialize `VertexAIImageQA` for basic image question answering. It shows loading an image from a file path using `ByteStream` and running a query to get a single reply. ```Python from haystack.dataclasses.byte_stream import ByteStream from haystack_integrations.components.generators.google_vertex import VertexAIImageQA qa = VertexAIImageQA() image = ByteStream.from_file_path("dog.jpg") res = qa.run(image=image, question="What color is this dog") print(res["replies"][0]) ``` -------------------------------- ### Haystack Component: FileClassifier Source: https://docs.haystack.deepset.ai/docs/intro/migration Distinguishes between text, PDF, Markdown, Docx, and HTML files. Example usage: Routing files to appropriate converters (for example, it routes PDF files to `PDFToTextConverter`). ```APIDOC Haystack 1.x: FileClassifier Haystack 2.x: FileTypeRouter (/docs/filetyperouter) ``` -------------------------------- ### Install ComponentTool Dependencies Source: https://docs.haystack.deepset.ai/docs/intro/componenttool Installs the necessary Python packages `docstring-parser` and `jsonschema` required to use the `ComponentTool`. ```Shell pip install docstring-parser jsonschema ``` -------------------------------- ### Install Optimum Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/optimumdocumentembedder Instructions on how to install the `optimum-haystack` package using pip, which is required to use the integration with Haystack. ```Shell pip install optimum-haystack ``` -------------------------------- ### Install Haystack AI Source: https://docs.haystack.deepset.ai/docs/intro/get_started Installs the minimal version of the Haystack AI library using pip. This is the first step to setting up a Haystack development environment. ```Shell pip install haystack-ai ``` -------------------------------- ### Haystack Component: Crawler Source: https://docs.haystack.deepset.ai/docs/intro/migration Scrapes text from websites. Example usage: To run searches on your website content. ```APIDOC Haystack 1.x: Crawler Haystack 2.x: Not Available ``` -------------------------------- ### LLMEvaluator Example Data Format Source: https://docs.haystack.deepset.ai/docs/intro/llmevaluator Defines the expected dictionary structure for the `examples` parameter used in `LLMEvaluator` initialization. It illustrates how to structure inputs (questions, contexts) and outputs (statements, statement_scores) for few-shot learning. ```Python [ { "inputs": { "questions": "What is the capital of Italy?", "contexts": ["Rome is the capital of Italy."], }, "outputs": { "statements": ["Rome is the capital of Italy.", "Rome has more than 4 million inhabitants."], "statement_scores": [1, 0], }, } ] ``` -------------------------------- ### Uninstalling Haystack 1.x and Installing Haystack 2.x Packages Source: https://docs.haystack.deepset.ai/docs/intro/migration This snippet demonstrates how to uninstall both Haystack 1.x (farm-haystack) and Haystack 2.x (haystack-ai) packages if they coexist, and then install only the new Haystack 2.x package. This is crucial as the two versions cannot run in the same Python environment. ```bash pip uninstall -y farm-haystack haystack-ai pip install haystack-ai ``` -------------------------------- ### Haystack Component: EntityExtractor Source: https://docs.haystack.deepset.ai/docs/intro/migration Extracts predefined entities out of a piece of text. Example usage: Named entity extraction (NER). ```APIDOC Haystack 1.x: EntityExtractor Haystack 2.x: NamedEntityExtractor (/docs/namedentityextractor) ``` -------------------------------- ### Start OpenSearch with Docker Compose Source: https://docs.haystack.deepset.ai/docs/intro/opensearchembeddingretriever Alternative method to start an OpenSearch instance using a provided `docker-compose.yml` file. ```Shell docker compose up ``` -------------------------------- ### Haystack Component: QuestionGenerator Source: https://docs.haystack.deepset.ai/docs/intro/migration When given a document, it generates questions this document can answer. Example usage: Auto-suggested questions in your search app. ```APIDOC Haystack 1.x: QuestionGenerator Haystack 2.x: Prompt Builders (/docs/builders) with dedicated prompt, Generators (/docs/generators) ``` -------------------------------- ### Implement Haystack Pipeline for Hybrid Retrieval with Fastembed Source: https://docs.haystack.deepset.ai/docs/intro/qdranthybridretriever This comprehensive Python example demonstrates setting up a Haystack pipeline for hybrid retrieval. It initializes a QdrantDocumentStore, defines documents, and constructs both an indexing pipeline (using FastembedSparseDocumentEmbedder and FastembedDocumentEmbedder) and a querying pipeline (using FastembedSparseTextEmbedder, FastembedTextEmbedder, and QdrantHybridRetriever). The example shows how to connect components, run the indexing process, and execute a query to retrieve relevant documents based on hybrid embeddings. ```Python from haystack import Document, Pipeline from haystack.components.writers import DocumentWriter from haystack_integrations.components.retrievers.qdrant import QdrantHybridRetriever from haystack_integrations.document_stores.qdrant import QdrantDocumentStore from haystack.document_stores.types import DuplicatePolicy from haystack_integrations.components.embedders.fastembed import ( FastembedTextEmbedder, FastembedDocumentEmbedder, FastembedSparseTextEmbedder, FastembedSparseDocumentEmbedder ) document_store = QdrantDocumentStore( ":memory:", recreate_index=True, use_sparse_embeddings=True, embedding_dim = 384 ) documents = [ Document(content="My name is Wolfgang and I live in Berlin"), Document(content="I saw a black horse running"), Document(content="Germany has many big cities"), Document(content="fastembed is supported by and maintained by Qdrant."), ] indexing = Pipeline() indexing.add_component("sparse_doc_embedder", FastembedSparseDocumentEmbedder(model="prithvida/Splade_PP_en_v1")) indexing.add_component("dense_doc_embedder", FastembedDocumentEmbedder(model="BAAI/bge-small-en-v1.5")) indexing.add_component("writer", DocumentWriter(document_store=document_store, policy=DuplicatePolicy.OVERWRITE)) indexing.connect("sparse_doc_embedder", "dense_doc_embedder") indexing.connect("dense_doc_embedder", "writer") indexing.run({"sparse_doc_embedder": {"documents": documents}}) querying = Pipeline() querying.add_component("sparse_text_embedder", FastembedSparseTextEmbedder(model="prithvida/Splade_PP_en_v1")) querying.add_component("dense_text_embedder", FastembedTextEmbedder( model="BAAI/bge-small-en-v1.5", prefix="Represent this sentence for searching relevant passages: ") ) querying.add_component("retriever", QdrantHybridRetriever(document_store=document_store)) querying.connect("sparse_text_embedder.sparse_embedding", "retriever.query_sparse_embedding") querying.connect("dense_text_embedder.embedding", "retriever.query_embedding") question = "Who supports fastembed?" results = query_mix.run( {"dense_text_embedder": {"text": question}, "sparse_text_embedder": {"text": question}} ) print(result["retriever"]["documents"][0]) # Document(id=..., # content: 'fastembed is supported by and maintained by Qdrant.', # score: 1.0) ``` -------------------------------- ### Install llama-cpp-haystack with cuBLAS backend Source: https://docs.haystack.deepset.ai/docs/intro/llamacppchatgenerator Commands to set up environment variables and install `llama-cpp-python` with cuBLAS support, followed by `llama-cpp-haystack`. ```Shell export GGML_CUDA=1 CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python pip install llama-cpp-haystack ``` -------------------------------- ### Install Custom Haystack Document Store from PyPI Source: https://docs.haystack.deepset.ai/docs/intro/creating-custom-document-stores Shows how to install a custom Haystack Document Store package that has been published on PyPI. This method is recommended for public distribution, ensuring versioning and easier installation. ```Shell pip install example-haystack ``` -------------------------------- ### Install Weaviate Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/weaviatebm25retriever Instructions to install the necessary Python package for Weaviate integration with Haystack. ```Shell pip install weaviate-haystack ``` -------------------------------- ### Install Mistral Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/mistraldocumentembedder Instructions to install the necessary Haystack integration for Mistral components using pip. ```Shell pip install mistral-haystack ``` -------------------------------- ### Haystack Component: QueryClassifier Source: https://docs.haystack.deepset.ai/docs/intro/migration Categorizes queries. Example usage: Distinguishing between keyword queries and natural language questions and routing them to the Retrievers that can handle them best. ```APIDOC Haystack 1.x: QueryClassifier Haystack 2.x: TransformersZeroShotTextRouter (/docs/transformerszeroshottextrouter) TransformersTextRouter (/docs/transformerstextrouter) ``` -------------------------------- ### Initialize LlamaCppGenerator and generate text Source: https://docs.haystack.deepset.ai/docs/intro/llamacppgenerator Demonstrates how to initialize the `LlamaCppGenerator` with a GGUF model path, context size, batch size, and initial generation parameters, then warm it up and run it to generate text from a prompt. ```Python from haystack_integrations.components.generators.llama_cpp import LlamaCppGenerator generator = LlamaCppGenerator( model="/content/openchat-3.5-1210.Q3_K_S.gguf", n_ctx=512, n_batch=128, model_kwargs={"n_gpu_layers": -1}, generation_kwargs={"max_tokens": 128, "temperature": 0.1}, ) generator.warm_up() prompt = f"Who is the best American actor?" result = generator.run(prompt) ``` -------------------------------- ### Haystack Component: Reader Source: https://docs.haystack.deepset.ai/docs/intro/migration Finds an answer by selecting a text span in documents. Example usage: In a query pipeline when you want to know the location of the answer. ```APIDOC Haystack 1.x: Reader Haystack 2.x: ExtractiveReader (/docs/extractivereader) ``` -------------------------------- ### Initialize OpenSearch with Docker: Pull and Run Source: https://docs.haystack.deepset.ai/docs/intro/opensearch-document-store This snippet demonstrates how to pull the official OpenSearch Docker image and run it as a single-node instance, exposing necessary ports for access. It also sets Java memory options for the OpenSearch process. ```Shell docker pull opensearchproject/opensearch:2.11.0 docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1024m -Xmx1024m" opensearchproject/opensearch:2.11.0 ``` -------------------------------- ### Install OpenTelemetry Instrumentation CLI Source: https://docs.haystack.deepset.ai/docs/intro/tracing Installs the OpenTelemetry distribution package, which includes the `opentelemetry-instrument` command-line interface for automated application instrumentation. ```Shell pip install opentelemetry-distro ``` -------------------------------- ### Haystack Component: Ranker Source: https://docs.haystack.deepset.ai/docs/intro/migration Orders documents based on how relevant they are to the query. Example usage: In a query pipeline, after a keyword-based Retriever to rank the documents it returns. ```APIDOC Haystack 1.x: Ranker Haystack 2.x: Rankers (/docs/rankers) ``` -------------------------------- ### Install Langfuse Haystack Connector Source: https://docs.haystack.deepset.ai/docs/intro/tracing Installs the `langfuse-haystack` component, which allows you to easily trace your Haystack pipelines and visualize them in the Langfuse UI. ```Shell pip install langfuse-haystack ``` -------------------------------- ### Initialize Haystack Pipeline with RagasEvaluator for Answer Relevancy Source: https://docs.haystack.deepset.ai/docs/intro/ragasevaluator Example demonstrating the setup of a Haystack Pipeline and the initialization of the `RagasEvaluator` component, configured to use the `ANSWER_RELEVANCY` metric for evaluating generated responses. ```python from haystack import Pipeline from haystack_integrations.components.evaluators.ragas import RagasEvaluator, RagasMetric pipeline = Pipeline() evaluator = RagasEvaluator( metric=RagasMetric.ANSWER_RELEVANCY, ) pipeline.add_component("evaluator", evaluator) ``` -------------------------------- ### Haystack Component: FileConverter Source: https://docs.haystack.deepset.ai/docs/intro/migration Cleans and splits documents in different formats. Example usage: In indexing pipelines, extracting text from a file and casting it into the Document class format. ```APIDOC Haystack 1.x: FileConverter Haystack 2.x: Converters (/docs/converters) ``` -------------------------------- ### Install Fastembed Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/fastembedtextembedder Instructions to install the `fastembed-haystack` package using pip, enabling its use with Haystack. ```Bash pip install fastembed-haystack ``` -------------------------------- ### Install google-vertex-haystack package Source: https://docs.haystack.deepset.ai/docs/intro/vertexaicodegenerator This snippet shows how to install the `google-vertex-haystack` package using pip, which is a prerequisite for utilizing components like `VertexAIImageCaptioner` and `VertexAICodeGenerator`. ```Shell pip install google-vertex-haystack ``` -------------------------------- ### Integrate VertexAIGeminiChatGenerator in Haystack Pipeline Source: https://docs.haystack.deepset.ai/docs/intro/vertexaigeminichatgenerator Shows the initial setup for integrating `VertexAIGeminiChatGenerator` within a Haystack Pipeline, including necessary imports and component instantiation. This snippet is incomplete and serves as an introductory example. ```Python from haystack.components.builders import ChatPromptBuilder from haystack.dataclasses import ChatMessage from haystack import Pipeline from haystack_integrations.components.generators.google_vertex import VertexAIGeminiChatGenerator ``` -------------------------------- ### Install GitHub Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/githubprcreatortool Installs the necessary Python package for the GitHub integration, enabling the use of GitHub-related tools like `GitHubPRCreatorTool`. ```Shell pip install github-haystack ``` -------------------------------- ### Using LLMEvaluator Standalone Source: https://docs.haystack.deepset.ai/docs/intro/llmevaluator Demonstrates initializing and running the `LLMEvaluator` component independently to evaluate generated responses. The example shows how to define instructions, inputs, outputs, and provide few-shot examples, then execute the evaluator and print its results. ```Python from typing import List from haystack.components.evaluators import LLMEvaluator llm_evaluator = LLMEvaluator( instructions="Is this answer problematic for children?", inputs=[("responses", List[str])], outputs=["score"], examples=[ {"inputs": {"responses": "Damn, this is straight outta hell!!!"}, "outputs": {"score": 1}}, {"inputs": {"responses": "Football is the most popular sport."}, "outputs": {"score": 0}}, ], ) responses = [ "Football is the most popular sport with around 4 billion followers worldwide", "Python language was created by Guido van Rossum.", ] results = llm_evaluator.run(responses=responses) print(results) # {'results': [{'score': 0}, {'score': 0}]} ``` -------------------------------- ### Embed Documents with OptimumDocumentEmbedder (Standalone) Source: https://docs.haystack.deepset.ai/docs/intro/optimumdocumentembedder Demonstrates how to use the `OptimumDocumentEmbedder` component independently to generate embeddings for a single document. It shows the basic setup, warming up the embedder, and running it to get the embedding result. ```Python from haystack.dataclasses import Document from haystack_integrations.components.embedders.optimum import OptimumDocumentEmbedder doc = Document(content="I love pizza!") document_embedder = OptimumDocumentEmbedder(model="sentence-transformers/all-mpnet-base-v2") document_embedder.warm_up() result = document_embedder.run([doc]) print(result["documents"][0].embedding) # [0.017020374536514282, -0.023255806416273117, ...] ``` -------------------------------- ### Install GitHub Integration for Haystack Source: https://docs.haystack.deepset.ai/docs/intro/githubprcreator Instructions to install the 'github-haystack' package using pip. This is a prerequisite for using the GitHub integration components. ```Shell pip install github-haystack ``` -------------------------------- ### Install Nvidia Haystack Package Source: https://docs.haystack.deepset.ai/docs/intro/nvidiatextembedder Installs the necessary Python package for using Nvidia components in Haystack. ```Shell pip install nvidia-haystack ``` -------------------------------- ### Install llama-cpp-haystack with cuBLAS backend Source: https://docs.haystack.deepset.ai/docs/intro/llamacppgenerator Instructions to install `llama-cpp-python` with cuBLAS support and then `llama-cpp-haystack` by setting the necessary environment variables and CMAKE arguments. ```Bash export GGML_CUDA=1 CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python pip install llama-cpp-haystack ``` -------------------------------- ### Install GitHub Haystack Integration Source: https://docs.haystack.deepset.ai/docs/intro/githubissuecommentertool Provides the shell command to install the `github-haystack` package, which is required to use the `GitHubIssueCommenterTool` and other GitHub-related components in Haystack. ```Shell pip install github-haystack ``` -------------------------------- ### Resolve Haystack Package Conflicts for Haystack AI Source: https://docs.haystack.deepset.ai/docs/intro/get_started Addresses potential conflicts when `farm-haystack` and `haystack-ai` are installed in the same environment. This snippet uninstalls both packages and then reinstalls only `haystack-ai` to ensure a clean setup. ```Bash pip uninstall -y farm-haystack haystack-ai pip install haystack-ai ``` -------------------------------- ### Basic Standalone Usage of SagemakerGenerator Source: https://docs.haystack.deepset.ai/docs/intro/sagemakergenerator Illustrates the basic standalone usage of `SagemakerGenerator`. It shows how to import the component, initialize it with a model, warm it up, and then run a text generation query, printing the response. ```Python from haystack_integrations.components.generators.amazon_sagemaker import SagemakerGenerator client = SagemakerGenerator(model="jumpstart-dft-hf-llm-falcon-7b-instruct-bf16") client.warm_up() response = client.run("Briefly explain what NLP is in one sentence.") print(response) >>> {'replies': ["Natural Language Processing (NLP) is a subfield of artificial intelligence and computational linguistics that focuses on the interaction between computers and human languages..."], 'metadata': [{}]} ``` -------------------------------- ### Uninstall conflicting Haystack packages Source: https://docs.haystack.deepset.ai/docs/intro/installation Removes both `farm-haystack` and `haystack-ai` packages from the Python environment to resolve potential conflicts. This step is crucial if you were previously using Haystack 1.x and are upgrading or switching versions. ```Bash pip uninstall -y farm-haystack haystack-ai pip install haystack-ai ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.