### APIServer with Route Configuration and Uvicorn Start Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/APIServer.md Demonstrates the complete setup for running the APIServer, including client initialization, route configuration, and starting the server with uvicorn. ```python from chatui.api import APIServer from chatui.chat_client import ChatClient import uvicorn client = ChatClient("http://localhost:8000", "llama-3.3-70b") api_server = APIServer(client) api_server.configure_routes() # Start server uvicorn.run(api_server, host="0.0.0.0", port=8080, workers=1) ``` -------------------------------- ### YAML Configuration Example Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/AppConfig.md Example of a YAML file structure for AppConfig settings. ```yaml serverUrl: http://localhost serverPort: "8000" serverPrefix: /rag-api/ modelName: llama-3.3-70b ``` -------------------------------- ### JSON Configuration Example Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/AppConfig.md Example of a JSON file structure for AppConfig settings. ```json { "serverUrl": "http://localhost", "serverPort": "8000", "serverPrefix": "/rag-api/", "modelName": "llama-3.3-70b" } ``` -------------------------------- ### APIServer Initialization Example Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/APIServer.md Instantiates the APIServer with a ChatClient and configures its routes. This is a prerequisite for running the server. ```python from chatui.api import APIServer from chatui.chat_client import ChatClient client = ChatClient("http://localhost:8000", "llama-3.3-70b") app = APIServer(client) app.configure_routes() # Run with uvicorn # uvicorn chatui.api:app --host 0.0.0.0 --port 8080 ``` -------------------------------- ### Configuration Guide Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/README.md A comprehensive guide to application configuration, detailing AppConfig options, environment variables, and runtime configuration via GraphState. ```APIDOC ## Configuration ### Description Provides a detailed guide to configuring the Agentic RAG system. ### Configuration Options - AppConfig options - Environment variables - Runtime configuration via GraphState - Includes example scenarios for different configurations. ``` -------------------------------- ### Cloud-Only Configuration Scenario Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/configuration.md Example JSON configuration for a cloud-only setup. This scenario requires setting NVIDIA and Tavily API keys as environment variables. ```json { "serverUrl": "http://localhost", "serverPort": "8000", "modelName": "llama-3.3-70b" } ``` ```bash export NVIDIA_API_KEY="your_nvidia_key" export TAVILY_API_KEY="your_tavily_key" ``` -------------------------------- ### Instantiate CustomChatOpenAI (with GPU Check) Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/CustomChatOpenAI.md Example of instantiating CustomChatOpenAI with GPU type and count for compatibility checking. ```python from chatui.utils.nim import CustomChatOpenAI # With GPU compatibility checking llm = CustomChatOpenAI( custom_endpoint="localhost", port="8000", model_name="meta/llama-3.3-70b-instruct", gpu_type="A100", gpu_count="2", temperature=0.5 ) ``` -------------------------------- ### Environment Variables Setup Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/README.md Sets up required and optional environment variables for the application. NVIDIA_API_KEY and TAVILY_API_KEY are mandatory. ```bash export NVIDIA_API_KEY="your_nvidia_key" export TAVILY_API_KEY="your_tavily_key" ``` ```bash export CHUNK_SIZE=250 export CHUNK_OVERLAP=0 export TAVILY_K=3 export RECURSION_LIMIT=10 export INTERNAL_API=no ``` -------------------------------- ### Instantiate CustomChatOpenAI (Basic) Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/CustomChatOpenAI.md Example of basic instantiation of CustomChatOpenAI with endpoint, port, model name, and temperature. ```python from chatui.utils.nim import CustomChatOpenAI # Basic usage llm = CustomChatOpenAI( custom_endpoint="agentic-rag-local-nim-1", port="8000", model_name="meta/llama-3.1-8b-instruct", temperature=0.7 ) ``` -------------------------------- ### Initialize GraphState for Usage Example Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/GraphState.md This snippet shows how to import and potentially initialize the GraphState for use within the agentic RAG workflow. It demonstrates the basic setup required before passing state through the graph. ```python from chatui.utils.graph import GraphState ``` -------------------------------- ### Sample Queries for Chatbot Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/build-page.md A list of pre-configured example questions that can be used as clickable buttons in the chat interface. ```python "How do I add an integration in the CLI?" "How do I fix an inaccessible remote Location?" "What are the NVIDIA-provided default base environments?" "How do I create a support bundle for troubleshooting?" ``` -------------------------------- ### Custom Page with Different Client Example Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/build-page.md Demonstrates how to create a custom ChatClient and use it to build, customize, and launch a Gradio page. ```python from chatui import pages from chatui.chat_client import ChatClient # Create custom client client = ChatClient( server_url="http://custom.server:9000", model_name="custom-model" ) # Build page page = pages.converse.build_page(client) # Customize and launch page.queue(max_size=20) page.launch(server_name="0.0.0.0", server_port=8080) ``` -------------------------------- ### Set Verbosity to INFO (default) Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md Example of running the application with the default INFO logging level. ```bash # Verbosity = 1 (INFO, default) python -m chatui ``` -------------------------------- ### Use Retriever to Get Documents Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/types.md Example of obtaining a retriever instance and invoking it to fetch documents based on a query. ```python from chatui.utils import database retriever = database.get_retriever() docs = retriever.invoke("How do I install AI Workbench?") ``` -------------------------------- ### Mixed Mode Configuration Scenario Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/configuration.md Example GraphState configuration for a mixed mode setup, utilizing both local NIM and cloud APIs. This involves setting specific flags and parameters for NIM usage and cloud model IDs. ```python state = { "generator_use_nim": True, # Use local NIM "nim_generator_ip": "localhost", "nim_generator_port": "8000", "router_use_nim": False, # Use cloud API "router_model_id": "meta/llama-3.3-70b-instruct", ... } ``` -------------------------------- ### Self-Hosted NIM Configuration Scenario Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/configuration.md Example JSON configuration for a self-hosted NIM setup. This scenario involves overriding GraphState with NIM service details. ```json { "serverUrl": "http://localhost", "serverPort": "8000", "modelName": "llama-3.1-8b" } ``` ```python state = { "router_use_nim": True, "nim_router_ip": "agentic-rag-local-nim-1", "nim_router_port": "8000", "nim_router_id": "meta/llama-3.1-8b-instruct", ... } ``` -------------------------------- ### Document Type Example Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/types.md Demonstrates how to create a LangChain Document object, which includes the main text content and associated metadata like the source URL. ```python from langchain.schema import Document doc = Document( page_content="AI Workbench is NVIDIA's IDE...", metadata={"source": "https://docs.nvidia.com/ai-workbench/"} ) ``` -------------------------------- ### Handle File Access Errors in Configuration Loading Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/errors.md Example demonstrating how to handle FileNotFoundError and PermissionError when loading configuration files. Verify file paths and permissions. ```python from chatui.configuration import AppConfig import sys config = AppConfig.from_file("/path/to/config.json") if config is None: print("Failed to load configuration (file not found or permission denied)") sys.exit(1) ``` -------------------------------- ### GraphState Initialization with Llama 3 Prompts Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/prompts.md Demonstrates how to initialize a GraphState with specific Llama 3 prompts for various agentic components. This setup is used when invoking the graph workflow. ```python from chatui.prompts import prompts_llama3 state = { "question": "...", "prompt_router": prompts_llama3.router_prompt, "prompt_retrieval": prompts_llama3.retrieval_prompt, "prompt_generator": prompts_llama3.generator_prompt, "prompt_hallucination": prompts_llama3.hallucination_prompt, "prompt_answer": prompts_llama3.answer_prompt, ... } result = app.invoke(state) ``` -------------------------------- ### Environment Variable Precedence Example Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/AppConfig.md Illustrates how environment variables override configuration file settings. This example shows setting environment variables before running a Python script that loads configuration. ```bash # Environment variable override export APP_SERVERURL="http://custom.server" export APP_SERVERPORT="9000" # Config file values are overridden by env vars python -m chatui --config /path/to/config.json # Result: server_url="http://custom.server", server_port="9000" ``` -------------------------------- ### Answer Grader Setup Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/code/langgraph_rag_agent_llama3_nvidia_nim.ipynb Initializes the LLM for the answer grader. This component will be used to assess the quality and correctness of the final generated answer. ```python # LLM llm = ChatNVIDIA(model=model_id, temperature=0) ``` -------------------------------- ### Usage Example: Redirecting Stdout and Logging Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/logger.md Demonstrates how to redirect stdout to a file using the Logger class and shows that subsequent print statements are captured in the log file. Ensure the file path is valid. ```python import sys from chatui.utils import logger # Redirect stdout to file sys.stdout = logger.Logger("/path/to/output.log") # All print statements now go to file print("This message is logged to file") print("Request processing: ", request_data) ``` -------------------------------- ### Retrieval Grader Setup Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/code/langgraph_rag_agent_llama3_nvidia_nim.ipynb Sets up a retrieval grader using LangChain, NVIDIA Chat, and JSON output parsing. It defines a prompt to assess document relevance to a user question. ```python from langchain.prompts import PromptTemplate from langchain_nvidia_ai_endpoints import ChatNVIDIA from langchain_core.output_parsers import JsonOutputParser model_id = "meta/llama3-70b-instruct" # LLM llm = ChatNVIDIA(model=model_id, temperature=0) prompt = PromptTemplate( template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a grader assessing relevance of a retrieved document to a user question. If the document contains keywords related to the user question, grade it as relevant. It does not need to be a stringent test. The goal is to filter out erroneous retrievals. Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question. Provide the binary score as a JSON with a single key 'score' and no premable or explanation. <|eot_id|><|start_header_id|>user<|end_header_id|> Here is the retrieved document: {document} Here is the user question: {question} <|eot_id|><|start_header_id|>assistant<|end_header_id|> """, input_variables=["question", "document"], ) retrieval_grader = prompt | llm | JsonOutputParser() question = "agent memory" docs = retriever.invoke(question) doc_txt = docs[1].page_content print(retrieval_grader.invoke({"question": question, "document": doc_txt})) ``` -------------------------------- ### URL Validation Example Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/database.md Demonstrates how to use the is_valid_url function to check if a given string is a properly formatted URL. It returns True for valid URLs and False otherwise. ```python from chatui.utils import database is_valid = database.is_valid_url("https://docs.nvidia.com/ai-workbench/") print(is_valid) # True is_valid = database.is_valid_url("invalid-url") print(is_valid) # False ``` -------------------------------- ### Safe URL Loading Example Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/database.md Shows how to use the safe_load function to load and parse content from a URL. It handles potential errors during loading and returns a list of Document objects or None if an error occurs. ```python from chatui.utils import database docs = database.safe_load("https://docs.nvidia.com/ai-workbench/") if docs: print(f"Loaded {len(docs)} documents") else: print("Failed to load URL") ``` -------------------------------- ### NVIDIA API Key Configuration Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/code/langgraph_rag_agent_llama3_nvidia_nim.ipynb Prompt the user for their NVIDIA API key if it's not already set as an environment variable. Validates that the key starts with 'nvapi-'. ```python # SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. # SPDX-License-Identifier: Apache-2.0 # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. import getpass import os if not os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"): nvapi_key = getpass.getpass("Enter your NVIDIA API key: ") assert nvapi_key.startswith("nvapi-"), f"{nvapi_key[:5]}... is not a valid key" os.environ["NVIDIA_API_KEY"] = nvapi_key ``` -------------------------------- ### Hallucination Grader Setup Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/code/langgraph_rag_agent_llama3_nvidia_nim.ipynb Configures a hallucination grader using an NVIDIA NIM LLM and a prompt template. This grader assesses if the generated answer is supported by the provided documents. ```python # LLM llm = ChatNVIDIA(model=model_id, temperature=0) # Prompt prompt = PromptTemplate( template=""" <|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a grader assessing whether an answer is grounded in / supported by a set of facts. Give a binary 'yes' or 'no' score to indicate whether the answer is grounded in / supported by a set of facts. Provide the binary score as a JSON with a single key 'score' and no preamble or explanation. <|eot_id|><|start_header_id|>user<|end_header_id|> Here are the facts: \n ------- \n {documents} \n ------- \n Here is the answer: {generation} <|eot_id|><|start_header_id|>assistant<|end_header_id|>""", input_variables=["generation", "documents"], ) hallucination_grader = prompt | llm | JsonOutputParser() hallucination_grader.invoke({"documents": docs, "generation": generation}) ``` -------------------------------- ### Integrate CustomChatOpenAI in a RAG Chain Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/CustomChatOpenAI.md Demonstrates initializing CustomChatOpenAI with a custom endpoint and integrating it into a Retrieval Augmented Generation (RAG) chain using LangChain. This setup allows for custom LLM backends in RAG pipelines. ```python from langchain.prompts import PromptTemplate from langchain_core.output_parsers import StrOutputParser from chatui.utils.nim import CustomChatOpenAI from langchain.schema import Document # Initialize LLM with NIM llm = CustomChatOpenAI( custom_endpoint="agentic-rag-local-nim-1", port="8000", model_name="meta/llama-3.1-8b-instruct", temperature=0.7 ) # Create RAG chain prompt = PromptTemplate( template="Context: {context}\n\nQuestion: {question}\n\nAnswer:", input_variables=["context", "question"] ) rag_chain = prompt | llm | StrOutputParser() # Execute chain documents = [ Document(page_content="AI Workbench is NVIDIA's IDE for AI..."), Document(page_content="It provides GPU acceleration...") ] context = "\n".join([doc.page_content for doc in documents]) response = rag_chain.invoke({ "context": context, "question": "What is AI Workbench?" }) print(response) ``` -------------------------------- ### Entrypoint Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/README.md Handles CLI and server bootstrap, including parsing command-line arguments, loading configuration, and initiating the server startup sequence. ```APIDOC ## Entrypoint ### Description Serves as the main entry point for the application, handling both CLI and server startup. ### Responsibilities - Parses command-line arguments. - Loads application configuration. - Manages the server startup sequence. ``` -------------------------------- ### Run Application with Configuration File Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md Shows how to specify a configuration file for the application. This allows for custom settings to be loaded. ```bash python -m chatui --config /path/to/config.json ``` -------------------------------- ### Print AppConfig Help Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/AppConfig.md Prints comprehensive configuration help documentation to a specified output. Use this to understand available configuration options and their defaults. ```python from chatui.configuration import AppConfig import sys AppConfig.print_help(sys.stdout.write) ``` -------------------------------- ### Display Configuration Help Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md Use this command to print the format help for the configuration file and exit. ```bash python -m chatui --help-config ``` -------------------------------- ### Load AppConfig from File and Initialize ChatClient Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/AppConfig.md Demonstrates loading application configuration from a file and initializing a ChatClient. Assumes configuration is loaded via APP_CONFIG_FILE environment variable or a default path. ```python from chatui.configuration import AppConfig from chatui.chat_client import ChatClient from chatui import pages import os # Load configuration config_file = os.environ.get("APP_CONFIG_FILE", "/dev/null") config = AppConfig.from_file(config_file) if not config: raise RuntimeError("Failed to load configuration") # Build API URL from config api_url = f"{config.server_url}:{config.server_port}" # Initialize chat client client = ChatClient(api_url, config.model_name) # Build UI pages blocks = pages.converse.build_page(client) ``` -------------------------------- ### AppConfig.print_help Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/AppConfig.md Prints comprehensive configuration help documentation, detailing each configuration field, its default value, type, and corresponding environment variable. ```APIDOC ## Class Method: print_help ### Description Prints comprehensive configuration help documentation. It iterates through dataclass fields and displays their JSON key name, default value, help text, type information, and environment variable name. ### Method Signature ```python @classmethod def print_help( cls, help_printer: Callable[[str], Any], env_parent: Optional[str] = None, json_parent: Optional[Tuple[str, ...]] = None, ) -> None ``` ### Parameters #### Arguments - **help_printer** (Callable) - Required - Function to write help text (e.g., `sys.stdout.write`) - **env_parent** (Optional[str]) - Optional - Used internally for recursion - **json_parent** (Optional[Tuple[str, ...]]) - Optional - Used internally for recursion ### Example ```python from chatui.configuration import AppConfig import sys AppConfig.print_help(sys.stdout.write) ``` ``` -------------------------------- ### Set Verbosity to DEBUG Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md Example of setting the verbosity to DEBUG level using multiple `-v` flags. ```bash # Verbosity = 2 (DEBUG) python -m chatui -vv ``` -------------------------------- ### Basic Application Execution Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md Demonstrates the basic command to run the chat UI application from the command line. ```bash python -m chatui ``` -------------------------------- ### Run ChatUI Application Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/README.md Command-line instructions for running the chatui application with different options, including basic usage, configuration file, help, and debug logging. ```bash python -m chatui ``` ```bash python -m chatui --config /path/to/config.json ``` ```bash python -m chatui --help-config ``` ```bash python -m chatui -vv # Debug logging ``` -------------------------------- ### Set Verbosity to WARN only Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md Example of setting the verbosity to WARN level by decreasing verbosity using the `-q` flag. ```bash # Verbosity = -1 (WARN only) python -m chatui -q ``` -------------------------------- ### Handle TavilyAPIError in Web Search Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/errors.md Example of catching TavilyAPIError during a web search operation. Use this to gracefully handle search failures. ```python from chatui.utils.graph import TavilyAPIError try: state = graph.web_search(state) except TavilyAPIError as e: print(f"Web search failed: {e}") # Handle gracefully: skip web search, use only docs ``` -------------------------------- ### Increase Graph Recursion Limit Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/errors.md Example of setting the RECURSION_LIMIT environment variable to manage LangGraph recursion depth. Use this to prevent GraphRecursionError. ```bash # Increase recursion limit export RECURSION_LIMIT=20 python -m chatui ``` -------------------------------- ### Initialize ChatClient Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/ChatClient.md Instantiate the ChatClient with the server URL and the desired model name. ```python from chatui.chat_client import ChatClient class ChatClient: def __init__(self, server_url: str, model_name: str) -> None: ... ``` -------------------------------- ### Configure Structured Logging Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/logger.md Sets up basic structured logging to a file named 'chatui.log' using a specified format and log level. This is configured via bootstrap_logging(). ```python logging.basicConfig( filename='chatui.log', format=_LOG_FMT, # "[PID] TIMESTAMP [LEVEL] - LOGGER - MESSAGE" level=log_level ) ``` -------------------------------- ### Handle RuntimeError in Configuration Loading Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/errors.md Example of catching RuntimeError when loading application configuration fails. Ensure configuration data is valid and accessible. ```python from chatui.configuration import AppConfig try: config = AppConfig.from_file("/invalid/path.json") if not config: raise RuntimeError("Failed to load configuration") except RuntimeError as e: print(f"Configuration error: {e}") ``` -------------------------------- ### Handle ValueError for GPU Incompatibility Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/errors.md Example of catching ValueError when initializing CustomChatOpenAI with incompatible GPU configurations. Verify GPU compatibility and configuration. ```python from chatui.utils.nim import CustomChatOpenAI try: llm = CustomChatOpenAI( custom_endpoint="localhost", gpu_type="A100", gpu_count="2", model_name="meta/llama-3.3-70b-instruct" ) except ValueError as e: print(f"GPU incompatibility: {e}") ``` -------------------------------- ### Load Application Configuration Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md Loads the application configuration from a specified file. Exits if configuration loading fails. ```python config = configuration.AppConfig.from_file(config_file) if not config: sys.exit(1) # Exit code 1 for config failure ``` -------------------------------- ### Set Verbosity to Higher DEBUG Level Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md Example of setting the verbosity to a higher DEBUG level (e.g., 3) using multiple `-v` flags. ```bash # Verbosity = 3 (DEBUG) python -m chatui -vvv ``` -------------------------------- ### Initialize Chroma Vector Store Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/types.md Instantiate the Chroma vector store, specifying collection name, embedding function, and persistence directory. Defaults are provided for convenience. ```python from langchain_community.vectorstores import Chroma class Chroma(VectorStore): def __init__( self, collection_name: str = "rag-chroma", embedding_function: Embeddings = None, persist_directory: str = "/project/data" ): ... ``` -------------------------------- ### RAG Chain for Generation Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/code/langgraph_rag_agent_llama3_nvidia_nim.ipynb Implements the RAG chain for generating answers. It uses a prompt template to guide the LLM with retrieved context and formats the output. ```python from langchain.prompts import PromptTemplate from langchain_core.output_parsers import StrOutputParser # Prompt prompt = PromptTemplate( template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise <|eot_id|><|start_header_id|>user<|end_header_id|> Question: {question} Context: {context} Answer: <|eot_id|><|start_header_id|>assistant<|end_header_id|>""", input_variables=["question", "document"], ) llm = ChatNVIDIA(model=model_id, temperature=0) # Post-processing def format_docs(docs): return "\n\n".join(doc.page_content for doc in docs) # Chain rag_chain = prompt | llm | StrOutputParser() # Run question = "agent memory" docs = retriever.invoke(question) generation = rag_chain.invoke({"context": docs, "question": question}) print(generation) ``` -------------------------------- ### Assets Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/README.md Handles UI theming, including loading the Kaizen theme and applying CSS styling. ```APIDOC ## Assets ### Description Manages static assets for UI theming. ### Features - Loads the Kaizen theme. - Applies CSS styling for the user interface. ``` -------------------------------- ### from_dict Class Method Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/AppConfig.md Creates an AppConfig instance from a dictionary. Environment variables can override values provided in the dictionary. ```APIDOC ## Class Method: from_dict ### Description Creates AppConfig from dictionary, with environment variable override. ### Method Signature ```python @classmethod def from_dict(cls, data: Dict[str, Any]) -> AppConfig ``` ### Parameters #### Path Parameters - **data** (Dict[str, Any]) - Required - Configuration dictionary ### Returns - `AppConfig` — Configuration instance ### Raises - `RuntimeError` — If data is not a dictionary ### Process 1. Validates input is dictionary 2. Iterates through `cls.envvars()` (list of supported env vars) 3. For each env var set, parses and updates dictionary at configured path 4. Binds LoadMeta with CAMEL case key transformation 5. Creates instance from merged data ### Example ```python from chatui.configuration import AppConfig data = { "serverUrl": "http://api.local", "serverPort": "8080", "modelName": "mistral-mixtral" } config = AppConfig.from_dict(data) ``` ``` -------------------------------- ### Configure and Launch Gradio Application Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md Enables queuing for the Gradio blocks and launches the application on a specified server address and port. The root path can also be configured. ```python blocks.queue(max_size=10) blocks.launch( server_name="0.0.0.0", server_port=8080, root_path=proxy_prefix ) ``` -------------------------------- ### Initialize ChatNVIDIA Model Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/types.md Set up the ChatNVIDIA model for interacting with NVIDIA cloud endpoints. Key parameters include the model ID, temperature, and API key for authentication. ```python from langchain_nvidia_ai_endpoints import ChatNVIDIA class ChatNVIDIA(BaseChatModel): def __init__( self, model: str, temperature: float = 0.7, api_key: str = None ): ... ``` -------------------------------- ### Configure and Test NIM Container API Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/agentic-rag-docs/self-host.md Instructions to check NIM container logs and test its API endpoint using curl. The model is configured automatically on container start. ```bash # Check the container logs docker logs nim # Test the API endpoint curl -X POST http://localhost:/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "messages": [ {"role": "user", "content": "Hello, NIM!"} ] }' ``` -------------------------------- ### Run API Server with Uvicorn Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/configuration.md Initialize the APIServer with a ChatClient instance and configure its routes. The server can then be run using uvicorn, specifying host, port, and worker count. ```python api_server = APIServer(client=chat_client) api_server.configure_routes() # Run with uvicorn # uvicorn chatui.api:app --host 0.0.0.0 --port 8080 --workers 1 ``` -------------------------------- ### AppConfig Loading Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/README.md Provides methods to load application configuration from a file or a dictionary. Also includes utilities for printing help and accessing environment variables. ```python AppConfig.from_file(filepath: str) ``` ```python AppConfig.from_dict(data: Dict) ``` ```python AppConfig.print_help(help_printer: Callable) ``` ```python AppConfig.envvars() ``` -------------------------------- ### Run Application with SSL Enabled Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md Enables SSL for the application server by providing paths to the SSL key and certificate files. This is crucial for secure communication. ```bash python -m chatui --ssl-keyfile /path/to/key.pem --ssl-certfile /path/to/cert.pem ``` -------------------------------- ### configure_routes Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/APIServer.md Configures all HTTP routes for the server, mounts Gradio applications, and sets up static file serving. ```APIDOC ## configure_routes ### Description Configures all HTTP routes for the server, mounts Gradio applications, and sets up static file serving. ### Method configure_routes ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```python from chatui.api import APIServer from chatui.chat_client import ChatClient import uvicorn client = ChatClient("http://localhost:8000", "llama-3.3-70b") api_server = APIServer(client) api_server.configure_routes() # Start server uvicorn.run(api_server, host="0.0.0.0", port=8080, workers=1) ``` ### Response #### Success Response (200) None #### Response Example None ### Routes Configured: - GET `/` → Serves `converse.html` (main chat interface) - GET `/converse` → Serves `converse.html` (chat page) - GET `/kb` → Serves `kb.html` (knowledge base management page) - POST `/content/call/` → Gradio chat interface endpoint - POST `/content/kb/call/` → Gradio knowledge base endpoint - GET `/*` → Static file serving from `/chatui/static/` ### Mounted Gradio Apps: - Path: `/content/` → Chat interface (converse page) - Path: `/content/kb/` → Knowledge base management (kb page) ``` -------------------------------- ### Custom Router Prompt Variant Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/prompts.md A custom variant of the router prompt, tailored for questions specifically about AI Workbench. It guides the model to choose between web search or a knowledge base based on the nature of the query. ```text Determine the best data source for this question about AI Workbench: Web search: Use for recent news, updates, or trending AI/ML topics Knowledge base: Use for AI Workbench documentation, configuration, and usage Question: {question} ``` -------------------------------- ### Create Chat API Client Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md Initializes the chat client using the server URL, port, and model name from the loaded configuration. ```python api_url = f"{config.server_url}:{config.server_port}" client = chat_client.ChatClient(api_url, config.model_name) ``` -------------------------------- ### Get Retriever from Chroma Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/database.md Creates and returns a LangChain retriever instance from the Chroma vector store. This retriever is used for querying the vector database and fetching relevant documents. It defaults to retrieving the top 4 results. ```python from chatui.utils import database retriever = database.get_retriever() docs = retriever.invoke("How do I install AI Workbench?") print(f"Retrieved {len(docs)} documents") for doc in docs: print(doc.page_content) ``` -------------------------------- ### from_file Class Method Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/AppConfig.md Loads application configuration from a specified JSON or YAML file. It automatically detects the file format and merges environment variables for overrides. ```APIDOC ## Class Method: from_file ### Description Loads configuration from JSON or YAML file with automatic format detection. ### Method Signature ```python @classmethod def from_file(cls, filepath: str) -> Optional[AppConfig] ``` ### Parameters #### Path Parameters - **filepath** (str) - Required - Path to config file (JSON or YAML format) ### Returns - `Optional[AppConfig]` — AppConfig instance or None if file cannot be loaded ### Process 1. Attempts to open file with UTF-8 encoding 2. Reads and parses JSON or YAML 3. Merges environment variables (highest priority) 4. Creates AppConfig via `from_dict()` 5. Returns config or None on error ### Error Handling - FileNotFoundError → logs error, returns None - PermissionError → logs error, returns None - JSON/YAML parse errors → logs detailed errors - Missing required fields → logs error, returns None - Invalid field values → logs error, returns None ### Example ```python from chatui.configuration import AppConfig # Load from file config = AppConfig.from_file("/path/to/config.json") if config: print(f"Server: {config.server_url}:{config.server_port}") print(f"Model: {config.model_name}") else: print("Failed to load configuration") ``` ### Supported Formats JSON: ```json { "serverUrl": "http://localhost", "serverPort": "8000", "serverPrefix": "/rag-api/", "modelName": "llama-3.3-70b" } ``` YAML: ```yaml serverUrl: http://localhost serverPort: "8000" serverPrefix: /rag-api/ modelName: llama-3.3-70b ``` ``` -------------------------------- ### APIServer Initialization Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/README.md Initializes the APIServer with a ChatClient instance. This server handles API requests. ```python APIServer(client: ChatClient) ``` -------------------------------- ### Main Application Execution Flow Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md The main script entry point for the application. It orchestrates argument parsing, configuration loading, client initialization, UI building, and server launching. ```python # Main script entry point if __name__ == "__main__": # 1. Parse CLI arguments args = parse_args() # 2. Configure logging verbosity os.environ["APP_VERBOSITY"] = f"{args.verbose - args.quiet}" os.environ["APP_CONFIG_FILE"] = args.config # 3. Load config from chatui import api, chat_client, configuration, pages config_file = os.environ.get("APP_CONFIG_FILE", "/dev/null") config = configuration.AppConfig.from_file(config_file) if not config: sys.exit(1) # 4. Connect to backend api_url = f"{config.server_url}:{config.server_port}" client = chat_client.ChatClient(api_url, config.model_name) proxy_prefix = os.environ.get("PROXY_PREFIX") # 5. Build and launch UI blocks = pages.converse.build_page(client) blocks.queue(max_size=10) blocks.launch( server_name="0.0.0.0", server_port=8080, root_path=proxy_prefix ) ``` -------------------------------- ### Create AppConfig from Dictionary Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/AppConfig.md Creates an AppConfig instance from a dictionary. Environment variables are automatically merged to override dictionary values. ```python from chatui.configuration import AppConfig data = { "serverUrl": "http://api.local", "serverPort": "8080", "modelName": "mistral-mixtral" } config = AppConfig.from_dict(data) ``` -------------------------------- ### Run Application on Custom Host and Port Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/entrypoint.md Specifies a custom hostname and port for the application server. Useful for avoiding conflicts or for specific network configurations. ```bash python -m chatui --host localhost --port 9000 ``` -------------------------------- ### CustomChatOpenAI Constructor Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/CustomChatOpenAI.md Initializes the CustomChatOpenAI model. Supports specifying the custom endpoint, port, model name, GPU details, and temperature. Raises ValueError for incompatible GPU configurations. ```python def __init__( self, custom_endpoint: str, port: str = "8000", model_name: str = "meta/llama-3.1-8b-instruct", gpu_type: Optional[str] = None, gpu_count: Optional[str] = None, temperature: float = 0.0, **kwargs ) -> None ``` -------------------------------- ### ChatClient Constructor Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/ChatClient.md Initializes the ChatClient with the server URL and model name. ```APIDOC ## Class: ChatClient Client for communicating with the Agentic RAG backend API service. Handles document search requests, model inference, and document uploads. ### Constructor ```python def __init__(self, server_url: str, model_name: str) -> None: ``` #### Parameters - **server_url** (str) - Required - Base URL of the chat API server, e.g., `http://localhost:8000` - **model_name** (str) - Required - Friendly name identifier of the LLM model being used ``` -------------------------------- ### Load AppConfig from File Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/AppConfig.md Loads application configuration from a specified file (JSON or YAML). Environment variables are merged with the highest priority. Returns an AppConfig instance or None if loading fails. ```python from chatui.configuration import AppConfig # Load from file config = AppConfig.from_file("/path/to/config.json") if config: print(f"Server: {config.server_url}:{config.server_port}") print(f"Model: {config.model_name}") else: print("Failed to load configuration") ``` -------------------------------- ### Question Router Prompt Template Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/code/langgraph_rag_agent_llama3_nvidia_nim.ipynb Sets up a prompt template for a question router. This LLM determines whether to use a vector store or web search based on the user's question. It's configured to output a JSON with a 'datasource' key. Questions about LLM agents, prompt engineering, and adversarial attacks are routed to the vector store. ```python prompt = PromptTemplate( template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an expert at routing a user question to a vectorstore or web search. Use the vectorstore for questions on LLM agents, prompt engineering, and adversarial attacks. You do not need to be stringent with the keywords in the question related to these topics. Otherwise, use web-search. Give a binary choice 'web_search' or 'vectorstore' based on the question. Return the a JSON with a single key 'datasource' and no premable or explanation. Question to route: {question} <|eot_id|><|start_header_id|>assistant<|end_header_id|>""", input_variables=["question"], ) question_router = prompt | llm | JsonOutputParser() question = "llm agent memory" docs = retriever.get_relevant_documents(question) doc_txt = docs[1].page_content print(question_router.invoke({"question": question})) ``` -------------------------------- ### Retrieve AppConfig Environment Variables Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/api-reference/AppConfig.md Returns a list of valid environment variables and their corresponding configuration paths. Useful for understanding how environment variables map to configuration settings. ```python from chatui.configuration import AppConfig env_vars = AppConfig.envvars() for env_name, path, field_type in env_vars: print(f"{env_name} → {path} (type: {field_type})") ``` -------------------------------- ### AppConfig Loading Source: https://github.com/nvidia/workbench-example-agentic-rag/blob/main/_autodocs/README.md Provides methods to load application configuration from a file or a dictionary, and utilities for environment variables and help printing. ```APIDOC ## AppConfig ### Description Provides methods to load application configuration from a file or a dictionary, and utilities for environment variables and help printing. ### Methods - **from_file(filepath: str)**: Loads configuration from a file. - **from_dict(data: Dict)**: Loads configuration from a dictionary. - **print_help(help_printer: Callable)**: Prints configuration help. - **envvars()**: Retrieves configuration from environment variables. ```