### Install dify-dataset-sdk using pip Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Installs the Dify Knowledge Base SDK using pip, the Python package installer. This is the primary way to get the library into your Python environment. ```bash pip install dify-dataset-sdk ``` -------------------------------- ### Clone Repository and Install Dependencies (Bash) Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Commands to clone the Dify SDK repository from GitHub and install its development dependencies using pip. This sets up the local environment for development. ```bash # Clone the repository git clone https://github.com/LeekJay/dify-dataset-sdk.git cd dify-dataset-sdk # Install dependencies pip install -e ".[dev]" ``` -------------------------------- ### Advanced Retrieval Methods (Semantic, Hybrid, Full-Text) Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Shows examples of performing different types of searches within a dataset: semantic search, hybrid search (combining semantic and full-text), and pure full-text search, with configurable parameters like top_k and score_threshold. ```python # Semantic search results = client.retrieve( dataset_id=dataset_id, query="How to implement authentication?", retrieval_config={ "search_method": "semantic_search", "top_k": 5, "score_threshold": 0.7 } ) # Hybrid search (combining semantic and full-text) results = client.retrieve( dataset_id=dataset_id, query="API documentation", retrieval_config={ "search_method": "hybrid_search", "top_k": 10, "rerank_model": { "model": "rerank-multilingual-v2.0", "mode": "reranking_model" } } ) # Full-text search results = client.retrieve( dataset_id=dataset_id, query="database configuration", retrieval_config={"search_method": "full_text_search", "top_k": 5} ) ``` -------------------------------- ### Initialize DifyDatasetClient and manage datasets Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Demonstrates initializing the DifyDatasetClient with an API key and base URL. It shows how to create a new dataset, list existing datasets with pagination, and delete a dataset. ```python from dify_dataset_sdk import DifyDatasetClient # Initialize the client with API key and custom base URL client = DifyDatasetClient( api_key="your-api-key", base_url="https://your-custom-dify-instance.com", timeout=60.0 # Custom timeout in seconds ) # Create a new dataset (knowledge base) dataset = client.create_dataset( name="My Knowledge Base", permission="only_me" ) # Create a dataset with description dataset = client.create_dataset( name="Technical Documentation", permission="only_me", description="Internal technical docs" ) # List datasets with pagination datasets = client.list_datasets(page=1, limit=20) # Delete a dataset (ensure dataset_id is defined) # client.delete_dataset(dataset_id) # Close the client client.close() ``` -------------------------------- ### Run Tests with Pytest (Bash) Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Instructions for running tests using the pytest framework. Includes commands for running all tests, specific files, and with verbose output. ```bash # Run all tests pytest # Run specific test file python tests/test_all_39_apis.py # Run with verbose output pytest -v ``` -------------------------------- ### Client Configuration Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Details on how to configure and initialize the Dify Dataset Client. ```APIDOC ## Client Configuration ### Description Initialize the Dify Dataset Client with your API key and optional parameters. ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```python DifyDatasetClient( api_key: str, # Required: Your Dify API key base_url: str, # Optional: API base URL (default: "https://api.dify.ai") timeout: float # Optional: Request timeout in seconds (default: 30.0) ) ``` ### Response None ``` -------------------------------- ### Create documents from text and files Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Shows how to create documents within a dataset. Supports creating documents directly from plain text content or from local files like PDFs, with options for indexing techniques and processing rules. ```python from dify_dataset_sdk import DifyDatasetClient # Assume client is initialized and dataset_id is available # client = DifyDatasetClient(api_key="your-api-key") # dataset_id = dataset.id # Create a document from text doc_response = client.create_document_by_text( dataset_id=dataset_id, name="Sample Document", text="This is a sample document for the knowledge base.", indexing_technique="high_quality" ) # Create document from text with custom processing mode doc_response = client.create_document_by_text( dataset_id=dataset_id, name="API Documentation", text="Complete API documentation content...", indexing_technique="high_quality", process_rule_mode="automatic" ) # Create document from a local file doc_response = client.create_document_by_file( dataset_id=dataset_id, file_path="./documentation.pdf", indexing_technique="high_quality" ) ``` -------------------------------- ### Initialize DifyDatasetClient (Python) Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Configuration for the DifyDatasetClient, specifying the API key, optional base URL, and request timeout. This client is used to interact with the Dify API. ```python DifyDatasetClient( api_key: str, # Required: Your Dify API key base_url: str, # Optional: API base URL (default: "https://api.dify.ai") timeout: float # Optional: Request timeout in seconds (default: 30.0) ) ``` -------------------------------- ### Manage Knowledge Tags and Bind Datasets Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Demonstrates creating knowledge tags, binding datasets to these tags, listing all tags, retrieving tags for a specific dataset, and filtering datasets by tags using the Dify SDK. ```python # Create knowledge tags tag = client.create_knowledge_tag(name="Technical Documentation") dept_tag = client.create_knowledge_tag(name="Engineering Department") # Bind datasets to tags client.bind_dataset_to_tag(dataset_id, [tag.id, dept_tag.id]) # List all knowledge tags tags = client.list_knowledge_tags() # Get tags for a specific dataset dataset_tags = client.get_dataset_tags(dataset_id) # Filter datasets by tags filtered_datasets = client.list_datasets(tag_ids=[tag.id]) ``` -------------------------------- ### Health Monitoring Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Provides an overview of how to monitor the SDK's performance and API health. ```APIDOC ## Health Monitoring ### Description Monitor SDK performance and API health by tracking requests, errors, and response times. ### Example Usage ```python class SDKMonitor: def __init__(self, client): self.client = client self.metrics = {"requests": 0, "errors": 0, "avg_response_time": 0} def health_check(self): try: start_time = time.time() self.client.list_datasets(limit=1) response_time = time.time() - start_time return {"status": "healthy", "response_time": response_time} except Exception as e: return {"status": "unhealthy", "error": str(e)} ``` ``` -------------------------------- ### Batch Document Upload using ThreadPoolExecutor Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Shows how to efficiently process multiple documents in parallel using Python's `concurrent.futures.ThreadPoolExecutor` for uploading documents to a dataset with specified indexing quality. ```python from concurrent.futures import ThreadPoolExecutor def upload_document(file_path): return client.create_document_by_file( dataset_id=dataset_id, file_path=file_path, indexing_technique="high_quality" ) # Parallel document upload with ThreadPoolExecutor(max_workers=3) as executor: futures = [executor.submit(upload_document, file) for file in file_list] results = [future.result() for future in futures] ``` -------------------------------- ### Knowledge Tag Management API Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md APIs for creating, listing, and binding knowledge tags to datasets. ```APIDOC ## POST /api/knowledge/tags ### Description Creates a new knowledge tag. ### Method POST ### Endpoint /api/knowledge/tags ### Parameters #### Request Body - **name** (string) - Required - The name of the knowledge tag. ### Request Example ```json { "name": "Technical Documentation" } ``` ### Response #### Success Response (200) - **id** (string) - The ID of the created tag. - **name** (string) - The name of the created tag. ### Response Example ```json { "id": "tag_123", "name": "Technical Documentation" } ``` ## POST /api/datasets/{dataset_id}/tags ### Description Binds one or more knowledge tags to a specific dataset. ### Method POST ### Endpoint /api/datasets/{dataset_id}/tags ### Parameters #### Path Parameters - **dataset_id** (string) - Required - The ID of the dataset. #### Request Body - **tag_ids** (array of strings) - Required - A list of tag IDs to bind to the dataset. ### Request Example ```json { "tag_ids": ["tag_123", "tag_456"] } ``` ### Response #### Success Response (200) - **message** (string) - Confirmation message. ### Response Example ```json { "message": "Tags bound successfully." } ``` ## GET /api/knowledge/tags ### Description Lists all available knowledge tags. ### Method GET ### Endpoint /api/knowledge/tags ### Response #### Success Response (200) - **tags** (array of objects) - A list of knowledge tags, each with 'id' and 'name'. ### Response Example ```json { "tags": [ {"id": "tag_123", "name": "Technical Documentation"}, {"id": "tag_456", "name": "Engineering Department"} ] } ``` ## GET /api/datasets/{dataset_id}/tags ### Description Retrieves all knowledge tags associated with a specific dataset. ### Method GET ### Endpoint /api/datasets/{dataset_id}/tags ### Parameters #### Path Parameters - **dataset_id** (string) - Required - The ID of the dataset. ### Response #### Success Response (200) - **tags** (array of objects) - A list of tags associated with the dataset, each with 'id' and 'name'. ### Response Example ```json { "tags": [ {"id": "tag_123", "name": "Technical Documentation"} ] } ``` ## GET /api/datasets ### Description Lists datasets, with an option to filter by knowledge tags. ### Method GET ### Endpoint /api/datasets ### Parameters #### Query Parameters - **tag_ids** (array of strings) - Optional - Filters datasets by a list of tag IDs. ### Response #### Success Response (200) - **datasets** (array of objects) - A list of datasets matching the filter criteria. ### Response Example ```json { "datasets": [ {"id": "dataset_abc", "name": "Dataset A"}, {"id": "dataset_def", "name": "Dataset B"} ] } ``` ``` -------------------------------- ### Format and Check Code with Ruff (Bash) Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Commands to format code and check for linting issues using the Ruff tool. These commands help maintain code quality and consistency. ```bash # Format code ruff format dify_dataset_sdk/ # Check and fix issues ruff check --fix dify_dataset_sdk/ # Type checking mypy dify_dataset_sdk/ ``` -------------------------------- ### Supported File Types Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Lists the file types that the SDK supports for uploading data. ```APIDOC ## Supported File Types ### Description The SDK supports uploading data from various file formats. ### File Types - `txt` - Plain text files - `md`, `markdown` - Markdown files - `pdf` - PDF documents - `html` - HTML files - `xlsx` - Excel spreadsheets - `docx` - Word documents - `csv` - CSV files ``` -------------------------------- ### Error Handling with Retry Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Demonstrates how to implement automatic retry mechanisms for robust error handling. ```APIDOC ## Error Handling with Retry ### Description Implement robust error handling with automatic retry using exponential backoff for network-related errors. ### Example Usage ```python from dify_dataset_sdk.exceptions import DifyTimeoutError, DifyConnectionError import time def safe_operation_with_retry(operation, max_retries=3): for attempt in range(max_retries): try: return operation() except (DifyTimeoutError, DifyConnectionError) as e: if attempt < max_retries - 1: wait_time = 2 ** attempt # Exponential backoff time.sleep(wait_time) continue raise e ``` ``` -------------------------------- ### Rate Limits Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Information regarding API rate limits and the SDK's handling of them. ```APIDOC ## Rate Limits ### Description Users must adhere to Dify's API rate limits. The SDK is designed with built-in error handling for rate limit responses. ``` -------------------------------- ### Batch Processing API Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Demonstrates efficient processing of multiple documents using parallel operations. ```APIDOC ## Parallel Document Upload ### Description Uploads multiple documents to a dataset concurrently using a thread pool executor. ### Method POST (implicitly via `create_document_by_file` calls) ### Endpoint `/api/datasets/{dataset_id}/documents` (for `create_document_by_file`) ### Parameters (for `create_document_by_file`) #### Path Parameters - **dataset_id** (string) - Required - The ID of the dataset to upload to. #### Request Body (for `create_document_by_file`) - **file_path** (string) - Required - The path to the document file. - **indexing_technique** (string) - Optional - The technique to use for indexing (e.g., "high_quality"). ### Code Example ```python from concurrent.futures import ThreadPoolExecutor def upload_document(client, dataset_id, file_path): """Helper function to upload a single document.""" try: return client.create_document_by_file( dataset_id=dataset_id, file_path=file_path, indexing_technique="high_quality" ) except Exception as e: print(f"Error uploading {file_path}: {e}") return None # Assume 'client' is an initialized Dify SDK client instance # Assume 'dataset_id' is the target dataset ID # Assume 'file_list' is a list of file paths to upload file_list = ["/path/to/doc1.pdf", "/path/to/doc2.docx", "/path/to/doc3.txt"] print("Starting parallel document upload...") with ThreadPoolExecutor(max_workers=3) as executor: # Submit upload tasks to the executor futures = [executor.submit(upload_document, client, dataset_id, file) for file in file_list] # Collect results as tasks complete results = [] for future in futures: result = future.result() if result: results.append(result) print(f"Successfully uploaded {len(results)} documents.") ``` ### Response (for `create_document_by_file`) #### Success Response (200) - **document_id** (string) - The ID of the uploaded document. - **status** (string) - The status of the document upload. ### Response Example (for `create_document_by_file`) ```json { "document_id": "doc_uuid_123", "status": "uploaded" } ``` ``` -------------------------------- ### Manage Metadata Fields and Update Documents Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Illustrates how to create custom metadata fields (e.g., 'category', 'priority') for datasets and subsequently update document metadata with specific values using the SDK. ```python # Create metadata fields category_field = client.create_metadata_field( dataset_id=dataset_id, field_type="string", name="category" ) priority_field = client.create_metadata_field( dataset_id=dataset_id, field_type="number", name="priority" ) # Update document metadata metadata_operations = [ { "document_id": document_id, "metadata_list": [ { "id": category_field.id, "value": "technical", "name": "category" }, { "id": priority_field.id, "value": "5", "name": "priority" } ] } ] client.update_document_metadata(dataset_id, metadata_operations) ``` -------------------------------- ### Progress Monitoring API Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md API for monitoring the status of document indexing within a dataset. ```APIDOC ## GET /api/datasets/{dataset_id}/indexing-status ### Description Retrieves the indexing status for documents in a dataset. ### Method GET ### Endpoint /api/datasets/{dataset_id}/indexing-status ### Parameters #### Path Parameters - **dataset_id** (string) - Required - The ID of the dataset. - **batch_id** (string) - Required - The ID of the indexing batch. ### Response #### Success Response (200) - **data** (array of objects) - Information about the indexing status. - **indexing_status** (string) - The current status of indexing (e.g., "completed", "processing"). - **completed_segments** (integer) - The number of segments that have been processed. - **total_segments** (integer) - The total number of segments to process. ### Response Example ```json { "data": [ { "indexing_status": "completed", "completed_segments": 100, "total_segments": 100 } ] } ``` ``` -------------------------------- ### Comprehensive Error Handling with Dify SDK Exceptions Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Demonstrates robust error handling by catching specific exception types provided by the Dify SDK, such as authentication errors, validation errors, not found errors, and general API errors, along with their attributes. ```python from dify_dataset_sdk.exceptions import ( DifyAPIError, DifyAuthenticationError, DifyValidationError, DifyNotFoundError, DifyConflictError, DifyServerError, DifyConnectionError, DifyTimeoutError ) try: dataset = client.create_dataset(name="Test Dataset") except DifyAuthenticationError: print("Invalid API key") except DifyValidationError as e: print(f"Validation error: {e}") except DifyConflictError as e: print(f"Conflict: {e}") # e.g., duplicate dataset name except DifyAPIError as e: print(f"API error: {e}") print(f"Status code: {e.status_code}") print(f"Error code: {e.error_code}") ``` -------------------------------- ### Monitor Document Indexing Progress Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Provides a code snippet to retrieve the indexing status of documents within a dataset and display progress information such as completed and total segments. ```python # Monitor document indexing progress status = client.get_document_indexing_status(dataset_id, batch_id) if status.data: indexing_info = status.data[0] print(f"Status: {indexing_info.indexing_status}") print(f"Progress: {indexing_info.completed_segments}/{indexing_info.total_segments}") ``` -------------------------------- ### Custom document processing rules Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Illustrates how to define and use custom processing rules when creating documents from files. This allows for fine-grained control over text cleaning and segmentation, such as removing extra spaces or URLs, and setting segment size. ```python from dify_dataset_sdk import DifyDatasetClient # Assume client is initialized and dataset_id is available # client = DifyDatasetClient(api_key="your-api-key") # dataset_id = dataset.id # Custom processing configuration process_rule_config = { "rules": { "pre_processing_rules": [ {"id": "remove_extra_spaces", "enabled": True}, {"id": "remove_urls_emails", "enabled": True} ], "segmentation": { "separator": "###", "max_tokens": 500 } } } doc_response = client.create_document_by_file( dataset_id=dataset_id, file_path="document.txt", process_rule_mode="custom", process_rule_config=process_rule_config ) ``` -------------------------------- ### Manage document segments Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Provides methods for segment management within a dataset and document. Includes creating multiple segments with content, answers, and keywords, listing existing segments, updating a segment's details, and deleting a specific segment. ```python from dify_dataset_sdk import DifyDatasetClient # Assume client is initialized, dataset_id, document_id, and segment_id are available # client = DifyDatasetClient(api_key="your-api-key") # dataset_id = dataset.id # document_id = doc_response.id # segment_id = segments[0].id # Create segments segments_data = [ { "content": "First segment content", "answer": "Answer for first segment", "keywords": ["keyword1", "keyword2"] }, { "content": "Second segment content", "answer": "Answer for second segment", "keywords": ["keyword3", "keyword4"] } ] segments = client.create_segments(dataset_id, document_id, segments_data) # List segments segments = client.list_segments(dataset_id, document_id) # Update a segment client.update_segment( dataset_id=dataset_id, document_id=document_id, segment_id=segment_id, segment_data={ "content": "Updated content", "keywords": ["updated", "keywords"], "enabled": True } ) # Delete a segment # client.delete_segment(dataset_id, document_id, segment_id) ``` -------------------------------- ### Advanced Retrieval API Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md APIs for performing semantic, hybrid, and full-text searches on datasets. ```APIDOC ## POST /api/datasets/{dataset_id}/retrieve ### Description Performs a retrieval operation on a dataset using specified search methods and configurations. ### Method POST ### Endpoint /api/datasets/{dataset_id}/retrieve ### Parameters #### Path Parameters - **dataset_id** (string) - Required - The ID of the dataset to search within. #### Request Body - **query** (string) - Required - The search query. - **retrieval_config** (object) - Required - Configuration for the retrieval process. - **search_method** (string) - Required - The search method to use (e.g., "semantic_search", "hybrid_search", "full_text_search"). - **top_k** (integer) - Optional - The number of results to return. - **score_threshold** (float) - Optional - The minimum score for results (for semantic search). - **rerank_model** (object) - Optional - Configuration for reranking results (for hybrid search). - **model** (string) - Required - The name of the rerank model. - **mode** (string) - Required - The mode of the rerank model. ### Request Example ```json { "query": "How to implement authentication?", "retrieval_config": { "search_method": "semantic_search", "top_k": 5, "score_threshold": 0.7 } } ``` ### Response #### Success Response (200) - **results** (array of objects) - The search results. ### Response Example ```json { "results": [ { "content": "Authentication can be implemented using OAuth 2.0...", "score": 0.92, "metadata": {} } ] } ``` ``` -------------------------------- ### Error Handling Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Details on the exception types provided by the SDK for robust error management. ```APIDOC ## Error Handling Overview The Dify SDK provides a set of custom exception classes to handle various API-related errors gracefully. ### Exception Types - **DifyAPIError**: Base class for all Dify API errors. - **DifyAuthenticationError**: Raised for authentication failures (e.g., invalid API key). - **DifyValidationError**: Raised when request data fails validation. - **DifyNotFoundError**: Raised when a requested resource is not found. - **DifyConflictError**: Raised when an operation conflicts with existing resources (e.g., duplicate names). - **DifyServerError**: Raised for server-side errors. - **DifyConnectionError**: Raised for network connection issues. - **DifyTimeoutError**: Raised when an API request times out. ### Usage Example ```python from dify_dataset_sdk.exceptions import ( DifyAPIError, DifyAuthenticationError, DifyValidationError, DifyNotFoundError, DifyConflictError, DifyServerError, DifyConnectionError, DifyTimeoutError ) try: # Attempt an API operation that might fail client.create_dataset(name="My Dataset") except DifyAuthenticationError: print("Authentication failed. Please check your API key.") except DifyValidationError as e: print(f"Validation error occurred: {e}") except DifyConflictError as e: print(f"Conflict detected: {e}. Resource might already exist.") except DifyNotFoundError as e: print(f"Resource not found: {e}") except DifyServerError as e: print(f"Server error: {e}. Status code: {e.status_code}") except DifyConnectionError: print("Could not connect to the Dify API. Check your network connection.") except DifyTimeoutError: print("The request to the Dify API timed out.") except DifyAPIError as e: print(f"An unexpected API error occurred: {e}") print(f"Error Code: {e.error_code}") print(f"Status Code: {e.status_code}") except Exception as e: print(f"An unexpected error occurred: {e}") ``` ``` -------------------------------- ### Monitor SDK Performance and API Health (Python) Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md This class provides health monitoring for the SDK by tracking requests, errors, and average response times. The `health_check` method performs a simple API call to gauge the connection and response status. ```python class SDKMonitor: def __init__(self, client): self.client = client self.metrics = {"requests": 0, "errors": 0, "avg_response_time": 0} def health_check(self): try: start_time = time.time() self.client.list_datasets(limit=1) response_time = time.time() - start_time return {"status": "healthy", "response_time": response_time} except Exception as e: return {"status": "unhealthy", "error": str(e)} ``` -------------------------------- ### List documents in a dataset Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md Demonstrates how to retrieve a list of all documents within a specific dataset. It shows printing the total count of documents retrieved. ```python from dify_dataset_sdk import DifyDatasetClient # Assume client is initialized and dataset_id is available # client = DifyDatasetClient(api_key="your-api-key") # dataset_id = dataset.id # List all documents documents = client.list_documents(dataset_id) print(f"Total documents: {documents.total}") ``` -------------------------------- ### Metadata Management API Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md APIs for creating metadata fields and updating document metadata. ```APIDOC ## POST /api/datasets/{dataset_id}/metadata-fields ### Description Creates a new metadata field for a dataset. ### Method POST ### Endpoint /api/datasets/{dataset_id}/metadata-fields ### Parameters #### Path Parameters - **dataset_id** (string) - Required - The ID of the dataset. #### Request Body - **name** (string) - Required - The name of the metadata field. - **field_type** (string) - Required - The type of the metadata field (e.g., "string", "number"). ### Request Example ```json { "name": "category", "field_type": "string" } ``` ### Response #### Success Response (200) - **id** (string) - The ID of the created metadata field. - **name** (string) - The name of the metadata field. - **field_type** (string) - The type of the metadata field. ### Response Example ```json { "id": "field_xyz", "name": "category", "field_type": "string" } ``` ## POST /api/datasets/{dataset_id}/documents/metadata ### Description Updates the metadata for one or more documents within a dataset. ### Method POST ### Endpoint /api/datasets/{dataset_id}/documents/metadata ### Parameters #### Path Parameters - **dataset_id** (string) - Required - The ID of the dataset. #### Request Body - **metadata_operations** (array of objects) - Required - A list of operations to update document metadata. - **document_id** (string) - Required - The ID of the document to update. - **metadata_list** (array of objects) - Required - A list of metadata key-value pairs to apply. - **id** (string) - Required - The ID of the metadata field. - **value** (string) - Required - The value for the metadata field. - **name** (string) - Required - The name of the metadata field. ### Request Example ```json { "metadata_operations": [ { "document_id": "doc_789", "metadata_list": [ { "id": "field_xyz", "value": "technical", "name": "category" } ] } ] } ``` ### Response #### Success Response (200) - **message** (string) - Confirmation message. ### Response Example ```json { "message": "Document metadata updated successfully." } ``` ``` -------------------------------- ### Implement Error Handling with Retry (Python) Source: https://github.com/leekjay/dify-knowledge-sdk/blob/master/README.md This function implements robust error handling with automatic retries for operations that might fail due to timeouts or connection errors. It uses exponential backoff to increase wait times between retries, improving resilience. ```python from dify_dataset_sdk.exceptions import DifyTimeoutError, DifyConnectionError import time def safe_operation_with_retry(operation, max_retries=3): for attempt in range(max_retries): try: return operation() except (DifyTimeoutError, DifyConnectionError) as e: if attempt < max_retries - 1: wait_time = 2 ** attempt # Exponential backoff time.sleep(wait_time) continue raise e ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.