### Install Vectorize Python Client Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Installs or upgrades the Vectorize Python client library using pip. This is the first step to using the client in your Python environment. ```python !pip install vectorize-client --upgrade ``` -------------------------------- ### Install Vectorize Client Source: https://github.com/vectorize-io/vectorize-clients/blob/main/scripts/ts_README.md Installs the Vectorize Node.js client library using npm. This command is essential for setting up the client in your project. ```sh npm install @vectorize-io/vectorize-client ``` -------------------------------- ### Install Vectorize Client Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/ts/README.md Installs the Vectorize Node.js client library using npm. This command is essential for setting up the client in your project. ```sh npm install @vectorize-io/vectorize-client ``` -------------------------------- ### Install Vectorize Client Source: https://github.com/vectorize-io/vectorize-clients/blob/main/scripts/python_README.md Installs the Vectorize Python client library using pip. This is the first step to using the client in your Python projects. ```sh pip install vectorize-client ``` -------------------------------- ### Install Vectorize Client Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/python/README.md Installs the Vectorize Python client library using pip. This is the first step to using the client in your Python projects. ```sh pip install vectorize-client ``` -------------------------------- ### Start File Upload to Connector Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Initiates a file upload process to a specific source connector. It prepares the upload by specifying the file name, content type, and metadata, returning a pre-signed URL for the upload. ```python import urllib3, json, os, mimetypes http = urllib3.PoolManager() file_path = "apple.pdf" content_type, _ = mimetypes.guess_type(file_path) uploads_api = v.UploadsApi(api) metadata = {"created-from-api": True} upload_response = uploads_api.start_file_upload_to_connector( org, source_connector_id, v.StartFileUploadToConnectorRequest( name=file_path.split("/")[-1], content_type=content_type, # add additional metadata that will be stored along with each chunk in the vector database metadata=json.dumps(metadata))) with open(file_path, "rb") as f: response = http.request("PUT", upload_response.upload_url, body=f, headers={"Content-Type": content_type, "Content-Length": str(os.path.getsize(file_path))}) if response.status != 200: print("Upload failed: ", response.data) else: print("Upload successful") ``` -------------------------------- ### Upload File via Files API Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Uploads a file using the Files API, which is a more direct method than uploading to a connector. It starts the upload, gets a URL, uploads the file content, and then initiates an extraction process. ```python import urllib3, os files_api = v.FilesApi(api) file_path="apple.pdf" start_file_upload_response = files_api.start_file_upload(org, start_file_upload_request=v.StartFileUploadRequest( content_type=content_type, name="My file.pdf", )) http = urllib3.PoolManager() with open(file_path, "rb") as f: response = http.request("PUT", start_file_upload_response.upload_url, body=f, headers={"Content-Type": "application/pdf", "Content-Length": str(os.path.getsize(file_path))}) if response.status != 200: print("Upload failed: ", response.data) else: print("Upload successful") extraction_api = v.ExtractionApi(api) response = extraction_api.start_extraction(org, start_extraction_request=v.StartExtractionRequest( file_id=start_file_upload_response.file_id )) extraction_id = response.extraction_id ``` -------------------------------- ### Get Vectorize Pipelines Source: https://github.com/vectorize-io/vectorize-clients/blob/main/scripts/python_README.md Demonstrates how to list all available pipelines for a given organization using the Vectorize Python client. It requires an access token and organization ID. ```python import vectorize_client as v TOKEN = '' ORG = '' with v.ApiClient(v.Configuration(access_token=TOKEN)) as api: pipelines = v.PipelinesApi(api) response = pipelines.get_pipelines(ORG) print("Found" + str(len(response.data)) + " pipelines") ``` -------------------------------- ### Get Built-in AI Platform and Vector Database Connectors Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Retrieves lists of available AI platform and destination (vector database) connectors. It then identifies the IDs for the built-in 'VECTORIZE' type connectors. ```python ai_platforms = connectors_api.get_ai_platform_connectors(org) builtin_ai_platform = [c.id for c in ai_platforms.ai_platform_connectors if c.type == "VECTORIZE"][0] vector_databases = connectors_api.get_destination_connectors(org) builtin_vector_db = [c.id for c in vector_databases.destination_connectors if c.type == "VECTORIZE"][0] ``` -------------------------------- ### Get Vectorize Pipelines Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/python/README.md Demonstrates how to list all available pipelines for a given organization using the Vectorize Python client. It requires an access token and organization ID. ```python import vectorize_client as v TOKEN = '' ORG = '' with v.ApiClient(v.Configuration(access_token=TOKEN)) as api: pipelines = v.PipelinesApi(api) response = pipelines.get_pipelines(ORG) print("Found" + str(len(response.data)) + " pipelines") ``` -------------------------------- ### Generate Vectorize Clients Source: https://github.com/vectorize-io/vectorize-clients/blob/main/README.md Installs project dependencies and generates the Vectorize client for TypeScript and Python. This process utilizes the OpenAPI specification to create language-specific client libraries. ```bash npm install npm run generate:ts npm run generate:python ``` -------------------------------- ### Start Deep Research on Pipeline Data Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Initiates a deep research task on the data within a pipeline. It takes a query and an optional web search flag, returning a research ID to track the task's progress. ```python pipelines_api = v.PipelinesApi(api) response = pipelines.start_deep_research(org, pipeline_id, v.StartDeepResearchRequest( # make sure to include a relevant prompt here query="Generate a report on Apple RSE 2024", # optionally enable additional search on the web web_search=False, )) research_id = response.research_id ``` -------------------------------- ### Release Vectorize Clients Source: https://github.com/vectorize-io/vectorize-clients/blob/main/README.md Installs project dependencies and releases the Vectorize client for TypeScript and Python to their respective package managers. This command is used to publish updated client libraries. ```bash npm install npm run release:ts npm run release:python ``` -------------------------------- ### Monitor Extraction Progress and Get Results Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Polls the status of an extraction task using its ID. Once the extraction is complete, it prints the extracted text content or an error message if the process failed. ```python while True: response = extraction_api.get_extraction_result(org, extraction_id) if response.ready: if response.data.success: print(response.data.text) else: print("Extraction failed: ", response.data.error) break print("not ready") ``` -------------------------------- ### Monitor Deep Research Progress and Get Results Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Continuously checks the status of a deep research task using its ID. Once the task is ready, it prints the research results in markdown format or an error message if it failed. ```python while True: response = pipelines.get_deep_research_result(org, pipeline_id, research_id) if response.ready: if response.data.success: print(response.data.markdown) else: print("Deep Research failed: ", response.data.error) break print("not ready") ``` -------------------------------- ### Download Sample PDF File Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Downloads a sample PDF file from Apple's newsroom using wget. This file will be used later for data ingestion and processing. ```shell !wget -O apple.pdf https://www.apple.com/newsroom/pdfs/fy2024-q1/FY24_Q1_Consolidated_Financial_Statements.pdf ``` -------------------------------- ### List Vectorize Pipelines Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Retrieves and prints the names of all available pipelines for the specified organization using the Pipelines API. Demonstrates basic API interaction after initialization. ```python response = pipelines.get_pipelines(org) for pipeline in response.data: print("Pipeline: " + pipeline.name) ``` -------------------------------- ### Create a Vectorize Pipeline Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Creates a new data processing pipeline. It configures the pipeline with the previously created file upload source connector, the built-in vector database, and an AI platform with specific chunking and embedding settings. ```python response = pipelines.create_pipeline(org, v.PipelineConfigurationSchema( source_connectors=[v.SourceConnectorSchema(id=source_connector_id, type="FILE_UPLOAD", config={})], destination_connector=v.DestinationConnectorSchema(id=builtin_vector_db, type="VECTORIZE", config={}), ai_platform=v.AIPlatformSchema(id=builtin_ai_platform, type="VECTORIZE", config={ "chunkSize": 600, "chunkingStrategy": "FIXED", "embeddingModel": "VECTORIZE_OPEN_AI_TEXT_EMBEDDING_3_LARGE", }), pipeline_name="My Pipeline From API", schedule=v.ScheduleSchema(type="manual") )) pipeline_id = response.data.id pipeline_id ``` -------------------------------- ### Authenticate and Initialize Vectorize Client Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Prompts the user for their Vectorize Organization ID and API Token, then initializes the Vectorize API client and Pipelines API. Securely handles token input using getpass. ```python import getpass org=input("Vectorize Organization ID:") token=getpass.getpass("Vectorize Token:") ``` ```python import vectorize_client as v api = v.ApiClient(v.Configuration(access_token=token, host="https://api.vectorize.io/v1")) pipelines = v.PipelinesApi(api) ``` -------------------------------- ### List Pipelines with Vectorize Client Source: https://github.com/vectorize-io/vectorize-clients/blob/main/scripts/ts_README.md Demonstrates how to list all pipelines using the Vectorize Node.js client. It requires importing necessary classes and configuring the API client with an access token and organization. ```typescript import {Configuration, PipelinesApi} from "@vectorize-io/vectorize-client"; const connectorsApi = new ConnectorsApi(new Configuration({ accessToken: "token", })); const pipelines = connectorsApi.getPipelines({ organization: "your-org" }) console.log(pipelines) ``` -------------------------------- ### List Pipelines with Vectorize Client Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/ts/README.md Demonstrates how to list all pipelines using the Vectorize Node.js client. It requires importing necessary classes and configuring the API client with an access token and organization. ```typescript import {Configuration, PipelinesApi} from "@vectorize-io/vectorize-client"; const connectorsApi = new ConnectorsApi(new Configuration({ accessToken: "token", })); const pipelines = connectorsApi.getPipelines({ organization: "your-org" }) console.log(pipelines) ``` -------------------------------- ### Create File Upload Source Connector Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Creates a new source connector of type 'FILE_UPLOAD' named 'From API' for the organization. This connector is used to ingest files. ```python connectors_api = v.ConnectorsApi(api) response = connectors_api.create_source_connector(org, [{ "type": "FILE_UPLOAD", "name": "From API" }]) source_connector_id = response.connectors[0].id source_connector_id ``` -------------------------------- ### Retrieve Documents from Pipeline Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/examples/notebooks/vectorize.ipynb Retrieves documents from a specified pipeline based on a query. It returns a list of documents, ranked by relevancy, along with their text content. ```python response = pipelines.retrieve_documents(org, pipeline_id, v.RetrieveDocumentsRequest( question="Apple RSU activity", num_results=5, )) print(response) for doc in response.documents: print(str(doc.relevancy) +" - " + doc.text) ``` -------------------------------- ### Import Vectorize Client Package Source: https://github.com/vectorize-io/vectorize-clients/blob/main/scripts/python_README.md Imports the necessary vectorize_client package into your Python script. This makes the client's functionalities available for use. ```python import vectorize_client ``` -------------------------------- ### Import Vectorize Client Package Source: https://github.com/vectorize-io/vectorize-clients/blob/main/src/python/README.md Imports the necessary vectorize_client package into your Python script. This makes the client's functionalities available for use. ```python import vectorize_client ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.