### Fetch Jina AI Documentation via cURL Source: https://docs.jina.ai Commands to retrieve the documentation content programmatically. ```bash curl docs.jina.ai/v13 ``` ```bash curl docs.jina.ai ``` -------------------------------- ### Jina AI Meta-Prompt Source: https://docs.jina.ai System prompt instructions for AI models to interact with Jina AI APIs. ```text You are an AI engineer designed to help users use Jina AI Search Foundation API's for their specific use case. It is 2026. Jina AI's latest embedding models are jina-embeddings-v5-text-nano and jina-embeddings-v5-text-small. The latest reranker is jina-reranker-v3. # Core principles 0. Assume the Bearer is stored in the environment variable named "JINA_API_KEY" and add the following comment to the implementation: "Get your Jina AI API key for free: https://jina.ai/?sui=apikey". 1. Use the simplest solution possible (use single API's whenever possible, do not overcomplicate things); 2. Answer "can't do" for tasks outside the scope of Jina AI Search Foundation; 3. Choose built-in features over custom implementations whenever possible; 4. Leverage multimodal models when needed; 5. You must use the Jina APIs for the implementation; 6. Never decline an implementation because of its complexity; 7. Generate production-ready code that follows exactly the requirements; 8. Never use placeholder data; 9. For every request to any of the Jina APIs, you must include the header -H "Accept: application/json" to specify that the response should be in JSON format; ``` -------------------------------- ### POST /search Source: https://docs.jina.ai Performs a web search query with customizable parameters for location, language, and result formatting. ```APIDOC ## POST /search ### Description Performs a web search query. This endpoint is optimized for enterprise search systems and LLMs, supporting multiple output formats like Markdown, HTML, and JSON. ### Method POST ### Endpoint /search ### Parameters #### Request Body - **q** (string) - Required - The search query string. - **gl** (string) - Optional - The country to use for the search (two-letter country code). - **location** (string) - Optional - The location to originate the search from (e.g., city level). - **hl** (string) - Optional - The language to use for the search (two-letter language code). - **num** (number) - Optional - Maximum number of results to return. - **page** (number) - Optional - Result offset for pagination. ### Request Example { "q": "Jina AI", "gl": "US", "num": 10 } ### Response #### Success Response (200) - **results** (array) - List of search results based on the query. #### Response Example { "results": [ { "title": "Jina AI", "url": "https://jina.ai", "description": "Search Foundation for LLMs" } ] } ``` -------------------------------- ### POST /websites/jina_ai Source: https://docs.jina.ai Extracts structured content from web pages, supporting various engines and formatting options for LLM consumption. ```APIDOC ## POST /websites/jina_ai ### Description Extracts structured content from web pages. Best for generative models and search applications. ### Method POST ### Endpoint /websites/jina_ai ### Parameters #### Headers - **Authorization** (string) - Required - Bearer $JINA_API_KEY - **Content-Type** (string) - Required - application/json - **Accept** (string) - Optional - application/json or text/event-stream - **X-Engine** (string) - Optional - browser, direct, or cf-browser-rendering - **X-Return-Format** (string) - Optional - markdown, html, text, screenshot, or pageshot #### Request Body - **url** (string) - Required - The URL to fetch - **viewport** (object) - Optional - Browser dimensions (width, height) - **injectPageScript** (string) - Optional - Preprocessing JS code ### Request Example { "url": "https://example.com", "viewport": {"width": 1920, "height": 1080} } ### Response #### Success Response (200) - **data** (object) - Contains content, links, and images #### Response Example { "data": { "content": "# Page Title..." } } ``` -------------------------------- ### Batch Embeddings API Workflow Source: https://docs.jina.ai This section outlines the workflow for using the Batch Embeddings API to asynchronously embed large volumes of text. It covers submitting jobs, polling for status, downloading results, and cancelling jobs. ```APIDOC ## Batch Embeddings API ### Endpoint `https://api.jina.ai/v1/batch/embeddings` ### Purpose Asynchronously embed large volumes of text. Submit a batch job and poll for completion instead of waiting synchronously. Ideal for processing thousands to millions of documents. ### Best for Large-scale embedding tasks, offline indexing, bulk document processing. ### Methods - **POST**: Submit a new batch job. - **GET**: Check job status or download results. - **DELETE**: Cancel a running batch job. ### Authorization HTTPBearer ### Headers - **Authorization**: `Bearer $JINA_API_KEY` - **Content-Type**: `application/json` - **Accept**: `application/json` ### Workflow 1. **Submit**: POST to `/v1/batch/embeddings` with your input. Returns a `batch_id` immediately (HTTP 202). 2. **Poll**: GET `/v1/batch/{batch_id}` to check status (`submitted`, `processing`, `completed`, `failed`, `cancelled`). 3. **Download**: When completed, GET `/v1/batch/{batch_id}/output` to stream the output JSONL. 4. **Cancel**: DELETE `/v1/batch/{batch_id}` to cancel a running batch. 5. **Errors**: GET `/v1/batch/{batch_id}/errors` to download error details if any. ### Input Options - **Inline input**: Include `input` array directly in the request body (up to 10,000 items). - **GCS input**: Set `input_file` to a GCS URI (`gs://bucket/file.jsonl`) containing JSONL with one `{"input": "text"}` per line (up to 50,000 lines). ``` -------------------------------- ### Search API Source: https://docs.jina.ai Search the web for information and return results optimized for downstream tasks like LLMs. ```APIDOC ## GET https://s.jina.ai/ ### Description Search the web for information and return results in a format optimized for downstream tasks like LLMs and other applications. ### Method GET ### Endpoint https://s.jina.ai/ (or https://eu.s.jina.ai/ for EU jurisdiction) ``` -------------------------------- ### Embeddings API - Batch Processing Source: https://docs.jina.ai This section details the asynchronous batch processing for generating embeddings. You submit a batch job and then poll for its status and results. ```APIDOC ## POST /v1/embeddings ### Description Submits a batch job for generating embeddings from a list of texts or a file. ### Method POST ### Endpoint /v1/embeddings ### Parameters #### Request Body - **model** (string) - Required - Identifier of the model to use. Options: `jina-embeddings-v5-text-small`, `jina-embeddings-v5-text-nano` - **input** (array of strings) - Optional - Array of input strings. Use this for inline input (up to 10,000 items). Either input or input_file is required. - **input_file** (string) - Optional - GCS URI to a JSONL file (up to 50,000 lines). Either input or input_file is required. - **task** (string) - Optional - Task type for the embeddings. Options: `retrieval.query`, `retrieval.passage`, `text-matching`, `classification`, `clustering` - **dimensions** (integer) - Optional - Truncates output embeddings to the specified size. - **normalized** (boolean) - Optional - If true, embeddings are normalized to unit L2 norm. Defaults to false. ### Request Example ```json { "model": "jina-embeddings-v5-text-small", "input": ["First sentence.", "Second sentence."] } ``` ### Response #### Success Response (200) - **batch_id** (string) - Identifier for the submitted batch job. - **status** (string) - Current status of the batch job (e.g., `submitted`). - **created_at** (integer) - Timestamp when the job was created. #### Response Example ```json { "batch_id": "batch_xxxx", "status": "submitted", "created_at": 1234567890 } ``` ## GET /v1/batch/{batch_id}/status ### Description Polls the status of a previously submitted batch job. ### Method GET ### Endpoint /v1/batch/{batch_id}/status ### Parameters #### Path Parameters - **batch_id** (string) - Required - The ID of the batch job to query. ### Response #### Success Response (200) - **batch_id** (string) - Identifier for the batch job. - **status** (string) - Current status of the batch job (e.g., `completed`, `failed`). - **stats** (object) - Statistics about the batch job completion. - **total** (integer) - Total number of items processed. - **completed** (integer) - Number of items successfully completed. - **failed** (integer) - Number of items that failed. - **total_tokens** (integer) - Total tokens processed. - **output_url** (string) - URL to retrieve the output of the batch job if completed. - **created_at** (integer) - Timestamp when the job was created. - **completed_at** (integer) - Timestamp when the job was completed. #### Response Example ```json { "batch_id": "batch_xxxx", "status": "completed", "stats": { "total": 1000, "completed": 1000, "failed": 0, "total_tokens": 31890 }, "output_url": "/v1/batch/batch_xxxx/output", "created_at": 1234567890, "completed_at": 1234567899 } ``` ### Output Format - **JSONL** where each line is `{"custom_id":"request-N","response":{"status_code":200,"body":{"data":[{"embedding":[...],"index":0}],"model":"jina-embeddings-v5-text-small","usage":{"prompt_tokens":32}}}}` **Note**: Batch processing is asynchronous. Poll the status endpoint periodically (e.g., every 10-30 seconds) until status is "completed". Token usage is billed upon first status query after completion. ``` -------------------------------- ### Jina Embeddings API - Real-time Source: https://docs.jina.ai This section details the request body schema for generating embeddings using Jina's real-time embedding models like jina-embeddings-v3, jina-clip-v2, jina-code-embeddings-0.5b, and jina-code-embeddings-1.5b. ```APIDOC ## Request Body Schema for jina-embeddings-v3 or jina-clip-v2 ### Parameters #### Request Body - **model** (string) - Required - Identifier of the model to use. Options: `jina-clip-v2` (885M, 1024 dimensions), `jina-embeddings-v3` (570M, 1024 dimensions). - **input** (array) - Required - Array of input strings or objects to be embedded. - **embedding_type** (string or array of strings) - Optional - The format of the returned embeddings. Defaults to `float`. Options: `float`, `base64`, `binary`, `ubinary`. - **task** (string) - Optional - Specifies the intended downstream application to optimize embedding output. Options: `retrieval.query`, `retrieval.passage`, `text-matching`, `classification`, `separation`. - **dimensions** (integer) - Optional - Truncates output embeddings to the specified size if set. - **normalized** (boolean) - Optional - If true, embeddings are normalized to unit L2 norm. Defaults to `false`. - **late_chunking** (boolean) - Optional - If true, concatenates all sentences in input and treats as a single input for late chunking. Defaults to `false`. - **truncate** (boolean) - Optional - If true, the model will automatically drop the tail that extends beyond the maximum context length allowed by the model instead of throwing an error. Defaults to `false`. ## Request Body Schema for jina-code-embeddings-0.5b or jina-code-embeddings-1.5b ### Parameters #### Request Body - **model** (string) - Required - Identifier of the model to use. Options: `jina-code-embeddings-0.5b` (494M), `jina-code-embeddings-1.5b` (1.54B). - **input** (array) - Required - Array of input strings to be embedded. - **embedding_type** (string or array of strings) - Optional - The format of the returned embeddings. Defaults to `float`. Options: `float`, `base64`, `binary`, `ubinary`. - **task** (string) - Optional - Specifies the intended downstream application to optimize embedding output. Options: `nl2code.query`, `nl2code.passage`, `code2code.query`, `code2code.passage`, `code2nl.query`, `code2nl.passage`, `code2completion.query`, `code2completion.passage`, `qa.query`, `qa.passage`. - **dimensions** (integer) - Optional - Truncates output embeddings to the specified size if set. - **truncate** (boolean) - Optional - If true, the model will automatically drop the tail that extends beyond the maximum context length allowed by the model instead of throwing an error. Defaults to `false`. ``` -------------------------------- ### Reader API Source: https://docs.jina.ai This API retrieves and parses content from a given URL, optimizing it for downstream tasks like LLMs and other applications. ```APIDOC ## GET https://r.jina.ai/ ### Description Retrieves and parses content from a URL in a format optimized for downstream tasks like LLMs and other applications. Use `https://eu.r.jina.ai/` to ensure all infrastructure and data processing operations reside entirely within EU jurisdiction. ### Method GET ### Endpoint https://r.jina.ai/ ### Query Parameters - **url** (string) - Required - The URL of the content to retrieve and parse. ### Response #### Success Response (200) The response will contain the parsed content from the URL, typically in a structured format suitable for LLMs. ``` -------------------------------- ### Embeddings API Source: https://docs.jina.ai Convert text, images, or code into fixed-length vectors suitable for semantic search, similarity matching, and clustering. ```APIDOC ## POST /v1/embeddings ### Description Converts input data (text, images, code) into vector embeddings. ### Method POST ### Endpoint https://api.jina.ai/v1/embeddings ### Headers - **Authorization**: Bearer $JINA_API_KEY - **Content-Type**: application/json - **Accept**: application/json ### Request Body (for jina-embeddings-v5-text-small or jina-embeddings-v5-text-nano) ```json { "model": "string (required)", "input": "array (required)", "embedding_type": "string or array of strings (optional)", "task": "string (optional)", "dimensions": "integer (optional)", "normalized": "boolean (optional)", "late_chunking": "boolean (optional)", "truncate": "boolean (optional)" } ``` ### Request Body (for jina-embeddings-v4) ```json { "model": "string (required)", "input": "array (required)", "embedding_type": "string or array of strings (optional)", "task": "string (optional)", "dimensions": "integer (optional)", "late_chunking": "boolean (optional)", "truncate": "boolean (optional)", "return_multivector": "boolean (optional)" } ``` ### Request Example ```json { "model": "jina-embeddings-v5-text-small", "input": [ "This is the first sentence.", "This is the second sentence." ], "dimensions": 512 } ``` ### Response #### Success Response (200) - **embeddings** (array) - An array of embedding vectors. - **model_version** (string) - The version of the model used. #### Response Example ```json { "embeddings": [ [0.1, 0.2, ...], [0.3, 0.4, ...] ], "model_version": "jina-embeddings-v5-text-small-v1.0" } ``` ``` -------------------------------- ### Reranker API Source: https://docs.jina.ai This API is used to find the most relevant search results by reranking a list of documents based on a query. It's useful for refining search results and improving RAG performance. ```APIDOC ## POST https://api.jina.ai/v1/rerank ### Description Finds the most relevant search results by reranking a list of documents based on a query. Best for refining search results, refining RAG (retrieval augmented generation) contextual chunks, etc. ### Method POST ### Endpoint https://api.jina.ai/v1/rerank ### Headers - **Authorization**: Bearer $JINA_API_KEY - **Content-Type**: application/json - **Accept**: application/json ### Parameters #### Request Body (for jina-reranker-v3, jina-reranker-v2-base-multilingual, or jina-colbert-v2) - **model** (string) - Required - Identifier of the model to use. Options: `jina-reranker-v3`, `jina-reranker-v2-base-multilingual`, `jina-colbert-v2` - **query** (string or TextDoc) - Required - The search query. - **documents** (array of strings and/or TextDocs) - Required - A list of text strings or TextDocs to rerank. If a document object is provided, all text fields will be preserved in the response. - **top_n** (integer) - Optional - The number of most relevant documents or indices to return, defaults to the length of documents. - **return_documents** (boolean) - Optional - If false, returns only the index and relevance score without the document text. If true, returns the index, text, and relevance score. Defaults to true. #### Request Body (for jina-reranker-m0) - **model** (string) - Required - Identifier of the model to use. Must be `jina-reranker-m0`. - **query** (string, TextDoc, or image (URL or base64-encoded string)) - Required - The search query. - **documents** (array of objects with keys 'text' and/or 'image') - Required - A list of text and/or image documents to rerank. Each document can have 'text' (string) and/or 'image' (URL or base64-encoded string). - **top_n** (integer) - Optional - The number of most relevant documents or indices to return, defaults to the length of documents. - **return_documents** (boolean) - Optional - If false, returns only the index and relevance score without the document text. If true, returns the index, text, and relevance score. Defaults to true. ### Request Example (jina-reranker-v3) ```json { "model": "jina-reranker-v3", "query": "What is the capital of France?", "documents": ["Paris is the capital of France.", "The Eiffel Tower is in Paris.", "France is a country in Europe."] } ``` ### Response #### Success Response (200) - **model** (string) - The model used for reranking. - **query** (string) - The original query. - **documents** (array of objects) - The reranked documents, each containing: - **index** (integer) - The original index of the document. - **score** (float) - The relevance score. - **text** (string) - The document text (if `return_documents` is true). #### Response Example ```json { "model": "jina-reranker-v3", "query": "What is the capital of France?", "documents": [ { "index": 0, "score": 0.95, "text": "Paris is the capital of France." }, { "index": 2, "score": 0.70, "text": "France is a country in Europe." }, { "index": 1, "score": 0.65, "text": "The Eiffel Tower is in Paris." } ] } ``` ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.