### Install LLMInferenceService controller using Quick Install Script Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Installs only the LLMInferenceService controller using a provided script from GitHub releases. ```bash curl -sL "https://github.com/kserve/kserve/releases/download/v0.18.0/llmisvc-full-install-with-manifests.sh" | bash ``` -------------------------------- ### Install KServe Dependencies (Standard Mode) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Installs the standard mode dependencies for KServe using a quick install script. ```bash # Option A: Standard Mode Dependencies curl -fsSL https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-standard-mode-dependency-install.sh | bash ``` -------------------------------- ### CLI Tool Installation Example Source: https://kserve.github.io/website/docs/install/overview Examples of how to install CLI tools like Helm using the provided scripts. ```bash # Install Helm ./hack/setup/cli/install-helm.sh # Install with specific version HELM_VERSION=v3.13.0 ./hack/setup/cli/install-helm.sh ``` -------------------------------- ### Example Definition Source: https://kserve.github.io/website/docs/install/overview An example of a KServe installation definition file, specifying tools, components, and environment variables. ```yaml # quick-install/definitions/kserve-standard-mode-full-install.definition DESCRIPTION: Install KServe Standard Mode using Helm RELEASE: true TOOLS: - helm - kustomize - yq COMPONENTS: - name: cert-manager-helm - name: istio-helm - name: kserve-helm env: DEPLOYMENT_MODE: Standard ENABLE_KSERVE: true ENABLE_LLMISVC: false ``` -------------------------------- ### Install KServe (Standard Mode - Quick Install Script) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Shell script to install KServe in standard mode using the quick install script. ```bash curl -sL "https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-standard-mode-full-install-with-manifests.sh" | bash ``` -------------------------------- ### Verify git Installation Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Command to verify the installation of git. ```bash git --version ``` -------------------------------- ### Kustomize Standalone Overlay Example Source: https://kserve.github.io/website/docs/install/overview This example shows how to configure a standalone KServe installation using Kustomize overlays. It includes base resources and the KServe controller component. ```yaml namespace: kserve resources: - ../../../base components: - ../../../components/kserve ``` -------------------------------- ### Example Script Structure Source: https://kserve.github.io/website/docs/install/overview Demonstrates the structure of a typical installation script, including sourcing common utilities and performing installation steps. ```bash #!/bin/bash # Source common utilities SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" source "${SCRIPT_DIR}/../common.sh" install() { log_info "Installing cert-manager..." helm repo add jetstack https://charts.jetstack.io helm install cert-manager jetstack/cert-manager \ --namespace cert-manager \ --create-namespace \ --set crds.enabled=true wait_for_pods "cert-manager" "app.kubernetes.io/instance=cert-manager" 300 log_success "cert-manager installed successfully" } uninstall() { log_info "Uninstalling cert-manager..." helm uninstall cert-manager -n cert-manager kubectl delete namespace cert-manager --wait=true } # Main execution if [ "$UNINSTALL" = true ]; then uninstall; exit 0; fi install ``` -------------------------------- ### Install Standard Mode Dependencies using Quick Install Script Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Installs only the infrastructure dependencies for Standard Mode KServe using a quick install script. ```bash curl -fsSL https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-standard-mode-dependency-install.sh | bash ``` -------------------------------- ### Install LLMIsvc Dependencies using Quick Install Script Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Installs only the infrastructure dependencies for LLMIsvc using a quick install script. ```bash curl -fsSL https://github.com/kserve/kserve/releases/download/v0.18.0/llmisvc-dependency-install.sh | bash ``` -------------------------------- ### Install LLMInferenceService Dependencies Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Installs the dependencies required for LLMInferenceService using a quick install script. ```bash # Install LLMInferenceService Dependencies curl -fsSL https://github.com/kserve/kserve/releases/download/v0.18.0/llmisvc-dependency-install.sh | bash ``` -------------------------------- ### Install Knative Mode Dependencies using Quick Install Script Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Installs only the infrastructure dependencies for Knative Mode KServe using a quick install script. ```bash curl -fsSL https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-knative-mode-dependency-install.sh | bash ``` -------------------------------- ### Quick Install Script - LLMInferenceService Source: https://kserve.github.io/website/docs/developer-guide Runs the quick install script for LLMInferenceService. ```bash # For LLMInferenceService ./hack/kserve-install.sh --type localmodel --kustomize ``` -------------------------------- ### Install KServe Dependencies (Knative Mode) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Installs the Knative mode dependencies for KServe using a quick install script. ```bash # OR # Option B: Knative Mode Dependencies curl -fsSL https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-knative-mode-dependency-install.sh | bash ``` -------------------------------- ### Install Standard Dependencies using kserve-install.sh Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Installs only the infrastructure dependencies for Standard Mode KServe using the kserve-install.sh script. ```bash ./hack/kserve-install.sh --deps-only --standard ``` -------------------------------- ### Install KServe (Standard Mode - Kustomize) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Command to install KServe in standard mode using Kustomize. ```bash ./hack/kserve-install.sh --kustomize --standard ``` -------------------------------- ### Configuration Composition Example Source: https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-configuration This example demonstrates how to compose multiple LLMInferenceServiceConfig resources to define a complete LLMInferenceService. ```yaml # Config 1: Model configuration apiVersion: serving.kserve.io/v1alpha1 kind: LLMInferenceServiceConfig metadata: name: model-llama-3-8b namespace: kserve spec: model: uri: hf://meta-llama/Llama-3.1-8B-Instruct name: meta-llama/Llama-3.1-8B-Instruct --- # Config 2: Workload configuration apiVersion: serving.kserve.io/v1alpha1 kind: LLMInferenceServiceConfig metadata: name: workload-single-gpu namespace: kserve spec: replicas: 3 template: containers: - name: main resources: limits: nvidia.com/gpu: "1" --- # Config 3: Router configuration apiVersion: serving.kserve.io/v1alpha1 kind: LLMInferenceServiceConfig metadata: name: router-managed namespace: kserve spec: router: route: {} gateway: {} scheduler: {} --- # LLMInferenceService: Compose all configs apiVersion: serving.kserve.io/v1alpha1 kind: LLMInferenceService metadata: name: my-llama-service namespace: default spec: baseRefs: - name: model-llama-3-8b - name: workload-single-gpu - name: router-managed # Optional: Override specific fields replicas: 5 # Override workload-single-gpu replicas ``` -------------------------------- ### Install KServe Dependencies Source: https://kserve.github.io/website/docs/developer-guide Installs KServe dependencies locally. ```bash ./hack/kserve-install.sh -d --type kserve ``` -------------------------------- ### Helm Deployment Patch Example Source: https://kserve.github.io/website/docs/install/overview An example of a Helm-specific override patch for a deployment, demonstrating how to use values. ```yaml kind: Deployment metadata: name: kserve-controller-manager spec: template: spec: containers: - name: manager image: "{{ .Values.kserve.controller.image }}:{{ .Values.kserve.controller.tag }}" resources: {{ toYaml .Values.kserve.controller.resources | nindent 12 }} ``` -------------------------------- ### Complete Router Configuration Example Source: https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-configuration Example of a complete router configuration, including gateway, route, and scheduler. ```yaml spec: router: gateway: {} # Gateway configuration route: {} # HTTPRoute configuration scheduler: {} # Scheduler configuration ``` -------------------------------- ### Gateway Instance Installation Source: https://kserve.github.io/website/docs/admin-guide/kubernetes-deployment-llmisvc Installs a Gateway instance. ```bash infra/gateway-api/manage.gateway-api-gw.sh ``` -------------------------------- ### Install KServe Resources using Helm Source: https://kserve.github.io/website/docs/admin-guide/serverless Installs the main KServe resources using Helm. ```bash helm install kserve oci://ghcr.io/kserve/charts/kserve-resources --version v0.18.0 ``` -------------------------------- ### Quick Install Script - Include KEDA Source: https://kserve.github.io/website/docs/developer-guide Installs KServe with KEDA (Kubernetes Event-driven Autoscaling). ```bash # To include KEDA (Kubernetes Event-driven Autoscaling) ./hack/kserve-install.sh -k ``` -------------------------------- ### Document Frontmatter Example Source: https://kserve.github.io/website/docs/developer-guide/contribution Example of the required frontmatter for Docusaurus documentation files. ```yaml --- title: "Document Title" description: "Brief description of the document content" --- ``` -------------------------------- ### Infrastructure Management Examples Source: https://kserve.github.io/website/docs/install/overview Examples demonstrating how to manage Kubernetes infrastructure components like Istio using the provided scripts. ```bash # Install Istio ./hack/setup/infra/manage.istio-helm.sh # Reinstall with custom args REINSTALL=true \ ISTIOD_EXTRA_ARGS="--set resources.limits.cpu=500m" \ ./hack/setup/infra/manage.istio-helm.sh # Uninstall UNINSTALL=true ./hack/setup/infra/manage.istio-helm.sh ``` -------------------------------- ### Quick Install Script - LLMInferenceService Dependencies Only Source: https://kserve.github.io/website/docs/developer-guide Installs only LLMInferenceService dependencies (without LLMInferenceService). ```bash # To install only LLMInferenceService dependencies (without LLMInferenceService) ./hack/kserve-install.sh -d --type llmisvc ``` -------------------------------- ### Verify helm Installation Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Command to verify the installation of helm. ```bash helm version ``` -------------------------------- ### Verify kubectl Installation Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Command to verify the installation of kubectl. ```bash kubectl version --client ``` -------------------------------- ### Deploy Kafka Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/kafka Commands to install Kafka and Zookeeper using Helm. ```bash helm repo add bitnami https://charts.bitnami.com/bitnami helm install zookeeper bitnami/zookeeper --set replicaCount=1 --set auth.enabled=false --set allowAnonymousLogin=true \ --set persistance.enabled=false --version 11.0.0 helm install kafka bitnami/kafka --set zookeeper.enabled=false --set replicaCount=1 --set persistance.enabled=false \ --set logPersistance.enabled=false --set externalZookeeper.servers=zookeeper-headless.default.svc.cluster.local \ --version 21.0.0 ``` -------------------------------- ### Start Minikube Cluster Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Command to start a local Kubernetes cluster using Minikube. ```bash minikube start ``` -------------------------------- ### Install Kafka Event Source Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/kafka Command to apply the Kafka Event Source. ```bash kubectl apply -f https://github.com/knative-sandbox/eventing-kafka/releases/download/knative-v1.9.1/source.yaml ``` -------------------------------- ### Helm Template Helper for Rendering Resources Source: https://kserve.github.io/website/docs/install/overview Example of a Helm template helper that merges base manifests with patches. ```go-template {{- include "kserve-common.renderMultiResourceWithPatches" (dict "baseFile" "files/kserve/resources.yaml" "patchGlob" "files/kserve/*-patch.yaml" "certName" "serving-cert" "context" .) -}} ``` -------------------------------- ### Install KServe (Knative Mode - Quick Install Script) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Shell script to install KServe in Knative mode using the quick install script. ```bash curl -sL "https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-knative-mode-full-install-with-manifests.sh" | bash ``` -------------------------------- ### Local Documentation Build Command Source: https://kserve.github.io/website/docs/developer-guide/contribution Commands to build and start the documentation locally for preview. ```bash cd website npm install npm start ``` -------------------------------- ### Install Knative Eventing Core Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/kafka Commands to apply Knative Eventing CRDs and core components. ```bash kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.9.7/eventing-crds.yaml kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.9.7/eventing-core.yaml ``` -------------------------------- ### All-in-One Overlay Source: https://kserve.github.io/website/docs/install/overview An example of an all-in-one Kustomize overlay that includes the base and all three components for a full-featured deployment. ```yaml namespace: kserve resources: - ../../base components: - ../../components/kserve - ../../components/llmisvc - ../../components/localmodel ``` -------------------------------- ### Install KServe (Knative Mode - Kustomize) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Command to install KServe in Knative mode using Kustomize. ```bash ./hack/kserve-install.sh --kustomize --knative ``` -------------------------------- ### Install KServe (Knative Mode - Helm) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Command to install KServe in Knative mode using Helm. ```bash ./hack/kserve-install.sh --kserve-version v0.18.0 --type kserve --knative ``` -------------------------------- ### Install lgbserver Runtime Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/frameworks/lightgbm Commands to install the 'lgbserver' runtime package using Uv. ```bash cd python/lgbserver uv sync ``` -------------------------------- ### Complete LLMInferenceService Configuration Example Source: https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-configuration A comprehensive example combining model specification, parallelism, replicas, container templates, worker configuration, and router settings. ```yaml apiVersion: serving.kserve.io/v1alpha1 kind: LLMInferenceService metadata: name: llama-70b-production namespace: production spec: # Model specification model: uri: hf://meta-llama/Llama-2-70b-hf name: meta-llama/Llama-2-70b-hf criticality: High # Multi-node workload with data parallelism parallelism: tensor: 4 data: 8 dataLocal: 4 # Decode workload (main) replicas: 2 template: containers: - name: main image: vllm/vllm-openai:latest args: - "--model" - "/mnt/models" - "--tensor-parallel-size" - "4" resources: limits: nvidia.com/gpu: "4" rdma/roce: "1" # Worker pods worker: containers: - name: main image: vllm/vllm-openai:latest args: - "--model" - "/mnt/models" - "--tensor-parallel-size" - "4" resources: limits: nvidia.com/gpu: "4" rdma/roce: "1" # Router configuration router: gateway: {} route: http: spec: rules: - timeouts: request: "300s" backendRequest: "300s" scheduler: {} ``` -------------------------------- ### Install KServe (Standard Mode - Helm) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Command to install KServe in standard mode using Helm. ```bash ./hack/kserve-install.sh --kserve-version v0.18.0 --type kserve --standard ``` -------------------------------- ### Install Knative Dependencies using kserve-install.sh Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Installs only the infrastructure dependencies for Knative Mode KServe using the kserve-install.sh script. ```bash ./hack/kserve-install.sh --deps-only --knative ``` -------------------------------- ### Quick Install Script - Standard Deployment Source: https://kserve.github.io/website/docs/developer-guide Clones the KServe repository and runs the quick install script for Standard deployment mode (without Knative, using Gateway API + Istio). ```bash # Clone the repository if you haven't already git clone https://github.com/kserve/kserve.git cd kserve # Run the quick install script with Standard deployment mode (without Knative, using Gateway API + Istio) ./hack/kserve-install.sh -r --kustomize ``` -------------------------------- ### Install via uv Source: https://kserve.github.io/website/docs/reference/controlplane-client/controlplane-client-sdk Checkout KServe GitHub repository and Install via uv. ```bash cd kserve/python/kserve uv sync ``` -------------------------------- ### Install LLMInferenceService using kserve-install.sh (Kustomize Mode) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Installs the LLMInferenceService component using the kserve-install.sh script in Kustomize mode. ```bash # LLMIsvc ./hack/kserve-install.sh --kustomize --type llmisvc ``` -------------------------------- ### Install the pmmlserver runtime Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/frameworks/pmml Installs the pmmlserver runtime using Uv. ```bash cd python/pmmlserver uv sync ``` -------------------------------- ### Install LLMInferenceService using kserve-install.sh (Helm Mode) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Installs the LLMInferenceService component using the kserve-install.sh script in Helm mode. ```bash ./hack/kserve-install.sh --kserve-version v0.18.0 --type llmisvc ``` -------------------------------- ### Install KServe + LocalModel (Standard Mode - Kustomize) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Command to install KServe with LocalModel support in standard mode using Kustomize. ```bash ./hack/kserve-install.sh --kustomize --standard --type kserve,localmodel ``` -------------------------------- ### Sample Storage Initializer Configuration Source: https://kserve.github.io/website/docs/model-serving/storage/providers/oci Example JSON output showing the configuration for the storage initializer, including Modelcar-specific settings. ```json { "image" : "kserve/storage-initializer:latest", "memoryRequest": "100Mi", "memoryLimit": "1Gi", "cpuRequest": "100m", "cpuLimit": "1", "caBundleConfigMapName": "", "caBundleVolumeMountPath": "/etc/ssl/custom-certs", "enableModelcar": true, "cpuModelcar": "10m", "memoryModelcar": "15Mi", "uidModelcar": 1010 } ``` -------------------------------- ### Install KServe + LocalModel (Knative Mode - Helm) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Command to install KServe with LocalModel support in Knative mode using Helm. ```bash ./hack/kserve-install.sh --kserve-version v0.18.0 --type kserve,localmodel --knative ``` -------------------------------- ### Install paddleserver runtime Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/frameworks/paddle Installs the paddleserver runtime package using Uv. ```bash cd python/paddleserver uv sync ``` -------------------------------- ### Install KServe + LocalModel (Standard Mode - Helm) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Command to install KServe with LocalModel support in standard mode using Helm. ```bash ./hack/kserve-install.sh --kserve-version v0.18.0 --type kserve,localmodel --standard ``` -------------------------------- ### Install All KServe Components (Kustomize Mode - Knative) Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Installs all KServe components, including LLMInferenceService and LocalModel, in Knative mode using Kustomize. ```bash # All components (Knative mode) ./hack/kserve-install.sh --kustomize --knative --type kserve,llmisvc,localmodel ``` -------------------------------- ### Minikube setup with MetalLB Source: https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-dependencies Commands to start a minikube cluster, enable the MetalLB addon, and configure an IP address pool for LoadBalancer services. ```bash # Start minikube with sufficient resources minikube start --cpus='12' --memory='16G' --kubernetes-version=v1.33.1 # Enable MetalLB addon minikube addons enable metallb # Configure IP address pool IP=$(minikube ip) PREFIX=${IP%.*} START=${PREFIX}.200 END=${PREFIX}.235 kubectl apply -f - < \ charts/kserve-resources/files/kserve/resources.yaml ``` -------------------------------- ### Kind cluster setup with cloud-provider-kind Source: https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-dependencies Commands to create a kind cluster, install cloud-provider-kind, and run it in the background to simulate a LoadBalancer. ```bash # Create kind cluster kind create cluster -n "kserve-llm" # Install cloud-provider-kind go install sigs.k8s.io/cloud-provider-kind@latest # Run cloud-provider-kind in background cloud-provider-kind > /dev/null 2>&1 & ``` -------------------------------- ### Clone KServe Repository Source: https://kserve.github.io/website/docs/admin-guide/kubernetes-deployment-llmisvc Clones the KServe repository and navigates to the setup directory. ```bash git clone https://github.com/kserve/kserve.git cd kserve/hack/setup ``` -------------------------------- ### Run OpenAI SDK Example Source: https://kserve.github.io/website/docs/model-serving/generative-inference/sdk-integration Command to execute the Python script for the OpenAI SDK example. ```bash python3 sample_openai.py ``` -------------------------------- ### Install KServe CRDs, Controllers, and Cluster Resources Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide Applies the Custom Resource Definitions (CRDs), controllers, and cluster resources for KServe. ```bash # Install CRDs kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-crds.yaml # Install Controllers kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.18.0/kserve.yaml # Install ClusterServingRuntimes/LLMIsvcConfigs kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-cluster-resources.yaml ``` -------------------------------- ### Setup Minio Event Notification to Kafka Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/kafka Command to configure Minio to publish events to Kafka. ```bash # Setup bucket event notification with Kafka mc admin config set myminio notify_kafka:1 tls_skip_verify="off" queue_dir="" queue_limit="0" sasl="off" sasl_password="" sasl_username="" tls_client_auth="0" tls="off" client_tls_cert="" client_tls_key="" brokers="kafka-headless.default.svc.cluster.local:9092" topic="mnist" version="" # Restart Minio mc admin service restart myminio ``` -------------------------------- ### Multi-Node Configuration Example Source: https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-configuration Configuration for a multi-node deployment using LeaderWorkerSet, specifying tensor and data parallelism. ```yaml spec: replicas: 2 # Number of LeaderWorkerSet replicas parallelism: tensor: 4 # Tensor parallelism degree data: 8 # Total data parallel instances dataLocal: 4 # GPUs per node # Result: 8 / 4 = 2 LWS replicas (overrides replicas: 2 if different) template: # Leader pod spec containers: - name: main image: vllm/vllm-openai:latest args: - "--model" - "/mnt/models" - "--tensor-parallel-size" - "4" resources: limits: nvidia.com/gpu: "4" cpu: "16" memory: 128Gi worker: # Worker pod spec (triggers multi-node) containers: - name: main image: vllm/vllm-openai:latest args: - "--model" - "/mnt/models" - "--tensor-parallel-size" - "4" resources: limits: nvidia.com/gpu: "4" cpu: "16" memory: 128Gi ``` -------------------------------- ### Expected Output Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/detect/art Example output after running 'kubectl get inferenceservice'. ```text NAME URL READY DEFAULT TRAFFIC CANARY TRAFFIC AGE artserver http://artserver.somecluster/v1/models/artserver True 100 40m ``` -------------------------------- ### Running the LangChain Example Source: https://kserve.github.io/website/docs/model-serving/generative-inference/sdk-integration Execute the Python script to observe both regular and streaming responses from a LangChain integration. ```bash python3 sample_langchain.py ``` -------------------------------- ### GatewayClass Installation Source: https://kserve.github.io/website/docs/admin-guide/kubernetes-deployment-llmisvc Installs the GatewayClass resource. ```bash infra/gateway-api/manage.gateway-api-gwclass.sh ``` -------------------------------- ### Quick Install Script - Standard Mode Dependencies Only Source: https://kserve.github.io/website/docs/developer-guide Installs only Standard mode dependencies (without KServe). ```bash # To install only Standard mode dependencies (without KServe) ./hack/kserve-install.sh -d --standard ``` -------------------------------- ### Option 1: Use Direct DeploymentStrategy (Recommended) Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/rollout-strategies/rollout-strategy-standard Example of configuring a rollout strategy directly within the InferenceService specification using the `deploymentStrategy` field. ```yaml spec: predictor: deploymentStrategy: type: RollingUpdate rollingUpdate: maxUnavailable: "0" # Availability mode maxSurge: "25%" # Use the ratio value ```