### Install LLMInferenceService controller using Quick Install Script

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs only the LLMInferenceService controller using a provided script from GitHub releases.

```bash
curl -sL "https://github.com/kserve/kserve/releases/download/v0.18.0/llmisvc-full-install-with-manifests.sh" | bash
```

--------------------------------

### Install KServe Dependencies (Standard Mode)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs the standard mode dependencies for KServe using a quick install script.

```bash
# Option A: Standard Mode Dependencies
curl -fsSL https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-standard-mode-dependency-install.sh | bash
```

--------------------------------

### CLI Tool Installation Example

Source: https://kserve.github.io/website/docs/install/overview

Examples of how to install CLI tools like Helm using the provided scripts.

```bash
# Install Helm  
./hack/setup/cli/install-helm.sh  
  
# Install with specific version  
HELM_VERSION=v3.13.0 ./hack/setup/cli/install-helm.sh  

```

--------------------------------

### Example Definition

Source: https://kserve.github.io/website/docs/install/overview

An example of a KServe installation definition file, specifying tools, components, and environment variables.

```yaml
# quick-install/definitions/kserve-standard-mode-full-install.definition  
DESCRIPTION: Install KServe Standard Mode using Helm  
RELEASE: true  
  
TOOLS:  
  - helm  
  - kustomize  
  - yq  
  
COMPONENTS:  
  - name: cert-manager-helm  
  - name: istio-helm  
  - name: kserve-helm  
    env:  
      DEPLOYMENT_MODE: Standard  
      ENABLE_KSERVE: true  
      ENABLE_LLMISVC: false  

```

--------------------------------

### Install KServe (Standard Mode - Quick Install Script)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Shell script to install KServe in standard mode using the quick install script.

```bash
curl -sL "https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-standard-mode-full-install-with-manifests.sh" | bash
```

--------------------------------

### Verify git Installation

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Command to verify the installation of git.

```bash
git --version
```

--------------------------------

### Kustomize Standalone Overlay Example

Source: https://kserve.github.io/website/docs/install/overview

This example shows how to configure a standalone KServe installation using Kustomize overlays. It includes base resources and the KServe controller component.

```yaml
namespace: kserve
resources:
- ../../../base
components:
- ../../../components/kserve
```

--------------------------------

### Example Script Structure

Source: https://kserve.github.io/website/docs/install/overview

Demonstrates the structure of a typical installation script, including sourcing common utilities and performing installation steps.

```bash
#!/bin/bash  
# Source common utilities  
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"  
source "${SCRIPT_DIR}/../common.sh"  
  
install() {  
  log_info "Installing cert-manager..."  
  helm repo add jetstack https://charts.jetstack.io  
  helm install cert-manager jetstack/cert-manager \  
    --namespace cert-manager \  
    --create-namespace \  
    --set crds.enabled=true  
  
  wait_for_pods "cert-manager" "app.kubernetes.io/instance=cert-manager" 300  
  log_success "cert-manager installed successfully"  
}  
  
uninstall() {  
  log_info "Uninstalling cert-manager..."  
  helm uninstall cert-manager -n cert-manager  
  kubectl delete namespace cert-manager --wait=true  
}  
  
# Main execution  
if [ "$UNINSTALL" = true ]; then uninstall; exit 0; fi  
install  

```

--------------------------------

### Install Standard Mode Dependencies using Quick Install Script

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs only the infrastructure dependencies for Standard Mode KServe using a quick install script.

```bash
curl -fsSL https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-standard-mode-dependency-install.sh | bash
```

--------------------------------

### Install LLMIsvc Dependencies using Quick Install Script

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs only the infrastructure dependencies for LLMIsvc using a quick install script.

```bash
curl -fsSL https://github.com/kserve/kserve/releases/download/v0.18.0/llmisvc-dependency-install.sh | bash
```

--------------------------------

### Install LLMInferenceService Dependencies

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs the dependencies required for LLMInferenceService using a quick install script.

```bash
# Install LLMInferenceService Dependencies
curl -fsSL https://github.com/kserve/kserve/releases/download/v0.18.0/llmisvc-dependency-install.sh | bash
```

--------------------------------

### Install Knative Mode Dependencies using Quick Install Script

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs only the infrastructure dependencies for Knative Mode KServe using a quick install script.

```bash
curl -fsSL https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-knative-mode-dependency-install.sh | bash
```

--------------------------------

### Quick Install Script - LLMInferenceService

Source: https://kserve.github.io/website/docs/developer-guide

Runs the quick install script for LLMInferenceService.

```bash
# For LLMInferenceService  
./hack/kserve-install.sh --type localmodel --kustomize  

```

--------------------------------

### Install KServe Dependencies (Knative Mode)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs the Knative mode dependencies for KServe using a quick install script.

```bash
# OR

# Option B: Knative Mode Dependencies
curl -fsSL https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-knative-mode-dependency-install.sh | bash
```

--------------------------------

### Install Standard Dependencies using kserve-install.sh

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs only the infrastructure dependencies for Standard Mode KServe using the kserve-install.sh script.

```bash
./hack/kserve-install.sh --deps-only --standard
```

--------------------------------

### Install KServe (Standard Mode - Kustomize)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Command to install KServe in standard mode using Kustomize.

```bash
./hack/kserve-install.sh --kustomize --standard
```

--------------------------------

### Configuration Composition Example

Source: https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-configuration

This example demonstrates how to compose multiple LLMInferenceServiceConfig resources to define a complete LLMInferenceService.

```yaml
# Config 1: Model configuration  
apiVersion: serving.kserve.io/v1alpha1  
kind: LLMInferenceServiceConfig  
metadata:  
  name: model-llama-3-8b  
  namespace: kserve  
spec:  
  model:  
    uri: hf://meta-llama/Llama-3.1-8B-Instruct  
    name: meta-llama/Llama-3.1-8B-Instruct  
  
---
# Config 2: Workload configuration  
apiVersion: serving.kserve.io/v1alpha1  
kind: LLMInferenceServiceConfig  
metadata:  
  name: workload-single-gpu  
  namespace: kserve  
spec:  
  replicas: 3  
  template:  
    containers:  
      - name: main  
        resources:  
          limits:  
            nvidia.com/gpu: "1"  
  
---
# Config 3: Router configuration  
apiVersion: serving.kserve.io/v1alpha1  
kind: LLMInferenceServiceConfig  
metadata:  
  name: router-managed  
  namespace: kserve  
spec:  
  router:  
    route: {}  
    gateway: {}  
    scheduler: {}  
  
---
# LLMInferenceService: Compose all configs  
apiVersion: serving.kserve.io/v1alpha1  
kind: LLMInferenceService  
metadata:  
  name: my-llama-service  
  namespace: default  
spec:  
  baseRefs:  
    - name: model-llama-3-8b  
    - name: workload-single-gpu  
    - name: router-managed  
  # Optional: Override specific fields  
  replicas: 5  # Override workload-single-gpu replicas  

```

--------------------------------

### Install KServe Dependencies

Source: https://kserve.github.io/website/docs/developer-guide

Installs KServe dependencies locally.

```bash
./hack/kserve-install.sh -d --type kserve
```

--------------------------------

### Helm Deployment Patch Example

Source: https://kserve.github.io/website/docs/install/overview

An example of a Helm-specific override patch for a deployment, demonstrating how to use values.

```yaml
kind: Deployment
metadata:
  name: kserve-controller-manager
spec:
  template:
    spec:
      containers:
      - name: manager
        image: "{{ .Values.kserve.controller.image }}:{{ .Values.kserve.controller.tag }}"
        resources: {{ toYaml .Values.kserve.controller.resources | nindent 12 }}

```

--------------------------------

### Complete Router Configuration Example

Source: https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-configuration

Example of a complete router configuration, including gateway, route, and scheduler.

```yaml
spec:  
  router:  
    gateway: {}     # Gateway configuration  
    route: {}       # HTTPRoute configuration  
    scheduler: {}   # Scheduler configuration  


```

--------------------------------

### Gateway Instance Installation

Source: https://kserve.github.io/website/docs/admin-guide/kubernetes-deployment-llmisvc

Installs a Gateway instance.

```bash
infra/gateway-api/manage.gateway-api-gw.sh  
```

--------------------------------

### Install KServe Resources using Helm

Source: https://kserve.github.io/website/docs/admin-guide/serverless

Installs the main KServe resources using Helm.

```bash
helm install kserve oci://ghcr.io/kserve/charts/kserve-resources --version v0.18.0  

```

--------------------------------

### Quick Install Script - Include KEDA

Source: https://kserve.github.io/website/docs/developer-guide

Installs KServe with KEDA (Kubernetes Event-driven Autoscaling).

```bash
# To include KEDA (Kubernetes Event-driven Autoscaling)  
./hack/kserve-install.sh -k  

```

--------------------------------

### Document Frontmatter Example

Source: https://kserve.github.io/website/docs/developer-guide/contribution

Example of the required frontmatter for Docusaurus documentation files.

```yaml
---
title: "Document Title"
description: "Brief description of the document content"
---

```

--------------------------------

### Infrastructure Management Examples

Source: https://kserve.github.io/website/docs/install/overview

Examples demonstrating how to manage Kubernetes infrastructure components like Istio using the provided scripts.

```bash
# Install Istio  
./hack/setup/infra/manage.istio-helm.sh  
  
# Reinstall with custom args  
REINSTALL=true \  
ISTIOD_EXTRA_ARGS="--set resources.limits.cpu=500m" \  
./hack/setup/infra/manage.istio-helm.sh  
  
# Uninstall  
UNINSTALL=true ./hack/setup/infra/manage.istio-helm.sh  

```

--------------------------------

### Quick Install Script - LLMInferenceService Dependencies Only

Source: https://kserve.github.io/website/docs/developer-guide

Installs only LLMInferenceService dependencies (without LLMInferenceService).

```bash
# To install only LLMInferenceService dependencies (without LLMInferenceService)  
./hack/kserve-install.sh -d --type llmisvc  

```

--------------------------------

### Verify helm Installation

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Command to verify the installation of helm.

```bash
helm version
```

--------------------------------

### Verify kubectl Installation

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Command to verify the installation of kubectl.

```bash
kubectl version --client
```

--------------------------------

### Deploy Kafka

Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/kafka

Commands to install Kafka and Zookeeper using Helm.

```bash
helm repo add bitnami https://charts.bitnami.com/bitnami  
helm install zookeeper bitnami/zookeeper --set replicaCount=1 --set auth.enabled=false --set allowAnonymousLogin=true \ 
  --set persistance.enabled=false --version 11.0.0  
helm install kafka bitnami/kafka --set zookeeper.enabled=false --set replicaCount=1 --set persistance.enabled=false \ 
  --set logPersistance.enabled=false --set externalZookeeper.servers=zookeeper-headless.default.svc.cluster.local \ 
  --version 21.0.0  
```

--------------------------------

### Start Minikube Cluster

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Command to start a local Kubernetes cluster using Minikube.

```bash
minikube start
```

--------------------------------

### Install Kafka Event Source

Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/kafka

Command to apply the Kafka Event Source.

```bash
kubectl apply -f https://github.com/knative-sandbox/eventing-kafka/releases/download/knative-v1.9.1/source.yaml  
```

--------------------------------

### Helm Template Helper for Rendering Resources

Source: https://kserve.github.io/website/docs/install/overview

Example of a Helm template helper that merges base manifests with patches.

```go-template
{{- include "kserve-common.renderMultiResourceWithPatches" (dict
  "baseFile" "files/kserve/resources.yaml"
  "patchGlob" "files/kserve/*-patch.yaml"
  "certName" "serving-cert"
  "context" .)
-}}

```

--------------------------------

### Install KServe (Knative Mode - Quick Install Script)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Shell script to install KServe in Knative mode using the quick install script.

```bash
curl -sL "https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-knative-mode-full-install-with-manifests.sh" | bash
```

--------------------------------

### Local Documentation Build Command

Source: https://kserve.github.io/website/docs/developer-guide/contribution

Commands to build and start the documentation locally for preview.

```bash
cd website
npm install
npm start

```

--------------------------------

### Install Knative Eventing Core

Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/kafka

Commands to apply Knative Eventing CRDs and core components.

```bash
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.9.7/eventing-crds.yaml  
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.9.7/eventing-core.yaml  
```

--------------------------------

### All-in-One Overlay

Source: https://kserve.github.io/website/docs/install/overview

An example of an all-in-one Kustomize overlay that includes the base and all three components for a full-featured deployment.

```yaml
namespace: kserve
resources:
- ../../base
components:
- ../../components/kserve
- ../../components/llmisvc
- ../../components/localmodel

```

--------------------------------

### Install KServe (Knative Mode - Kustomize)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Command to install KServe in Knative mode using Kustomize.

```bash
./hack/kserve-install.sh --kustomize --knative
```

--------------------------------

### Install KServe (Knative Mode - Helm)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Command to install KServe in Knative mode using Helm.

```bash
./hack/kserve-install.sh --kserve-version v0.18.0 --type kserve --knative
```

--------------------------------

### Install lgbserver Runtime

Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/frameworks/lightgbm

Commands to install the 'lgbserver' runtime package using Uv.

```bash
cd python/lgbserver
uv sync

```

--------------------------------

### Complete LLMInferenceService Configuration Example

Source: https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-configuration

A comprehensive example combining model specification, parallelism, replicas, container templates, worker configuration, and router settings.

```yaml
apiVersion: serving.kserve.io/v1alpha1  
kind: LLMInferenceService  
metadata:  
  name: llama-70b-production  
  namespace: production  
spec:  
  # Model specification  
  model:  
    uri: hf://meta-llama/Llama-2-70b-hf  
    name: meta-llama/Llama-2-70b-hf  
    criticality: High  
  
  # Multi-node workload with data parallelism  
  parallelism:  
    tensor: 4  
    data: 8  
    dataLocal: 4  
  
  # Decode workload (main)  
  replicas: 2  
  template:  
    containers:  
      - name: main  
        image: vllm/vllm-openai:latest  
        args:  
          - "--model"  
          - "/mnt/models"  
          - "--tensor-parallel-size"  
          - "4"  
        resources:  
          limits:  
            nvidia.com/gpu: "4"  
            rdma/roce: "1"  
  
  # Worker pods  
  worker:  
    containers:  
      - name: main  
        image: vllm/vllm-openai:latest  
        args:  
          - "--model"  
          - "/mnt/models"  
          - "--tensor-parallel-size"  
          - "4"  
        resources:  
          limits:  
            nvidia.com/gpu: "4"  
            rdma/roce: "1"  
  
  # Router configuration  
  router:  
    gateway: {}  
    route:  
      http:  
        spec:  
          rules:  
            - timeouts:  
                request: "300s"  
                backendRequest: "300s"  
    scheduler: {}  

```

--------------------------------

### Install KServe (Standard Mode - Helm)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Command to install KServe in standard mode using Helm.

```bash
./hack/kserve-install.sh --kserve-version v0.18.0 --type kserve --standard
```

--------------------------------

### Install Knative Dependencies using kserve-install.sh

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs only the infrastructure dependencies for Knative Mode KServe using the kserve-install.sh script.

```bash
./hack/kserve-install.sh --deps-only --knative
```

--------------------------------

### Quick Install Script - Standard Deployment

Source: https://kserve.github.io/website/docs/developer-guide

Clones the KServe repository and runs the quick install script for Standard deployment mode (without Knative, using Gateway API + Istio).

```bash
# Clone the repository if you haven't already  
git clone https://github.com/kserve/kserve.git  
cd kserve  
 
# Run the quick install script with Standard deployment mode (without Knative, using Gateway API + Istio)  
./hack/kserve-install.sh -r --kustomize  

```

--------------------------------

### Install via uv

Source: https://kserve.github.io/website/docs/reference/controlplane-client/controlplane-client-sdk

Checkout KServe GitHub repository and Install via uv.

```bash
cd kserve/python/kserve  
uv sync  

```

--------------------------------

### Install LLMInferenceService using kserve-install.sh (Kustomize Mode)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs the LLMInferenceService component using the kserve-install.sh script in Kustomize mode.

```bash
# LLMIsvc
./hack/kserve-install.sh --kustomize --type llmisvc
```

--------------------------------

### Install the pmmlserver runtime

Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/frameworks/pmml

Installs the pmmlserver runtime using Uv.

```bash
cd python/pmmlserver
uv sync
```

--------------------------------

### Install LLMInferenceService using kserve-install.sh (Helm Mode)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs the LLMInferenceService component using the kserve-install.sh script in Helm mode.

```bash
./hack/kserve-install.sh --kserve-version v0.18.0 --type llmisvc
```

--------------------------------

### Install KServe + LocalModel (Standard Mode - Kustomize)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Command to install KServe with LocalModel support in standard mode using Kustomize.

```bash
./hack/kserve-install.sh --kustomize --standard --type kserve,localmodel
```

--------------------------------

### Sample Storage Initializer Configuration

Source: https://kserve.github.io/website/docs/model-serving/storage/providers/oci

Example JSON output showing the configuration for the storage initializer, including Modelcar-specific settings.

```json
{
    "image" : "kserve/storage-initializer:latest",
    "memoryRequest": "100Mi",
    "memoryLimit": "1Gi",
    "cpuRequest": "100m",
    "cpuLimit": "1",
    "caBundleConfigMapName": "",
    "caBundleVolumeMountPath": "/etc/ssl/custom-certs",
    "enableModelcar": true,
    "cpuModelcar": "10m",
    "memoryModelcar": "15Mi",
    "uidModelcar": 1010
}

```

--------------------------------

### Install KServe + LocalModel (Knative Mode - Helm)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Command to install KServe with LocalModel support in Knative mode using Helm.

```bash
./hack/kserve-install.sh --kserve-version v0.18.0 --type kserve,localmodel --knative
```

--------------------------------

### Install paddleserver runtime

Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/frameworks/paddle

Installs the paddleserver runtime package using Uv.

```bash
cd python/paddleserver
uv sync
```

--------------------------------

### Install KServe + LocalModel (Standard Mode - Helm)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Command to install KServe with LocalModel support in standard mode using Helm.

```bash
./hack/kserve-install.sh --kserve-version v0.18.0 --type kserve,localmodel --standard
```

--------------------------------

### Install All KServe Components (Kustomize Mode - Knative)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs all KServe components, including LLMInferenceService and LocalModel, in Knative mode using Kustomize.

```bash
# All components (Knative mode)
./hack/kserve-install.sh --kustomize --knative --type kserve,llmisvc,localmodel
```

--------------------------------

### Minikube setup with MetalLB

Source: https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-dependencies

Commands to start a minikube cluster, enable the MetalLB addon, and configure an IP address pool for LoadBalancer services.

```bash
# Start minikube with sufficient resources
minikube start --cpus='12' --memory='16G' --kubernetes-version=v1.33.1

# Enable MetalLB addon
minikube addons enable metallb

# Configure IP address pool
IP=$(minikube ip)
PREFIX=${IP%.*}
START=${PREFIX}.200
END=${PREFIX}.235

kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: metallb-system
  name: config
data:
  config: |
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - ${START}-${END}
EOF
```

--------------------------------

### Install All KServe Components (Kustomize Mode - Standard)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs all KServe components, including LLMInferenceService and LocalModel, in standard mode using Kustomize.

```bash
# All components (Standard mode)
./hack/kserve-install.sh --kustomize --standard --type kserve,llmisvc,localmodel
```

--------------------------------

### Install All KServe Components (Helm Mode - Knative)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs all KServe components, including LLMInferenceService and LocalModel, in Knative mode using Helm.

```bash
# All components (Knative mode)
./hack/kserve-install.sh --kserve-version v0.18.0 --type kserve,llmisvc,localmodel --knative
```

--------------------------------

### Configure Minio Client

Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/kafka

Command to add Minio host configuration.

```bash
mc config host add myminio http://127.0.0.1:9000 minio minio123  
```

--------------------------------

### Install All KServe Components (Helm Mode - Standard)

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs all KServe components, including LLMInferenceService and LocalModel, in standard mode using Helm.

```bash
# All components (Standard mode)
./hack/kserve-install.sh --kserve-version v0.18.0 --type kserve,llmisvc,localmodel --standard
```

--------------------------------

### Install KServe in Knative mode with LLMInferenceService and LocalModel

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Installs KServe, LLMInferenceService, and LocalModel components using the kserve-install.sh script in Knative mode.

```bash
./hack/kserve-install.sh --kustomize --knative --type kserve,localmodel
```

--------------------------------

### Generating Base Manifests with Kustomize

Source: https://kserve.github.io/website/docs/install/overview

Command to generate base manifests by running kustomize build.

```bash
kustomize build config/components/kserve > \
  charts/kserve-resources/files/kserve/resources.yaml

```

--------------------------------

### Kind cluster setup with cloud-provider-kind

Source: https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-dependencies

Commands to create a kind cluster, install cloud-provider-kind, and run it in the background to simulate a LoadBalancer.

```bash
# Create kind cluster
kind create cluster -n "kserve-llm"

# Install cloud-provider-kind
go install sigs.k8s.io/cloud-provider-kind@latest

# Run cloud-provider-kind in background
cloud-provider-kind > /dev/null 2>&1 &
```

--------------------------------

### Clone KServe Repository

Source: https://kserve.github.io/website/docs/admin-guide/kubernetes-deployment-llmisvc

Clones the KServe repository and navigates to the setup directory.

```bash
git clone https://github.com/kserve/kserve.git  
cd kserve/hack/setup  
```

--------------------------------

### Run OpenAI SDK Example

Source: https://kserve.github.io/website/docs/model-serving/generative-inference/sdk-integration

Command to execute the Python script for the OpenAI SDK example.

```bash
python3 sample_openai.py  
```

--------------------------------

### Install KServe CRDs, Controllers, and Cluster Resources

Source: https://kserve.github.io/website/docs/getting-started/quickstart-guide

Applies the Custom Resource Definitions (CRDs), controllers, and cluster resources for KServe.

```bash
# Install CRDs
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-crds.yaml
# Install Controllers
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.18.0/kserve.yaml

# Install ClusterServingRuntimes/LLMIsvcConfigs
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.18.0/kserve-cluster-resources.yaml
```

--------------------------------

### Setup Minio Event Notification to Kafka

Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/kafka

Command to configure Minio to publish events to Kafka.

```bash
# Setup bucket event notification with Kafka  
mc admin config set myminio notify_kafka:1 tls_skip_verify="off" queue_dir="" queue_limit="0" sasl="off" sasl_password="" sasl_username="" tls_client_auth="0" tls="off" client_tls_cert="" client_tls_key="" brokers="kafka-headless.default.svc.cluster.local:9092" topic="mnist" version=""  
  
# Restart Minio  
mc admin service restart myminio  
  
```

--------------------------------

### Multi-Node Configuration Example

Source: https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-configuration

Configuration for a multi-node deployment using LeaderWorkerSet, specifying tensor and data parallelism.

```yaml
spec:  
  replicas: 2  # Number of LeaderWorkerSet replicas  
  
  parallelism:  
    tensor: 4   # Tensor parallelism degree  
    data: 8     # Total data parallel instances  
    dataLocal: 4  # GPUs per node  
    # Result: 8 / 4 = 2 LWS replicas (overrides replicas: 2 if different)  
  
  template:     # Leader pod spec  
    containers:  
      - name: main  
        image: vllm/vllm-openai:latest  
        args:  
          - "--model"  
          - "/mnt/models"  
          - "--tensor-parallel-size"  
          - "4"  
        resources:  
          limits:  
            nvidia.com/gpu: "4"  
            cpu: "16"  
            memory: 128Gi  
  
  worker:       # Worker pod spec (triggers multi-node)  
    containers:  
      - name: main  
        image: vllm/vllm-openai:latest  
        args:  
          - "--model"  
          - "/mnt/models"  
          - "--tensor-parallel-size"  
          - "4"  
        resources:  
          limits:  
            nvidia.com/gpu: "4"  
            cpu: "16"  
            memory: 128Gi  


```

--------------------------------

### Expected Output

Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/detect/art

Example output after running 'kubectl get inferenceservice'.

```text
NAME         URL                                               READY   DEFAULT TRAFFIC   CANARY TRAFFIC   AGE  
artserver   http://artserver.somecluster/v1/models/artserver   True    100                                40m  

```

--------------------------------

### Running the LangChain Example

Source: https://kserve.github.io/website/docs/model-serving/generative-inference/sdk-integration

Execute the Python script to observe both regular and streaming responses from a LangChain integration.

```bash
python3 sample_langchain.py  
```

--------------------------------

### GatewayClass Installation

Source: https://kserve.github.io/website/docs/admin-guide/kubernetes-deployment-llmisvc

Installs the GatewayClass resource.

```bash
infra/gateway-api/manage.gateway-api-gwclass.sh  
```

--------------------------------

### Quick Install Script - Standard Mode Dependencies Only

Source: https://kserve.github.io/website/docs/developer-guide

Installs only Standard mode dependencies (without KServe).

```bash
# To install only Standard mode dependencies (without KServe)  
./hack/kserve-install.sh -d --standard  

```

--------------------------------

### Option 1: Use Direct DeploymentStrategy (Recommended)

Source: https://kserve.github.io/website/docs/model-serving/predictive-inference/rollout-strategies/rollout-strategy-standard

Example of configuring a rollout strategy directly within the InferenceService specification using the `deploymentStrategy` field.

```yaml
spec:
  predictor:
    deploymentStrategy:
      type: RollingUpdate
      rollingUpdate:
        maxUnavailable: "0"    # Availability mode
        maxSurge: "25%"        # Use the ratio value
```