### Local Development Setup

Source: https://github.com/clickhouse/code-interpreter/blob/main/README.md

Use this command to set up the Code Interpreter service for local development using Docker Compose. Ensure Docker is installed and running.

```bash
docker-compose up --build
```

--------------------------------

### Start Minikube for Local Development

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Initiates a Minikube instance with specified CPU and memory resources for local development.

```bash
minikube start --cpus=4 --memory=8192
```

--------------------------------

### Start Minikube and Deploy Code API

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Commands to start the Minikube environment and deploy the codeapi Helm chart. The package-init job runs automatically upon deployment.

```bash
minikube start
```

```bash
helm install codeapi ./helm/codeapi -f ./helm/codeapi/values-local.yaml
```

```bash
kubectl port-forward svc/codeapi-api 3112:3112
```

--------------------------------

### Install Helm Chart and Dependencies

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Installs the Code Interpreter Helm chart and its dependencies, using a local values file for configuration. Ensure Redis dependencies are updated.

```bash
cd helm/codeapi

# Download chart dependencies (Redis)
helm dependency update

# Deploy! Override internalServiceAuth.token for any shared/prod cluster.
# values-local.yaml supplies a TEST-ONLY executionManifest keypair; without it
# (or your own keypair) the install fails fast — see "Execution manifest
# signing keys" above.
helm install codeapi . -f values-local.yaml
```

--------------------------------

### Kubernetes Quick Start: Minikube Seccomp Profile Deployment

Source: https://github.com/clickhouse/code-interpreter/blob/main/seccomp/README.md

For Minikube, this process involves extracting the seccomp profile from a ConfigMap, copying it to the appropriate Kubelet directory on the node, and then enabling seccomp in the Helm values for your application.

```bash
# Extract profile from ConfigMap and copy to node
kubectl get configmap codeapi-seccomp-profile -o jsonpath='{.data.nsjail\.json}' > /tmp/nsjail.json
minikube cp /tmp/nsjail.json /var/lib/kubelet/seccomp/profiles/nsjail.json

# Enable seccomp in Helm values
helm upgrade codeapi ./helm/codeapi --set workerSandbox.seccomp.enabled=true
```

--------------------------------

### Build and Run Locally with Docker Compose

Source: https://github.com/clickhouse/code-interpreter/blob/main/api/README.md

Builds the Docker image and starts the code interpreter service using Docker Compose. Ensure you are in the codeapi root directory.

```bash
# Build and run locally with docker-compose, from the codeapi root
docker compose up --build
```

--------------------------------

### Run MinIO Docker Container

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/src/README.md

This command starts a MinIO server instance in a Docker container. It maps ports 9000 and 9001, sets root credentials via environment variables, and mounts a local directory for data storage.

```bash
docker run -p 9000:9000 -p 9001:9001 --name minio \
  -e "MINIO_ROOT_USER=AKIAIOSFODNN7EXAMPLE" \
  -e "MINIO_ROOT_PASSWORD=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" \
  -v ~/minio/data:/data \
  quay.io/minio/minio server /data --console-address ":9001"
```

--------------------------------

### GET /api/v2/runtimes

Source: https://github.com/clickhouse/code-interpreter/blob/main/api/README.md

Retrieves a list of all language runtimes that are currently supported and available for code execution.

```APIDOC
## GET /api/v2/runtimes

### Description
List available language runtimes.

### Method
GET

### Endpoint
/api/v2/runtimes
```

--------------------------------

### Kubernetes DaemonSet for Seccomp Profile Installation

Source: https://github.com/clickhouse/code-interpreter/blob/main/seccomp/README.md

A DaemonSet can be used to automatically deploy the seccomp profile to the required location on each node in a production Kubernetes cluster. This DaemonSet runs an init container to copy the profile from a ConfigMap to the host's seccomp directory.

```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: seccomp-profile-installer
spec:
  selector:
    matchLabels:
      app: seccomp-installer
  template:
    metadata:
      labels:
        app: seccomp-installer
    spec:
      initContainers:
        - name: install-profile
          image: busybox:1.36.1
          command:
            - sh
            - -c
            - |
              mkdir -p /host/var/lib/kubelet/seccomp
              cp /profiles/nsjail.json /host/var/lib/kubelet/seccomp/profiles/
          volumeMounts:
            - name: host-seccomp
              mountPath: /host/var/lib/kubelet/seccomp
            - name: profiles
              mountPath: /profiles
          securityContext:
            privileged: true
      containers:
        - name: pause
          image: gcr.io/google_containers/pause:3.2
          resources:
            requests:
              cpu: 1m
              memory: 1Mi
      volumes:
        - name: host-seccomp
          hostPath:
            path: /var/lib/kubelet/seccomp
            type: DirectoryOrCreate
        - name: profiles
          configMap:
            name: codeapi-seccomp-profile
      tolerations:
        - operator: Exists
```

--------------------------------

### Generate and Use Ed25519 Signing Keys for Helm Chart

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Generates an Ed25519 keypair and installs the Helm chart with the keys set. This is crucial for signing sandbox execute requests.

```bash
# Generate the Ed25519 private key (PEM)
openssl genpkey -algorithm ed25519 -out manifest-signing.pem

# Extract the public key (PEM)
openssl pkey -in manifest-signing.pem -pubout -out manifest-signing.pub.pem

# base64-encoded DER values for the chart
PRIVATE_KEY=$(openssl pkey -in manifest-signing.pem -outform DER | base64)
PUBLIC_KEY=$(openssl pkey -in manifest-signing.pem -pubout -outform DER | base64)

helm install codeapi . \
  --set executionManifest.privateKey="$PRIVATE_KEY" \
  --set executionManifest.publicKey="$PUBLIC_KEY"
```

--------------------------------

### List Runtimes API Endpoint

Source: https://github.com/clickhouse/code-interpreter/blob/main/api/README.md

The GET endpoint used to retrieve a list of all language runtimes currently available for code execution.

```http
GET /api/v2/runtimes
```

--------------------------------

### Docker Compose Seccomp Configuration

Source: https://github.com/clickhouse/code-interpreter/blob/main/seccomp/README.md

Integrate the custom seccomp profile into your Docker Compose setup by specifying it under the `security_opt` directive for the sandbox service.

```yaml
services:
  sandbox:
    security_opt:
      - seccomp=./seccomp/nsjail.json
```

--------------------------------

### POST /api/v2/execute

Source: https://github.com/clickhouse/code-interpreter/blob/main/api/README.md

Executes code within a sandboxed environment. This endpoint is used to run user-provided code and get the results.

```APIDOC
## POST /api/v2/execute

### Description
Execute code in a sandboxed environment.

### Method
POST

### Endpoint
/api/v2/execute
```

--------------------------------

### Track Execution Duration

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

Calculate the duration of an execution by subtracting the start time from the current time. Store this in the ExecutionState.

```typescript
// Track in ExecutionState
const duration = Date.now() - state.startTime;
```

--------------------------------

### Kubernetes DaemonSet for AppArmor Profile Loading

Source: https://github.com/clickhouse/code-interpreter/blob/main/apparmor/README.md

A Kubernetes DaemonSet configuration to load AppArmor profiles onto nodes. It uses an Alpine container to install AppArmor, copy the profile, and parse it.

```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: apparmor-loader
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: apparmor-loader
  template:
    metadata:
      labels:
        app: apparmor-loader
    spec:
      hostPID: true
      containers:
        - name: loader
          image: alpine
          securityContext:
            privileged: true
          command: ["/bin/sh", "-c"]
          args:
            - |
              apk add --no-cache apparmor
              cat > /etc/apparmor.d/sandbox-nsjail << 'PROFILE'
              # Copy contents of sandbox-nsjail file here
              PROFILE
              apparmor_parser -r /etc/apparmor.d/sandbox-nsjail
              echo "Profile loaded"
              sleep infinity
          volumeMounts:
            - name: sys
              mountPath: /sys
            - name: apparmor
              mountPath: /etc/apparmor.d
      volumes:
        - name: sys
          hostPath:
            path: /sys
        - name: apparmor
          hostPath:
            path: /etc/apparmor.d
```

--------------------------------

### Health Check Endpoint Example

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

This TypeScript code defines a health check endpoint that verifies the status of various dependencies and reports the number of active executions.

```typescript
GET /v1/exec/programmatic/health
{
  "redis": true,
  "tool_call_server": true,
  "sandbox": true,
  "active_executions": 5
}
```

--------------------------------

### Kubernetes Testing with Minikube

Source: https://github.com/clickhouse/code-interpreter/blob/main/apparmor/README.md

Steps to set up and test the sandbox in a Kubernetes environment using minikube and Helm.

```bash
# Start minikube
minikube start --cpus=4 --memory=8192

# Deploy with helm
./helm/setup-local.sh minikube

# Port forward and test
kubectl port-forward deploy/codeapi-worker-sandbox 2000:2000
./test-sandbox.sh
```

--------------------------------

### Sandbox Filesystem Layout

Source: https://github.com/clickhouse/code-interpreter/blob/main/api/README.md

Illustrates the directory structure within the code execution sandbox, including working directories, temporary storage, and read-only system mounts.

```text
/mnt/data/          Working directory (bind mount, writable)
  ├── *.py          User code files
  ├── *.js / *.ts   User code files
  └── ...           Downloaded files from file server
/tmp/               tmpfs (20MB, writable)
/usr/               Host /usr (read-only bind mount)
/bin -> /usr/bin    Symlink (merged-usr)
/lib -> /usr/lib    Symlink (merged-usr)
/lib64 -> /usr/lib64  Symlink (merged-usr)
/proc/              procfs (read-only, required by Bun)
/dev/null           Device node (read-only)
/dev/urandom        Device node (read-only)
/dev/zero           Device node (read-only)
/pkgs/   Language runtime packages (read-only bind mount)
```

--------------------------------

### Update Code API Images and Restart Deployments

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Instructions for rebuilding Docker images and restarting deployments to apply code changes. Ensure you are in the Minikube Docker environment before building.

```bash
eval $(minikube docker-env)
```

```bash
docker build -t codeapi-worker:latest -f service/Dockerfile.worker .
```

```bash
docker build -t codeapi-sandbox-runner:latest -f api/Dockerfile .
```

```bash
kubectl rollout restart deployment/codeapi-service-worker
```

```bash
kubectl rollout restart deployment/codeapi-sandbox-runner
```

--------------------------------

### Build Docker Image in Minikube

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

When pods are stuck in `ErrImageNeverPull`, ensure images are built within Minikube's Docker environment before deploying.

```bash
eval $(minikube docker-env)
docker build -t <image-name>:latest ...
kubectl rollout restart deployment/<deployment-name>
```

--------------------------------

### Address MinIO ImagePullBackOff in Production

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

For production environments, if MinIO experiences `ImagePullBackOff`, either enable `minio.useSimple=true` in your values to use the official image or specify a valid Bitnami image tag in your `values.yaml`.

```bash
# The Bitnami MinIO chart may reference unavailable image tags.
# For local dev, values-local.yaml uses minio.useSimple=true which
# deploys the official minio/minio:latest image instead.
#
# If you see this in production, either:
# 1. Use minio.useSimple=true
# 2. Or specify a valid Bitnami image tag in values.yaml
```

--------------------------------

### View Sandbox Runner Pod Logs

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/templates/NOTES.txt

Stream logs from the sandbox runner component pods. Use the -f flag to follow the logs in real-time.

```bash
kubectl logs -l app.kubernetes.io/component=sandbox-runner -f
```

--------------------------------

### View All Pods

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/templates/NOTES.txt

List all pods associated with the current Helm release instance. This is useful for checking the status of all deployed components.

```bash
kubectl get pods -l app.kubernetes.io/instance={{ .Release.Name }}
```

--------------------------------

### Build Docker Images within Minikube

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Builds all necessary Docker images for the Code Interpreter project, ensuring they are tagged correctly and built using Minikube's Docker daemon.

```bash
# Point docker to minikube's daemon
eval $(minikube docker-env)

# Build all images, from the codeapi root
docker build -t codeapi-api:latest -f service/Dockerfile.api .
docker build -t codeapi-worker:latest -f service/Dockerfile.worker .
docker build -t codeapi-sandbox-runner:latest -f api/Dockerfile .
docker build -t codeapi-file-server:latest -f service/Dockerfile --target production .
docker build -t codeapi-tool-call-server:latest -f service/Dockerfile.tool-call-server --target production .
docker build -t codeapi-package-init:latest -f docker/Dockerfile.package-init .
```

--------------------------------

### Testing Seccomp Profile with Docker

Source: https://github.com/clickhouse/code-interpreter/blob/main/seccomp/README.md

After applying the seccomp profile, run the provided test script to verify its correct implementation. Failures, such as 'Operation not permitted' errors, indicate a syscall is being blocked and may require adjustment to the profile's whitelist.

```bash
./test-sandbox-docker.sh
```

--------------------------------

### Resolve 'runtime is unknown' Error

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

This error often indicates an issue with language packages. Verify the init job ran successfully and consider forcing a rebuild if necessary. Restart sandbox-runner pods after applying changes.

```bash
kubectl get jobs -l app.kubernetes.io/component=package-init
kubectl logs job/codeapi-package-init
```

```bash
helm upgrade codeapi . --set workerSandbox.packages.initJob.forceRebuild=true
kubectl rollout restart deployment/codeapi-sandbox-runner
```

--------------------------------

### Check Package Init Job Status

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Retrieves the status of Kubernetes jobs related to package initialization and displays the logs for the package init job.

```bash
kubectl get jobs -l app.kubernetes.io/component=package-init
kubectl logs job/codeapi-package-init
```

--------------------------------

### Local Testing with Docker Compose

Source: https://github.com/clickhouse/code-interpreter/blob/main/apparmor/README.md

Commands to test the sandbox container with and without AppArmor profiles using Docker Compose. AppArmor requires a native Linux environment.

```bash
# Baseline (privileged: true)
docker compose -f docker-compose.local-dev.yml up -d
./test-sandbox.sh

# Capability-restricted (no privileged, explicit caps)
docker compose -f docker-compose.local-dev.yml -f your-capability-overlay.yml up -d
./test-sandbox.sh
```

--------------------------------

### Manually Populate Packages PVC

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Manually populates the packages Persistent Volume Claim (PVC) by running a temporary pod, copying data, and then deleting the pod. This is an alternative to automatic package initialization.

```bash
kubectl run pvc-populator --image=alpine --command -- sleep 3600 \
  --overrides='{"spec":{"containers":[{"name":"pvc-populator","image":"alpine","command":["sleep","3600"],"volumeMounts":[{"name":"packages","mountPath":"/packages"}]}]},"volumes":[{"name":"packages","persistentVolumeClaim":{"claimName":"codeapi-packages"}}]}}'

kubectl wait --for=condition=ready pod/pvc-populator --timeout=60s
kubectl cp ./data/pkgs/. pvc-populator:/packages/
kubectl delete pod pvc-populator
kubectl rollout restart deployment/codeapi-sandbox-runner
```

--------------------------------

### Execute Code

Source: https://github.com/clickhouse/code-interpreter/blob/main/api/README.md

Submits user code, language, and optional files/stdin to the sandbox for execution. The API returns stdout, stderr, exit code, and signal information.

```APIDOC
## POST /api/v2/execute

### Description
Submits user code, language, and optional files/stdin to the sandbox for execution. The API returns stdout, stderr, exit code, and signal information.

### Method
POST

### Endpoint
/api/v2/execute

### Request Body
- **code** (string) - Required - The user-provided code to execute.
- **language** (string) - Required - The programming language of the code (e.g., "python", "javascript").
- **files** (object) - Optional - A map of filenames to their content for files to be included in the sandbox.
- **stdin** (string) - Optional - Standard input to be provided to the executed code.

### Response
#### Success Response (200)
- **stdout** (string) - The standard output of the executed code.
- **stderr** (string) - The standard error of the executed code.
- **exit_code** (integer) - The exit code of the executed process.
- **signal** (integer) - The signal that terminated the process, if any.
```

--------------------------------

### Troubleshoot Pod CrashLoopBackOff

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

If a pod is stuck in `CrashLoopBackOff`, check its logs and describe the pod for detailed error information.

```bash
kubectl logs <pod-name> --previous
kubectl describe pod <pod-name>
```

--------------------------------

### Check Code API Pod Status and Logs

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Commands to view all pods, retrieve logs from specific deployments, and describe a pod for debugging purposes.

```bash
kubectl get pods
```

```bash
kubectl logs deployment/codeapi-api
```

```bash
kubectl logs deployment/codeapi-service-worker
```

```bash
kubectl logs deployment/codeapi-sandbox-runner
```

```bash
kubectl describe pod <pod-name>
```

--------------------------------

### View API Pod Logs

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/templates/NOTES.txt

Stream logs from the API component pods. Use the -f flag to follow the logs in real-time.

```bash
kubectl logs -l app.kubernetes.io/component=api -f
```

--------------------------------

### Scale Execution Pods

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/templates/NOTES.txt

Scale the sandbox runner deployment to increase execution capacity. This command directly adjusts the number of sandbox runner pods.

```bash
kubectl scale deployment/{{ include "codeapi.fullname" . }}-sandbox-runner --replicas=10
```

--------------------------------

### Scale Sandbox Execution Tier

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Commands to scale the sandbox execution tier using kubectl or Helm upgrade. This adjusts the number of sandbox runner replicas.

```bash
kubectl scale deployment/codeapi-sandbox-runner --replicas=10
```

```bash
helm upgrade codeapi ./helm/codeapi -f ./helm/codeapi/values-local.yaml \
  --set workerSandbox.sandboxRunner.replicaCount=10
```

--------------------------------

### Scale API Pods

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/templates/NOTES.txt

Scale the API deployment to increase HTTP capacity. This command directly adjusts the number of API pods.

```bash
kubectl scale deployment/{{ include "codeapi.fullname" . }}-api --replicas=5
```

--------------------------------

### Search for Cleanup Failures

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

Use grep to find log entries detailing errors that occurred during execution cleanup.

```bash
# Find cleanup failures
grep "Error during execution cleanup" logs/error-*.log
```

--------------------------------

### View Egress Gateway Pod Logs

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/templates/NOTES.txt

Stream logs from the egress gateway component pods. Use the -f flag to follow the logs in real-time.

```bash
kubectl logs -l app.kubernetes.io/component=egress-gateway -f
```

--------------------------------

### Set Execution State TTL

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

Configure the Time-To-Live for execution state in seconds. Defaults to 600 seconds (10 minutes).

```bash
# Execution state TTL (in seconds)
# Default: 600 (10 minutes)
EXECUTION_STATE_TTL=600
```

--------------------------------

### Configure API Authentication with JWT

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Configure JWT verification for the API component by setting environment variables. This is required for production deployments outside of local mode.

```yaml
api:
  extraEnv:
    - name: CODEAPI_AUTH_PROVIDER
      value: librechat-jwt
    - name: CODEAPI_JWT_PUBLIC_KEY     # single PEM/base64-DER verifier key
      valueFrom:
        secretKeyRef:
          name: codeapi-jwt-verifier
          key: public-key
    - name: CODEAPI_JWT_KID
      value: my-key-id
```

--------------------------------

### Force Package Initialization Job Rebuild

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Forces the package initialization job to rebuild language packages. This is useful after updates or when migrating package roots.

```bash
helm upgrade codeapi . --set workerSandbox.packages.initJob.forceRebuild=true
```

--------------------------------

### AppArmor Debug Mode (Complain)

Source: https://github.com/clickhouse/code-interpreter/blob/main/apparmor/README.md

Instructions to temporarily set an AppArmor profile to 'complain' mode for debugging. This logs potential denials without blocking actions.

```bash
# Set profile to complain mode (log but don't block)
aa-complain sandbox-nsjail

# Run tests, check dmesg for what would be denied
dmesg | grep -i apparmor

# Re-enable enforcement
aa-enforce sandbox-nsjail
```

--------------------------------

### Kubernetes Verification Commands

Source: https://github.com/clickhouse/code-interpreter/blob/main/apparmor/README.md

Commands to verify AppArmor profile deployment in a Kubernetes cluster. Checks node OS image and pod annotations.

```bash
# Check profile is loaded on nodes
kubectl get nodes -o jsonpath='{.items[*].status.nodeInfo.osImage}'

# Check pod annotation
kubectl get pod -l app.kubernetes.io/component=worker-sandbox -o jsonpath='{.items[0].metadata.annotations}'
```

--------------------------------

### Port Forwarding for Local API Access

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/templates/NOTES.txt

Use this command to forward a local port to the API service for local access. Access the API via http://localhost:{{ .Values.api.service.port }}/v1/health.

```bash
kubectl port-forward svc/{{ include "codeapi.fullname" . }}-api {{ .Values.api.service.port }}:{{ .Values.api.service.port }}

Then access at:
  http://localhost:{{ .Values.api.service.port }}/v1/health
```

--------------------------------

### Execute Python Code via API

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Sends a POST request to the /v1/exec endpoint to execute Python code. Requires Content-Type and X-API-Key headers, and a JSON payload specifying the language and code.

```bash
curl -X POST http://localhost:3112/v1/exec \
  -H "Content-Type: application/json" \
  -H "X-API-Key: test-key" \
  -d '{"lang": "py", "code": "print(\"Hello from K8s!\")"}'
```

--------------------------------

### Configure Tool Call Server Retries

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

Set the number of retry attempts and the delay in milliseconds for the Tool Call Server.

```bash
# Tool Call Server retry settings
TOOL_CALL_SERVER_RETRY_ATTEMPTS=3
TOOL_CALL_SERVER_RETRY_DELAY=1000  # milliseconds
```

--------------------------------

### Port Forward API Service

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Establishes a port forwarding connection from your local machine to the Code Interpreter API service running in Kubernetes.

```bash
# Port forward (in another terminal)
kubectl port-forward svc/codeapi-api 3112:3112
```

--------------------------------

### Redis-Based Execution State (After)

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

The new implementation utilizes Redis to store execution states, enabling multi-instance safety, persistence across restarts, and automatic TTL-based expiry.

```typescript
// Redis keys: exec_state:{execution_id}
interface ExecutionState {
  execution_id: string;
  session_id: string;
  userId: string;
  apiKeyId: string;
  startTime: number;
  jobCompleted?: boolean;
  jobResult?: t.ExecuteResult;
  jobError?: string;
}
```

--------------------------------

### Upgrade Helm Release for Scaling

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/templates/NOTES.txt

Upgrade the Helm release to adjust replica counts for service workers and sandbox runners. This is an alternative to direct scaling commands.

```bash
helm upgrade {{ .Release.Name }} ./helm/codeapi \
    --set workerSandbox.serviceWorker.replicaCount=2 \
    --set workerSandbox.sandboxRunner.replicaCount=10
```

--------------------------------

### Search for Client Disconnects

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

Use grep to find log entries indicating a client has disconnected for a specific execution.

```bash
# Find client disconnects
grep "Client disconnected for execution" logs/combined-*.log
```

--------------------------------

### Search for Retry Attempts

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

Use grep to find log entries related to failed attempts during retries.

```bash
# Find retry attempts
grep "failed (attempt" logs/combined-*.log
```

--------------------------------

### In-Memory Execution State (Before)

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

The previous implementation used an in-memory Map to store active execution states, which was not suitable for multi-instance or restart scenarios.

```typescript
const activeExecutions = new Map<string, {...}>(); // In-memory only!
```

--------------------------------

### Test Tool Call Server Unavailability

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

This test checks how the system handles the Tool Call Server being unavailable. It expects the execution to fail gracefully with a 503 error after retries.

```bash
# Stop Tool Call Server
docker-compose stop tool_call_server

# Try execution (should fail gracefully)
./test-programmatic.sh simple

# Should: Return 503 error after retries
```

--------------------------------

### Replace Polling with Redis Pub/Sub

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

This snippet illustrates replacing a polling mechanism with Redis pub/sub for instant notifications of execution completion.

```typescript
// Instead of:
while (...) { await redis.get(...); await sleep(100); }

// Use:
await redis.subscribe(`exec:${execution_id}:complete`);
```

--------------------------------

### Checking AppArmor Status on Node

Source: https://github.com/clickhouse/code-interpreter/blob/main/apparmor/README.md

Commands to check the AppArmor status on a Kubernetes node and verify if the sandbox-nsjail profile is loaded and active.

```bash
# On node
cat /sys/module/apparmor/parameters/enabled  # Should be Y
aa-status | grep sandbox-nsjail

# Check denials
dmesg | grep -i apparmor
```

--------------------------------

### Test Code Execution with cURL

Source: https://github.com/clickhouse/code-interpreter/blob/main/api/README.md

Tests the execution endpoint of the code interpreter by sending a Python script to be executed. The output is piped to jq for pretty-printing.

```bash
# Test execution
curl -s http://localhost:2000/api/v2/execute \
  -H 'Content-Type: application/json' \
  -d '{"language":"python","version":"3.14.4","files":[{"content":"print(42)"}]}' | jq
```

--------------------------------

### Search for Stale Executions

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

Use grep to find log entries indicating the cleanup of stale executions.

```bash
# Find stale executions
grep "Cleaning up stale execution" logs/combined-*.log
```

--------------------------------

### Test Client Disconnect

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

This test simulates a client disconnecting abruptly during execution. It checks for proper cleanup of execution state and the Tool Call Server session. Verification involves checking Redis for leftover keys.

```bash
# Start execution and kill immediately
timeout 2s ./test-programmatic.sh simple || true

# Should: Clean up execution state and Tool Call Server session
# Verify: Check Redis for leftover keys
redis-cli keys "exec_state:*"
```

--------------------------------

### Retry Logic for Tool Call Server Communication

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

All requests to the Tool Call Server now include automatic retry logic with exponential backoff (3 attempts, delays of 1s and 2s) and skips retries for 4xx errors.

```typescript
async function retryToolCallServerRequest<T>(
  requestFn: () => Promise<T>,
  context: string
): Promise<T> {
  // 3 attempts
  // Delays: 1s, 2s (exponential backoff)
  // Skips retry on 4xx errors (client errors)
}

```

--------------------------------

### View Service Worker Pod Logs

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/templates/NOTES.txt

Stream logs from the service worker component pods. Use the -f flag to follow the logs in real-time.

```bash
kubectl logs -l app.kubernetes.io/component=service-worker -f
```

--------------------------------

### Helm Values for AppArmor Integration

Source: https://github.com/clickhouse/code-interpreter/blob/main/apparmor/README.md

Configuration snippet for Helm values to enable and specify the AppArmor profile for the worker sandbox.

```yaml
workerSandbox:
  securityHardening:
    enabled: true
    appArmorProfile: "sandbox-nsjail"
```

--------------------------------

### Implement Circuit Breaker for Tool Call Server

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

This TypeScript code demonstrates implementing a circuit breaker pattern to temporarily stop sending requests to the Tool Call Server if it fails repeatedly.

```typescript
const circuitBreaker = new CircuitBreaker(toolCallServerRequest, {
  timeout: 5000,
  errorThresholdPercentage: 50,
  resetTimeout: 30000
});
```

--------------------------------

### Test Service Restart During Execution

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

This test verifies that the service can resume execution from its Redis state or fail gracefully after a service restart during an ongoing execution.

```bash
# Start execution
./test-programmatic.sh simple &

# Restart service mid-execution
docker-compose restart service

# Should: Resume from Redis state or fail gracefully
```

--------------------------------

### Teardown Code API Helm Release

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Commands to uninstall the Helm release, stop Minikube, or delete Minikube entirely for a full reset.

```bash
helm uninstall codeapi
```

```bash
minikube stop
```

```bash
minikube delete
```

--------------------------------

### Fix Connection Refused on Port 3112

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Ensure that a port-forwarding service is active to allow connections to the codeapi-api service on port 3112.

```bash
kubectl port-forward svc/codeapi-api 3112:3112
```

--------------------------------

### Scheduled Cleanup of Stale Executions

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

A background job is scheduled to run every 5 minutes to identify and clean up stale executions that have exceeded their TTL and are not completed.

```typescript
setInterval(() => {
  cleanupStaleExecutions(); // Finds executions older than TTL
}, 5 * 60 * 1000);

```

--------------------------------

### Log Tool Call Server Retries

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

Log warning messages for Tool Call Server retries, including the attempt number, context, and the error message.

```typescript
logger.warn('Tool Call Server retry', {
  attempt,
  context,
  error: lastError.message
});
```

--------------------------------

### Log Cleanup Statistics

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

Log information about cleanup operations, including the number of stale items cleaned and active items remaining.

```typescript
logger.info('Cleanup completed', {
  stale_cleaned: cleaned,
  active_remaining: activeCount
});
```

--------------------------------

### Execute Code API Endpoint

Source: https://github.com/clickhouse/code-interpreter/blob/main/api/README.md

The POST endpoint for executing code within a sandboxed environment. This is the primary method for running user-provided scripts.

```http
POST /api/v2/execute
```

--------------------------------

### Test API Health Endpoint

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Tests the health endpoint of the Code Interpreter API to verify it is running and accessible via localhost.

```bash
# Test
curl http://localhost:3112/v1/health
```

--------------------------------

### Verify Horizontal Scaling with Logs

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Checks which service-worker processed a job by viewing recent logs. This helps verify that jobs are distributed across all workers, each with a unique ID.

```bash
kubectl logs deployment/codeapi-service-worker --tail=5

# Each pod has a unique ID - jobs are distributed across all workers
```

--------------------------------

### Graceful Degradation with Retry Queue

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

This TypeScript code shows how to implement graceful degradation by queuing executions for retry when the Tool Call Server is down, returning a 202 Accepted status.

```typescript
if (toolCallServerDown) {
  await queueForRetry(execution_id, payload);
  return res.status(202).json({ message: 'Queued for execution' });
}
```

--------------------------------

### Worker Health and Readiness Check

Source: https://github.com/clickhouse/code-interpreter/blob/main/README.md

Verify the health and readiness of the Code Interpreter worker. These endpoints are crucial for ensuring the worker can process code execution requests.

```http
GET /health
```

```http
GET /ready
```

--------------------------------

### Define Monitoring Constants

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

Constants for polling intervals, maximum poll time, and execution state TTL in milliseconds and seconds.

```typescript
const POLL_INTERVAL = 100;           // ms - How often to poll for state changes
const MAX_POLL_TIME = 300000;        // ms - Max time to wait for execution
const EXECUTION_STATE_TTL = 600;     // seconds - Redis key TTL
```

--------------------------------

### Count Active Executions

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

Retrieve the number of active executions by counting keys matching the 'exec_state:*' pattern in Redis.

```typescript
// Count active executions
const keys = await connection.keys('exec_state:*');
const activeCount = keys.length;
```

--------------------------------

### Unified Execution Cleanup Function

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

A centralized function `cleanupExecution` consolidates the logic for cleaning up execution state in Redis and terminating the Tool Call Server session. It is used across various handlers including client disconnect, completion, error handlers, and stale execution cleanup.

```typescript
async function cleanupExecution(execution_id: string): Promise<void> {
  await Promise.all([
    deleteExecutionState(execution_id),
    axios.delete(`${env.TOOL_CALL_SERVER_URL}/sessions/${execution_id}`)
  ]);
}

```

--------------------------------

### Test Stale Execution Cleanup

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

This test focuses on the cleanup of stale execution state in Redis. It involves manually creating stale state and then relying on a cleanup job to remove it after a delay.

```bash
# Manually create stale execution state
redis-cli SET "exec_state:test_old" '{"execution_id":"test_old","startTime":0}'

# Wait 5+ minutes or trigger cleanup manually
# Should: Cleanup job removes stale state
```

--------------------------------

### Health Check Code API

Source: https://github.com/clickhouse/code-interpreter/blob/main/helm/codeapi/README.md

Performs a health check on the Code API. Expects an 'OK' response.

```bash
curl http://localhost:3112/v1/health
# Expected: OK
```

--------------------------------

### Client Disconnect Handling

Source: https://github.com/clickhouse/code-interpreter/blob/main/service/ROBUSTNESS.md

This handler is triggered when a client disconnects mid-execution, ensuring that the job is removed from the queue, execution state in Redis is cleaned up, and the Tool Call Server session is terminated.

```typescript
req.on('close', async () => {
  // 1. Remove job from queue (if not started)
  await job.remove();
  
  // 2. Cleanup execution state in Redis
  // 3. Cleanup Tool Call Server session
  await cleanupExecution(execution_id);
});

```

--------------------------------

### API Health Check

Source: https://github.com/clickhouse/code-interpreter/blob/main/README.md

Check the health status of the Code Interpreter API gateway. This endpoint is used to verify if the API is operational.

```http
GET /v1/health
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.