NVIDIA NIM Health API
Liveness, readiness, and startup probes exposed by self-hosted NIM containers (/v1/health/live, /v1/health/ready) and a Prometheus /v1/metrics scrape endpoint for GPU utilization, request latency, and queue depth. Drives Kubernetes pod lifecycle and HPA scaling via the NIM Operator.