Version: main 🚧

Monitoring

vCluster Platform provides full-stack observability across your entire multi-tenant fleet. For a single cluster, install a complete Prometheus and Grafana stack in one click. For fleet-wide visibility, workload metrics from every connected cluster can flow into a central Prometheus instance, automatically labeled by project, space, and tenant cluster.

Operators can also monitor the Platform's own health using built-in Prometheus metrics. For centralized log collection across all tenant clusters, the Central HostPath Mapper removes the need for per-cluster agents.

Workload metrics

vCluster Platform provides two approaches for monitoring tenant cluster workloads. One option focuses on ease of use for a single cluster. The other offers filtering and labeling across multiple clusters into a single Prometheus instance.

If you want to	Use
Monitor a single cluster with a quick, one-click setup	Basic cluster monitoring
Aggregate metrics across multiple tenant clusters	Fleet monitoring
Filter or label metrics by project, space, or tenant cluster	Fleet monitoring

Basic cluster monitoring

For a single connected cluster, install the kube-prometheus-stack app from the Platform UI. This deploys Prometheus and Grafana to the cluster in one step.

Go to Infra > Clusters and select a cluster.
Navigate to the Apps tab.
Click the kube-prometheus-stack recommended app and click Install.

After installation, you have a complete monitoring setup with Prometheus scraping cluster metrics and Grafana for visualization. If your tenant clusters run in Shared Nodes mode, see Prometheus node metrics on shared nodes to configure kubelet scraping.

warning

Do not use kube-prometheus-stack if you want to aggregate metrics across multiple tenant clusters. Use the OpenTelemetry fleet monitoring approach below instead.

Fleet monitoring (multi-cluster)

For aggregating workload metrics across multiple tenant clusters, use the OpenTelemetry Collector with Prometheus. This approach deploys shared OpenTelemetry DaemonSets on each connected cluster. They push metrics to a central Prometheus instance, labeled by project, tenant cluster, and space.

Aggregating metrics with OpenTelemetry — step-by-step setup with Prometheus, OpenTelemetry Collector, and Grafana
Fleet monitoring with OpenTelemetry — advanced configuration including remote_write for both Shared Nodes and Private Nodes tenancy models

Platform health metrics

vCluster Platform exposes Prometheus-conformant metrics from its internal components, including the API gateway, integrated Kubernetes API server, controller manager, and Go runtime. These metrics cover request counts, latency, and error rates for all platform operations.

Use a Prometheus ServiceMonitor to scrape these metrics automatically, or access the /metrics endpoint directly. For platform pod log level and output format, see Platform Process Logging.

Log collection

To collect logs from workloads running inside tenant clusters, use the Central HostPath Mapper. It installs a single DaemonSet on the Control Plane Cluster. The DaemonSet handles log path remapping for all tenant clusters and removes the need for per-cluster logging agents.

Central HostPath Mapper

For platform audit logging and platform process log configuration, see Logging.

Workload metrics​

Basic cluster monitoring​

Fleet monitoring (multi-cluster)​

Platform health metrics​

Log collection​

Workload metrics

Basic cluster monitoring

Fleet monitoring (multi-cluster)

Platform health metrics

Log collection