Monitoring
vCluster Platform provides full-stack observability across your entire multi-tenant fleet. For a single cluster, install a complete Prometheus and Grafana stack in one click. For fleet-wide visibility, workload metrics from every connected cluster can flow into a central Prometheus instance, automatically labeled by project, space, and tenant cluster.
Operators can also monitor the Platform's own health using built-in Prometheus metrics. For centralized log collection across all tenant clusters, the Central HostPath Mapper removes the need for per-cluster agents.
Workload metrics​
vCluster Platform provides two approaches for monitoring tenant cluster workloads. One option focuses on ease of use for a single cluster. The other offers filtering and labeling across multiple clusters into a single Prometheus instance.
| If you want to | Use |
|---|---|
| Monitor a single cluster with a quick, one-click setup | Basic cluster monitoring |
| Aggregate metrics across multiple tenant clusters | Fleet monitoring |
| Filter or label metrics by project, space, or tenant cluster | Fleet monitoring |
Basic cluster monitoring​
For a single connected cluster, install the kube-prometheus-stack app from the Platform UI. This deploys Prometheus and Grafana to the cluster in one step.
- Go to Infra > Clusters and select a cluster.
- Navigate to the Apps tab.
- Click the kube-prometheus-stack recommended app and click Install.
After installation, you have a complete monitoring setup with Prometheus scraping cluster metrics and Grafana for visualization.
Do not use kube-prometheus-stack if you want to aggregate metrics across multiple tenant clusters. Use the OpenTelemetry fleet monitoring approach below instead.
Fleet monitoring (multi-cluster)​
For aggregating workload metrics across multiple tenant clusters, use the OpenTelemetry Collector with Prometheus. This approach deploys shared OpenTelemetry DaemonSets on each connected cluster. They push metrics to a central Prometheus instance, labeled by project, tenant cluster, and space.
- Aggregating metrics with OpenTelemetry — step-by-step setup with Prometheus, OpenTelemetry Collector, and Grafana
- Fleet monitoring with OpenTelemetry — advanced configuration including remote_write for both Shared Nodes and Private Nodes tenancy models
Platform health metrics​
vCluster Platform exposes Prometheus-conformant metrics from its internal components, including the API gateway, integrated Kubernetes API server, controller manager, and Go runtime. These metrics cover request counts, latency, and error rates for all platform operations.
Use a Prometheus ServiceMonitor to scrape these metrics automatically, or access the /metrics endpoint directly. You can also configure Platform log levels and JSON encoding from the same page.
Log collection​
To collect logs from workloads running inside tenant clusters, use the Central HostPath Mapper. It installs a single DaemonSet on the Control Plane Cluster. The DaemonSet handles log path remapping for all tenant clusters and removes the need for per-cluster logging agents.