Use custom schedulers with Hybrid scheduling
This guide shows how to:
- Run a custom scheduler in the host cluster.
- Run a custom scheduler in the virtual cluster.
- Use vCluster Hybrid scheduling, so Pods from the virtual cluster can use schedulers from both host and virtual cluster.
How pods are scheduled with Hybrid scheduling enabled:
- Pods without
spec.schedulerName
use the default scheduler inside the virtual cluster. - Pods whose
spec.schedulerName
matches any scheduler fromsync.toHost.pods.hybridScheduling.hostSchedulers
in your vCluster config are scheduled by that matching host scheduler. - Pods whose
spec.schedulerName
matches a scheduler running inside the virtual cluster are scheduled by that virtual scheduler.
See Hybrid scheduling docs for more details.
Hybrid scheduling feature that is used in this guide is an Enterprise feature that is available in vCluster v0.26.0 and newer. It requires shared host nodes and host nodes have to be synced into the virtual cluster. This guide enables node syncing in the above vCluster config.
Prerequisitesβ
Before you begin, ensure you have:
- kind,
- container runtime that kind can use (e.g. Docker),
- kubectl,
- vCluster CLI.
Create host and virtual clusterβ
Create kind host cluster:
kind create cluster --name scheduler-demo
Create a vCluster config file with Hybrid scheduling enabled and node syncing from the host:
cat > my-vcluster.yaml <<EOF
sync:
toHost:
pods:
hybridScheduling:
enabled: true
hostSchedulers:
- my-host-scheduler
fromHost:
nodes:
enabled: true
EOF
Create the virtual cluster:
vcluster create -n my-vcluster my-vcluster --values my-vcluster.yaml
Set environment variables for the kube contexts to reduce copy/paste errors:
# Host context provided by kind
export HOST_CONTEXT=kind-scheduler-demo
# vCluster context usually follows this pattern; confirm via `kubectl config get-contexts`
export VCLUSTER_CONTEXT=vcluster_my-vcluster_my-vcluster_kind-scheduler-demo
Deploy custom host schedulerβ
Create custom-host-scheduler.yaml
:
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-scheduler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: my-scheduler-as-kube-scheduler
subjects:
- kind: ServiceAccount
name: my-scheduler
namespace: kube-system
roleRef:
kind: ClusterRole
name: system:kube-scheduler
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: my-scheduler-as-volume-scheduler
subjects:
- kind: ServiceAccount
name: my-scheduler
namespace: kube-system
roleRef:
kind: ClusterRole
name: system:volume-scheduler
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: my-scheduler-extension-apiserver-authentication-reader
namespace: kube-system
roleRef:
kind: Role
name: extension-apiserver-authentication-reader
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: my-scheduler
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: my-scheduler-config
namespace: kube-system
data:
my-scheduler-config.yaml: |
apiVersion: kubescheduler.config.k8s.io/v1
kind: KubeSchedulerConfiguration
profiles:
- schedulerName: my-host-scheduler
leaderElection:
leaderElect: false
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
component: my-kube-scheduler
tier: control-plane
name: my-kube-scheduler
namespace: kube-system
spec:
selector:
matchLabels:
component: my-kube-scheduler
tier: control-plane
replicas: 1
template:
metadata:
labels:
component: my-kube-scheduler
tier: control-plane
spec:
serviceAccountName: my-scheduler
containers:
- command:
- kube-scheduler
- --config=/etc/kubernetes/my-scheduler/my-scheduler-config.yaml
image: registry.k8s.io/kube-scheduler:v1.33.4
imagePullPolicy: IfNotPresent
name: kube-scheduler
volumeMounts:
- name: config-volume
mountPath: /etc/kubernetes/my-scheduler
securityContext:
privileged: false
volumes:
- name: config-volume
configMap:
name: my-scheduler-config
Use a kube-scheduler image tag compatible with your clusterβs Kubernetes version (minor version should match). The
example above uses tag v1.33.4
(the latest GA version at the time of writing); adjust if your control plane is on a
different minor version.
leaderElection
is disabled for simplicity (a single replica). For high availability setup, enable leader election and
run multiple replicas.
Apply manifests in the host cluster:
# deploy custom host scheduler
kubectl --context="${HOST_CONTEXT}" apply -f custom-host-scheduler.yaml
# check the status of the deployment
kubectl --context="${HOST_CONTEXT}" -n kube-system rollout status deploy/my-kube-scheduler
Deploy custom virtual schedulerβ
Create custom-virtual-scheduler.yaml
:
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-scheduler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: my-scheduler-as-kube-scheduler
subjects:
- kind: ServiceAccount
name: my-scheduler
namespace: kube-system
roleRef:
kind: ClusterRole
name: system:kube-scheduler
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: my-scheduler-as-volume-scheduler
subjects:
- kind: ServiceAccount
name: my-scheduler
namespace: kube-system
roleRef:
kind: ClusterRole
name: system:volume-scheduler
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: my-scheduler-extension-apiserver-authentication-reader
namespace: kube-system
roleRef:
kind: Role
name: extension-apiserver-authentication-reader
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: my-scheduler
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: my-scheduler-config
namespace: kube-system
data:
my-scheduler-config.yaml: |
apiVersion: kubescheduler.config.k8s.io/v1
kind: KubeSchedulerConfiguration
profiles:
- schedulerName: my-virtual-scheduler
leaderElection:
leaderElect: false
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
component: my-kube-scheduler
tier: control-plane
name: my-kube-scheduler
namespace: kube-system
spec:
selector:
matchLabels:
component: my-kube-scheduler
tier: control-plane
replicas: 1
template:
metadata:
labels:
component: my-kube-scheduler
tier: control-plane
spec:
serviceAccountName: my-scheduler
containers:
- command:
- kube-scheduler
- --config=/etc/kubernetes/my-scheduler/my-scheduler-config.yaml
image: registry.k8s.io/kube-scheduler:v1.33.4
imagePullPolicy: IfNotPresent
name: kube-scheduler
volumeMounts:
- name: config-volume
mountPath: /etc/kubernetes/my-scheduler
securityContext:
privileged: false
volumes:
- name: config-volume
configMap:
name: my-scheduler-config
Apply manifests in the virtual cluster:
# deploy custom virtual scheduler
kubectl --context="${VCLUSTER_CONTEXT}" apply -f custom-virtual-scheduler.yaml
# check the status of the deployment
kubectl --context="${VCLUSTER_CONTEXT}" -n kube-system rollout status deploy/my-kube-scheduler
Deploy pods that use different schedulersβ
Now, create pods.yaml
with pods that are using different schedulers:
- one pod is using the default scheduler,
- one pod is using the custom host scheduler, and
- one pod is using the custom virtual scheduler.
apiVersion: v1
kind: Pod
metadata:
name: pod-uses-default-scheduler
spec:
containers:
- name: pause
image: registry.k8s.io/pause:3.9
---
apiVersion: v1
kind: Pod
metadata:
name: pod-uses-virtual-scheduler
spec:
schedulerName: my-virtual-scheduler
containers:
- name: pause
image: registry.k8s.io/pause:3.9
---
apiVersion: v1
kind: Pod
metadata:
name: pod-uses-host-scheduler
spec:
schedulerName: my-host-scheduler
containers:
- name: pause
image: registry.k8s.io/pause:3.9
Apply the manifest in the virtual cluster:
# deploy pods
kubectl --context="${VCLUSTER_CONTEXT}" apply -f pods.yaml
# wait until all created pods are running
kubectl --context="${VCLUSTER_CONTEXT}" get pods -w
Verify which scheduler handled each podβ
First check which scheduler should schedule each pod:
kubectl --context="${VCLUSTER_CONTEXT}" get pods -n default -o custom-columns=NAME:.metadata.name,SCHEDULER:.spec.schedulerName
NAME SCHEDULER
pod-uses-default-scheduler default-scheduler
pod-uses-host-scheduler my-host-scheduler
pod-uses-virtual-scheduler my-virtual-scheduler
Finally, confirm via events which scheduler actually scheduled the pod:
kubectl --context="${VCLUSTER_CONTEXT}" get events -n default --field-selector reason=Scheduled -o custom-columns=POD:.involvedObject.name,SCHEDULER:.reportingComponent
POD SCHEDULER
pod-uses-default-scheduler default-scheduler
pod-uses-host-scheduler my-host-scheduler
pod-uses-virtual-scheduler my-virtual-scheduler
Troubleshootβ
- No nodes in the virtual cluster:
- Ensure
sync.fromHost.nodes.enabled
istrue
and that your host cluster has Ready nodes.
- Ensure
- Pod stuck in Pending:
- Check the schedulerβs logs:
- Host:
kubectl --context="${HOST_CONTEXT}" -n kube-system logs deploy/my-kube-scheduler
- Virtual:
kubectl --context="${VCLUSTER_CONTEXT}" -n kube-system logs deploy/my-kube-scheduler
- Host:
- Ensure pod's
spec.schedulerName
(if set) matches either a virtual scheduler name or a name listed under hostSchedulers.
- Check the schedulerβs logs:
- Scheduler name not set when checking events:
- Get events with
kubectl --context="${VCLUSTER_CONTEXT}" get events -n default -o yaml
- Inspect
.reportingComponent
(for corev1
API) or.reportingController
(forevents.k8s.io/v1
API), as scheduler name (fromKubeSchedulerConfiguration
) should be set in these fields. - Inspect
.reportingInstance
where scheduler pod name should be set.
- Inspect
- Get events with
Cleanupβ
# Delete the virtual cluster
vcluster delete -n my-vcluster my-vcluster
# Delete the kind cluster
kind delete cluster --name scheduler-demo