Infrastructure for AI: Managing GPU Workloads with Kubernetes
Jan 20, 2026
–
Jan 20, 2026
|
Online
Our Speakers

Saiyam Pathak
Principal Developer Advocate at Loft Labs
The AI revolution has made GPUs the most valuable resource in modern infrastructure — and the hardest to manage at scale.
Kubernetes is becoming the platform of choice for AI workloads, but production-grade GenAI comes with real challenges: GPU scarcity, poor utilization, multi-team access, and fragmented deployments across cloud and bare metal.
Saiyam Pathak, Head of Developer Relations at vCluster, is speaking at O'Reilly's Infrastructure & Ops Superstream: Infrastructure for AI on January 20, 2025.
In this session, he'll cover:
- GPU sharing strategies (time-slicing, MIG)
- Advanced scheduling with Kai Scheduler
- Secure multi-tenancy using vCluster
- How vLLM fits into real architectures for scalable inference
You'll walk away with a practical blueprint for delivering high-performance, cost-efficient, multi-tenant AI infrastructure across Kubernetes environments.

Jan 20, 2026
00
Days
:
00
Hours
:
00
Minutes
:
00
Seconds