Tech Blog by vClusterPress and Media Resources

AI Cloud Providers: The Race to the Bottom Has Already Started. McKinsey Just Told You How to Get Out.

Apr 29, 2026
|
5
min Read
AI Cloud Providers: The Race to the Bottom Has Already Started. McKinsey Just Told You How to Get Out.

In 2010, there were dozens of companies selling virtual machines by the hour. They competed on price, on hardware specs, and on uptime SLAs. Most of them are gone. The ones that survived (AWS, Google, Azure) didn't survive by selling better VMs. They survived by building platforms that customers couldn't leave.

AI cloud providers are in that same moment right now. And most of them don't know it yet.

The Squeeze Is Already Happening

GPU specs are converging. H100 availability, once the defining competitive advantage of an AI cloud provider, is now table stakes. The hyperscalers are catching up on capacity. New entrants keep arriving. And the only lever most providers know how to pull is price.

That's a race you can't win, and McKinsey's recent analysis of the AI cloud market makes the math explicit. Gross margins on bare-metal-as-a-service (BMaaS) are typically 55 to 65 percent before depreciation. After labor, power costs, and depreciation, gross profit drops to between 14 and 16 percent, lower than many non-tech retail businesses. If utilization slips below 80 percent, or if rental prices erode even slightly, returns flatline. With debt financing in the picture, the cushion disappears entirely.

And then there's the chip cycle. McKinsey estimates that over a typical five-year depreciation horizon, the price of a GPU hour could decline by half or more. Based on current market signals, we think that timeline is optimistic. Our own research suggests the compression is playing out closer to two years, driven by the pace of new chip generations and the speed at which older hardware loses pricing power. Providers aren't just competing on price today. They're racing to recover capital on a much shorter clock than most financial models assume.

The report is direct: "The BMaaS model is inherently commoditized. It has limited differentiation, high spending intensity, and price-driven competition." AI cloud providers were born out of GPU scarcity. That scarcity is fading, and the business models built on it are fading with it.

What McKinsey Says Has to Happen Next

McKinsey identifies three paths forward for AI cloud providers that want to endure: carve out defensible positions in niche markets like sovereign compute and regulated verticals, build trusted relationships with AI start-ups early and grow with them as those companies scale into massive platforms, or consolidate. The report is clear that few will resolve the fundamental tension at scale, and those that do will be the ones that "turn early scarcity into long-term differentiation."

The investment thesis behind most AI cloud funding assumes exactly this transition: away from selling raw compute, toward AI-native software stacks, managed services, and training orchestration layers that create real customer stickiness.

The logic is right. But the execution gap is enormous.

What "Moving Up the Stack" Actually Means

"Moving up the stack" is the kind of strategic advice that sounds obvious until you have to act on it. What does it actually mean for an AI cloud operator?

It means your customer doesn't SSH into a bare metal node and install their own Kubernetes. It means they log into a portal, spin up a managed cluster, and get to work. It means they have self-service environments, RBAC, observability, and the tooling they already know from AWS and GCP, running on your hardware, under your brand.

It means managed Kubernetes.

This isn't a niche product request. It's the baseline expectation of any enterprise AI team that has evaluated your platform. They've used EKS. They've used GKE. They don't lower their expectations when they move to an AI-native cloud. They raise them, because they assume you've done for AI workloads what AWS did for general compute.

McKinsey explicitly calls out "managed machine learning services" and "developer tools" as the layers AI cloud providers must build. Managed Kubernetes is the foundation all of those layers sit on. You can't offer a managed training orchestration platform to a customer who is still self-managing their cluster infrastructure.

If you can't deliver that experience, enterprise customers go back to a hyperscaler. Not because the hyperscaler has better GPUs. Because the experience is what they know.

The Wholesale Trap

There's a specific version of this problem worth naming directly. McKinsey's data shows that for some AI cloud providers, more than half of their revenue comes from just one or two customers, typically hyperscalers buying wholesale capacity.

Selling capacity wholesale solves a utilization problem in the near term. But it comes with a structural cost that doesn't show up on this quarter's P&L: you hand over the customer relationship entirely.

When you sell wholesale, you get the revenue and none of the customer. You don't know what they're building. You can't offer them managed services. You can't upsell. You can't retain. And when your counterparty finds a cheaper source, you're out, with no direct customer base to fall back on.

McKinsey's path forward is clear on this point. The AI cloud providers building durable businesses are the ones establishing trusted relationships with end customers directly, specifically AI start-ups, early in those companies' lives and growing with them as they scale. That is not achieved by selling compute to a reseller. It's achieved by owning the customer experience.

The Build Trap

At this point, most AI cloud operators nod along and say: "We know. We're building it."

Building a managed Kubernetes platform from scratch requires 6 to 10 platform engineers, takes 6 to 12 months in practice (often longer), and costs over $1M in fully-loaded engineering spend. And that's before Day 2 operations: updates, observability, compliance, Tenant Isolation, and lifecycle management across a fleet.

Meanwhile, a $10M GPU cluster generating $2 to $3 per GPU per hour is a clock that runs whether or not your platform is ready. Every quarter you spend building is millions in potential platform revenue that went to a competitor who moved faster, or customers who went back to AWS because the experience wasn't there.

CoreWeave spent years building their Kubernetes platform. AWS built EKS over a decade. The AI cloud providers entering the market today don't have that runway.

What the Fast Path Looks Like

Boost Run launched a production-grade managed Kubernetes service in under 45 days. Zero new platform engineering hires. Full Tenant Isolation, self-service cluster provisioning, and an EKS-like experience, on their own GPU infrastructure, under their own brand.

Lintasarta, now running Indonesia's leading AI cloud, went from decision to 170+ Tenant Clusters in production in 90 days.

These aren't stories about companies with massive engineering teams who got lucky. They're stories about operators who recognized that the platform layer is a solved problem, and stopped trying to solve it from scratch.

The Window Is Open, But Not Forever

McKinsey's framing is precise: AI cloud providers "were born out of GPU scarcity." That scarcity created the market. Eliminating it is what threatens to end it. The providers that endure will be those that used the scarcity window to build something customers actually depend on, not just to rent hardware at a premium.

The AI cloud providers that win the next phase of this market aren't necessarily the ones with the most GPUs or the lowest prices. They're the ones who figure out, fast, that they're in the cloud business. Not the hardware rental business.

McKinsey mapped the exit. The question for every AI cloud provider right now isn't whether to build a managed platform. It's whether you can ship it fast enough to matter.

vCluster helps AI Cloud Providers launch a production-grade managed Kubernetes service in days, not quarters, without building from scratch. See how it works

Share:
Ready to take vCluster for a spin?

Deploy your first virtual cluster today.