VMetal – run a GPU cloud on bare metal without OpenStack

Name: VMetal – run a GPU cloud on bare metal without OpenStack
Availability: InStock
Author: teb510

by teb510·Mar 19, 2026·12 points·1 comment

Visit Project View on HN

AI Analysis

●●SolidSolve My ProblemNiche Gem

Saves neoclouds months of engineering by turning bare metal racks into managed Kubernetes clusters.

Strengths

•Focus on GPU workloads avoids the general-purpose bloat of OpenStack or VMware.
•Integrates directly with AI schedulers like Run:ai, Ray, and Slurm out of the box.
•Automates networking and DNS alongside PXE booting for full lifecycle management.

Weaknesses

•Request Demo gate prevents hands-on testing, limiting community adoption and feedback.
•Competes with mature open-source alternatives like MAAS and Tinkerbell that are free.

Post Description

Hi everyone — looking for feedback on a new infrastructure project we launched called vMetal. It's a bare metal management platform for GPU clusters that handles machine discovery, PXE booting, and lifecycle management, without the OpenStack complexity. Built around Kubernetes-native workflows so you can hand it off to teams or drop it into an existing platform. A lot of the infra platforms used for this today were designed 20 years ago (VMware, OpenStack, NVIDIA BCM, MAAS, etc.), while newer tools usually solve only a small piece of the stack. Neither were built with modern GPU cluster ops in mind. In practice most setups end up stitching things together or building custom provisioning pipelines.

With vMetal we took a different approach: treat physical machines like programmable infrastructure resources. Compared to tools like MAAS or Tinkerbell, vMetal is designed around a few ideas: - Bare metal lifecycle automation: Automatically discover machines on the network, boot them, install OS images, and reprovision nodes as hardware moves between clusters or workloads. Built on Metal3 and Ironic. -Built for GPU cluster ops: Supports environments where nodes frequently move between clusters, capacity pools, or tenant workloads. -Direct Kubernetes integration: Provisioned machines can be attached directly to Kubernetes clusters as nodes or assigned to infrastructure pools. -Works with Kubernetes multi-tenancy layers: Integrates with vCluster (virtual clusters) and vNode (node-level isolation) so machines can move from bare metal provisioning into multi-tenant Kubernetes environments. We’ve shared a few other infrastructure projects here before (DevPod, vCluster), and the feedback from HN has been incredibly helpful. Curious how others here are handling bare metal provisioning today — MAAS, Ironic, Metal3, Tinkerbell, something custom?

Open to any feedback, positive or negative.

Similar Projects

Infrastructure●●●Banger

Cozystack v1.0 – an open-source cloud platform for bare metal

Package-based platform architecture using OCI artifacts — OpenStack for the Kubernetes era with CNCF backing.

Big BrainBold Bet

tym83

112mo ago

Hardware●●Solid

How Scaleway brought the first RISC-V servers to the cloud

First RISC-V cloud servers; impressive infra feat, but it's a blog post.

Big BrainBold Bet

enthusaist

501mo ago

Infrastructure●●Solid

Kheeper, a registry designed for bootable images

Purpose-built registry for bootc that handles bare-metal provisioning where legacy OCI fails.

Ship ItNiche Gem

areed

201mo ago

Infrastructure●●Solid

A100s may be $3.20/HR on AWS, vs. $2.40/HR on Vast.ai

Wraps a lot of nasty multi-cloud choreography into a single CLI: parallel provisioning across providers, staging/compressing datasets, and plumbing nodes from different clouds into one Kubernetes cluster with generated Helm templates and Karpenter hooks. The Hugging Face Spaces one-command deploy and built-in telemetry/ML integrations are smart touches, but the page leans heavy on integration laundry-listing — I want concrete guarantees around networking/egress, cost arbitration logic, and auth/billing boundaries before trusting it for production budgets.

Solve My ProblemNiche Gem

Facingsouth

103mo ago

Developer Tools●●●Banger

IOcomposer – AI-assisted IDE – nRF54 bare-metal (no Devicetree/Kconfig)

Bare-metal BLE firmware with vendor SDK indexing—no Device Trees, one config per MCU.

WizardrySolve My Problem

yokostuno

103mo ago

Developer Tools●●Solid

Provision Stateless GPU Compute with Claude Code's Remote Control

Claude talks to RunPod/Lambda/Lambda/Vast — but needs working provider integrations to matter.

Big BrainNiche Gem

Facingsouth

203mo ago