AI Cloud›Bare Metal GPU Cloud (Velox)

Bare Metal GPU Cloud (Velox)

Raw GPU and NPU performance. No hypervisor. No compromises.

Dedicated bare-metal GPU and NPU servers provisioned at cloud speed. No virtualization overhead, no shared tenancy — just the complete throughput of physical hardware, available on demand through a console or API. The definitive infrastructure choice when performance consistency is non-negotiable.

Bare MetalNo HypervisorDedicated Single-TenantGPU + NPUCloud-like Provisioning

Why Velox

Most GPU clouds give you virtualized hardware wrapped in a hypervisor. Velox gives you the hardware.

Virtualization made sense when compute was general-purpose. For AI workloads — where GPU memory bandwidth, NVLink topology, and PCIe interconnect throughput directly determine training and inference performance — the hypervisor is overhead you pay for in every benchmark, every training run, and every inference latency measurement. Up to 20% of GPU performance is reclaimed simply by removing the hypervisor layer. For large-scale training workloads and high-throughput inference, that is a meaningful and compounding advantage. Velox delivers dedicated, single-tenant GPU and NPU servers — provisioned with the speed and simplicity of a public cloud, without the performance penalty that comes with virtualization. No noisy neighbors. No shared resources. No invisible overhead between your workload and the hardware running it.

18 PFLOPS

FP32 compute per node — NVIDIA HGX B200 × 8, zero hypervisor overhead

Up to 20%

Performance reclaimed by removing the hypervisor layer vs. virtualized GPU cloud

Shared tenancy — every server is dedicated to a single customer

What You Get

Bare-metal performance, cloud-native provisioning experience.

Velox is physical infrastructure as a service. What you provision is what you get — hardware, network, and OS, delivered with the operational experience of a public cloud.

Configuration

H/W (GPU Server, GPU Cluster) + Network + OS. Provisioned as dedicated single-tenant bare-metal servers with the operating system of your choice pre-installed.

Provisioning

On-demand allocation via console or API. Servers are available in minutes, not weeks. Capacity scales up and down without long-term capital commitment.

Connectivity

High-speed networking between nodes for distributed training and inference workloads — with no shared network congestion from other tenants.

Not included

Velox is raw infrastructure. Signum, Aegis, Metis, and Maxis software platforms are not included in the base configuration. Teams that need the full AI platform on top of bare metal should consider Telox.

Core Capabilities

The hardware. Nothing in the way.

Dedicated Single-Tenant Servers

Your hardware, exclusively

Every server allocated through Velox is yours alone — no hypervisor, no shared resources, no other workloads competing for GPU memory, PCIe bandwidth, or NVLink throughput.

GPU + NPU Infrastructure

Heterogeneous accelerator support

Access to GPU servers (NVIDIA HGX configurations) and NPU infrastructure — enabling workloads to run on the right accelerator architecture for the task at hand.

Kubernetes on Bare Metal

Container orchestration without virtualization overhead

Run Kubernetes directly on bare-metal hardware for container workloads that require GPU access without the performance degradation of nested virtualization.

Cloud-like Provisioning

Bare metal at cloud speed

Provision and deprovision servers through a unified console or API. The operational experience of a public cloud, with the performance characteristics of dedicated physical hardware.

High-Performance Networking

Built for distributed AI

Low-latency, high-bandwidth inter-node networking for distributed training runs and large-scale inference clusters — without the networking overhead of multi-tenant environments.

Consistent Throughput

Predictable performance, every time

Without hypervisor scheduling interference or noisy-neighbor effects, Velox delivers consistent, predictable GPU throughput — critical for training runs where performance variance affects reproducibility and cost predictability.

Use Cases

Built for workloads where hardware performance is the product.

When GPU memory bandwidth, interconnect throughput, and predictable performance are the deliverable, bare metal is the only foundation that makes sense.

Large-scale model training without hypervisor overhead

AI training: run distributed training workloads on dedicated bare-metal GPU clusters — reclaiming the performance overhead of virtualization and eliminating the neighbor-effect variability that makes training runs unpredictable.

Low-latency inference on dedicated hardware

High-performance inference: serve AI models on dedicated bare-metal GPU servers where throughput and latency are guaranteed by hardware allocation, not shared resource scheduling.

High-throughput computing without cloud virtualization penalties

HPC and scientific computing: execute HPC workloads — simulation, scientific modeling, genomics — on dedicated GPU and NPU infrastructure with the full memory bandwidth and interconnect performance of physical hardware.

Bare metal as the foundation for your own AI stack

AI infrastructure teams: teams building their own AI platform on top of GPU infrastructure use Velox as the hardware foundation — adding their own software layers on top of dedicated, consistently-performing bare-metal servers.

Velox vs. Telox

Bare metal, or the full platform on top of it.

Velox is bare infrastructure — GPU servers, networking, and OS. Teams bring their own software stack. Telox is the complete AI platform built on top of Velox — adding Aegis, Metis, and Signum to deliver a fully managed AI cloud with training, inference, and unified governance included.

	Bare Metal GPU Cloud (Velox)	AI-Native GPU Cloud (Telox)
Hardware	GPU/NPU servers, network, OS	GPU/NPU servers, network, OS
AI Platform	Not included	Aegis + Metis + Signum
Best for	Teams with their own software stack	Teams that need compute and platform
Provisioning	Console or API	Console or API

Ready to harness
bare-metal GPU
performance?

Contact our team