Bare Metal · Dedicated Single-Tenant Server · Kubernetes-Native

Reclaim the Performance Virtualization Has Been Stealing.

The moment you move to public cloud, up to 20% of your compute goes to the hypervisor. Velox eliminates this layer entirely. Run Kubernetes directly on bare metal — delivering container orchestration flexibility and raw physical performance together, provisioned in minutes.

No HypervisorBare Metal xPUZero Noisy NeighborPredictable Fixed Performance
Why Velox

General-purpose cloud was not designed for high-performance workloads.

General-purpose public cloud often makes GPU workloads expensive to run, while virtualization overhead and shared infrastructure can limit the performance enterprises expect. Velox is dedicated bare metal infrastructure built from the ground up for large-scale AI workloads and enterprise databases. Velox runs workloads directly on bare-metal xPU infrastructure and combines high-speed parallel storage with InfiniBand-based ultra-low-latency networking. Deliver higher performance for AI and high-performance workloads with a more cost-efficient cloud model.

Architecture

Nothing between the hardware and your application.

Most general-purpose cloud services run on a VM architecture where multiple users share a single physical server. The hypervisor layer consumes up to 20% of available resources on its own and introduces significant network and storage I/O latency. Velox eliminates this layer entirely.

Velox Bare Metal

Application
Operating System (OS)
Hardware✓ 100% direct connection — zero performance loss

Standard Cloud (VM)

Application
Operating System (OS)
VM Layer⚠ Unnecessary abstraction
Hypervisor⚠ Up to 20% resource loss + I/O latency
Hardware

No Hypervisor Layer

CPU, memory, and NIC connect directly to your application. Nothing in between. Because the hypervisor overhead doesn’t structurally exist, performance loss cannot occur.

Single-Tenant Hardware

You occupy the entire physical server exclusively. There is no performance jitter from shared resources, and no other tenant’s traffic spike can affect your service.

API-Driven Provisioning

Deploy in minutes via the Thaki Cloud console and API. The days-long wait of traditional IDC provisioning is over. Enjoy cloud agility and physical server performance at the same time.

Hardware-Level Isolation

Physical isolation, not logical separation. VM Escape vulnerabilities don’t exist without a hypervisor. Structural security assurance for compliance-critical industries including finance, healthcare, and defense.

Kubernetes on Bare Metal

Kubernetes runs directly on physical servers — no hypervisor required. You get the full flexibility of container orchestration alongside bare metal performance.

Core Capabilities

Bare metal infrastructure built on four pillars.

COMPUTING

100% hardware performance, no virtualization overhead

Direct bare metal xPU access eliminates virtualization overhead entirely. Capture 100% of hardware performance for both foundation model training and ultra-low-latency inference. A single node with NVIDIA HGX B200 × 8 delivers 18 PFLOPS FP32 compute without compromise.

SECURITY

Physical isolation as standard

With a dedicated single-tenant server, VM Escape vulnerabilities structurally do not exist. Hardware-level complete isolation is provided as a baseline. CSAP low-to-mid grade requirements can be met.

STORAGE

Your GPU should never wait for data

High-speed parallel storage optimized for large-scale AI datasets. A 400Gbps data fabric based on NFS over RDMA eliminates data loading bottlenecks during training at the source. 10PB cluster-level storage is 2.5× the industry-recommended capacity.

AGILITY

Deploy physical servers like cloud

Days-long IDC provisioning is a thing of the past. Create and deploy bare metal servers to exact spec in minutes via the Thaki Cloud console and API. Run Kubernetes directly on physical servers for full container orchestration flexibility.

Reclaim 100% of your hardware performance — zero virtualization overhead

Contact Us
Technical Specifications

The numbers speak for themselves.

Velox standard node configuration. Custom configurations optimized for your specific workloads are available upon consultation.

Velox Standard Node

ItemSpecification
Form10U Rackmount
CPUIntel Xeon 6900 Series (6960P) / 2-socket / 72-core
MemoryDDR5-6400 ECC RDIMM / 2.3TB Total (SK Hynix)
StorageSamsung NVMe PCIe Gen4 / 30TB RAID 0
GPUNVIDIA HGX B200 × 8 / HBM3e 180GB (SK Hynix)
GPU FabricInfiniBand NDR400G / ConnectX-7
DPUBlueField-3 BF3220 / 200GbE Dual-Port
Power5,250W × 6 Redundant (3+3)
CertificationNVIDIA Qualified

Datacenter — Gasan AI DC

ItemDetail
LocationGasan-dong, Geumcheon-gu, Seoul
TierTier-III
Floor Area69,300㎡ (21,016 pyeong)
Power Capacity80MW (IT: 46MW)
Rack PowerUp to 44kW per rack (air-cooled)
Seismic DesignRichter magnitude 7 (intensity 9)
OpenedJuly 2021

Network Fabric

ItemDetail
ComputeInfiniBand NDR / 400GbE per GPU
vs. RoCE5× lower latency
StorageNFS over RDMA / 400Gbps
ManagementNVIDIA UFM automation

Performance Metrics

MetricValueNote
FP32 Compute18 PFLOPS / node2.25× H100
GPU Memory1.4TB / node2.25× H100
GPU-to-GPU Bandwidth1.8TB/s2× H100
Cluster Storage10PB2.5× industry recommendation

B200 vs H100 Performance Comparison

MetricH100B200 (Velox)
FP32 Compute8 PFLOPS18 PFLOPS
GPU Memory640GB1.4TB
GPU-to-GPU Bandwidth900GB/s1.8TB/s
Training PerformanceBaselineUp to 3× improvement
Inference PerformanceBaselineUp to 15× improvement

Source: NVIDIA, SMCI

Target Workloads

Built for the workloads that can’t afford limits.

Velox provides the optimal answer for mission-critical enterprise workloads that general-purpose VM-based cloud simply cannot handle.

Foundation Model Training

Large-scale AI research clusters that demand 100% GPU and HBM utilization without loss. Velox on B200 delivers up to 3× training performance and 15× inference performance over the previous generation.

High-Performance Databases

Large-scale RDBMS and real-time NoSQL workloads where I/O bottlenecks are unacceptable. Removing the virtualization layer eliminates storage I/O latency at the source.

High-Concurrency Real-Time Services

Mission-critical infrastructure that must deliver a seamless experience to millions of concurrent users. Optimal for environments where unpredictable performance variance cannot be tolerated.

Finance, Healthcare & Compliance

Hardware-level physical isolation meets regulatory requirements. Infrastructure proven in rigorous compliance environments governed by CSAP and NIS certification standards.

HPC & Parallel Scientific Computing

Ultra-low-latency GPU fabric on InfiniBand NDR400G — 5× lower latency than Ethernet-based RoCE. Maximize parallel computing efficiency at scale.

Sovereign AI Infrastructure

For public sector and large enterprise environments where data sovereignty is non-negotiable. Full physical isolation within domestic datacenters with an independent operating framework.

Velox by the Numbers

The case for adoption, in numbers.

By internalizing bare metal infrastructure, you maintain compliance while fundamentally reducing operational costs. The only fully dedicated infrastructure for AI training, inference, high-performance databases, and real-time services.

18PF

FP32 compute per node — 2.25× H100

0

Hypervisor overhead — zero performance loss

100%

Dedicated xPU occupancy — zero noisy neighbors

Minutes

Provisioning time — days faster than IDC

The convenience of cloud.
The raw power of bare metal.
Own both.