Skip to main content
NVIDIA H100

NVIDIA H100 GPU Servers

The flagship GPU for AI training. 3x faster than A100, with FP8 precision and 3.35 TB/s memory bandwidth.

80GB HBM3

HBM3 Memory

3.35 TB/s

Memory Bandwidth

3,958 TFLOPS

FP8 Performance

3x

Faster than A100

Specifications

H100 Technical Specs

Hopper architecture built for AI/ML workloads

Compute

  • CUDA Cores16,896
  • Tensor Cores528
  • FP32 Performance67 TFLOPS
  • FP16 Tensor1,979 TFLOPS
  • FP8 Tensor3,958 TFLOPS

Memory

  • Memory Size80GB HBM3
  • Memory Bandwidth3.35 TB/s
  • Memory TypeHBM3
  • ECCYes

Connectivity

  • InterconnectNVLink 4.0 (900 GB/s)
  • PCIeGen5 x16
  • TDP700W
  • Form FactorSXM5
Pricing

H100 Pricing

Flexible pricing for every workload

On-Demand

$3.75 /hr
  • No minimum commitment
  • Billed per minute
  • Start/stop anytime
Get Started

Reserved (Monthly)

$1,500 /mo
  • ~17% savings vs on-demand
  • Guaranteed availability
  • Priority support
Contact Sales
Use Cases

What Can You Build with H100?

Industry-leading performance for the most demanding AI workloads

LLM Training

Train large language models with 100B+ parameters. H100 delivers 3x faster training vs A100 for transformer architectures.

Foundation Model Fine-tuning

Fine-tune LLaMA, Mistral, Falcon, and other foundation models with full precision or LoRA/QLoRA.

Distributed Training

Scale across 8x H100 clusters with NVLink for near-linear scaling on large training jobs.

High-Throughput Inference

Deploy models for production inference with industry-leading throughput and low latency.

Comparison

H100 vs Other GPUs

H100 vs A100

3x faster training, 30% better inference throughput, FP8 support

H100 vs H200

H200 has 76% more memory (141GB vs 80GB) for larger models

H100 vs L40S

H100 is 4x faster for training, L40S better for cost-effective inference

Explore other GPUs:

FAQ

H100 Questions

What is the difference between H100 SXM and H100 PCIe?

H100 SXM offers higher memory bandwidth (3.35 TB/s vs 2 TB/s) and NVLink 4.0 support for multi-GPU scaling. SXM is preferred for training workloads, while PCIe is suitable for inference.

How does H100 pricing compare to A100?

H100 costs approximately 50% more per hour than A100, but delivers 3x better training performance. For training workloads, H100 typically offers better cost-efficiency despite the higher hourly rate.

Can I use H100 for inference workloads?

Yes. H100 excels at inference, especially with FP8 quantization delivering 4x higher throughput than FP16. However, for cost-sensitive inference, A100 or L40S may offer better value.

What ML frameworks support H100?

All major frameworks including PyTorch 2.0+, TensorFlow, JAX, and specialized libraries like DeepSpeed and Megatron-LM fully support H100 and its FP8 capabilities.

Do you offer multi-GPU H100 instances?

Yes. We offer 1x, 2x, 4x, and 8x H100 configurations with NVLink interconnect. For larger clusters, contact our team for custom deployments.

Ready for H100?

Start Training on H100 Today

Get instant access to H100 GPUs with pre-configured ML environments.