NVIDIA H200

Maximum Memory for AI Workloads

141GB HBM3e memory—1.9x more than H100. Run larger models, longer contexts, and bigger batches without memory constraints.

Get Started View Pricing

Features

Memory Where You Need It

H200 removes memory bottlenecks from your AI pipeline.

141GB HBM3e Memory

1.9x more memory than H100. Fit larger models and longer context lengths without memory constraints.

4.8 TB/s Bandwidth

1.4x higher memory bandwidth than H100 for faster data movement and improved throughput.

NVLink 4.0

900 GB/s GPU-to-GPU bandwidth enables efficient scaling across multi-GPU configurations.

Hopper Architecture

Same proven Hopper architecture as H100, with expanded memory for larger workloads.

Pre-configured Environment

PyTorch, TensorFlow, CUDA 12, and popular ML frameworks ready to use immediately.

Expert Support

Our ML infrastructure team helps with environment setup, optimization, and debugging.

Specifications

Technical Specifications

H200 SXM specifications for reference.

Specification	Value
GPU Memory	141GB HBM3e
Memory Bandwidth	4.8 TB/s
FP8 Performance	3,958 TFLOPS
FP16 Performance	1,979 TFLOPS
FP32 Performance	989 TFLOPS
NVLink Bandwidth	900 GB/s
TDP	700W
Form Factor	SXM5

Use Cases

What H200 Excels At

Memory-intensive workloads where H100 isn't enough.

Large Model Inference

Deploy 70B+ parameter models for production inference. 141GB memory fits models that require multi-GPU on H100.

Memory-Intensive Training

Train models with large batch sizes and long context lengths without gradient checkpointing overhead.

RAG Applications

Retrieval-augmented generation with large context windows benefits from H200's expanded memory.

Scientific Computing

HPC workloads with large datasets that benefit from high memory capacity and bandwidth.

Pricing

On-Demand & Reserved Pricing

Flexible pricing for memory-intensive workloads.

1x H200 SXM

$3.99 /hr

or $2,395 /mo reserved

1× NVIDIA H200 SXM
141GB HBM3e
24 vCPU
240GB RAM
1TB NVMe
Pre-configured ML Environment

Get Started

4x H200 SXM

$15.99 /hr

or $9,595 /mo reserved

4× NVIDIA H200 SXM
564GB HBM3e
96 vCPU
960GB RAM
4TB NVMe
Pre-configured ML Environment

Get Started

8x H200 SXM

$31.99 /hr

or $19,195 /mo reserved

8× NVIDIA H200 SXM
1.1TB HBM3e
192 vCPU
1.9TB RAM
8TB NVMe
Pre-configured ML Environment

Get Started

Need a custom configuration? Contact us for a quote.

FAQ

Frequently Asked Questions

How does H200 compare to H100? +

H200 has 1.9x more memory (141GB vs 80GB) and 1.4x higher memory bandwidth (4.8 TB/s vs 3.35 TB/s). Compute performance is the same. Choose H200 when you need more memory capacity or are memory-bound.

When should I choose H200 over H100? +

Choose H200 if you're running large model inference (70B+ parameters), training with large batch sizes, or working with long context lengths. If compute is your bottleneck, H100 offers better price/performance.

What frameworks are supported? +

All major frameworks: PyTorch, TensorFlow, JAX, Hugging Face Transformers. H200 uses the same CUDA/cuDNN stack as H100, so existing code works without modification.

Is reserved pricing available? +

Yes. Reserved instances offer 20-40% discounts compared to on-demand pricing. Contact our team for a quote based on your commitment term.

Can I mix H200 with H100 in a cluster? +

For most workloads, we recommend homogeneous clusters (all H200 or all H100). Mixed configurations can work for certain pipelines—contact our team to discuss your architecture.

Ready for Maximum Memory?

Get Your H200 Instance Today

Talk to our team about your large-scale AI workload. We'll help you decide between H200 and H100.

Talk to Sales Compare GPUs