Skip to main content
NVIDIA L40S

Versatile AI & Media GPU

Ada Lovelace architecture for AI inference, video processing, and 3D rendering. Cost-effective performance for production workloads.

48GB GDDR6 Memory
864 GB/s Memory Bandwidth
NVENC Encoders
362 FP16 TFLOPS

AI Meets Media Processing

L40S excels at inference, rendering, and video encoding.

Ada Lovelace Architecture

Latest-generation architecture with hardware ray tracing and improved tensor cores for AI and graphics.

Hardware Video Encoding

Dual NVENC encoders for real-time video transcoding. AV1, H.264, H.265 encoding at scale.

48GB GDDR6 Memory

Ample memory for inference workloads and 3D rendering. Fits most production models.

Low-Latency Inference

Optimized for real-time AI inference with consistent low latency for interactive applications.

Pre-configured Environment

PyTorch, TensorFlow, TensorRT, and popular ML frameworks ready to use. Start deploying immediately.

Cost-Effective

Excellent price/performance for inference and rendering workloads. Lower cost than A100/H100.

Technical Specifications

L40S specifications for reference.

Specification Value
GPU Memory 48GB GDDR6 with ECC
Memory Bandwidth 864 GB/s
FP16 Performance 362 TFLOPS
FP32 Performance 91 TFLOPS
RT Cores 3rd Generation
Tensor Cores 4th Generation
NVENC 2× AV1/HEVC/H.264
TDP 350W

What L40S Excels At

Optimized for inference, rendering, and media workloads.

AI Inference

Deploy models for production inference with low latency. Great for LLM inference, embeddings, and classification.

Video Processing

Real-time video transcoding, streaming, and analysis with dual hardware encoders.

3D Rendering

Ray-traced rendering for visualization, CAD, and media production with RTX acceleration.

Real-Time AI Applications

Interactive AI applications, chatbots, and real-time content generation with consistent latency.

Cost-Effective GPU Power

Excellent value for inference and media workloads.

1x L40S

$1.29 /hr
or $775 /mo reserved
  • 1× NVIDIA L40S
  • 48GB GDDR6
  • 16 vCPU
  • 120GB RAM
  • 500GB NVMe
  • Pre-configured Environment
Get Started

4x L40S

$4.99 /hr
or $2,995 /mo reserved
  • 4× NVIDIA L40S
  • 192GB GDDR6
  • 64 vCPU
  • 480GB RAM
  • 2TB NVMe
  • Pre-configured Environment
Get Started

Need more GPUs? Contact us for custom configurations.

Frequently Asked Questions

How does L40S compare to A100 for inference? +

L40S offers excellent inference performance at a lower price point than A100. For inference-heavy workloads (not training), L40S often provides better price/performance. A100 has higher memory bandwidth and is better for memory-bound inference.

Is L40S good for training? +

L40S can handle light training and fine-tuning, but for serious training workloads, A100 or H100 are better choices. L40S is optimized for inference, rendering, and video processing.

What video formats does the encoder support? +

The dual NVENC encoders support AV1, HEVC (H.265), and H.264 encoding. You can encode multiple streams simultaneously for live streaming and video processing pipelines.

Can I use L40S for LLM inference? +

Yes. L40S is excellent for LLM inference, especially for models that fit in 48GB memory. For larger models (70B+), consider A100 80GB or H100/H200.

Is reserved pricing available? +

Yes. Reserved instances offer 20-40% discounts compared to on-demand pricing. L40S reserved instances are very cost-effective for steady inference workloads.

Ready for Cost-Effective AI?

Deploy Your L40S Instance Today

Talk to our team about your inference or media workload. We'll help you choose the right GPU.