NVIDIA L40S

Versatile AI & Media GPU

Ada Lovelace architecture for AI inference, video processing, and 3D rendering. Cost-effective performance for production workloads.

Get Started View Pricing

Features

AI Meets Media Processing

L40S excels at inference, rendering, and video encoding.

Ada Lovelace Architecture

Latest-generation architecture with hardware ray tracing and improved tensor cores for AI and graphics.

Hardware Video Encoding

Dual NVENC encoders for real-time video transcoding. AV1, H.264, H.265 encoding at scale.

48GB GDDR6 Memory

Ample memory for inference workloads and 3D rendering. Fits most production models.

Low-Latency Inference

Optimized for real-time AI inference with consistent low latency for interactive applications.

Pre-configured Environment

PyTorch, TensorFlow, TensorRT, and popular ML frameworks ready to use. Start deploying immediately.

Cost-Effective

Excellent price/performance for inference and rendering workloads. Lower cost than A100/H100.

Specifications

Technical Specifications

L40S specifications for reference.

Specification	Value
GPU Memory	48GB GDDR6 with ECC
Memory Bandwidth	864 GB/s
FP16 Performance	362 TFLOPS
FP32 Performance	91 TFLOPS
RT Cores	3rd Generation
Tensor Cores	4th Generation
NVENC	2× AV1/HEVC/H.264
TDP	350W

Use Cases

What L40S Excels At

Optimized for inference, rendering, and media workloads.

AI Inference

Deploy models for production inference with low latency. Great for LLM inference, embeddings, and classification.

Video Processing

Real-time video transcoding, streaming, and analysis with dual hardware encoders.

3D Rendering

Ray-traced rendering for visualization, CAD, and media production with RTX acceleration.

Real-Time AI Applications

Interactive AI applications, chatbots, and real-time content generation with consistent latency.

Pricing

Cost-Effective GPU Power

Excellent value for inference and media workloads.

1x L40S

$1.29 /hr

or $775 /mo reserved

1× NVIDIA L40S
48GB GDDR6
16 vCPU
120GB RAM
500GB NVMe
Pre-configured Environment

Get Started

Best Value

2x L40S

$2.49 /hr

or $1,495 /mo reserved

2× NVIDIA L40S
96GB GDDR6
32 vCPU
240GB RAM
1TB NVMe
Pre-configured Environment

Get Started

4x L40S

$4.99 /hr

or $2,995 /mo reserved

4× NVIDIA L40S
192GB GDDR6
64 vCPU
480GB RAM
2TB NVMe
Pre-configured Environment

Get Started

Need more GPUs? Contact us for custom configurations.

FAQ

Frequently Asked Questions

How does L40S compare to A100 for inference? +

L40S offers excellent inference performance at a lower price point than A100. For inference-heavy workloads (not training), L40S often provides better price/performance. A100 has higher memory bandwidth and is better for memory-bound inference.

Is L40S good for training? +

L40S can handle light training and fine-tuning, but for serious training workloads, A100 or H100 are better choices. L40S is optimized for inference, rendering, and video processing.

What video formats does the encoder support? +

The dual NVENC encoders support AV1, HEVC (H.265), and H.264 encoding. You can encode multiple streams simultaneously for live streaming and video processing pipelines.

Can I use L40S for LLM inference? +

Yes. L40S is excellent for LLM inference, especially for models that fit in 48GB memory. For larger models (70B+), consider A100 80GB or H100/H200.

Is reserved pricing available? +

Yes. Reserved instances offer 20-40% discounts compared to on-demand pricing. L40S reserved instances are very cost-effective for steady inference workloads.

Ready for Cost-Effective AI?

Deploy Your L40S Instance Today

Talk to our team about your inference or media workload. We'll help you choose the right GPU.

Talk to Sales Compare GPUs