Versatile AI & Media GPU
Ada Lovelace architecture for AI inference, video processing, and 3D rendering. Cost-effective performance for production workloads.
AI Meets Media Processing
L40S excels at inference, rendering, and video encoding.
Ada Lovelace Architecture
Latest-generation architecture with hardware ray tracing and improved tensor cores for AI and graphics.
Hardware Video Encoding
Dual NVENC encoders for real-time video transcoding. AV1, H.264, H.265 encoding at scale.
48GB GDDR6 Memory
Ample memory for inference workloads and 3D rendering. Fits most production models.
Low-Latency Inference
Optimized for real-time AI inference with consistent low latency for interactive applications.
Pre-configured Environment
PyTorch, TensorFlow, TensorRT, and popular ML frameworks ready to use. Start deploying immediately.
Cost-Effective
Excellent price/performance for inference and rendering workloads. Lower cost than A100/H100.
Technical Specifications
L40S specifications for reference.
| Specification | Value |
|---|---|
| GPU Memory | 48GB GDDR6 with ECC |
| Memory Bandwidth | 864 GB/s |
| FP16 Performance | 362 TFLOPS |
| FP32 Performance | 91 TFLOPS |
| RT Cores | 3rd Generation |
| Tensor Cores | 4th Generation |
| NVENC | 2× AV1/HEVC/H.264 |
| TDP | 350W |
What L40S Excels At
Optimized for inference, rendering, and media workloads.
AI Inference
Deploy models for production inference with low latency. Great for LLM inference, embeddings, and classification.
Video Processing
Real-time video transcoding, streaming, and analysis with dual hardware encoders.
3D Rendering
Ray-traced rendering for visualization, CAD, and media production with RTX acceleration.
Real-Time AI Applications
Interactive AI applications, chatbots, and real-time content generation with consistent latency.
Cost-Effective GPU Power
Excellent value for inference and media workloads.
1x L40S
- 1× NVIDIA L40S
- 48GB GDDR6
- 16 vCPU
- 120GB RAM
- 500GB NVMe
- Pre-configured Environment
2x L40S
- 2× NVIDIA L40S
- 96GB GDDR6
- 32 vCPU
- 240GB RAM
- 1TB NVMe
- Pre-configured Environment
4x L40S
- 4× NVIDIA L40S
- 192GB GDDR6
- 64 vCPU
- 480GB RAM
- 2TB NVMe
- Pre-configured Environment
Need more GPUs? Contact us for custom configurations.
Frequently Asked Questions
How does L40S compare to A100 for inference?
L40S offers excellent inference performance at a lower price point than A100. For inference-heavy workloads (not training), L40S often provides better price/performance. A100 has higher memory bandwidth and is better for memory-bound inference.
Is L40S good for training?
L40S can handle light training and fine-tuning, but for serious training workloads, A100 or H100 are better choices. L40S is optimized for inference, rendering, and video processing.
What video formats does the encoder support?
The dual NVENC encoders support AV1, HEVC (H.265), and H.264 encoding. You can encode multiple streams simultaneously for live streaming and video processing pipelines.
Can I use L40S for LLM inference?
Yes. L40S is excellent for LLM inference, especially for models that fit in 48GB memory. For larger models (70B+), consider A100 80GB or H100/H200.
Is reserved pricing available?
Yes. Reserved instances offer 20-40% discounts compared to on-demand pricing. L40S reserved instances are very cost-effective for steady inference workloads.
Deploy Your L40S Instance Today
Talk to our team about your inference or media workload. We'll help you choose the right GPU.