NVIDIA L40S · Ada Lovelace

Rent NVIDIA L40S in India — ₹55,000/month ($599)

48GB for 13B models and image generation — the entry node at our Mumbai location.

Per node, per month · Mumbai location ₹55,000$599/mo 1-month minimum · ≈ ₹75$0.82/hr effective · managed ops +₹15,000$179/mo

Get an L40S quote or talk to a GPU engineer

L40S pricing

An NVIDIA L40S 48GB node costs ₹55,000 ($599) per month in Mumbai as of July 2026, about ₹75 per hour effective, on a 1-month minimum. It is the entry row of our GPU line and among the lowest published monthly L40S rates in India. Managed ops adds ₹15,000 ($179) per node per month.

Config	Total VRAM	Per Node / Month	≈ Effective / hr
1× L40S 48GB	48GB	₹55,000$599	≈ ₹75/hr≈ $0.82/hr
2× L40S	96GB	On request	—
4× L40S	192GB	On request	—

* Prices checked July 2026. Monthly commitment, 1-month minimum; no hourly product. ≈ /hr = monthly ÷ 730, for comparison only. INR prices attract 18% GST, claimable as input tax credit. Managed ops add-on: ₹15,000 ($179) per node/month. Node CPU, RAM, and NVMe sized at scoping; multi-GPU and NVLink pricing confirmed at scoping.

Will your model fit on one L40S?

Weight sizes at the stated precision; KV cache needs headroom on top. Unsure — a benchmark run settles it.

Model	Params	Precision	Fits on 1× L40S?	Notes
Mistral 7B	7B	FP16	Yes	~14GB weights; full KV headroom
Llama 3.1 8B	8B	FP16	Yes	~16GB weights
Llama 2 13B	13B	FP16	Yes	~26GB weights
SDXL	3.5B	FP16	Yes	Image gen with batch headroom
Llama 3.1 70B	70B	INT4 (AWQ/GPTQ)	Tight	~40GB weights; thin KV cache
Qwen2.5 32B	32B	FP16	No	~64GB weights; A100 or RTX PRO 6000

L40S — or something else?

What is the L40S good for?

Production inference on 7B–13B models (Mistral 7B, Llama 3.1 8B, Llama 2 13B), SDXL image generation, and video workloads: 733 TFLOPS of FP8 plus 3× NVENC encoders. At under a third of the H100’s monthly rate it carries most small-model production APIs.

L40S vs A100

The L40S costs ₹55,000 vs the A100’s ₹97,000 per month, a ₹42,000 saving, and its 733 TFLOPS of FP8 serve 7B–13B models at comparable throughput. The A100 answers back with 80GB of HBM2e at 2 TB/s and NVLink. Serve small models on the L40S; move up when the model needs more than 48GB or the job is training.

L40S vs RTX PRO 6000

The RTX PRO 6000 doubles the VRAM (96GB vs 48GB) and roughly doubles the bandwidth on a newer Blackwell die, at ₹1,10,000 vs ₹55,000 per month. Stay on the L40S while your models fit in 48GB; the step up buys 70B INT4 fit, 32B FP16, and heavier generation pipelines.

NVIDIA L40S 48GB — chip reference

Architecture	Ada Lovelace
VRAM	48GB GDDR6 with ECC
Memory bandwidth	864 GB/s
CUDA cores	18,176
Tensor cores	568 (4th gen)
FP32	91.6 TFLOPS
FP16 Tensor	183 TFLOPS
FP8 Tensor	733 TFLOPS
Video encoders	3× NVENC
Interconnect	PCIe Gen4 x16 (no NVLink)
TDP	350W

L40S hosting questions

How much does an L40S server cost per month in India? +

₹55,000 ($599) per node per month at our Mumbai location, about ₹75/hr effective, as of July 2026. It is the entry node of our GPU line. Monthly commitment, 1-month minimum; no hourly product. Managed ops adds ₹15,000 ($179) per node per month. 18% GST applies on INR invoices, claimable as input tax credit.

L40S vs A100: which should I choose? +

Choose the L40S for 7B–13B inference and image generation: it saves ₹42,000 per month and FP8 throughput keeps token rates competitive. Choose the A100 80GB at ₹97,000 for 70B INT4, Mixtral, MIG partitioning, or multi-GPU training over NVLink. The trial node settles borderline cases with your actual model.

What models run well on one L40S? +

Mistral 7B, Llama 3.1 8B, and Llama 2 13B at FP16 with full KV headroom, plus SDXL image generation with batch room. Llama 3.1 70B INT4 fits at ~40GB but leaves thin KV cache; take an A100 or RTX PRO 6000 if 70B is the production target.

Is the L40S good for video workloads? +

Yes. It carries 3× NVENC hardware encoders for transcoding and streaming alongside 733 TFLOPS of FP8 for video AI models. One node handles inference and encoding together, which is why video pipelines are one of its most common deployments here.

How long does L40S provisioning take? +

A single L40S node is ready in 2–3 business days, the fastest class in our lineup alongside the RTX PRO 6000. Multi-card configurations are confirmed during scoping. We share the exact lead time before you commit.

Is there an L40S trial before the monthly commitment? +

Yes. We provision an L40S trial node so you validate throughput and fit on the actual hardware before the 1-month commitment. Nearly every client evaluates on a trial node first. Request one with your model name and expected traffic.

Ready to run on an L40S?

Tell us your workload — an engineer replies with a firm monthly quote in one business day. Qualified teams can benchmark on the exact hardware before committing.

Get an L40S quote or request a benchmark node

Other classes: H100 ₹1.8L · $2,099 H200 ₹2.5L · $2,799 B200 ₹3.95L · $4,499 A100 80GB ₹97K · $1,099 RTX PRO 6000 ₹1.1L · $1,249 L4 on request

All GPU pricing | GPU dedicated server rental | AI infrastructure