GPU VRAM Calculator for LLMs
Model Size (billions of params)
Precision
A GPU VRAM calculator estimates the video RAM required to run a large language model locally. VRAM requirements depend on model size (parameters) and numerical precision (quantization).
- 1VRAM (bytes) = Parameters × bytes per parameter
- 2FP32: 4 bytes/param, FP16/BF16: 2 bytes, INT8: 1 byte, INT4: 0.5 bytes
- 3Add ~20% overhead for activations and KV cache
- 47B model at FP16 = 7B × 2 = 14GB minimum
7B params at FP16=~14GB VRAM minimum (RTX 3090 or better)
70B params at INT4=~35GB VRAM (2× A100 40GB)
13B params at INT8=~13GB VRAM (RTX 4090)
| Model Size | FP16 VRAM | INT8 VRAM | INT4 VRAM |
|---|---|---|---|
| 7B | 14 GB | 7 GB | 3.5 GB |
| 13B | 26 GB | 13 GB | 6.5 GB |
| 30B | 60 GB | 30 GB | 15 GB |
| 70B | 140 GB | 70 GB | 35 GB |
| 140B | 280 GB | 140 GB | 70 GB |
References
🔒
100% Gratis
Geen registratie
✓
Nauwkeurig
Geverifieerde formules
⚡
Direct
Resultaten meteen
📱
Mobielvriendelijk
Alle apparaten