🚀 Zero-Config LLM Hardware Infrastructure Guide

Run Any LLM Without
Out-Of-Memory Errors.

The precision VRAM estimator tailored for open-source large language models. Instantly match models with the cheapest cloud GPU instances worldwide.

VS
🧮

VRAM Estimation

Accurate formula calculation taking quantization weights & KV cache parameters into account.

⚖️

GPU Instance Pricing

Compare cost-per-hour across Top cloud providers like RunPod and Vast.ai instantly.

Pure Static Speed

Built on Next.js SSG technology for instantaneous loading times and top Google Core Web Vitals rankings.

Supported Large Language Models

Select an open-source model below to estimate optimal server specifications.

Codestral 22B
22B Params
Mistral AI's specialized open-weights code generation model supporting over 80+ programming languages.
Configure Hardware Setup
Command R+ (104B)
104B Params
Cohere's enterprise-grade model built explicitly for highly advanced Retrieval-Augmented Generation (RAG) tasks and automated agents.
Configure Hardware Setup
DeepSeek V4 Pro
671B Params
DeepSeek's 2026 flagship Mixture-of-Experts (MoE) titan. It introduces a massive 1-million-token context window with deep architectural optimizations, offering frontier-level intelligence at an unprecedented 75% price reduction.
Configure Hardware Setup
DeepSeek V4 Flash
284B Params
An efficiency-optimized Mixture-of-Experts (MoE) variant featuring 284B total and 13B active parameters, specifically designed for high-throughput and ultra-low latency while maintaining a 1M context length.
Configure Hardware Setup
DeepSeek V3.2 (MoE)
685B Params
An upgraded iteration of DeepSeek's v3 line architecture, optimizing mathematical execution and multi-turn enterprise workflows to peak efficiency.
Configure Hardware Setup
DeepSeek V3 (671B MoE)
671B Params
A Mixture-of-Experts (MoE) monster from DeepSeek with 671B total and 37B active parameters, delivering world-class intelligence with extreme architectural efficiency.
Configure Hardware Setup
DeepSeek R1 (Reasoning)
671B Params
DeepSeek's flagship reasoning model utilizing advanced reinforcement learning. Specializes in chain-of-thought processing for complex math and coding.
Configure Hardware Setup
Gemma 4 31B It
31B Params
Google's premier dense open model from the April 2026 generation. Released under the permissive Apache 2.0 license, it incorporates Gemini 3 architecture to deliver cross-modal reasoning and advanced agentic multi-step planning.
Configure Hardware Setup
Gemma 4 26B MoE
26B Params
Google's highly flexible Mixture-of-Experts model utilizing a sparse architecture to deliver 31B-class text and vision performance at a fraction of the inference cost.
Configure Hardware Setup
Gemma 4 4B It
4B Params
Google's breakthrough edge-tier model that natively processes text, vision, and real-time audio waveforms, designed to bring fully autonomous multimodal agents to local consumer hardware.
Configure Hardware Setup
Gemma 2 9B
9B Params
Google's lightweight star, packing punchy, forward-thinking architecture that outperforms many models twice its size.
Configure Hardware Setup
Gemma 2 27B
27B Params
Google's highly advanced mid-sized model utilizing an innovative sliding window attention mechanism for high-efficiency processing.
Configure Hardware Setup
GLM-5 (744B MoE)
744B Params
Zhipu AI's (Z.ai) massive open-weights titan designed for complex systems engineering. Features 40B active parameters per token, a 200K input pipeline, and unprecedented multi-step agentic execution scoring.
Configure Hardware Setup
GPT-OSS 120B
117B Params
OpenAI's historical, highly disruptive release into the open-weights community. Built with an Apache 2.0 license, it brings OpenAI's advanced reasoning behaviors straight to self-hosted cloud environments.
Configure Hardware Setup
InternLM 2.5 20B
20B Params
Shanghai AI Lab's flagship model, outstanding in long-context processing with perfect information retrieval up to 128K tokens.
Configure Hardware Setup
Kimi K2.5
1000B Params
Moonshot AI's 1-trillion parameter Mixture-of-Experts (MoE) powerhouse (32B active). It stands at the absolute frontier of open-weights models, specifically dominating complex front-end visual coding and multi-agent systems.
Configure Hardware Setup
Llama 4 Scout (109B MoE)
109B Params
Meta's next-generation sparse Mixture-of-Experts (MoE) model. Activating 17B parameters per token, it brings a massive 10-million-token context window alongside native multimodal text-and-image understanding.
Configure Hardware Setup
Llama 3.1 8B
8B Params
Meta's highly optimized lightweight model with an expanded 128K context window, perfect for edge deployment and efficient fine-tuning.
Configure Hardware Setup
Llama 3 70B
70B Params
Meta's flagship 70B model, renowned for elite tier reasoning, coding, and general knowledge capabilities.
Configure Hardware Setup
MiniMax M2.5
229B Params
An open-weight productivity power-house utilizing a sparse MoE layout (10B active parameters). It excels in real-world software engineering workflows and long-form context synthesis up to 196K tokens.
Configure Hardware Setup
Mistral Small 4 (119B MoE)
119B Params
Mistral AI's highly versatile 'all-in-one' model that merges advanced reasoning, vision (Pixtral), and agentic coding into a single highly optimized sparse MoE framework.
Configure Hardware Setup
Mistral Nemo 12B
12B Params
Collaborative model built by Mistral AI and NVIDIA, featuring a 128K context window and top-tier reasoning in a compact size.
Configure Hardware Setup
Mistral Large 2 (123B)
123B Params
Mistral's leading frontier model designed for complex multilingual enterprise tasks, coding, and function orchestration.
Configure Hardware Setup
Nemotron Super 49B
49B Params
NVIDIA's specialized mid-tier model designed specifically to optimize guardrails, RAG alignment, and prompt engineering orchestration.
Configure Hardware Setup
Phi-4 Reasoning
14B Params
Microsoft's heavily requested compact reasoning engine. Fine-tuned specifically from the Phi-4 base to emulate advanced math and scientific data deduction without the resource overhead.
Configure Hardware Setup
Phi-3 Medium 14B
14B Params
Microsoft's heavy-hitting small language model, trained on heavily curated high-quality synthetic data and web texts.
Configure Hardware Setup
Qwen 3 235B (MoE)
235B Params
Alibaba's premier open-source sparse MoE model (22B active parameters). Released under the Apache 2.0 license, it sets the enterprise benchmark for multilingual processing, advanced tool use, and visual reasoning.
Configure Hardware Setup
Qwen 2.5 7B
7B Params
Alibaba's efficient model packed with advanced coding skills and industry-leading multilingual capabilities.
Configure Hardware Setup
Qwen 2.5 72B
72B Params
Alibaba's top-tier open weights titan, matching closed-source models in mathematics, multi-turn coding, and multilingual knowledge.
Configure Hardware Setup
SmolLM2 1.7B
1.7B Params
Hugging Face's meticulously trained ultra-compact model, engineered from curated high-quality datasets to provide unmatched instruction-following logic on local edge hardware.
Configure Hardware Setup
Yi 1.5 34B
34B Params
01.AI's highly optimized dense model built upon the acclaimed Yi architecture, offering enhanced coding, math, and reasoning capabilities for enterprise applications.
Configure Hardware Setup
Yi-Large (MoE)
104B Params
01.AI's elite Mixture-of-Experts engine, delivering outstanding cross-disciplinary academic and conversational responses.
Configure Hardware Setup