About Meta AI
Meta AI develops the Llama family of open-weight large language models. Unlike most frontier AI labs, Meta releases Llama models with open weights, allowing researchers and developers to download, modify, and deploy them freely.
The Llama family has rapidly evolved from Llama 2 (2023) through Llama 4 (2025). Llama 4 introduced the Scout and Maverick variants, competing directly with proprietary models. Meta reports MMLU-Pro and GPQA Diamond as primary benchmarks.
Llama Model Timeline
Verified scores from official Meta pages.
Llama 4 Maverick
Meta’s most capable open model. Achieves 80.5% on MMLU-Pro and 69.8% on GPQA Diamond, competing with proprietary frontier models while remaining open-weight.
Llama 4 Scout
The efficient variant of Llama 4. Achieves 74.3% on MMLU-Pro and 57.2% on GPQA Diamond.
Llama 3.2 90B Vision
Meta’s first multimodal Llama model, adding image understanding capabilities.
Llama 3.1 405B
The largest open-weight model at release. At 405 billion parameters, it proved open models could compete with proprietary systems.
Llama 3 70B
A significant quality leap with new tokenizer and larger training dataset.
Llama 2 70B
Meta’s first widely-released open model. Established the open-weight paradigm.
Benchmark Performance
Llama 4 Maverick scores across verified benchmark categories.
About This Data
Scores sourced from official Meta AI pages. Meta reports MMLU-Pro (not standard MMLU), which has lower absolute scores due to harder questions. Llama 4 Behemoth (82.2% MMLU-Pro, 73.7% GPQA) is announced but not yet released.