#41
Llama 4 Scout
llama-4-scout-17b-16e-instruct-fp8Helpfulness
Instruction
Following
Comprehension
Empathy
Creative
Writing
Helpfulness
0.0
Instruction Following
0.0
Comprehension
0.0
Empathy
0.0
Creative Writing
0.0
Speed
Avg 97 tok/s
Release Date
April 5, 2025
Lab
Meta
Type
Open Source
Context Size
128K
Max Output Tokens
8.2K
Cost per 1 million tokens
$0.18 / $0.60
Model Inputs*
Text, Images
Model Outputs*
Text
Tool Calling*
Enabled
Overall Assistant Score
An average score combining the 5 main categories.
75.08 pts
Rank #41
17th Percentile
0
Novice
33
Capable
66
Proficient
100
Expert
Llama 4 Scout is the efficiency-focused sibling in the Llama 4 family, featuring a 17B parameter Mixture-of-Experts architecture with 16 experts. Optimized for single-GPU deployment, it delivers remarkable multimodal performance and instruction following capabilities that rival much larger models. While slightly less nuanced than the Maverick variant, Scout offers an exceptional balance of speed, cost-effectiveness, and reasoning power, making it an ideal choice for high-throughput applications requiring reliable text and image understanding.
Intelligence
Overall Score; Higher is better
Speed
Output Tokens per Second; Higher is better
Price
USD per 1M Tokens; Lower is better
meta Models
Overall Score; Same provider comparison
Llama Family
Overall Score; Same model family comparison
Closest Rivals
Overall Score; Nearest by overall rank