#34
Llama 4 Scout
llama-4-scout-17b-16e-instruct-fp8

Llama 4 Scout is the efficiency-focused sibling in the Llama 4 family, featuring a 17B parameter Mixture-of-Experts architecture with 16 experts. Optimized for single-GPU deployment, it delivers remarkable multimodal performance and instruction following capabilities that rival much larger models. While slightly less nuanced than the Maverick variant, Scout offers an exceptional balance of speed, cost-effectiveness, and reasoning power, making it an ideal choice for high-throughput applications requiring reliable text and image understanding.

Performance Metrics

Helpfulness
Instruction Following
Comprehension
Empathy
Creative Writing
76.40
Helpfulness
76
Empathy
75
Instruction Following
82
Creative Writing
70
Comprehension
79
Speed
Avg 97 tok/s

Model Specifications

Release Date
April 5, 2025
Lab
Meta
Type
Open Source
Context Size
128K
Max Output Tokens
8.2K
Cost per 1M tokens
$0.18 / $0.60
Model Inputs
Text, Images
Model Outputs
Text
Tool Calling
Enabled

Compare With Similar Models