#37
Gemini 2.5 Flash-Lite
gemini-2.5-flash-lite

Gemini 2.5 Flash-Lite is the fastest and most cost-efficient model in the 2.5 family, optimized for high-throughput and latency-sensitive tasks. With lower latency than both 2.0 Flash-Lite and 2.0 Flash, it excels at classification, translation, and simple QA at massive scale. Despite its speed focus, it retains thinking, function calling, and search grounding capabilities.

Performance Metrics

Helpfulness
Instruction Following
Comprehension
Empathy
Creative Writing
73.40
Helpfulness
78
Empathy
68
Instruction Following
80
Creative Writing
62
Comprehension
79
Speed
Avg 97 tok/s

Model Specifications

Release Date
June 2025
Lab
Google
Type
Proprietary
Context Size
1M
Max Output Tokens
65.5K
Cost per 1M tokens
$0.10 / $0.40
Model Inputs
Text, Audio, Images, Video, PDFs
Model Outputs
Text
Tool Calling
Enabled

Compare With Similar Models