Name: o3
Author: openai

#10

o3-2025-04-16

Helpfulness

Instruction Following

Comprehension

Empathy

Creative Writing

Helpfulness

0.0

Instruction Following

0.0

Comprehension

0.0

Empathy

0.0

Creative Writing

0.0

Speed

Avg 58 tok/s

Release Date

April 16, 2025

Lab

OpenAI

Type

Proprietary

Context Size

200K

Max Output Tokens

100K

Cost per 1 million tokens

$2.00 / $8.00

Model Inputs*

Text, Images, Code

Model Outputs*

Text

Tool Calling*

Enabled

Overall Assistant Score

An average score combining the 5 main categories.

89.16 pts

Rank #10

82th Percentile

Novice

Capable

Proficient

100

Expert

OpenAI's o3 is a large-scale reasoning model released in April 2025, built to excel at math, science, coding, and other complex analytical tasks. It reaches state-of-the-art results on benchmarks like AIME and GPQA Diamond while maintaining strong general helpfulness and reliable instruction following compared to GPT-4-class models. Although slower and more compute-intensive than fast chat models, o3 offers significantly deeper comprehension and more rigorous problem solving, making it ideal for high-stakes analysis, research workflows, and difficult multi-step reasoning problems.

Intelligence

Overall Score; Higher is better

Claude Sonn…

GPT-5.4

Gemini 3 Pro

Claude Sonn…

GPT 5.1

GPT 5.2

GLM 5

GPT 5

Grok 4.1 Fa…

Speed

Output Tokens per Second; Higher is better

Grok 4.1 Fa…

Gemini 2.5 …

Claude Opus…

Gemini 3 Pro

Claude Opus…

DeepSeek V3…

Kimi K2 Thi…

Price

USD per 1M Tokens; Lower is better

GPT 5.1

GPT 5

GPT 5.2

Gemini 3.1 …

Gemini 3 Pro

Gemini 2.5 …

GPT-5.4

GPT 4o

Claude Sonn…

openai Models

Overall Score; Same provider comparison

GPT-5.4

GPT 5.1

GPT 5.2

GPT 5

ChatGPT 4o

GPT-5 Mini

o4-mini

o3-mini

O-Series Family

Overall Score; Same model family comparison

o4-mini

o3-mini

Closest Rivals

Overall Score; Nearest by overall rank

Claude Sonn…

GPT-5.4

Gemini 3 Pro

Claude Sonn…

GPT 5.1

GPT 5.2

GLM 5

GPT 5

Grok 4.1 Fa…