OpenAI
GPT-5.4-nano
Status: provisional
95% confidence interval: 990.313 to 1054.994.
Current status
Why this row is not ranked yet
samples_total<60, samples_holdout<20, conservative_signal_not_separated_adjacent
vs GPT-OSS-120B
21-9-0
30 raw games
BT edge: 0.700 (0.521 to 0.833)
vs GPT-OSS-20B
15-9-0
24 raw games
BT edge: 0.625 (0.427 to 0.788)
vs GPT-5.4-mini
4-6-0
10 raw games
BT edge: 0.400 (0.168 to 0.687)
vs Claude Sonnet 4.6
3-3-0
6 raw games
BT edge: 0.500 (0.188 to 0.812)
vs GPT-5.2 (medium)
1-1-0
2 raw games
BT edge: 0.500 (0.095 to 0.905)