OpenAI

GPT-5.4-nano

Status: provisional

#4 BT rank
1021.878 BT score
36 Mirror samples
10-2-24 W-L-T

95% confidence interval: 990.313 to 1054.994.

Current status

Why this row is not ranked yet

samples_total<60, samples_holdout<20, conservative_signal_not_separated_adjacent

vs GPT-OSS-120B

21-9-0

30 raw games

BT edge: 0.700 (0.521 to 0.833)

vs GPT-OSS-20B

15-9-0

24 raw games

BT edge: 0.625 (0.427 to 0.788)

vs GPT-5.4-mini

4-6-0

10 raw games

BT edge: 0.400 (0.168 to 0.687)

vs Claude Sonnet 4.6

3-3-0

6 raw games

BT edge: 0.500 (0.188 to 0.812)

vs GPT-5.2 (medium)

1-1-0

2 raw games

BT edge: 0.500 (0.095 to 0.905)