OpenAI

GPT-5-mini

Status: provisional

Summary stats

#7 BT rank
1002.797 BT score
12 Mirror samples
5-2-5 W-L-T

95% confidence interval: 966.743 to 1041.459.

Current status

Why this row is not ranked yet

samples_total<60, samples_holdout<20, distinct_opponents<3, conservative_signal_not_separated_adjacent

Head-to-head results

vs GPT-OSS-20B

10-2-0

12 raw games

BT edge: 0.833 (0.552 to 0.953)

vs GPT-OSS-120B

5-7-0

12 raw games

BT edge: 0.417 (0.193 to 0.680)