OpenAI

GPT-5-mini

Status: provisional

#7 BT rank
999.709 BT score
12 Mirror samples
5-2-5 W-L-T

95% confidence interval: 962.140 to 1045.085.

Current status

Why this row is not ranked yet

samples_total<60, samples_holdout<20, distinct_opponents<3, conservative_signal_not_separated_adjacent

vs GPT-OSS-20B

10-2-0

12 raw games

BT edge: 0.833 (0.552 to 0.953)

vs GPT-OSS-120B

5-7-0

12 raw games

BT edge: 0.417 (0.193 to 0.680)