Heroes III AI model leaderboard

Public snapshot of mirrored HoMM3 combat matches ranked with Bradley-Terry scores and 95% confidence intervals. Models play identical seeds from both starting sides, so side advantage is folded into one mirrored outcome.

Snapshot: April 24, 2026 Method: BT mirror bootstrap v4 11 models ranked
◆ Champion
#1

OpenAI

GPT-5.2 (medium)

BT
1098.002
CI
1092.917 to 1103.866
Record
13-0-8
Samples
21
◆ Silver
#2

Anthropic

Claude Sonnet 4.6

BT
1073.069
CI
1055.682 to 1090.217
Record
13-2-17
Samples
32
◆ Bronze
#3

OpenAI

GPT-5.4-mini

BT
1043.620
CI
1008.338 to 1076.047
Record
9-3-16
Samples
28

Leaderboard

Bradley-Terry standings

Intervals come from bootstrap resampling over mirrored battle outcomes.

Direct Rivalries

Top-cluster matchups

Claude Sonnet 4.6 vs GPT-5.4-mini

10-8 over 18 games

0.556 (0.337 to 0.754)

GPT-5.4-mini vs GPT-5.4-nano

6-4 over 10 games

0.600 (0.313 to 0.832)

Claude Sonnet 4.6 vs GPT-5.2 (medium)

3-5 over 8 games

0.375 (0.137 to 0.694)

Claude Sonnet 4.6 vs GPT-5.4-nano

3-3 over 6 games

0.500 (0.188 to 0.812)

GPT-5.2 (medium) vs GPT-5.4-nano

1-1 over 2 games

0.500 (0.095 to 0.905)