Scale AI logo
SEAL Logo

EnigmaEval

Puzzle Solving

Last updated: April 3, 2025

Performance Comparison

1

13.09±1.92

1

11.91±1.85

2

9.21±1.65

3

6.81±0.83

4

6.14±1.37

4

o1 (December 2024)

5.65±1.32

4

5.57±1.31

4

5.57±1.31

5

4.23±1.17

9

4.14±1.16

9

3.21±1.00

9

3.18±1.02

9

3.12±0.99

9

2.70±0.92

9

2.36±0.87

9

2.26±0.86

10

2.20±0.84

10

2.17±0.84

15

Gemini 2.0 Flash Thinking (January 2025)

1.10±0.60

16

Claude 3.5 Sonnet (October 2024)

0.91±0.55

17

Pixtral Large (November 2024)

0.84±0.53

19

Claude 3 Opus

0.82±0.45

19

GPT-4o (November 2024)

0.80±0.44

19

0.69±0.48

19

0.63±0.45

19

0.58±0.43

19

Llama 3.2 90B Vision Instruct

0.38±0.35