Skip to main content
openbench includes a comprehensive collection of math benchmarks ranging from multilingual grade school problems to competition-level mathematics from prestigious contests like AIME and HMMT.

Available Benchmarks

AIME 2023

American Invitational Mathematics Examination 2023 - challenging high school competition problems.
bench eval aime_2023_I
bench eval aime_2023_II

AIME 2024

American Invitational Mathematics Examination 2024.
bench eval aime_2024
bench eval aime_2024_I
bench eval aime_2024_II

AIME 2025

American Invitational Mathematics Examination 2025.
bench eval aime_2025
bench eval aime_2025_II

HMMT Feb 2023

Harvard-MIT Mathematics Tournament February 2023.
bench eval hmmt_feb_2023

HMMT Feb 2024

Harvard-MIT Mathematics Tournament February 2024.
bench eval hmmt_feb_2024

HMMT Feb 2025

Harvard-MIT Mathematics Tournament February 2025.
bench eval hmmt_feb_2025

BRUMO 2025

BRUMO mathematics competition 2025.
bench eval brumo_2025

MATH

Competition-level math problems covering algebra, geometry, number theory, and more.
bench eval math

MATH-500

A challenging 500-problem subset of the MATH dataset.
bench eval math_500

MGSM

Multilingual Grade School Math - elementary math problems in multiple languages.
bench eval mgsm

MGSM English

English-only version of Multilingual Grade School Math.
bench eval mgsm_en

MGSM Latin Scripts

MGSM covering 5 languages using Latin scripts.
bench eval mgsm_latin

MGSM Non-Latin Scripts

MGSM covering 6 languages using non-Latin scripts.
bench eval mgsm_non_latin

OTIS Mock AIME 2024

Mock AIME problems from OTIS 2024.
bench eval otis_mock_aime_2024

OTIS Mock AIME 2025

Mock AIME problems from OTIS 2025.
bench eval otis_mock_aime_2025