Skip to main content

Prerequisites

System Requirements

  • Python: 3.10 or higher
  • Operating System: macOS, Linux, or Windows
We recommend using uv for Python package management. It’s significantly faster than pip and handles virtual environments automatically.
curl -LsSf https://astral.sh/uv/install.sh | sh

Install openbench

Quick Install with UV

# Create a virtual environment and install openbench
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install openbench

Alternative: Install with pip

# Create a virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install openbench
pip install openbench

Install from Source

For development or to get the latest features:
# Clone the repository
git clone https://github.com/groq/openbench.git
cd openbench

# Setup with UV (recommended)
uv venv && uv sync --dev
source .venv/bin/activate

# Or setup with pip
python -m venv .venv
source .venv/bin/activate
pip install -e .

Special Installation (Optional)

For safely running code execution benchmarks in a sandboxed environment:

Install Docker

Download and install Docker Desktop for Mac

Verify Docker

docker --version
docker run hello-world
Code generation benchmarks execute model code. We suggest using dockerized execution to ensure safety.
(e.g. humaneval --model openai/gpt-4o --sandbox docker).

Optional Plugins

Some benchmark suites ship as standalone plugins so they can iterate independently from the core distribution. Install them alongside openbench with uv pip and they will automatically appear in bench list via the plugin entry point system.
  • openbench-cyber: adds the CTI-Bench family plus CyBench (agentic CTF challenges). This plugin ships real exploit code and forensics artifacts that routinely trigger anti-malware scanners, so we require a deliberate, manual install after you read the security guidance.
    • Install explicitly: uv pip install "openbench-cyber @ git+https://github.com/groq/openbench-cyber.git@d93522ba70392cdceddb83f762c78a68923e70da"
    • Review the plugin README for sandbox requirements and risk acknowledgements before using it.