Skip to content

Agent Security Sandbox

Agent Security Sandbox (ASB) is a benchmark framework for evaluating defenses against indirect prompt injection in tool-using LLM agents.

What ASB provides

  • A 565-case benchmark spanning attack and benign workflows.
  • Eleven defense strategies (D0-D10) with a shared evaluation interface.
  • A CLI, Python API, and Streamlit demo for local experimentation.
  • Reproduction scripts for the paper tables and figures.

Installation paths

Minimal runtime

git clone https://github.com/X-PG13/agent-security-sandbox.git
cd agent-security-sandbox
python -m venv .venv
source .venv/bin/activate
pip install -e .

Runtime extras

# UI demo + analysis + real-provider integrations
pip install -e ".[all]"

Maintainer setup

# Tests, release checks, and docs tooling
pip install -e ".[maintainer]"

First commands to run

asb run "Read email_001 and summarize it" --provider mock --defense D5
asb evaluate --suite mini --provider mock -d D0 -d D5 -d D10 -o results/quick_test
asb report --results-dir results/quick_test --format markdown

Documentation map

  • Getting Started explains install modes and first commands.
  • Provider Configuration shows how to configure mock, openai, anthropic, and openai-compatible backends.
  • Benchmark Schema documents case fields, naming rules, and validation commands.
  • Evaluation and Reproducibility cover reference artifacts, scripts, and verification steps.
  • Defenses and Defense API explain the shipped strategies and the extension surface.
  • Release Checklist captures the maintainer path for GitHub-only releases.