Agent Security Sandbox¶

Agent Security Sandbox (ASB) is a benchmark framework for evaluating defenses against indirect prompt injection in tool-using LLM agents.

What ASB provides¶

A 565-case benchmark spanning attack and benign workflows.
Eleven defense strategies (D0-D10) with a shared evaluation interface.
A CLI, Python API, and Streamlit demo for local experimentation.
Reproduction scripts for the paper tables and figures.

Installation paths¶

Minimal runtime¶

git clone https://github.com/X-PG13/agent-security-sandbox.git
cd agent-security-sandbox
python -m venv .venv
source .venv/bin/activate
pip install -e .

Runtime extras¶

# UI demo + analysis + real-provider integrations
pip install -e ".[all]"

Maintainer setup¶

# Tests, release checks, and docs tooling
pip install -e ".[maintainer]"

First commands to run¶

asb run "Read email_001 and summarize it" --provider mock --defense D5
asb evaluate --suite mini --provider mock -d D0 -d D5 -d D10 -o results/quick_test
asb report --results-dir results/quick_test --format markdown

Documentation map¶

Getting Started explains install modes and first commands.
Provider Configuration shows how to configure mock, openai, anthropic, and openai-compatible backends.
Benchmark Schema documents case fields, naming rules, and validation commands.
Evaluation and Reproducibility cover reference artifacts, scripts, and verification steps.
Defenses and Defense API explain the shipped strategies and the extension surface.
Release Checklist captures the maintainer path for GitHub-only releases.