Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

A curated, implementation-first list of **agent harness engineering** resources, with GitHub projects as the primary focus.

- Total entries: **268**
- GitHub entries: **241 (89.9%)**
- GitHub in project categories (excluding readings): **236/236 (100.0%)**
- Total entries: **269**
- GitHub entries: **242 (90.0%)**
- GitHub in project categories (excluding readings): **237/237 (100.0%)**
- Categories: **9**
- Last verified: **2026-06-05**
- Language: [English](./README.md) | [中文](./README_zh.md)
Expand Down Expand Up @@ -48,7 +48,7 @@ A curated, implementation-first list of **agent harness engineering** resources,
| --- | ---: |
| Harness Architecture & Orchestration | 44 |
| Context & Working-State Engineering | 16 |
| Execution Substrates & Sandboxing | 25 |
| Execution Substrates & Sandboxing | 26 |
| Protocols, Tool Interfaces & Agent Contracts | 23 |
| Evaluation Harnesses & Benchmarks | 27 |
| Observability & Reliability Operations | 14 |
Expand Down Expand Up @@ -165,6 +165,7 @@ Notes:
| agentbox | [GitHub](https://github.com/mattolson/agent-sandbox) | [![star](https://img.shields.io/badge/star-176-f4b400?style=flat-square)](https://github.com/mattolson/agent-sandbox) | sandbox, coding-agents, network-policy | Locked-down local sandbox for AI coding agents with scoped filesystem access, egress policy, secret injection, firewalling, and persistent agent state. |
| HexAgent | [GitHub](https://github.com/UnicomAI/hexagent) | [![star](https://img.shields.io/badge/star-125-f4b400?style=flat-square)](https://github.com/UnicomAI/hexagent) | computer-layer, sandbox, runtime | Agent harness that separates the runtime from the computer it operates on through local, VM, and cloud sandbox backends. |
| terminal-bench-env | [GitHub](https://github.com/ucsb-mlsec/terminal-bench-env) | [![star](https://img.shields.io/badge/star-83-f4b400?style=flat-square)](https://github.com/ucsb-mlsec/terminal-bench-env) | terminal, benchmark-env, sandbox | Environment layer for terminal-agent benchmark execution. |
| AgentBox (madarco) | [GitHub](https://github.com/madarco/agentbox) | [![star](https://img.shields.io/badge/star-41-f4b400?style=flat-square)](https://github.com/madarco/agentbox) | sandbox, parallel, claude-code | CLI that runs coding agents in parallel, each teleported into its own sandboxed box (local Docker or cloud VMs) with checkpoints and a per-box browser/VS Code. |

<a id="protocols-tool-interfaces-agent-contracts"></a>
### Protocols, Tool Interfaces & Agent Contracts
Expand Down
9 changes: 5 additions & 4 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

一个面向 **Agent Harness Engineering** 的工程实践清单,优先收录可直接落地的 GitHub 项目。

- 当前条目数: **268**
- GitHub 条目: **241 (89.9%)**
- 项目分类 GitHub 占比(不含阅读类): **236/236 (100.0%)**
- 当前条目数: **269**
- GitHub 条目: **242 (90.0%)**
- 项目分类 GitHub 占比(不含阅读类): **237/237 (100.0%)**
- 分类数量: **9**
- 最近核对日期: **2026-06-05**
- 语言: [English](./README.md) | [中文](./README_zh.md)
Expand Down Expand Up @@ -48,7 +48,7 @@
| --- | ---: |
| Harness Architecture & Orchestration | 44 |
| Context & Working-State Engineering | 16 |
| Execution Substrates & Sandboxing | 25 |
| Execution Substrates & Sandboxing | 26 |
| Protocols, Tool Interfaces & Agent Contracts | 23 |
| Evaluation Harnesses & Benchmarks | 27 |
| Observability & Reliability Operations | 14 |
Expand Down Expand Up @@ -165,6 +165,7 @@
| agentbox | [GitHub](https://github.com/mattolson/agent-sandbox) | [![star](https://img.shields.io/badge/star-176-f4b400?style=flat-square)](https://github.com/mattolson/agent-sandbox) | sandbox, coding-agents, network-policy | 面向 AI 编码代理的本地锁定沙箱,提供限定文件访问、出站策略、密钥注入、防火墙与持久代理状态。 |
| HexAgent | [GitHub](https://github.com/UnicomAI/hexagent) | [![star](https://img.shields.io/badge/star-125-f4b400?style=flat-square)](https://github.com/UnicomAI/hexagent) | computer-layer, sandbox, runtime | 将代理运行时与其操作的计算机分离的 agent harness,支持本地、VM 与云端沙箱后端。 |
| terminal-bench-env | [GitHub](https://github.com/ucsb-mlsec/terminal-bench-env) | [![star](https://img.shields.io/badge/star-83-f4b400?style=flat-square)](https://github.com/ucsb-mlsec/terminal-bench-env) | terminal, benchmark-env, sandbox | 为终端代理基准测试提供执行环境层。 |
| AgentBox (madarco) | [GitHub](https://github.com/madarco/agentbox) | [![star](https://img.shields.io/badge/star-41-f4b400?style=flat-square)](https://github.com/madarco/agentbox) | sandbox, parallel, claude-code | 在并行的独立沙箱(本地 Docker 或云端虚拟机)中运行多个编码代理的 CLI,支持检查点以及每个盒子内置的浏览器/VS Code。 |

<a id="protocols-tool-interfaces-agent-contracts"></a>
### Protocols, Tool Interfaces & Agent Contracts
Expand Down
13 changes: 13 additions & 0 deletions data/projects.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3725,3 +3725,16 @@ entries:
updated_at: '2025-01-01'
license: n/a
why_included: High-signal practitioner framing for harness-first implementation.
- name: AgentBox (madarco)
repo_url: https://github.com/madarco/agentbox
category: Execution Substrates & Sandboxing
summary_en: CLI that runs coding agents in parallel, each teleported into its own sandboxed box (local Docker or cloud VMs) with checkpoints and a per-box browser/VS Code.
summary_zh: 在并行的独立沙箱(本地 Docker 或云端虚拟机)中运行多个编码代理的 CLI,支持检查点以及每个盒子内置的浏览器/VS Code。
tags:
- sandbox
- parallel
- claude-code
stars_snapshot: 41
updated_at: '2026-06-06'
license: MIT
why_included: Self-hostable execution substrate that teleports a project into an isolated box per agent (local Docker FUSE overlay or Hetzner/Daytona/Vercel/E2B), with sub-1s checkpoint starts and host-held git credentials; works with Claude Code, Codex, and OpenCode.
67 changes: 67 additions & 0 deletions reports/verification/2026-06-06.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Verification Report

- Generated at: `2026-06-06T08:50:17.872197+00:00`
- Total entries: `269`
- GitHub entries: `242` (90.0%)
- GitHub in project categories (excluding `Essential Readings & Ecosystem Maps`): `237/237` (100.0%)
- Categories: `9`
- URL checks: `270` total, `270` reachable, `0` broken

## Category Counts

| Category | Entries |
| --- | ---: |
| Harness Architecture & Orchestration | 44 |
| Context & Working-State Engineering | 16 |
| Execution Substrates & Sandboxing | 26 |
| Protocols, Tool Interfaces & Agent Contracts | 23 |
| Evaluation Harnesses & Benchmarks | 27 |
| Observability & Reliability Operations | 14 |
| Guardrails, Security & Governance | 19 |
| Reference Harness Implementations | 68 |
| Essential Readings & Ecosystem Maps | 32 |

## Structural Errors

- None

## Warnings

- None

## Broken URLs

- None

## Reachable URL Sample

- `HEAD 200` https://blog.langchain.com/agent-frameworks-runtimes-and-harnesses-oh-my/
- `HEAD 200` https://blog.langchain.com/evaluating-deep-agents-our-learnings/
- `HEAD 200` https://blog.langchain.com/improving-deep-agents-with-harness-engineering/
- `HEAD 200` https://blog.langchain.com/the-anatomy-of-an-agent-harness/
- `HEAD 200` https://claude.com/blog/building-agents-with-the-claude-agent-sdk
- `HEAD 200` https://cognition.ai/blog/what-we-learned-building-cloud-agents
- `HEAD 200` https://developers.openai.com/blog/eval-skills
- `HEAD 200` https://github.com/1jehuang/jcode
- `HEAD 200` https://github.com/21st-dev/1code
- `HEAD 200` https://github.com/2FastLabs/agent-squad
- `HEAD 200` https://github.com/AVIDS2/memorix
- `HEAD 200` https://github.com/AgentOps-AI/agentops
- `HEAD 200` https://github.com/Aider-AI/aider
- `HEAD 200` https://github.com/AndyMik90/Aperant
- `HEAD 200` https://github.com/Arize-ai/openinference
- `HEAD 200` https://github.com/Arize-ai/phoenix
- `HEAD 200` https://github.com/Atmosphere/atmosphere
- `HEAD 200` https://github.com/BerriAI/litellm
- `HEAD 200` https://github.com/BloopAI/vibe-kanban
- `HEAD 200` https://github.com/Chorus-AIDLC/Chorus
- `HEAD 200` https://github.com/ChromeDevTools/chrome-devtools-mcp
- `HEAD 200` https://github.com/ComposioHQ/agent-orchestrator
- `HEAD 200` https://github.com/DevAgentForge/Open-Claude-Cowork
- `HEAD 200` https://github.com/EleutherAI/lm-evaluation-harness
- `HEAD 200` https://github.com/EveryInc/compound-engineering-plugin
- `HEAD 200` https://github.com/FoundationAgents/OpenManus
- `HEAD 200` https://github.com/Git-on-my-level/codex-autorunner
- `HEAD 200` https://github.com/GoogleCloudPlatform/scion
- `HEAD 200` https://github.com/HKUDS/CLI-Anything
- `HEAD 200` https://github.com/HKUDS/OpenHarness