Safe Agent Harness Design for Claude Code and Codex: Permissions, Checks, and Rollback
Build a practical agent harness for Claude Code and Codex with policy, planning, verification, and recovery layers.
Stronger agents need stronger rails
When you first use Claude Code or Codex, it is tempting to think the main skill is writing better prompts. That works for small edits. It does not hold up once the agent starts touching deployment, SaaS APIs, files, publishing scripts, or outbound email.
At that point, the real problem is not the prompt. The real problem is the harness around the agent.
In this article, an Agent Harness means the external structure that lets an AI agent work safely: permission rules, execution plans, verification checks, logs, and rollback paths. The agent can still reason and write changes, but it does not get unlimited freedom to run every command it wants.
The practical goal is simple:
- let the agent do useful work
- block obviously dangerous actions
- require approval for irreversible actions
- verify the result before calling it done
- keep enough logs to recover when something fails
If you want the concept first, start with the harness engineering guide. For concrete failure patterns, read Claude Code security failure cases. For long-running project context, pair this with Claude Code and Obsidian integration.
This is useful whether you prefer Claude Code for local project work or Codex for delegated coding tasks and review workflows. The interface differs, but the operational shape is the same.
The four layers of an Agent Harness
A useful harness does not need to be a large framework. Start with four layers.
User request
|
v
Agent
|
v
[1] Policy layer What is allowed, asked, or denied
[2] Plan layer What steps will run, in what order
[3] Verification layer What proves the result works
[4] Recovery layer How to revert or repair failures
|
v
Files / shell / SaaS APIs / deploy
Most production failures come from missing one of these layers.
| Failure mode | What happens | Missing layer |
|---|---|---|
| Permissions are too broad | The agent reads secrets or runs destructive commands | Policy |
| The workflow is vague | The article is written but translations or deploy are forgotten | Plan |
| Success is guessed | Build passes but the live URL is stale | Verification |
| Recovery is unclear | Nobody knows which files to revert | Recovery |
Claude Code’s official settings documentation describes scoped settings, permission rules, hooks, and project configuration. Start with Claude Code settings, Hooks reference, and MCP configuration. For Codex, the same mental model applies through task instructions, sandboxing, approvals, branches, and CI.
Example 1: a content publishing harness
For a content site, “publish one article” is not one action. It is a workflow.
Input: publish one article today
1. Read analytics for the last 7 days
2. Pick a topic adjacent to a high-performing cluster
3. Check existing articles to avoid duplication
4. Write the source article
5. Create all required locales
6. Validate frontmatter, slug, and internal links
7. Build the site
8. Check the live URL
9. Commit and push
If the plan is not explicit, the agent will often complete the visible part and miss the boring release work. The harness makes the boring work unavoidable.
Example 2: a SaaS integration harness
SaaS integrations need even stricter boundaries. Imagine an agent that researches companies and drafts outreach emails.
1. Read public websites
2. Store company name, website, and email
3. Generate a sample landing page
4. Draft an email
5. Send the email
Only the first four should be automatic. Sending email affects another person and cannot be undone. It should require approval.
| Operation | Auto? | Reason |
|---|---|---|
| Read public pages | Yes | Low impact |
| Save a local CSV | Yes | Reviewable |
| Generate a sample page | Yes | Reviewable before publishing |
| Draft an email | Yes | Not sent yet |
| Send email | Approval required | External, irreversible action |
This is the core of harness design: separate reading, drafting, writing, publishing, and sending.
Policy layer: start with allow, ask, and deny
A simple policy file is enough for many teams.
{
"allowCommands": [
"npm run build",
"npm run test",
"node scripts/analytics-report.mjs",
"node scripts/content-trend-report.mjs"
],
"askCommands": [
"git push",
"wrangler pages deploy",
"node scripts/outreach-send-mails.mjs --send"
],
"denyCommands": [
"rm -rf",
"git reset --hard",
"curl * | sh",
"npm publish"
],
"protectedPaths": [
".env",
".env.local",
"claudecode-lab-sheets-f54fc47c68f0.json"
]
}
For Claude Code, project settings can express a similar boundary.
{
"$schema": "https://json.schemastore.org/claude-code-settings.json",
"permissions": {
"allow": [
"Bash(npm run build)",
"Bash(npm run test *)",
"Bash(node scripts/content-trend-report.mjs *)"
],
"ask": [
"Bash(git push *)",
"Bash(wrangler pages deploy *)"
],
"deny": [
"Bash(rm -rf *)",
"Bash(git reset --hard *)",
"Read(./.env)",
"Read(./.env.*)",
"Read(./claudecode-lab-sheets-f54fc47c68f0.json)"
]
}
}
The important part is specificity. Do not write “be careful with secrets” and hope the model remembers. Put secret files, destructive commands, production deploys, and outbound send actions behind explicit rules.
Plan layer: require a machine-readable plan
Before running work, ask the agent to produce a plan that can be inspected.
{
"goal": "Publish one trend-informed article",
"steps": [
{
"name": "analytics",
"command": "node scripts/content-trend-report.mjs --days 7",
"risk": "safe"
},
{
"name": "write",
"paths": ["site/src/content/blog/*.mdx"],
"risk": "safe"
},
{
"name": "build",
"command": "npm run build",
"risk": "safe"
},
{
"name": "deploy",
"command": "wrangler pages deploy dist --project-name claudecode-lab",
"risk": "ask"
}
]
}
This is not bureaucracy. It gives the harness a chance to block a risky operation before it happens.
Verification layer: define success as commands
“Check it carefully” is not a verification strategy. Use commands.
// scripts/verify-published-page.mjs
const url = process.argv[2];
if (!url) {
throw new Error("Usage: node scripts/verify-published-page.mjs <url>");
}
const response = await fetch(url, { redirect: "follow" });
if (!response.ok) {
throw new Error(`Page returned ${response.status}: ${url}`);
}
const html = await response.text();
const checks = [
["title", /<title>.+<\/title>/i],
["description", /<meta name="description"/i],
["adsense", /ca-pub-2125588229998303/i],
["analytics", /G-3YR0LE68MJ/i]
];
for (const [name, pattern] of checks) {
if (!pattern.test(html)) {
throw new Error(`Missing ${name} on ${url}`);
}
}
console.log(`OK: ${url}`);
For mobile layout, use Playwright.
// scripts/check-mobile-code-blocks.mjs
import { chromium } from "playwright";
const url = process.argv[2];
const browser = await chromium.launch();
const page = await browser.newPage({ viewport: { width: 390, height: 844 } });
await page.goto(url, { waitUntil: "networkidle" });
const overflowing = await page.evaluate(() => {
return [...document.querySelectorAll("pre, code, table")]
.filter((el) => el.scrollWidth > el.clientWidth + 4)
.map((el) => el.textContent?.slice(0, 80));
});
await browser.close();
if (overflowing.length > 0) {
console.error(JSON.stringify(overflowing, null, 2));
process.exit(1);
}
console.log("OK: no mobile overflow");
Recovery layer: log what changed
The dangerous failure is not “the run failed.” The dangerous failure is “the run failed and nobody knows what it changed.”
Keep a simple run log.
{
"runId": "2026-05-19-article-001",
"topic": "agent harness security",
"changedFiles": [
"site/src/content/blog/claude-code-codex-agent-harness-security.mdx",
"site/src/content/blog-en/claude-code-codex-agent-harness-security.mdx"
],
"commands": [
"node scripts/content-trend-report.mjs --days 7",
"npm run build",
"wrangler pages deploy dist --project-name claudecode-lab"
],
"status": "deployed"
}
For Git-based workflows, prefer targeted recovery over broad resets.
git status --short
git diff -- site/src/content/blog/target-article.mdx
git revert <bad-commit>
Avoid giving agents routine access to git reset --hard. It can erase unrelated work in a dirty repository.
Minimal Node.js command harness
The smallest useful harness is a command classifier.
// scripts/agent-command-policy.mjs
const policy = {
allow: [
/^node scripts\/content-trend-report\.mjs( .*)?$/,
/^npm run build$/,
/^npm run test$/
],
ask: [
/^git push( .*)?$/,
/^wrangler pages deploy( .*)?$/
],
deny: [
/rm -rf/,
/git reset --hard/,
/curl .* \| sh/,
/npm publish/
]
};
export function classifyCommand(command) {
if (policy.deny.some((rule) => rule.test(command))) return "deny";
if (policy.allow.some((rule) => rule.test(command))) return "allow";
if (policy.ask.some((rule) => rule.test(command))) return "ask";
return "ask";
}
// scripts/run-agent-command.mjs
import { spawn } from "node:child_process";
import { classifyCommand } from "./agent-command-policy.mjs";
const command = process.argv.slice(2).join(" ");
const decision = classifyCommand(command);
if (!command) {
throw new Error("Usage: node scripts/run-agent-command.mjs <command>");
}
if (decision === "deny") {
throw new Error(`Denied by policy: ${command}`);
}
if (decision === "ask" && process.env.AGENT_APPROVED !== "1") {
console.error(`Approval required: ${command}`);
process.exit(2);
}
const child = spawn(command, {
shell: true,
stdio: "inherit"
});
child.on("exit", (code) => {
process.exit(code ?? 1);
});
This wrapper is intentionally boring. That is the point. Good harness code is predictable.
The same design works for Claude Code and Codex
Claude Code and Codex are not identical. Claude Code is excellent for local, interactive work inside an existing repository. Codex is well suited to delegated coding tasks, cloud work, reviews, and branch-based workflows. OpenAI describes Codex as an AI coding partner for working on code tasks, while Claude Code exposes project settings, permissions, hooks, MCP, and memory through its own configuration model.
The harness pattern transfers across both.
| Concern | Claude Code | Codex |
|---|---|---|
| Project instructions | CLAUDE.md, .claude/settings.json | AGENTS.md, task instructions |
| Permissions | permissions, hooks, MCP settings | sandbox, approvals, task environment |
| Verification | local build, tests, Playwright, hooks | CI, tests, review checks |
| Recovery | git diff, revert, targeted file changes | branch diffs, PR review, revert |
The tool matters. The rails matter more.
Start with three rules
You do not need a perfect platform on day one.
First, block secrets. Deny .env, service account files, API keys, customer lists, and private exports.
Second, require approval for external side effects. Deploy, push, email, database writes, and publishing should not be fully automatic until the workflow has proven itself.
Third, turn success into commands. Build, test, check locale coverage, verify the live URL, and take a mobile screenshot when layout risk is high.
Summary
Agent quality is not just prompt quality. In real workflows, quality comes from the harness around the model.
Use four layers:
- Policy: what is allowed, asked, or denied
- Plan: what will happen before it happens
- Verification: how success is proven
- Recovery: how failure is repaired
The more useful an agent becomes, the more important this structure becomes. It is the difference between “AI that can edit files” and “AI that can safely run part of a business workflow.”
Free PDF: Claude Code Cheatsheet in 5 Minutes
Just enter your email and we'll send you the single-page A4 cheatsheet right away.
We handle your data with care and never send spam.
Level up your Claude Code workflow
50 battle-tested prompt templates you can copy-paste into Claude Code right now.
About the Author
Masa
Engineer obsessed with Claude Code. Runs claudecode-lab.com, a 10-language tech media with 2,000+ pages.
Related Posts
10 Powerful Subagent Patterns for Claude Code
Master Claude Code's subagent feature with 10 practical patterns. Learn how to use parallel processing, specialization, and context isolation to double your development speed.
Getting Started with Claude Code Agent SDK — Build Autonomous Agents Fast
Learn how to build autonomous AI agents with Claude Code Agent SDK. Covers setup, tool definitions, and multi-step execution with practical code examples.
The Complete Guide to Context Management in Claude Code
Learn practical techniques to maximize Claude Code's context window. Covers token optimization, conversation splitting, and CLAUDE.md usage.
Related Products
The Complete Claude Code Setup & Configuration Guide
From install to team-ready workflow.
A practical guide to installation, CLAUDE.md, hooks, MCP servers, permissions, IDE setup, and CI/CD workflows.
50 Battle-Tested Claude Code Prompt Templates
Copy, paste, ship. 50 production-ready prompts.
Use proven prompts for code review, refactoring, testing, documentation, debugging, architecture, and incident response.