“Just Clean This Up” Is How AI Wrecks Your Repo. Build It a Safety Frame Instead

“Hey, tidy up this repo a bit, would you?”

The next morning, the AI had “tidied” 40 files. There was working code in there, sure. But it had also swept away config files that absolutely should not have been touched — gone, clean as a whistle.

You know that stomach-drop feeling? Yeah. That one.

So why does a supposedly smart AI cause accidents without blinking? The answer is simple: being smart and being safe to work with are two completely different things. Picture the new hire who aces every test on paper, then crashes the register on their first shift. It’s not a brains problem. It’s a footing problem.

That footing has a name these days: the harness. Today I’ll walk you through it with as little jargon as I can manage.

So what actually is a harness?

A harness is a small program you wrap around the AI.

The easiest mental image is the safety line a construction worker clips on, or the training wheels on a kid’s bike. It leaves the rider’s ability untouched, but stops them from hitting the ground. For an AI, it plays roles like these:

Decides what the AI gets to read (not everything)
Decides what the AI is supposed to produce (a clear goal)
Decides how far it can act alone, and where it has to stop and ask a human
Mechanically checks whether the result is broken before anyone trusts it

“Writing a good prompt” is only one slice of that. The reason endless prompt-tuning doesn’t cut down on accidents is the same reason practicing on a unicycle with no training wheels doesn’t make you safer — you’re working on the wrong thing.

Why is everyone suddenly talking about this?

Not long ago, the things we asked AI to do were basically “write some text” or “write some code.” A human read the output and made the call. Simple.

Now we’re handing AI the work itself. Read the files, pick a topic that doesn’t overlap with existing posts, check the diff, register with an external service, report the cause when something fails — once you’re here, there’s no human standing in the loop on every step. Which is exactly why the risk of “it just went off and broke something” shot up overnight.

The reason Claude Code gets such good word of mouth isn’t really that the model is smart — it’s that the footing underneath it is well built. Tools to read files, tools to search, a mechanism to halt dangerous operations, a way to make it remember project rules. All that unglamorous plumbing is solid. What’s actually catching on isn’t a magic prompt. It’s the boring plumbing.

Let’s just run something: a 30-line minimal harness

Talking about it is slower than running it. Let’s build the smallest possible frame: the AI is allowed to read and write, and is never allowed to touch anything outside one folder we choose. You’ll need Node.js and an Anthropic API key.

First, set things up.

mkdir harness-demo && cd harness-demo
npm init -y
npm install @anthropic-ai/sdk
mkdir sandbox
echo "# Notes" > sandbox/note.md

Next, write the “allow list.” This is the heart of the harness — a declaration that nothing outside sandbox gets touched.

{
  "workspace": "./sandbox",
  "maxSteps": 6
}

Then the main file (harness.mjs). There’s only one spot you really need to remember. safePath is the gatekeeper that stops the AI the moment it tries to step outside the folder. Just having this means the “40-file accident” from the intro can’t happen.

import Anthropic from "@anthropic-ai/sdk";
import { readFile, writeFile } from "node:fs/promises";
import path from "node:path";

const client = new Anthropic();
const policy = JSON.parse(await readFile(new URL("./policy.json", import.meta.url), "utf8"));
const root = path.resolve(policy.workspace);

// Gatekeeper: if it tries to leave the work folder, stop it right there
function safePath(p) {
  const resolved = path.resolve(root, p);
  if (resolved !== root && !resolved.startsWith(root + path.sep)) {
    throw new Error(`${p} is outside the work folder. You can only touch things inside sandbox.`);
  }
  return resolved;
}

const tools = [
  { name: "read_file", description: "Read text inside sandbox",
    input_schema: { type: "object", properties: { path: { type: "string" } }, required: ["path"] } },
  { name: "write_file", description: "Write text inside sandbox",
    input_schema: { type: "object", properties: { path: { type: "string" }, content: { type: "string" } }, required: ["path", "content"] } },
];

async function useTool(name, input) {
  if (name === "read_file") return await readFile(safePath(input.path), "utf8");
  if (name === "write_file") { await writeFile(safePath(input.path), input.content, "utf8"); return "Write OK"; }
  throw new Error(`Unknown tool: ${name}`);
}

const messages = [{ role: "user", content: process.argv.slice(2).join(" ") || "Read note.md and write a summary to summary.md." }];

for (let step = 0; step < policy.maxSteps; step++) {
  const res = await client.messages.create({
    model: process.env.ANTHROPIC_MODEL || "claude-sonnet-4-6",
    max_tokens: 1024,
    tools,
    system: "You are a careful file clerk. Use tools only when needed, and keep all work inside sandbox.",
    messages,
  });
  messages.push({ role: "assistant", content: res.content });

  const calls = res.content.filter((b) => b.type === "tool_use");
  if (calls.length === 0) { console.log(res.content.find((b) => b.type === "text")?.text ?? ""); break; }

  const results = [];
  for (const c of calls) {
    try { results.push({ type: "tool_result", tool_use_id: c.id, content: String(await useTool(c.name, c.input)).slice(0, 4000) }); }
    catch (e) { results.push({ type: "tool_result", tool_use_id: c.id, is_error: true, content: e.message }); }
  }
  messages.push({ role: "user", content: results });
}

Running it is just this.

node harness.mjs

A few dozen lines, but you’ve already got the “AI itself,” the “tools it can use,” the “scope it’s allowed,” the “retry ceiling,” and the “stop-when-broken mechanism” cleanly separated. That’s the skeleton of a harness. From here you bolt on search, test runs, approval gates, and notifications — and it grows into something shaped like Claude Code.

Where this pays off (three cases)

1. Quality-gating mass-produced content Stop at “write a blog post” and the AI will happily churn out thin articles and near-duplicate topics. So I give the harness a sequence: read the existing titles → pick an angle that doesn’t overlap → write the body → mechanically check word count and links. Before I even have to agonize over whether a draft is any good, the gatekeeper rejects the thin ones. These days several drafts a month get stopped before they ever go live. And honestly? I’m grateful when they do.

2. Triaging inbound inquiries “Read the inquiries that came in and flag the ones that look like real sales leads.” The reading can be automatic — fine. But registering anyone in the customer list stays on hold until a human presses the button. I enforce that with the harness. Reading is automatic, writing is a dry-run draft, and only the final registration is human. The accident where a misclassified customer gets shoved straight into the production database? Gone.

3. A breath before deploy Before the publish button gets pressed, make it confirm: does the build pass, are the env vars all present, does the diff match expectations, is there a rollback path. The AI loves to read only the last line of a failure log and “fix” the wrong thing, so the trick is deciding where to look in advance. Don’t hand it the whole log — narrow it to the few dozen relevant lines. That alone kills most of the off-target fixes.

Three designs you can “steal” from Claude Code

When you build your own harness, you don’t have to invent it from scratch. Claude Code is a treasure chest of worked examples. You don’t need to copy all of it — but adopt these three early and things stabilize fast.

First, split your rules into layers. Promises that never change go in a config file; one-off instructions for this task go in an on-the-spot note; long-lived preferences live somewhere else. Cramming all of it into every prompt just makes the prompt long and the accuracy worse.

Second, let commands handle the deterministic work. Formatting, linting, tests — running npm test is faster and more reliable than asking the AI to do it. Leave the AI only the “thinking” part of the job.

Third, offload the heavy research to a separate worker. Pour long logs and bulk file-reads into your main conversation and the actual judgment gets blurry. Have a separate process do the legwork and hand you back only the conclusion. Just that brings the sharpness of your decisions right back.

Three accidents I caused myself

Let me be honest. My first harness was one accident after another.

The first was handing over too many tools. Figuring more is better, I gave it something like 30 tools — and the AI froze up going “which one do I use?” and made one weird choice after another. Now I start with five to ten.

The second was unhelpful error messages. When I returned only Error: failed, the AI couldn’t fix a thing. The moment I started returning the cause and the next move — like README.md not found. sandbox only has note.md — it suddenly began solving its own problems.

The third was relying on human eyes alone for checks. “I’ll just review it at the end” falls apart, guaranteed, on a busy day. Once I put in machine-checkable gatekeepers — word count, broken links, type errors — my late-night reviews dropped off a cliff.

If you’re starting out, start here

Do not build a “fully autonomous genius agent” right out of the gate. Pick one small job that’s safe to get wrong — checking draft articles, a first-pass PR review, triaging inquiries, a pre-staging sanity check. That’s the right size.

The order is always the same. (1) Narrow the scope of what it reads. (2) Make the goal — the deliverable — explicit. (3) Push verification onto commands as much as you can. (4) Set every dangerous operation (deletion, production DB, billing, force push) to “ask the human” at first. Only after an operation proves safe do you promote it to automatic. Just following that order cuts accidents down by a shocking amount.

How to set up permissions is covered in the Claude Code permissions guide, and the groundwork for using this on a team is in CLAUDE.md best practices. If you want to split long jobs apart, subagent patterns pairs well too. For the official line, the Claude Agent SDK docs are the primary source.

What actually happened when I tried it

Ever since the “40-file accident,” I stopped agonizing over whether to trust the AI. What I look at instead is which gatekeeper stopped it. Adding a single safePath to a minimal harness took outside-the-folder accidents to zero. Adding automatic word-count and link checks meant thin articles got stopped before publishing. Rather than hunt for a smarter AI, build the footing that keeps you from getting hurt when you fall. It looks like the long way around, but it’s genuinely the fastest — that’s where I’ve landed.

Wrapping up

Harness engineering isn’t the art of decorating prompts. It’s the art of designing what the AI sees, what it does, where it stops, and how you confirm the result. Start by running those 30 lines above, then add one “gatekeeper” to a job in your own work. The quality of the AI’s work is decided less by the model’s smarts than by the footing you build around it.

If you want to fold AI into your own work more systematically, take a look at the materials and templates. And if you want to get permissions, reviews, and verification sorted across a whole team, peek at the training and onboarding consultation.

“Just Clean This Up” Is How AI Wrecks Your Repo. Build It a Safety Frame Instead

So what actually is a harness?

Why is everyone suddenly talking about this?

Let’s just run something: a 30-line minimal harness

Where this pays off (three cases)

Three designs you can “steal” from Claude Code

Three accidents I caused myself

If you’re starting out, start here

What actually happened when I tried it

Wrapping up

Free PDF: Claude Code Cheatsheet

Level up your Claude Code workflow

Related Posts

How to Tell Claude Code to Edit Just One File

Recover from Claude Code Permission Denials Without Weakening Your Guardrails

Claude Code Harness Smoke Test: A 15-Minute Proof Loop Before You Trust an Agent

Related Products

50 Battle-Tested Claude Code Prompt Templates

The Complete Claude Code Setup & Configuration Guide