Advanced (Updated: 6/7/2026)

A Verification Checklist So Claude Code Leaves Proof It Actually Finished

Stop trusting "done" reports. A practical checklist to verify Claude Code's work with build output, live URLs, and CTAs.

A Verification Checklist So Claude Code Leaves Proof It Actually Finished

On a Friday night I told Claude Code, “Write one article and take it all the way through to publish,” and went to bed. The next morning the log read, confidently, “Done. The article has been published.” Relieved, I opened the live URL. What loaded was the title of the previous article.

The build had passed. The URL returned a 200. But the h1 tag still belonged to a different page. I’d handed the diff check entirely to the AI, and I had believed nothing but the word “worked.”

That was the moment it clicked: the thing tripping me up wasn’t how smart Claude Code is. It was how I closed out the work. The more you hand to an AI, the more you need to keep proof in your own hands of what was actually verified afterward. Skip that, and you fall for a fake “all done” report every single time.

This article walks through the pattern I use to leave that proof behind, with copy-paste code you can run today.

Key takeaways

  • Don’t trust the AI’s “done.” Make machine-checked evidence (build, live URL, h1, CTAs) the only basis for calling work finished.
  • Before you start editing, decide in one sentence what this run has to prove, and pick the verification command up front.
  • After every publish, eyeball things in the same fixed order (h1 -> canonical -> opening paragraph -> CTAs). A fixed order kills blind spots.
  • Leave the evidence (screenshot, URL, the next number to watch) in a one-line note, so tomorrow’s you (or your automation) doesn’t redo the same judgment.
  • Shipping an article you can verify tomorrow beats shipping a “perfect” one in a single pass. As an operation, the former is far stronger.

Why “it’s done” alone gets you burned

Claude Code summarizes its progress in prose. That’s the trap: the summary comes back wearing roughly the same confident face whether things truly went well or not.

A passing build only proves “the syntax isn’t broken.” A 200 from the live URL only proves “the server returned something.” Whether that something is the article you just created is an entirely separate question.

My Friday accident cleared both the build and the 200. The only thing that had broken was a single point: whether the URL and the page’s contents actually matched. And nobody was looking at it.

So I put the “is it done?” call on the result of knocking out my own checklist, one item at a time, not on the AI’s words. If the foundations of how you run those checks feel shaky, line up the basics first with getting started with Claude Code, and the steps below will land more easily.

The 15-minute verification loop

Check things in a different order every time and, on a busy day, you’ll skip a step for sure. Fix the order so you can run it without thinking.

  1. Write, in one sentence, what counts as “done” for this run. Example: “This slug’s article appears at the live URL with the correct h1.”
  2. Before you start editing, decide the commands you’ll verify with (build, diff display). Don’t go hunting for them afterward.
  3. After a change, look in this order: diff -> build -> live URL. Even if you change your mind midway, don’t break the order.
  4. On the live URL, eyeball whether the h1, canonical, opening paragraph, and CTAs line up the way you expected.
  5. Leave the remaining risk and the “smallest next move” in a single line of notes.

The key here is to separate, up front, what you let the AI handle from what a human decides.

StageSafe to delegate to AIA human decides
Topic selectionRead existing titles, propose candidatesWhich one we actually write
WritingDraft body, code, headingsWhether anything false or stale crept in
VerificationRun the build, summarize the diffFinal call on whether the live URL’s content is correct
PublishingExecute the publish commandApproval for irreversible actions (deletes, prod pushes)

Tip irreversible actions entirely to “ask a human” at the start. Promote operations to automatic only after you’ve confirmed they’re safe. For the full thinking behind this line, see the permissions guide.

A copy-paste request template

Phrase your verification from scratch every time and gaps creep in depending on your mood that day. Turn the request you hand Claude Code into a template, and the checklist stays steady.

I just published this article. Before you report it done, confirm the following and return the results as a table.
- Did the build succeed? (Write the command and the exit code too.)
- Does the live URL's h1 match this slug's article title?
- Does the canonical point at the same slug?
- Is the opening paragraph a reused leftover from a previous article or the homepage?
- Are the CTAs (free PDF, course, consultation) ordered to fit the reader's situation?
For anything you could not confirm, write "unconfirmed" honestly. Do not write "OK" by guessing.

That last line is what does the work. Without spelling out “don’t write OK by guessing,” the AI will return a vaguely pleasant “no problems” even for items it never actually checked. If you want to raise your game on building the request itself, pair this with advanced prompt engineering.

A verification script you can run as-is

This is the heart of it. We go fetch the live URL and mechanically decide whether the h1 is the title we expected. It runs on Node.js (18+) alone. Make this script’s verdict, not the AI’s “done,” the basis for calling things finished.

// verify-publish.mjs
// Usage: node verify-publish.mjs <live URL> "<expected h1 title>"
// Example: node verify-publish.mjs https://claudecode-lab.com/en/blog/foo/ "Article title"

const [url, expectedH1] = process.argv.slice(2);

if (!url || !expectedH1) {
  console.error("Pass two arguments: the URL and the expected h1 title.");
  process.exit(2);
}

// Fetch the published page
const res = await fetch(url, { redirect: "follow" });
const html = await res.text();

// Roughly extract h1 and canonical (no strict parser needed)
const h1 = (html.match(/<h1[^>]*>(.*?)<\/h1>/is)?.[1] ?? "")
  .replace(/<[^>]+>/g, "")
  .trim();
const canonical = html.match(/<link[^>]+rel=["']canonical["'][^>]+href=["']([^"']+)["']/i)?.[1] ?? "";

// Knock out the checklist one item at a time
const checks = {
  http200: res.status === 200,
  h1Matches: h1 === expectedH1,
  canonicalMatches: canonical.includes(new URL(url).pathname),
};

console.table(checks);

const allOk = Object.values(checks).every(Boolean);
if (!allOk) {
  console.error("Some items are unconfirmed or mismatched. Do not call this published yet.");
  console.error(`Fetched h1: ${h1 || "(empty)"}`);
  process.exit(1);
}

console.log("Evidence is in. You may call this done.");

Running it is this simple.

node verify-publish.mjs https://claudecode-lab.com/en/blog/foo/ "Article title"

If h1Matches is false, you’re in exactly the state of my Friday accident: the URL is alive, but the contents are wrong. Unless the exit code is 0, don’t call publishing “done.” Make that one rule, and the fake completion report gets stopped before it reaches you.

Where this pays off

1. Publishing articles This is the classic place where it’s easy to mistake an HTTP 200 alone for success. Confirm with the script above that h1 and canonical point at the same slug, and you’ll catch a reused previous article or a fallback to the homepage before it goes live.

2. Swapping out a CTA When you move a button to the free PDF or a course, leave the screenshot and the “next number to watch” on the same line of notes. Later you can trace “did that change actually lift signups?” from a record instead of from memory.

3. Changing settings or permissions Changing your CLAUDE.md or permission settings is exactly when you should run the same verification command before and after. If how you write those settings still worries you, square it away first with CLAUDE.md best practices, and the premise for your checks will be solid.

Pitfalls, and how to fix them

Honestly, I fell into the same holes many times before I settled on this pattern.

The first is trying to fix everything in one pass and creating a diff too big to verify. A diff that touched 40 files is unreadable for humans and AI alike. The fix is dull: narrow what you verify in one run down to a single sentence. When it grows beyond what you can confirm, split the work.

The second is calling it done the moment the local build passes. “Works locally” and “appears correctly at the live URL” are two different things. The fix is to run the script above exactly once after every publish. Until I baked that into my routine, I shipped wrong-content pages more than once.

The third is piling on CTAs without telling the reader where to go next. Line up three buttons and, if the reader can’t choose, they’re meaningless. The fix is to add one line in the body, per reader situation (still nervous about commands / worn out by repetitive work / weighing a team rollout), naming which CTA fits.

FAQ

Q. If the build passes, isn’t that enough to be done? A passing build is only proof the syntax isn’t broken. Whether the live URL’s contents are this article is a separate matter. You’re done only once you’ve checked the h1 and the canonical too.

Q. Do I run the verification script by hand every time? By hand is fine at first. Once your routine stabilizes, grow it into something that runs automatically right after the publish command. Make it “if the exit code isn’t 0, halt the publish,” and accidents drop.

Q. Can’t I just let the AI do all the verification too? Reading and summarizing, sure, delegate them. But the final call on “is the live URL’s content correct?” and approval of irreversible actions stay with a human. Hand those over and there’s no one left to stop a fake completion report.

Q. Can non-engineers run this check? Yes. The script runs by copy-paste, and the visual steps are just a matter of memorizing the order. If the commands themselves feel daunting, ease in via the explainer for non-engineers.

Q. What’s enough to keep in the notes? Three things suffice: “what I verified this time,” “remaining risk,” and “the smallest next move.” No long meeting minutes needed. The only goal is that tomorrow’s you doesn’t redo the same judgment.

What happened when I actually tried it

After the wrong-content publish that Friday, I switched my definition of “done” from “what the AI said” to “whether the script’s exit code is 0.”

Running verify-publish.mjs against roughly ten publishes, two came back with h1Matches as false. Both returned a 200 and were the kind you’d never catch at a glance. Without the script, I’d have published a recycled page again.

Fixing the visual order helped too. Once I always looked in the same sequence (h1 -> canonical -> opening paragraph -> CTAs), the “wait, did I skip this last time too?” misses all but vanished. The more I moved judgment out of my head and into a procedure, the lighter the nightly check felt.

Rather than hunting for a smarter AI, put a system that catches you when you trip in place first. It looks like the long way around, but right now my conclusion is that it’s the fastest.

If you’re at the stage of making this verification pattern your team’s standard, or wiring it into your own publishing flow, I’ll help you design it in training and consultation. You can review Claude Code’s official docs in Anthropic’s documentation.

#claude-code #verification #checklist #publishing #operations
Free

Free PDF: Claude Code Cheatsheet

Enter your email and download the one-page Claude Code cheatsheet for commands, review habits, and safe workflows.

We handle your data with care and never send spam.

Level up your Claude Code workflow

Start with the free PDF, use Gumroad guides when you need repeatable workflows, and book consultation when rollout or revenue paths need human judgment.

Masa

About the Author

Masa

Engineer focused on practical Claude Code workflows. Runs claudecode-lab.com, a 10-language technical media site.