Skip to main content

Baseline vs Minimal Harness

What You Do

Build a minimal Electron knowledge-base app shell — a window with a document list on the left, a Q&A panel on the right, and a local data directory. The task itself is not complex. What's complex is how you get the agent to complete it.

You run it twice. First time: just a prompt, no preparation. Second time: AGENTS.md, init.sh, feature_list.json pre-placed in the repo. Then compare.

The core of this project is not writing code — it's figuring out how big the gap is between "spend 15 minutes preparing rules first" and "just let the agent go."

Tools

  • Claude Code or Codex (pick one, use it for both runs)
  • Git (manage branches and compare)
  • Node.js + Electron (project stack)
  • A timer (record each run's duration)

Harness Mechanism

Minimal harness: AGENTS.md + init.sh + feature_list.json

Minimal Harness — rules-first

AGENTS.md prepared

init.sh initializes workspace

feature_list.json defines scope

Agent reads rules and follows

Stable, repeatable output

Baseline — prompt only

Prompt task

Agent guesses conventions

Reinvents rules each session

Inconsistent output

Feedback / ReportSpotted an issue or have an improvement idea?