teststop
Break it before your users do.¶
Trigger AI to test any software system the way a real adversarial user would — someone who never read the docs, retries when things are slow, and does what no spec ever imagined.
The Problem With Test Coverage¶
Every test ever written was written by someone who knew how the system works.
Real users don't know. That's why production still surprises us.
We call it test coverage. What we actually have is assumption coverage.
How It Works in 30 Seconds¶
Scan
teststop walks your project tree, detects the language and system type, and extracts routes, flows, and dependencies — all statically, no code execution.
Compose Mandate
Injects project context and accumulated memory into the base mandate — the adversarial instruction that tells the AI how a real user would break this specific system.
Generate Scenarios
Calls claude -p or copilot -p with the mandate. The AI returns structured JSON: scenario IDs, steps, chaos factors, failure modes, and priorities.
Update Memory
Confidence scores update per area. High confidence → less testing next run. Retirement at 0.95+ with ≥15 passes. The system gets smarter with every run.
Report & Exit
Outputs JSON, text, or markdown. Exits with a machine-readable code: 0 = safe to deploy, 1 = review needed, 2 = critical failures found.
Quick Install¶
Download the latest binary for your platform from GitHub Releases.
Prerequisite: claude or copilot CLI must be on your PATH.
Part of a Trilogy¶
DocuFlow → gives AI the context to act with purpose
Waymark → gives humans the reason to trust and step back
teststop → gives systems the confidence to prove themselves
| Tool | Role | Link |
|---|---|---|
| DocuFlow | MCP server — LLM wiki for AI context | GitHub |
| Waymark | MCP middleware — AI agent governance | GitHub |
| teststop | CLI — adversarial testing trigger | This repo |