The Engineering of Intent, Chapter 10: The Five-Layer Quality Gate Stack

This is Part 10 of a series walking through my book The Engineering of Intent. In the previous chapter, we covered context engineering — shaping what the agent reads at the start of every session. Part IV of the book turns to defense-in-depth: what catches the inevitable mistakes before they reach production.

Every AI-Generated Change Must Pass Five Gates Before a Human Sees It

Every AI-generated change must pass through a stack of automated checks before a human even sees it. The stack has five layers. None of them is optional.

Chapter 10 walks each layer, tunes the strictness dial, and works through the fintech case study where this exact stack reduced cycle time from two hours to twenty-two minutes and cut production incidents three-fold.

The Five Layers, Compressed

Static Linting. Catches hallucinated imports, unused variables, obvious bugs. Modern linters run on every change without noticeable latency. The most valuable lint rule in AI-native projects is the “unknown import” rule — agents occasionally invent libraries or reference APIs that do not exist.
Strict Type Checking. Types are the fastest form of executable documentation. When an agent changes a function signature, a strict type checker immediately identifies every caller that needs updating. Strict everywhere possible: TypeScript strict, Pyright strict, Sorbet with strict sigils.
SAST and Security Scans. AI-generated code has characteristic vulnerabilities: string-concatenated SQL, sensitive data in debug logs, “temporarily” disabled CSRF. Semgrep, Snyk, CodeQL. Maintain a custom ruleset for your organization’s anti-patterns.
Automated Test Synthesis. Agent-written tests cover the happy path; adversarial tests require human judgment. Require both. The combination covers more surface than either alone.
Agentic End-to-End Testing. Autonomous browser agents exercise the application as users would. Use as a final gate before merge, not a per-commit check. Keep the scope small.

💡 Key idea: The layers aren’t interchangeable. They’re ordered by speed and determinism: fast deterministic checks run on every commit; slow probabilistic checks run on merge queues with explicit retry budgets. A team I worked with went from a 30% flake rate to under 5% by separating these two tiers and trusting the signal again.

The Anti-Patterns That Quietly Kill Gates

The aspirational gate that exists in warning-only mode forever. Delete it or fix it.
The swiss army gate that bundles everything into one slow serial job. Parallelize.
The override culture where the bypass button is used weekly. Instrument overrides; expect fewer than one a month on a healthy team.

“Gate strictness is a dial, not a switch. Run a monthly gate review: classify every firing as true or false positive; tune any rule with over ten percent false positives. Do not add a new rule in response to a single incident — wait for a second occurrence to prove the class exists.”

The Reviewer Agent Pattern

A reviewer agent, run independently of the author agent, reads the diff and emits comments. Give it a different model when possible — different failure modes approximate classical pair review. Reviewer agents catch consistency violations across large diffs and Context Pack contradictions. They do not replace humans on product taste.

⚠ What not to do: Do not ship without types where the language supports them. Do not rely on code coverage as a quality signal. Do not let the author agent silence its own lint violations. Do not combine reviewer and author agents into one run. Do not let legacy exceptions accumulate without a retirement budget. Each of these failures invalidates a layer of the stack; each has cost a team I’ve worked with a production incident.

Next up — Chapter 11: The Art of Agentic Debugging. Gates catch the structural problems. But the interesting bugs in AI-native systems are the ones that slip through gates and then don’t reproduce locally. Chapter 11 is about the self-correction loop, bisection under velocity, observability as substrate, and the incident debugging practices that scale to AI-native throughput.

📖 Want the full picture?

The chapter walks each layer in depth with concrete tool recommendations and configuration snippets, the flaky-pipeline case study (30% to sub-5% flake rate), the regulated fintech rollout over a quarter with month-by-month sequencing, the full anti-pattern catalog, and the reviewer agent pattern with the model-diversity configuration that catches bugs a single model would miss.

Get The Engineering of Intent on Amazon →

2026-04-26

engineering-of-intent

vibe-coding

ai-native-development

quality-gates

ci-cd

testing

book-series

Sho Shimoda

I share and organize what I’ve learned and experienced.

カテゴリー

Computation & Mathematical Systems

Art of Coding

Frictionless SaaS

OpenId

Master Claude Chat - Cowork - Code

Microsoft Teams Bots

Vibe Coding

OpenClaw

検索ログ

Hello World bot 1004 Deploy Teams bot to Azure 945 IT assistant bot 930 Microsoft Bot Framework 906 Azure CLI webapp deploy 858 Teams bot development 823 Teams production bot 823 Adaptive Card Action.Submit 821 Microsoft Teams Task Modules 810 Bot Framework Adaptive Card 806 bot for sprint updates 806 Bot Framework example 805 C 803 Microsoft Graph 803 Teams app zip 803 Adaptive Cards 792 Graph API token 792 Bot Framework proactive messaging 790 Task Modules 790 Zendesk Teams integration 790 Teams bot packaging 789 Teams chatbot 789 Azure Bot Services 783 Teams bot tutorial 783 Azure App Service bot 780 Bot Framework CLI 780 Azure bot registration 772 Bot Framework prompts 767 ServiceNow bot 760 proactive messages 730

Development & Technical Consulting

Working on a new product or exploring a technical idea? We help teams with system design, architecture reviews, requirements definition, proof-of-concept development, and full implementation. Whether you need a quick technical assessment or end-to-end support, feel free to reach out.