Skip to content
Blog

Engineering notes on agentic QA

How we actually build, operate, and trust agentic test fleets in production — written from inside live engagements, with the numbers on.

Field Notes · Agentic QA Strategy

Data Privacy and Agentic AI in Testing: Aligning Your QA Stack With UK ICO Guidance

9 min read

Data privacy is the top barrier to agentic AI in QA, cited by 67%. How to align your agentic AI testing stack with the UK ICO's direction of travel.

Read the post
Engineering Notes · Agentic QA

Don't Break Checkout: Agentic QA for Revenue-Critical Retail Funnels

12 min read

On a retail site the checkout funnel is revenue, and it's fragile — payment gateways you can't hit for real, inventory and pricing that change constantly, promo-rule combinatorics, and silent funnel regressions that still pass. Here's how an agent protects the path to purchase.

Read the post
Field Notes · Agentic QA Strategy

Self-Healing Mobile Test Automation in CI: What Actually Works for iOS, Android, and React Native

10 min read

Self-healing mobile test automation in CI is harder than web. What actually works for iOS, Android, and React Native in 2026: architecture and feedback timing.

Read the post
Engineering Notes · Agentic QA

Ten Thousand Article Templates and Three Ad Networks: Agentic QA for Publishers

11 min read

Publisher sites render thousands of article permutations through a handful of templates, gate content behind paywalls, and load third-party ad scripts that break layout and Core Web Vitals. Here's how an agent tests content at scale — permutations, metering personas, ad determinism, and structured-data correctness.

Read the post
Field Notes · Agentic QA Strategy

What the Big 4's Agentic Testing Playbook Gets Wrong for UK Mid-Market Teams

8 min read

Deloitte, KPMG, PwC and EY built agentic testing for eight-figure budgets. Why UK mid-market teams need a different model — and the leaner alternative.

Read the post
Engineering Notes · Agentic QA

The Atomic Developer: Maintaining Balance in the Age of AI Agents

10 min read

AI agents multiply your output overnight, but your attention doesn't. The workflows, gates and habits that keep agent-speed work sustainable, not burnout.

Read the post
Field Notes · Agentic QA Strategy

Hallucination, Flakiness, and Trust: How to Evaluate an Agentic AI Test Agent in 2026

9 min read

Agentic test agents are harder to buy than traditional tools. A vendor-neutral twelve-point checklist to evaluate an agentic AI test agent in 2026.

Read the post
Engineering Notes · Agentic QA

The CRM That's Different in Every Org: Agentic QA for Configurable Enterprise Apps

12 min read

Enterprise CRMs are configured differently in every org, render differently per role, and fire workflows you can't see. Selector-based tests can't keep up. Here's how an agent that reasons by intent tests a CRM — across personas, test data, integrations, and the side-effects that hide in passing runs.

Read the post
Field Notes · Agentic QA Strategy

Test Maintenance Is Eating Your QA Budget. Here's Where Self-Healing Actually Pays Back

8 min read

Test maintenance eats 60–80% of automation effort. Here's where self-healing test automation ROI is real, where it isn't, and the preconditions that decide.

Read the post
Engineering Notes · Agentic QA

The Self-Driving Codebase

18 min read

Running an AI coding agent as a near-autonomous engineering team for a year — the artefacts, eval gates, and adversarial review that make the autonomy safe.

Read the post
Engineering Notes · Agentic QA

Write Once, Break Twice: Agentic QA Across React Native's Two Runtimes

11 min read

One React Native codebase, two native runtimes, and bugs that show up on only one of them. Here's how we run an agent across iOS and Android together — the testID contract, bridge synchronization, and detecting silent platform divergence before users do.

Read the post
Field Notes · Agentic QA Strategy

Pilot Purgatory: Why Most Agentic AI QA Projects Stall Before Production

8 min read

Only 15% of organisations have scaled agentic AI in QA. Here's why teams stall taking an agentic AI QA pilot to production — and how UK software teams escape.

Read the post
Engineering Notes · Agentic QA

Espresso, UI Automator, and an Agent: Taming the Android Device Matrix

11 min read

Android's device matrix breaks deterministic test suites in ways iOS never does. Here's how we engineer an agent driving Espresso and UI Automator across fragmentation — state reset, OEM divergence, the resource-id contract, and classifying device-specific failures from real ones.

Read the post
Engineering Notes · Agentic QA

Pointing an Agent at XCUITest: The Seven Things That Decide Signal From Noise

11 min read

iOS is a harder target for agentic QA than the browser. Here's how we engineer the seven things that decide whether an agent driving XCUITest is signal or noise — state, network mocking, accessibility identifiers, screen actions, anomaly watching, versioned config, and the xcresult bundle.

Read the post
Engineering Notes · Agentic QA

Making an Agentic Test Run Boring: Determinism, Retries, and the Flake Budget

9 min read

Agentic tests fail in a different shape from traditional end-to-end tests. Here's how to engineer a flake budget, a failure taxonomy, and the determinism levers that actually move the number.

Read the post
Engineering Notes · Agentic QA

Evals Are the Test Suite for Your Test Suite: Running Agentic QA in Production

10 min read

Once you ship agentic QA, you have two systems that can regress — the product, and the agent. Most teams only instrument the first. Here's the eval harness, golden traces, and model-upgrade protocol that keep an agentic fleet honest in production.

Read the post