How the AI works

A deeper look at the orchestrator and the specialized AI agents behind a BugBrain test run — RECON, PLANNER, ACTOR, VERIFIER, CRITIC, TRIAGER, and CURATOR — and how their reasoning surfaces in the action log.

A BugBrain test run is not a single giant prompt to one model. It is a coordinated team of specialized AI agents, each with one clear job, working under an orchestrator that decides who does what and when. This division of labor is what lets a run stay accurate, stay on budget, and — crucially — stay inspectable: every decision is logged so you can read exactly why the agent did what it did.

The agents and their roles#

Think of a run like a careful QA tester broken into focused specialists. The orchestrator drives the loop and passes context between them:

  • RECON maps the surface area. It looks at the app and figures out what's reachable — the pages, screens, and key interactions — so the rest of the run knows the territory.
  • PLANNER prioritizes flows. Given the mapped surface area, it decides which user journeys are worth exercising first (sign-up, checkout, search) so effort goes where it matters.
  • ACTOR drives the browser. It carries out a flow step by step in a real browser — clicking, typing, navigating — the way a person would.
  • VERIFIER checks pre- and post-conditions. Before and after each step it confirms the app is in the state it should be, so a "success" is actually a success and not a silent failure.
  • CRITIC judges failures. When something looks wrong, it weighs the evidence and decides whether it's a genuine problem or just noise.
  • TRIAGER finds root cause. For confirmed failures it reasons about why the failure happened, not just that it did.
  • CURATOR synthesizes the findings. It turns the run's raw observations into clean issues, removes duplicates, and updates test cases so the next run is smarter.

Each agent has a narrow remit on purpose. A single model asked to do everything tends to be overconfident and hard to debug. Several focused agents, each checked by the next, produce results you can trust and trace.

What wraps the loop#

Around this agent loop sits a layer of infrastructure that keeps runs safe, affordable, and honest:

  • A provider router picks the right AI model for each agent and plan tier, and falls back to an alternative if a provider is unavailable — so you never pick a model by hand.
  • Safety checks constrain what the agent is allowed to do and where it's allowed to go, so a run stays inside your app and inside its budget.
  • Oracles are the judgment functions that decide whether an outcome is correct — covering things like accessibility, data consistency, and visual regressions.
  • Rate limiting and circuit breakers protect both your app and the AI providers from being overwhelmed.
  • Memory lets the run carry context forward — what it has already seen, which elements it has located, what it has already learned about the app.

You don't configure these directly. They run quietly so that every run behaves predictably.

Why it matters: the action log#

Because the work is split across named agents, BugBrain can show you the reasoning behind a result. The AI action log in the run viewer records each step as the agent took it — what it observed on the page, what it decided to do next, and what it saw happen as a result.

That means a finding is never a black box. If a run found nothing on a page you care about, the log shows whether the agent ever reached it. If a flow failed, the log shows the exact step that broke and what the agent expected instead. This traceability is the practical payoff of the multi-agent design.

Read the log when a result surprises you

The fastest way to understand any run — a surprising pass, a confusing failure, or an empty result — is to open the run viewer and read the AI action log alongside the timeline screenshots. The two together tell the whole story.

Frequently asked questions

Is BugBrain just one big model prompt?

No. A run is coordinated by an orchestrator that hands work to several specialized agents, each with a focused job — mapping the app, choosing flows, driving the browser, verifying outcomes, judging failures, and synthesizing issues. Splitting the work keeps each step accurate and inspectable.

Can I see what the AI was thinking during a run?

Yes. The AI action log in the run viewer records each agent's reasoning step by step — what it observed, what it decided, and what it did — so a result is never a black box.

Which AI model does BugBrain use?

A provider router picks an appropriate model for each agent and plan tier. You don't choose models per run; BugBrain routes work to the right model and falls back safely if a provider is unavailable.

Does the AI ever guess at a result?

No. Safety checks, outcome oracles, and a confidence layer wrap the loop so the agent reports INCONCLUSIVE when it can't be sure, rather than guessing a pass. See Test-run scoring.