AI Failure Analysis

When a BugBrain test fails, the platform doesn’t just say “test failed”. Instead, the AI failure analysis pipeline automatically classifies why it failed and suggests concrete fixes.

This saves hours of manual debugging and helps your team fix issues faster.

How Failure Analysis Works

When a test fails, BugBrain automatically analyzes the failure:

Step 1: Failure Classification

BugBrain recognizes the failure pattern and categorizes it into one of 8 failure types (see below).

Example:

Error message: "element not found"
→ Classified as: selector_changed (Element moved or was redesigned)

Step 2: Detailed Explanation

BugBrain generates a human-readable explanation of what went wrong and how to fix it.

Example explanation:

What happened: The 'Sign In' button was not found

Why it failed: The button selector changed.
The expected selector no longer exists on the page.

How to fix:
1. Update to a more stable selector like [data-testid="signin-button"]
2. Ask developers to add data-testid for better test stability
3. Use a more specific element selector

Results are displayed in the Execution Detail page with visual evidence (screenshots, logs).

The 8 Failure Categories

1. `selector_changed` — UI Element Not Found

What it means: The test expected to find an element (button, input, etc.) but it doesn’t exist at the expected location.

Example error messages:

“element not found”
“failed to find element”
“locator did not resolve to any elements”
“can’t find element matching selector”

Why it happens:

UI was redesigned and selectors changed
Element is hidden (CSS display: none, visibility: hidden)
Element is inside a shadow DOM
Dynamic content hasn’t loaded yet
Wrong page (navigation failed)

How to fix:

Update selector to use stable attributes ([data-testid])
Add explicit waits before clicking
Ask developers to add data-testid attributes
Increase element_wait_timeout_ms (wait longer for element)

Example fix:

// ❌ Brittle (changed when designer renamed CSS class)
await page.click('.btn--primary');
 
// ✅ Stable (resistant to CSS changes)
await page.click('[data-testid="submit-button"]');

2. `api_error` — Backend Error (4xx or 5xx)

What it means: Your application made an API request and the server returned an error status code.

Example error messages:

“HTTP 500 Internal Server Error”
“HTTP 404 Not Found”
“HTTP 400 Bad Request”
“POST /api/orders failed with status 503”

Why it happens:

Backend bug or crash
Invalid request format
Missing required fields
Database error
Service temporarily down

How to fix:

Check server logs for the specific error
Verify test data is valid (email format, required fields)
Verify credentials are correct
Check if backend service is running
Retry the test (may be transient)

Example investigation:

Test: "Complete checkout"
Failed at: "Submit payment"
API Error: POST /api/payments → 400 Bad Request

Next step:
1. Check server logs for 400 error details
2. Verify payment amount and format
3. Verify payment method is valid

3. `timeout` — Test Took Too Long

What it means: The test didn’t complete within the time limit. Either the page loads slowly, or the AI took too long deciding what to do.

Example error messages:

“Timeout waiting for selector”
“Navigation timeout after 30000ms”
“Max iterations reached (50)”
“Test exceeded 300s time limit”

Why it happens:

Page loads slowly (large images, slow API)
External resources are slow (CDN, third-party scripts)
Test has too many steps (AI takes many iterations)
Infinite loop (page doesn’t transition to next step)
Network lag

How to fix:

Increase execution timeout (max_execution_time: 600)
Optimize page performance (reduce image sizes, lazy load)
Simplify test (fewer steps = fewer iterations)
Use more specific instructions (“Click the red button” vs “Click a button”)
Add explicit waits to help the AI understand the page state

Example fix:

// ❌ Vague (AI tries many actions)
"Click the button" (too many buttons on page)
 
// ✅ Specific (AI knows exactly what to do)
"Click the blue 'Submit' button at the bottom"

4. `assertion_mismatch` — Expected ≠ Actual

What it means: The test expected to see something on the page (text, value, URL) but saw something different.

Example error messages:

“Expected ‘Success’ but found ‘Error’”
“Expected page URL to contain ‘dashboard’ but got ‘login’”
“Expected input to have value ‘100’ but got ‘99’”

Why it happens:

Business logic changed (wrong value calculated)
Data state is wrong (seed data missing, old state persists)
Assertion is too strict (typo, case sensitivity)
Timing issue (assertion checked before data loaded)
API returned unexpected result

How to fix:

Verify test data is correct (use fresh data, reset state)
Check application logic (did developers change logic?)
Make assertion less strict if acceptable (“contains” vs exact match)
Add explicit wait before assertion
Check if previous test step failed silently

Example fix:

// ❌ Strict assertion (fails on whitespace, case)
assert page.contains("Success");
 
// ✅ Flexible assertion
assert page.includes("success") || page.includes("Success");

5. `environment_issue` — Network/Infrastructure Error

What it means: The browser couldn’t reach the server due to network or SSL issues. Not an application error—a connectivity problem.

Example error messages:

“ECONNREFUSED (connection refused)”
“DNS lookup failed”
“SSL certificate error”
“net::ERR_NETWORK_CHANGED”
“Failed to establish a new connection”

Why it happens:

Firewall blocking connection
DNS server is down or misconfigured
SSL/TLS certificate invalid or expired
Proxy misconfiguration
Server is offline
VPN connection dropped

How to fix:

Verify app URL is accessible (test in browser manually)
Check DNS resolution: nslookup app.example.com
Verify SSL certificate is valid: openssl s_client -connect app.example.com:443
Check firewall rules (allow HTTPS port 443)
If using a proxy, verify proxy configuration
Test from a different network (rule out local connectivity issues)
Check if server is running: ping app.example.com

6. `auth_failure` — Login Rejected

What it means: Authentication failed. The credentials were rejected or the login process encountered a permissions issue.

Example error messages:

“HTTP 401 Unauthorized”
“HTTP 403 Forbidden”
“Invalid credentials”
“Password incorrect”
“User not found”
“Session expired”

Why it happens:

Wrong credentials (user doesn’t exist or password is wrong)
Persona not set up correctly
User account is locked or disabled
Session expired (auth took too long)
MFA is blocking login
User lacks required permissions

How to fix:

Verify persona credentials are correct
Check if user account exists in the app
Check if account is active (not locked/disabled)
Verify MFA setup if 2FA is enabled
Check if user has required role/permissions
If session expired, increase timeout or use session caching
Review auth logs in the application

Example setup:

// Ensure persona is created with correct credentials
Persona: "QA Admin"
├─ Email: qa-admin@example.com
├─ Password: MySecurePass123!
├─ Auth Type: basic
└─ Status: Active ✓
 
// Then use in test
Use persona: "QA Admin"

7. `data_state_issue` — Required Data Missing

What it means: The test reached a state where expected data doesn’t exist. This is usually because the database state is wrong.

Example error messages:

“No items found in cart”
“User not found”
“Order ID not in database”
“Empty list when expecting 10 items”

Why it happens:

Test data wasn’t created (missing setup step)
Previous test deleted the data
Data expires automatically
Test ran on wrong environment (dev vs staging)
Database was reset between tests
Concurrent tests interfered with each other

How to fix:

Create required data before running test (setup fixtures)
Verify you’re testing on the right environment
Use fresh, unique data per test run (avoid state dependencies)
Run tests serially if they modify shared data
Add explicit data creation steps at test start

Example fix:

// ❌ Assumes product exists
"Navigate to product page"
"Add to cart"
 
// ✅ Ensures product exists
"Create product via API"
"Navigate to product page"
"Add to cart"

8. `flaky_infrastructure` — Rate Limiting, Service Degradation

What it means: The infrastructure is temporarily overloaded or rate limiting requests. Not a test bug—a temporary system issue.

Example error messages:

“HTTP 429 Too Many Requests”
“HTTP 503 Service Unavailable”
“Rate limit exceeded”
“Quota exceeded”
“Too many connections”

Why it happens:

Too many tests running concurrently
API rate limit hit (exceeded quota)
Service is degraded or overloaded
Database connection pool exhausted
Redis/cache is full

How to fix:

Reduce test concurrency (run fewer tests in parallel)
Add delays between test runs
Upgrade infrastructure (add more resources)
Implement backoff/retry logic (exponential backoff)
Wait for service to recover (check status page)
Check if rate limit was exceeded; upgrade API plan if needed
Stagger test plan runs (don’t all start at same time)

Reading the Failure Intelligence Panel

After a test fails, BugBrain displays the failure analysis in the Execution Detail page:

┌─────────────────────────────────────────────────────────────┐
│  FAILURE INTELLIGENCE                                       │
├─────────────────────────────────────────────────────────────┤
│  Category: selector_changed                                 │
│  Confidence: 95%  ████████░░                                │
│  Classification: Rule-based + LLM verified                  │
├─────────────────────────────────────────────────────────────┤
│  WHAT HAPPENED                                              │
│  The 'Checkout' button was not found on the page.           │
├─────────────────────────────────────────────────────────────┤
│  WHY IT FAILED                                              │
│  The button selector changed. The test was looking for      │
│  element with ID '#checkout-btn', but the element no longer│
│  has that ID. The button was likely redesigned.             │
├─────────────────────────────────────────────────────────────┤
│  HOW TO FIX                                                 │
│  1. Ask developers to add [data-testid='checkout-btn']      │
│  2. Update test step selector to: [data-testid='checkout'] │
│  3. Add explicit wait: wait for element visible             │
│  4. Increase timeout to 60 seconds                          │
├─────────────────────────────────────────────────────────────┤
│  [Re-trigger Analysis]  [Copy Failure Details]              │
└─────────────────────────────────────────────────────────────┘

Confidence Scoring

BugBrain assigns a confidence score (0–100%) to its failure category classification:

Confidence	Meaning	Action
95–100%	Very confident	Trust the diagnosis completely
80–94%	Confident	Trust the diagnosis with minor verification
70–79%	Somewhat confident	Diagnosis is likely, but double-check
< 70%	Low confidence	Manual investigation recommended

How confidence is calculated:

Rule-based match gets base score (72–95%)
Multiple matching patterns boost confidence (+5% each)
LLM review provides second opinion
Final score reflects combined confidence

Re-Triggering Analysis

If BugBrain’s analysis seems wrong, you can manually re-trigger analysis:

Open the failed execution
Click “Re-trigger Analysis” button
BugBrain re-analyzes using fresher LLM models
Updated analysis appears in 30 seconds

Useful if:

Recent code changes invalidated the analysis
Classification seems wrong
You want to use a different LLM model

Viewing Failure Trends

To see if failures are systematic or flaky:

Go to Test Case Detail
View Execution History
Filter by date range
Look at Failure Rate Over Time

Last 7 days:
  Mon: 1 pass, 0 fail (100%)
  Tue: 1 pass, 1 fail (50%)   ← Flaky?
  Wed: 0 pass, 1 fail (0%)    ← Broken?
  Thu: 1 pass, 0 fail (100%)
  Fri: 1 pass, 0 fail (100%)

→ Test is mostly passing but occasionally flaky
→ Likely timing issue or test data state

Cost of Failure Analysis

The failure analysis pipeline is intelligent about cost:

Rule-based classification — Free (< 1ms, no LLM)
LLM refinement — Only if low confidence (saves 80% of LLM cost)
LLM explanation — Slight cost (~$0.001–0.01 per failure)

Total cost per test failure: ~$0.001–0.02 (orders of magnitude cheaper than the test itself).

Best Practices

1. Use Stable Selectors

// ❌ Unstable
await page.click('.btn');  // CSS class can change
 
// ✅ Stable
await page.click('[data-testid="submit"]');  // Unlikely to change

2. Add Assertions Between Steps

// ❌ One big step
"Fill form and submit and verify success"
 
// ✅ Steps with assertions
"Fill email" → assert email field has value
"Fill password" → assert password field has value
"Click submit" → assert navigation to success page

3. Use Descriptive Step Names

// ❌ Vague
"Click"
 
// ✅ Descriptive
"Click the blue 'Checkout' button at the bottom of the cart"

4. Set Appropriate Timeouts

// ❌ Too strict (fails on slow pages)
timeout: 10 seconds
 
// ✅ Reasonable
timeout: 60 seconds (standard)
timeout: 120 seconds (slow pages)

Next Steps

Browser Automation — How tests are executed
Authentication Testing — Debug auth failures
Getting Started — Create your first test

Browser Automation Overview

AI Failure Analysis

How Failure Analysis Works

Step 1: Failure Classification

Step 2: Detailed Explanation

The 8 Failure Categories

1. selector_changed — UI Element Not Found

2. api_error — Backend Error (4xx or 5xx)

3. timeout — Test Took Too Long

4. assertion_mismatch — Expected ≠ Actual

5. environment_issue — Network/Infrastructure Error

6. auth_failure — Login Rejected

7. data_state_issue — Required Data Missing

8. flaky_infrastructure — Rate Limiting, Service Degradation

Reading the Failure Intelligence Panel

Confidence Scoring

Re-Triggering Analysis

Viewing Failure Trends

Cost of Failure Analysis

Best Practices

1. Use Stable Selectors

2. Add Assertions Between Steps

3. Use Descriptive Step Names

4. Set Appropriate Timeouts

Next Steps

1. `selector_changed` — UI Element Not Found

2. `api_error` — Backend Error (4xx or 5xx)

3. `timeout` — Test Took Too Long

4. `assertion_mismatch` — Expected ≠ Actual

5. `environment_issue` — Network/Infrastructure Error

6. `auth_failure` — Login Rejected

7. `data_state_issue` — Required Data Missing

8. `flaky_infrastructure` — Rate Limiting, Service Degradation