Discovery API
The Discovery API allows you to programmatically crawl web applications, detect user flows, and identify UI patterns and API endpoints.
Endpoints
Start Discovery Session
Begin a web application discovery crawl
POST /discovery/startRequest Body:
{
"project_id": "proj_abc123",
"target_url": "https://example.com",
"max_pages": 100,
"max_depth": 5,
"include_patterns": ["/app/*", "/dashboard/*"],
"exclude_patterns": ["/api/*", "/admin/*"],
"headless": true,
"wait_for_navigation": true,
"enable_ai_analysis": false
}Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
project_id | string | Yes | Project ID |
target_url | string | Yes | Starting URL |
max_pages | integer | No | Maximum pages to crawl (default: 100) |
max_depth | integer | No | Maximum link depth (default: 5) |
include_patterns | array | No | URL patterns to include (glob patterns) |
exclude_patterns | array | No | URL patterns to exclude |
headless | boolean | No | Run browser headless (default: true) |
wait_for_navigation | boolean | No | Wait for navigation events (default: true) |
enable_ai_analysis | boolean | No | Enable AI pattern analysis (default: false) |
Example Request:
curl -X POST https://api.bugbrain.tech/api/v1/discovery/start \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"project_id": "proj_abc123",
"target_url": "https://example.com",
"max_pages": 50,
"exclude_patterns": ["/admin/*", "/api/*"]
}'Response:
{
"session_id": "disc_abc123xyz789",
"project_id": "proj_abc123",
"target_url": "https://example.com",
"status": "running",
"started_at": "2025-03-08T10:30:00Z",
"estimated_completion": "2025-03-08T10:40:00Z"
}Get Discovery Progress
Monitor ongoing crawl progress
GET /discovery/{session_id}/progressExample Request:
curl https://api.bugbrain.tech/api/v1/discovery/disc_abc123xyz789/progress \
-H "Authorization: Bearer $API_KEY"Response:
{
"session_id": "disc_abc123xyz789",
"status": "running",
"progress": {
"percent": 35,
"pages_discovered": 35,
"pages_crawled": 35,
"pages_pending": 65,
"flows_detected": 3,
"patterns_found": 12,
"api_endpoints": 8
},
"current_page": "https://example.com/products",
"elapsed_seconds": 120,
"estimated_remaining_seconds": 240
}Get Discovery Results
Retrieve completed crawl results
GET /discovery/{session_id}/resultsExample Request:
curl https://api.bugbrain.tech/api/v1/discovery/disc_abc123xyz789/results \
-H "Authorization: Bearer $API_KEY"Response:
{
"session_id": "disc_abc123xyz789",
"status": "completed",
"summary": {
"total_pages": 45,
"unique_flows": 5,
"ui_patterns": 12,
"api_endpoints": 8,
"discovery_confidence": 0.92
},
"pages": [
{
"url": "https://example.com/",
"title": "Home",
"status_code": 200,
"content_type": "text/html",
"response_time_ms": 250,
"links_found": 12,
"forms_found": 2,
"api_calls": 3
}
],
"flows": [
{
"flow_id": "flow_1",
"type": "authentication",
"pages": [
"https://example.com/login",
"https://example.com/dashboard"
],
"steps": [
{"action": "navigate", "url": "/login"},
{"action": "fill_form", "selector": "#login-form"},
{"action": "submit"},
{"action": "navigate", "url": "/dashboard"}
]
},
{
"flow_id": "flow_2",
"type": "checkout",
"pages": [
"https://example.com/products",
"https://example.com/cart",
"https://example.com/checkout",
"https://example.com/confirmation"
]
}
],
"patterns": [
{
"pattern_id": "pat_1",
"type": "form",
"occurrences": 8,
"average_fields": 5,
"examples": ["login-form", "search-form", "contact-form"]
},
{
"pattern_id": "pat_2",
"type": "navigation",
"location": "header",
"links": 6
}
],
"api_endpoints": [
{
"method": "GET",
"endpoint": "/api/products",
"response_time_ms": 125,
"status_code": 200
},
{
"method": "POST",
"endpoint": "/api/cart/add",
"response_time_ms": 200,
"status_code": 201
}
]
}Pause Discovery
Temporarily pause a running crawl
POST /discovery/{session_id}/pauseExample Request:
curl -X POST https://api.bugbrain.tech/api/v1/discovery/disc_abc123xyz789/pause \
-H "Authorization: Bearer $API_KEY"Response:
{
"session_id": "disc_abc123xyz789",
"status": "paused",
"pages_crawled": 30,
"paused_at": "2025-03-08T10:35:00Z"
}Resume Discovery
Resume a paused crawl
POST /discovery/{session_id}/resumeExample Request:
curl -X POST https://api.bugbrain.tech/api/v1/discovery/disc_abc123xyz789/resume \
-H "Authorization: Bearer $API_KEY"Response:
{
"session_id": "disc_abc123xyz789",
"status": "running",
"resumed_at": "2025-03-08T10:36:00Z"
}Cancel Discovery
Stop and cancel a crawl session
DELETE /discovery/{session_id}Example Request:
curl -X DELETE https://api.bugbrain.tech/api/v1/discovery/disc_abc123xyz789 \
-H "Authorization: Bearer $API_KEY"Response:
{
"session_id": "disc_abc123xyz789",
"status": "cancelled",
"cancelled_at": "2025-03-08T10:36:00Z",
"pages_crawled": 30
}Stream Discovery Events
Real-time event stream (Server-Sent Events)
GET /discovery/{session_id}/streamUsage:
const eventSource = new EventSource(
'https://api.bugbrain.tech/api/v1/discovery/disc_abc123xyz789/stream',
{
headers: {
'Authorization': `Bearer ${apiKey}`
}
}
);
eventSource.addEventListener('page_discovered', (event) => {
const page = JSON.parse(event.data);
console.log(`Discovered: ${page.url}`);
});
eventSource.addEventListener('flow_detected', (event) => {
const flow = JSON.parse(event.data);
console.log(`Flow detected: ${flow.type}`);
});
eventSource.addEventListener('completion', (event) => {
console.log('Discovery completed');
eventSource.close();
});Discovery Configuration
Include/Exclude Patterns
Use glob patterns to control which URLs are crawled:
{
"include_patterns": [
"/app/*", // Match /app/anything
"/dashboard/**", // Match /dashboard/anything/recursively
"*.pdf" // Match any PDF
],
"exclude_patterns": [
"/api/*", // Exclude API endpoints
"/admin/**", // Exclude admin section
"*test*", // Exclude URLs with "test"
"*.zip" // Exclude zip files
]
}Rate Limiting
Discovery crawls are rate-limited to avoid overwhelming target servers:
- Default: 1 request per 500ms
- Configurable: Via
request_delay_msparameter - Respect robots.txt: Automatically honored
Flow Detection Types
| Type | Description | Example Pages |
|---|---|---|
authentication | Login/logout flows | login → dashboard |
checkout | Purchase flows | cart → checkout → confirmation |
search | Search and filter flows | search → results |
form_submission | Form interactions | form page → confirmation |
navigation | Navigation patterns | menu → subpages |
crud | Create/Read/Update/Delete | list → detail → edit → delete |
Example: Crawl and Generate Test Cases
Python:
import requests
import time
API_KEY = 'bugbrain_sk_prod_...'
API_URL = 'https://api.bugbrain.tech/api/v1'
headers = {'Authorization': f'Bearer {API_KEY}'}
# Start discovery
response = requests.post(
f'{API_URL}/discovery/start',
headers=headers,
json={
'project_id': 'proj_abc123',
'target_url': 'https://example.com',
'max_pages': 50,
'enable_ai_analysis': False
}
)
session_id = response.json()['session_id']
print(f"Started discovery: {session_id}")
# Poll until complete
while True:
status = requests.get(
f'{API_URL}/discovery/{session_id}/progress',
headers=headers
).json()
if status['status'] == 'completed':
break
progress = status['progress']
print(f"Progress: {progress['percent']}% - "
f"{progress['pages_discovered']} pages, "
f"{progress['flows_detected']} flows")
time.sleep(5)
# Get results
results = requests.get(
f'{API_URL}/discovery/{session_id}/results',
headers=headers
).json()
# Generate test cases from flows
for flow in results['flows']:
test_case = {
'project_id': 'proj_abc123',
'name': f"{flow['type'].title()} Flow",
'description': f"Auto-generated from discovery",
'steps': convert_flow_to_steps(flow)
}
requests.post(
f'{API_URL}/test-cases',
headers=headers,
json=test_case
)
print(f"Created {len(results['flows'])} test cases from discovered flows")Cost Optimization: Discovery crawls without AI analysis are much faster and cheaper. Enable enable_ai_analysis: true only when you need AI-generated flow descriptions.