Troubleshooting¶
Common issues and how to fix them.
Intervention Points¶
Brat surfaces problems via brat status. When intervention is needed, you'll see:
Interventions needed:
- stuck_session: s-20250121-abc1 missed heartbeat for 5m
Actions:
brat session tail s-20250121-abc1 --lines 200
brat session stop s-20250121-abc1
brat witness run --once
Common Issues¶
Stuck Session¶
Symptoms: Session shows no heartbeat for 5+ minutes.
Diagnosis:
Solutions:
- Wait a bit longer (API rate limits can cause delays)
- Stop and retry:
- Check engine health and API keys
Blocked Task¶
Symptoms: Task stuck in status:blocked.
Diagnosis:
Solutions:
- Add missing context:
- Reassign to a different agent:
- Requeue the task (stop current session first)
Merge Failed¶
Symptoms: Task shows merge:failed.
Diagnosis:
Solutions:
- Retry the merge:
- Check for merge conflicts manually
- Fix conflicts in the task branch
Lock Conflict¶
Symptoms: Operations blocked by existing locks.
Diagnosis:
Solutions:
- Wait for the lock holder to finish
- Coordinate with the lock holder
- Force release (if safe):
Config Error¶
Symptoms: Commands fail with config validation errors.
Diagnosis:
Solutions:
- Check
.brat/config.tomlfor syntax errors - Remove unknown keys
- Fix invalid values
Daemon Down¶
Symptoms: Commands fail to connect to daemon.
Diagnosis:
Solutions:
- Start the daemon:
- Use standalone mode:
- Check daemon logs:
Projection Drift¶
Symptoms: Inconsistent state, stale data.
Diagnosis:
Solutions:
Health Checks¶
Quick Check¶
Reports issues with:
- Grite WAL consistency
- Lock state
- Session health
- Configuration
Full Rebuild¶
Rebuilds all local projections from the WAL.
Common Error Messages¶
"No such convoy"¶
The convoy ID doesn't exist or was deleted.
"Engine spawn timeout"¶
The AI engine took too long to start.
Solutions:
- Check API credentials
- Increase timeout in config:
"Lock acquisition failed"¶
Another process holds a conflicting lock.
"WAL append failed"¶
Failed to write to the Grite event log.
Solutions:
- Check disk space
- Check Git permissions
- Run
grite doctor --fix
Getting Logs¶
Daemon Logs¶
Session Logs¶
Verbose Output¶
Reporting Issues¶
If you encounter a bug:
-
Gather diagnostic info:
-
Report at github.com/neul-labs/brat/issues
Include:
- What you were trying to do
- The error message
- Diagnostic output
- Steps to reproduce
Thresholds¶
Default intervention thresholds (configurable in .brat/config.toml):
| Threshold | Default | Config Key |
|---|---|---|
| Heartbeat interval | 30s | interventions.heartbeat_interval_ms |
| Stale session | 5m | interventions.stale_session_ms |
| Blocked task escalation | 24h | interventions.blocked_task_ms |
| Merge retry limit | 2 | refinery.merge_retry_limit |
Adjust for your workflow: