Queue Configuration¶
Configure the job queue for workflow execution.
Overview¶
The job queue manages asynchronous workflow executions. When you execute a workflow asynchronously, a job is created, queued, and processed by worker threads.
Supported Queue Types¶
| Type | Persistence | Use Case |
|---|---|---|
memory |
No | Development, testing |
sqlite |
Yes | Production (single-node) |
SQLite Queue (Default)¶
Persistent queue with SQLite storage:
| Setting | Default | Description |
|---|---|---|
sqlite.path |
~/.m9m/data/queue.db |
Queue database path |
Benefits¶
- Jobs survive server restarts
- No external dependencies
- Recovery of failed jobs
CLI Override¶
Memory Queue¶
Fast in-memory queue (jobs lost on restart):
Benefits¶
- Fastest performance
- No disk I/O
Limitations¶
- Jobs lost on restart
- Not suitable for production
CLI Override¶
Worker Configuration¶
| Setting | Default | Description |
|---|---|---|
max_workers |
4 |
Concurrent workflow executions |
CLI Override¶
Worker Sizing¶
| Scenario | Workers |
|---|---|
| Development | 2-4 |
| Light production | 4-8 |
| Heavy production | 8-16 |
| CPU-bound workflows | Match CPU cores |
| I/O-bound workflows | 2-4x CPU cores |
Job Retry Configuration¶
execution:
retry:
enabled: true
max_attempts: 3
backoff: "exponential"
initial_interval: "1s"
max_interval: "5m"
| Setting | Default | Description |
|---|---|---|
enabled |
true |
Enable automatic retry |
max_attempts |
3 |
Maximum retry attempts |
backoff |
exponential |
Backoff strategy |
initial_interval |
1s |
First retry delay |
max_interval |
5m |
Maximum retry delay |
Backoff Strategies¶
| Strategy | Behavior |
|---|---|
linear |
Constant delay between retries |
exponential |
Doubling delay (1s, 2s, 4s, ...) |
Environment Variables¶
Job Lifecycle¶
Monitoring¶
Queue Stats API¶
Response:
{
"pending": 5,
"running": 2,
"completed": 1500,
"failed": 23,
"workers": 4,
"queueType": "sqlite"
}
Prometheus Metrics¶
# HELP m9m_queue_pending_jobs Number of pending jobs
m9m_queue_pending_jobs 5
# HELP m9m_queue_running_jobs Number of running jobs
m9m_queue_running_jobs 2
# HELP m9m_queue_workers Number of active workers
m9m_queue_workers 4
Recovery¶
SQLite Queue Recovery¶
On server restart, the SQLite queue:
- Loads pending jobs from database
- Requeues jobs that were running (may have been interrupted)
- Resumes processing
Manual Recovery¶
View stuck jobs:
Best Practices¶
Production Setup¶
queue:
type: "sqlite"
max_workers: 8
sqlite:
path: "/data/queue.db"
execution:
retry:
enabled: true
max_attempts: 3
backoff: "exponential"
High Throughput¶
Resource Constrained¶
Troubleshooting¶
Jobs Not Processing¶
Check workers are running:
If workers: 0, restart the server.
Jobs Stuck in Pending¶
Check for errors in logs:
Queue Database Corruption¶
Recreate queue database:
Note: This loses pending jobs.