Skip to content

Alerting

Configure alerts for rninja infrastructure.

Alert Conditions

Condition Severity Action
Cache hit rate < 50% Warning Investigate cache misses
Cache server down Critical Restart server
Disk > 90% full Warning Run GC
Build timeout Error Check build logs

Simple Monitoring Script

#!/bin/bash
# /etc/cron.hourly/rninja-alerts

# Check cache health
if ! rninja -t cache-health > /dev/null 2>&1; then
    echo "rninja cache unhealthy" | mail -s "Alert: rninja" [email protected]
fi

# Check disk usage
USAGE=$(df ~/.cache/rninja | tail -1 | awk '{print $5}' | tr -d '%')
if [ "$USAGE" -gt 90 ]; then
    echo "rninja cache disk usage: ${USAGE}%" | mail -s "Warning: rninja" [email protected]
    rninja -t cache-gc
fi

Prometheus Alertmanager

# alertmanager.yml
route:
  receiver: 'ops-team'

receivers:
  - name: 'ops-team'
    email_configs:
      - to: '[email protected]'

PagerDuty Integration

For critical alerts:

receivers:
  - name: 'pagerduty'
    pagerduty_configs:
      - service_key: '<key>'