Tracks agent statuses in service-states.json alongside services.
Sends critical alert when agent status changes to failure (includes
result/error message from last-run.json). Sends recovery alert on
failure → success transition.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
HTTP service reachability checks for 12 services + agent watchdog.
Writes status dashboard to /opt/sites/varys/ and last-run.json for Citadel.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>