Initial Agent OS scaffolding — identity, brain, memory, infra-monitor skill
This commit is contained in:
@@ -0,0 +1,41 @@
|
||||
# NxM Agent OS
|
||||
|
||||
A personal agentic operating system built on plain markdown files. Tool-agnostic — works with Claude Code, Ollama, or any LLM harness. Based on the framework from the AI Daily Brief episode "How to Build a Personal Agentic Operating System" (Nufar Gaspar, 2026-04-25).
|
||||
|
||||
## How it works
|
||||
|
||||
Every agent interaction reads from and writes back to files in this repo. No databases, no APIs, no vendor lock-in. The files ARE the system.
|
||||
|
||||
## The seven layers
|
||||
|
||||
| Layer | File(s) | Purpose |
|
||||
|---|---|---|
|
||||
| Identity | `identity.md` | Who you are, communication style, values |
|
||||
| Context | `context/` | Dated, task-specific working files |
|
||||
| Brain | `brain.md` | Persistent facts — infra, people, decisions |
|
||||
| Memory | `memory/` | Short and long-term session notes |
|
||||
| Skills | `skills/` | Repeatable workflows, each self-improving |
|
||||
| Processes | `skills/*/context/handoff.md` | Output passed between chained skills |
|
||||
| Automation | cron on 172.27.40.3 | Scheduled skill execution |
|
||||
|
||||
## Adding a new skill
|
||||
|
||||
1. Create `skills/<skill-name>/skill.md` — what the skill does and how
|
||||
2. Create `skills/<skill-name>/learnings.md` — starts empty, fills over time
|
||||
3. Create `skills/<skill-name>/eval.json` — scoring criteria
|
||||
4. Add a cron job on 172.27.40.3 calling the skill
|
||||
5. The infra-monitor watchdog will automatically pick it up
|
||||
|
||||
## Runtime
|
||||
|
||||
- Files live on server: `/opt/agent-os/` (cloned from this repo)
|
||||
- LLM inference: Ollama at `http://172.27.6.139:11434`
|
||||
- Scheduled jobs: cron on `172.27.40.3`
|
||||
- Local editing: `/home/nxm/Documents/agent-os/` on Kubuntu (this machine)
|
||||
|
||||
## Infra reference
|
||||
|
||||
Cross-repo links to supporting documentation:
|
||||
- [IP & Port Map](https://git.nxm.co.za/admin/nxm-infrastructure/src/branch/main/Quick%20Reference/IP%20%26%20Port%20Map.md)
|
||||
- [Docker Stacks](https://git.nxm.co.za/admin/nxm-infrastructure/src/branch/main/Quick%20Reference/Docker%20Stacks.md)
|
||||
- [Network Overview](https://git.nxm.co.za/admin/nxm-infrastructure/src/branch/main/Infrastructure/Network%20Overview.md)
|
||||
@@ -0,0 +1,64 @@
|
||||
# Brain
|
||||
|
||||
Core facts read by all skills. Keep under 1000 words. Update when infrastructure changes.
|
||||
Last updated: 2026-04-30
|
||||
|
||||
---
|
||||
|
||||
## Infrastructure
|
||||
|
||||
**Primary server:** 172.27.40.3 — Ubuntu Server LTS, Docker host
|
||||
**Kubuntu desktop:** 172.27.6.139 — NxM-AI, runs Ollama
|
||||
**TrueNAS NAS:** 172.27.40.5
|
||||
**Firewall:** OPNsense at 172.27.6.1
|
||||
|
||||
**VLANs:**
|
||||
| VLAN | Name | Subnet |
|
||||
|---|---|---|
|
||||
| 40 | Servers40 | 172.27.40.0/24 |
|
||||
| 20 | Workshop20 | 172.27.20.0/24 |
|
||||
| 10 | IoT10 | 172.27.10.0/24 |
|
||||
|
||||
## Key Services (172.27.40.3)
|
||||
|
||||
| Service | Port | URL |
|
||||
|---|---|---|
|
||||
| Portainer | 9443 | https://172.27.40.3:9443 |
|
||||
| Nginx Proxy Manager | 80/81/443 | http://172.27.40.3:81 |
|
||||
| Uptime Kuma | 3002 | http://172.27.40.3:3002 |
|
||||
| Gitea | 3000 | https://git.nxm.co.za |
|
||||
| Headscale | 8080 | https://headscale.nxm.co.za |
|
||||
| Netbird | 3479/udp | https://netbird.nxm.co.za |
|
||||
| Vaultwarden | 8222 | https://vault.nxm.co.za |
|
||||
| Flowise | 3010 | http://172.27.40.3:3010 |
|
||||
| Plane | 8095 | https://plane.nxm.co.za |
|
||||
| Zabbix | 8091 | https://zabbix.nxm.co.za |
|
||||
| Homarr | 7575 | http://172.27.40.3:7575 |
|
||||
|
||||
## AI Stack
|
||||
|
||||
- **Ollama** on 172.27.6.139:11434 (bound to 0.0.0.0)
|
||||
- **Models:** gemma4, qwen2.5-coder:7b
|
||||
- **Flowise** on 172.27.40.3:3010 — visual agent/flow builder
|
||||
- **Claude Code** — primary AI assistant, runs on Kubuntu
|
||||
|
||||
## Agent OS Runtime
|
||||
|
||||
- Files: `/opt/agent-os/` on 172.27.40.3
|
||||
- Local edit path: `/home/nxm/Documents/agent-os/` on 172.27.6.139
|
||||
- Repo: `https://git.nxm.co.za/admin/agent-os`
|
||||
- Scheduled jobs: cron on 172.27.40.3
|
||||
- LLM calls: `http://172.27.6.139:11434`
|
||||
|
||||
## Key Paths on Server
|
||||
|
||||
- Docker stacks: `/opt/stacks/`
|
||||
- Agent OS: `/opt/agent-os/`
|
||||
|
||||
## Standing Decisions
|
||||
|
||||
- TrueNAS will move to a dedicated server — avoid hardcoding 172.27.40.5 in automation
|
||||
- NPM handles all SSL termination — internal services use HTTP, NPM adds HTTPS
|
||||
- NFS preferred for Linux-to-Linux file sharing
|
||||
- Docker Compose only (no Kubernetes)
|
||||
- All destructive actions require explicit confirmation before execution
|
||||
+19
@@ -0,0 +1,19 @@
|
||||
# Identity
|
||||
|
||||
> **Status: PENDING** — To be completed via Claude interview session.
|
||||
> Run the interview by saying: "Let's complete the Agent OS identity interview."
|
||||
|
||||
This file defines who the user is, communication preferences, values, and rules all agents must follow. Every skill reads this file before executing.
|
||||
|
||||
## What the interview will capture
|
||||
|
||||
- Professional role and responsibilities
|
||||
- Communication style preferences
|
||||
- Core values and priorities
|
||||
- Things agents should never do
|
||||
- How decisions should be escalated vs handled autonomously
|
||||
- Preferred output formats
|
||||
|
||||
---
|
||||
|
||||
*This section will be replaced with the completed identity profile after the interview.*
|
||||
@@ -0,0 +1,23 @@
|
||||
# Active Projects
|
||||
|
||||
Current in-flight work. Update at the end of each session.
|
||||
Last updated: 2026-04-30
|
||||
|
||||
---
|
||||
|
||||
## Agent OS — Phase 1 (NEXT)
|
||||
Complete the foundation before building skills.
|
||||
- [ ] Set up NFS export on 172.27.40.3 + mount on Kubuntu at /mnt/agent-os
|
||||
- [ ] Run identity interview with Claude → populate identity.md
|
||||
- [ ] Seed brain.md review and confirm accuracy
|
||||
- [ ] Clone this repo to /opt/agent-os/ on server
|
||||
|
||||
## Agent OS — Phase 3 (PENDING Phase 1)
|
||||
- [ ] Build infra-monitor skill
|
||||
- [ ] Set up cron schedule (hourly heartbeat, daily digest)
|
||||
- [ ] Wire up Home Assistant notifications
|
||||
|
||||
## Gitea documentation
|
||||
- [x] nxm-infrastructure repo — Obsidian vault imported
|
||||
- [x] nexum-projects repo — Obsidian vault imported
|
||||
- [x] agent-os repo — scaffolding created
|
||||
@@ -0,0 +1,13 @@
|
||||
# Constraints
|
||||
|
||||
Hard limits agents must respect. Never work around these without explicit user confirmation.
|
||||
Last updated: 2026-04-30
|
||||
|
||||
---
|
||||
|
||||
- Never take destructive or irreversible action without explicit confirmation (delete, overwrite, drop, reset, force push)
|
||||
- Never store credentials in output files, logs, or generated markdown — reference their location instead
|
||||
- Never skip git hooks or bypass signing
|
||||
- TrueNAS (172.27.40.5) is being migrated to a new server — do not create hard dependencies on that IP
|
||||
- Linux server (172.27.40.3) has no GPU — never schedule LLM inference to run locally there
|
||||
- Docker Compose only — no Kubernetes, no Swarm
|
||||
@@ -0,0 +1,8 @@
|
||||
# Notes from Last Run
|
||||
|
||||
Populated automatically at the end of each skill run. Cleared at the start of each new session.
|
||||
Last updated: —
|
||||
|
||||
---
|
||||
|
||||
*No runs yet — Agent OS not yet deployed.*
|
||||
@@ -0,0 +1,18 @@
|
||||
# Persistent Memory
|
||||
|
||||
Facts that don't expire. If you'd have to re-explain it to a new agent every time, it belongs here.
|
||||
Last updated: 2026-04-30
|
||||
|
||||
---
|
||||
|
||||
## Infrastructure decisions
|
||||
- RustDesk is self-hosted on 172.27.40.3 — clients connect to local server not public relay
|
||||
- Netbird client management is on port 8443 via Caddy sidecar, NOT port 443
|
||||
- Headscale v0.28: all write operations require numeric user ID, not username
|
||||
- Tailscale on Windows overrides DNS — disconnect before testing split DNS changes
|
||||
- Servers running Tailscale must run `sudo tailscale set --accept-dns=false` before joining Netbird
|
||||
|
||||
## Agent OS build state
|
||||
- Phase 1-2 (file structure + NFS + identity interview): not yet started
|
||||
- First skill to build: infra-monitor (Docker health + agent watchdog)
|
||||
- Notifications target: Home Assistant at 172.27.10.6
|
||||
@@ -0,0 +1,11 @@
|
||||
# Recent Decisions
|
||||
|
||||
Decisions made in the last 30 days that affect current work. Archive when no longer relevant.
|
||||
Last updated: 2026-04-30
|
||||
|
||||
---
|
||||
|
||||
- **2026-04-30:** Chose Gitea (self-hosted git) over Obsidian for documentation — AI-writable, browser-accessible, version controlled
|
||||
- **2026-04-30:** Agent OS files to live on 172.27.40.3 at /opt/agent-os/, accessed from Kubuntu via NFS
|
||||
- **2026-04-29:** Chose Syncthing-free approach for Obsidian migration — NFS for Linux, SMB for Windows
|
||||
- **2026-04-29:** infra-monitor will be first Agent OS skill — covers Docker health and agent watchdog in one skill
|
||||
@@ -0,0 +1,5 @@
|
||||
# Handoff: infra-monitor → notification
|
||||
|
||||
Populated by infra-monitor when anomalies are found. Read by the notification skill (future).
|
||||
|
||||
*Empty — no anomalies from last run, or skill has not run yet.*
|
||||
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"criteria": [
|
||||
{ "name": "all_services_checked", "weight": 0.3, "description": "Every expected container and service was checked, none skipped" },
|
||||
{ "name": "clear_status_summary", "weight": 0.3, "description": "Output leads with a plain-English summary line before detail" },
|
||||
{ "name": "actionable_findings", "weight": 0.2, "description": "Any warning or critical item includes enough detail to act on immediately" },
|
||||
{ "name": "agent_watchdog_complete", "weight": 0.2, "description": "All skills in /skills/ were checked for staleness and errors" }
|
||||
]
|
||||
}
|
||||
@@ -0,0 +1,3 @@
|
||||
# Last Output: infra-monitor
|
||||
|
||||
*Not yet populated — skill has not run.*
|
||||
@@ -0,0 +1,12 @@
|
||||
# Learnings: infra-monitor
|
||||
|
||||
Updated automatically after each run. The skill reads this before executing to improve its next output.
|
||||
|
||||
## What has worked well
|
||||
*Not yet populated — skill has not run.*
|
||||
|
||||
## What missed the mark
|
||||
*Not yet populated — skill has not run.*
|
||||
|
||||
## Adjustments for next run
|
||||
*Not yet populated — skill has not run.*
|
||||
@@ -0,0 +1,58 @@
|
||||
# Skill: infra-monitor
|
||||
|
||||
Monitors server health and watches all Agent OS skills for staleness or errors. Runs on a cron schedule on 172.27.40.3.
|
||||
|
||||
## Inputs
|
||||
|
||||
Reads before executing:
|
||||
- `../../identity.md`
|
||||
- `../../brain.md`
|
||||
- `../../memory/persistent.md`
|
||||
- `learnings.md` (this skill's improvement notes)
|
||||
|
||||
## What to check
|
||||
|
||||
### Docker health (on 172.27.40.3)
|
||||
- All expected containers are running (not exited/restarting)
|
||||
- Flag any container that has restarted more than 3 times in the last hour
|
||||
- Expected containers: portainer, nginx-proxy-manager, uptime-kuma, gitea, headscale, netbird, vaultwarden, flowise, plane, zabbix, homarr
|
||||
|
||||
### Service reachability
|
||||
Lightweight HTTP check (curl, timeout 5s) on each internal URL:
|
||||
- http://172.27.40.3:9443 (Portainer)
|
||||
- http://172.27.40.3:3002 (Uptime Kuma)
|
||||
- http://172.27.40.3:3000 (Gitea)
|
||||
- http://172.27.40.3:3010 (Flowise)
|
||||
- http://172.27.40.3:7575 (Homarr)
|
||||
- http://172.27.6.139:11434 (Ollama)
|
||||
|
||||
### Agent watchdog
|
||||
For each skill directory under `../../skills/`:
|
||||
- Check `last-output.md` modification time — flag if older than expected schedule
|
||||
- Check `../../logs/<skill-name>/` for ERROR entries in last run
|
||||
- Report: healthy / stale / erroring
|
||||
|
||||
### System resources (on 172.27.40.3)
|
||||
- Disk usage on / — warn if >80%, critical if >90%
|
||||
- Memory usage — flag if >85%
|
||||
|
||||
## Output
|
||||
|
||||
Write a digest to `last-output.md` in this format:
|
||||
- Summary line: X healthy, Y warnings, Z critical
|
||||
- Section per category: Docker, Services, Agent Watchdog, System
|
||||
- Each item: ✓ OK / ⚠ Warning / ✗ Critical + one line detail
|
||||
|
||||
Pass anomalies to `context/handoff.md` for notification skill (future).
|
||||
|
||||
## Wrap-up
|
||||
|
||||
After writing output:
|
||||
1. Update `learnings.md` with anything that went wrong or could be improved
|
||||
2. Append a one-line log entry to `../../logs/infra-monitor.log`: `YYYY-MM-DD HH:MM | status | summary`
|
||||
3. Update `../../memory/notes-from-last-run.md`
|
||||
|
||||
## Schedule
|
||||
|
||||
- **Heartbeat:** every hour — checks Docker + Ollama only (fast, <30s)
|
||||
- **Full digest:** daily at 07:00 — all checks
|
||||
Reference in New Issue
Block a user