Initial Agent OS scaffolding — identity, brain, memory, infra-monitor skill
This commit is contained in:
@@ -0,0 +1,41 @@
|
|||||||
|
# NxM Agent OS
|
||||||
|
|
||||||
|
A personal agentic operating system built on plain markdown files. Tool-agnostic — works with Claude Code, Ollama, or any LLM harness. Based on the framework from the AI Daily Brief episode "How to Build a Personal Agentic Operating System" (Nufar Gaspar, 2026-04-25).
|
||||||
|
|
||||||
|
## How it works
|
||||||
|
|
||||||
|
Every agent interaction reads from and writes back to files in this repo. No databases, no APIs, no vendor lock-in. The files ARE the system.
|
||||||
|
|
||||||
|
## The seven layers
|
||||||
|
|
||||||
|
| Layer | File(s) | Purpose |
|
||||||
|
|---|---|---|
|
||||||
|
| Identity | `identity.md` | Who you are, communication style, values |
|
||||||
|
| Context | `context/` | Dated, task-specific working files |
|
||||||
|
| Brain | `brain.md` | Persistent facts — infra, people, decisions |
|
||||||
|
| Memory | `memory/` | Short and long-term session notes |
|
||||||
|
| Skills | `skills/` | Repeatable workflows, each self-improving |
|
||||||
|
| Processes | `skills/*/context/handoff.md` | Output passed between chained skills |
|
||||||
|
| Automation | cron on 172.27.40.3 | Scheduled skill execution |
|
||||||
|
|
||||||
|
## Adding a new skill
|
||||||
|
|
||||||
|
1. Create `skills/<skill-name>/skill.md` — what the skill does and how
|
||||||
|
2. Create `skills/<skill-name>/learnings.md` — starts empty, fills over time
|
||||||
|
3. Create `skills/<skill-name>/eval.json` — scoring criteria
|
||||||
|
4. Add a cron job on 172.27.40.3 calling the skill
|
||||||
|
5. The infra-monitor watchdog will automatically pick it up
|
||||||
|
|
||||||
|
## Runtime
|
||||||
|
|
||||||
|
- Files live on server: `/opt/agent-os/` (cloned from this repo)
|
||||||
|
- LLM inference: Ollama at `http://172.27.6.139:11434`
|
||||||
|
- Scheduled jobs: cron on `172.27.40.3`
|
||||||
|
- Local editing: `/home/nxm/Documents/agent-os/` on Kubuntu (this machine)
|
||||||
|
|
||||||
|
## Infra reference
|
||||||
|
|
||||||
|
Cross-repo links to supporting documentation:
|
||||||
|
- [IP & Port Map](https://git.nxm.co.za/admin/nxm-infrastructure/src/branch/main/Quick%20Reference/IP%20%26%20Port%20Map.md)
|
||||||
|
- [Docker Stacks](https://git.nxm.co.za/admin/nxm-infrastructure/src/branch/main/Quick%20Reference/Docker%20Stacks.md)
|
||||||
|
- [Network Overview](https://git.nxm.co.za/admin/nxm-infrastructure/src/branch/main/Infrastructure/Network%20Overview.md)
|
||||||
@@ -0,0 +1,64 @@
|
|||||||
|
# Brain
|
||||||
|
|
||||||
|
Core facts read by all skills. Keep under 1000 words. Update when infrastructure changes.
|
||||||
|
Last updated: 2026-04-30
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Infrastructure
|
||||||
|
|
||||||
|
**Primary server:** 172.27.40.3 — Ubuntu Server LTS, Docker host
|
||||||
|
**Kubuntu desktop:** 172.27.6.139 — NxM-AI, runs Ollama
|
||||||
|
**TrueNAS NAS:** 172.27.40.5
|
||||||
|
**Firewall:** OPNsense at 172.27.6.1
|
||||||
|
|
||||||
|
**VLANs:**
|
||||||
|
| VLAN | Name | Subnet |
|
||||||
|
|---|---|---|
|
||||||
|
| 40 | Servers40 | 172.27.40.0/24 |
|
||||||
|
| 20 | Workshop20 | 172.27.20.0/24 |
|
||||||
|
| 10 | IoT10 | 172.27.10.0/24 |
|
||||||
|
|
||||||
|
## Key Services (172.27.40.3)
|
||||||
|
|
||||||
|
| Service | Port | URL |
|
||||||
|
|---|---|---|
|
||||||
|
| Portainer | 9443 | https://172.27.40.3:9443 |
|
||||||
|
| Nginx Proxy Manager | 80/81/443 | http://172.27.40.3:81 |
|
||||||
|
| Uptime Kuma | 3002 | http://172.27.40.3:3002 |
|
||||||
|
| Gitea | 3000 | https://git.nxm.co.za |
|
||||||
|
| Headscale | 8080 | https://headscale.nxm.co.za |
|
||||||
|
| Netbird | 3479/udp | https://netbird.nxm.co.za |
|
||||||
|
| Vaultwarden | 8222 | https://vault.nxm.co.za |
|
||||||
|
| Flowise | 3010 | http://172.27.40.3:3010 |
|
||||||
|
| Plane | 8095 | https://plane.nxm.co.za |
|
||||||
|
| Zabbix | 8091 | https://zabbix.nxm.co.za |
|
||||||
|
| Homarr | 7575 | http://172.27.40.3:7575 |
|
||||||
|
|
||||||
|
## AI Stack
|
||||||
|
|
||||||
|
- **Ollama** on 172.27.6.139:11434 (bound to 0.0.0.0)
|
||||||
|
- **Models:** gemma4, qwen2.5-coder:7b
|
||||||
|
- **Flowise** on 172.27.40.3:3010 — visual agent/flow builder
|
||||||
|
- **Claude Code** — primary AI assistant, runs on Kubuntu
|
||||||
|
|
||||||
|
## Agent OS Runtime
|
||||||
|
|
||||||
|
- Files: `/opt/agent-os/` on 172.27.40.3
|
||||||
|
- Local edit path: `/home/nxm/Documents/agent-os/` on 172.27.6.139
|
||||||
|
- Repo: `https://git.nxm.co.za/admin/agent-os`
|
||||||
|
- Scheduled jobs: cron on 172.27.40.3
|
||||||
|
- LLM calls: `http://172.27.6.139:11434`
|
||||||
|
|
||||||
|
## Key Paths on Server
|
||||||
|
|
||||||
|
- Docker stacks: `/opt/stacks/`
|
||||||
|
- Agent OS: `/opt/agent-os/`
|
||||||
|
|
||||||
|
## Standing Decisions
|
||||||
|
|
||||||
|
- TrueNAS will move to a dedicated server — avoid hardcoding 172.27.40.5 in automation
|
||||||
|
- NPM handles all SSL termination — internal services use HTTP, NPM adds HTTPS
|
||||||
|
- NFS preferred for Linux-to-Linux file sharing
|
||||||
|
- Docker Compose only (no Kubernetes)
|
||||||
|
- All destructive actions require explicit confirmation before execution
|
||||||
+19
@@ -0,0 +1,19 @@
|
|||||||
|
# Identity
|
||||||
|
|
||||||
|
> **Status: PENDING** — To be completed via Claude interview session.
|
||||||
|
> Run the interview by saying: "Let's complete the Agent OS identity interview."
|
||||||
|
|
||||||
|
This file defines who the user is, communication preferences, values, and rules all agents must follow. Every skill reads this file before executing.
|
||||||
|
|
||||||
|
## What the interview will capture
|
||||||
|
|
||||||
|
- Professional role and responsibilities
|
||||||
|
- Communication style preferences
|
||||||
|
- Core values and priorities
|
||||||
|
- Things agents should never do
|
||||||
|
- How decisions should be escalated vs handled autonomously
|
||||||
|
- Preferred output formats
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*This section will be replaced with the completed identity profile after the interview.*
|
||||||
@@ -0,0 +1,23 @@
|
|||||||
|
# Active Projects
|
||||||
|
|
||||||
|
Current in-flight work. Update at the end of each session.
|
||||||
|
Last updated: 2026-04-30
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Agent OS — Phase 1 (NEXT)
|
||||||
|
Complete the foundation before building skills.
|
||||||
|
- [ ] Set up NFS export on 172.27.40.3 + mount on Kubuntu at /mnt/agent-os
|
||||||
|
- [ ] Run identity interview with Claude → populate identity.md
|
||||||
|
- [ ] Seed brain.md review and confirm accuracy
|
||||||
|
- [ ] Clone this repo to /opt/agent-os/ on server
|
||||||
|
|
||||||
|
## Agent OS — Phase 3 (PENDING Phase 1)
|
||||||
|
- [ ] Build infra-monitor skill
|
||||||
|
- [ ] Set up cron schedule (hourly heartbeat, daily digest)
|
||||||
|
- [ ] Wire up Home Assistant notifications
|
||||||
|
|
||||||
|
## Gitea documentation
|
||||||
|
- [x] nxm-infrastructure repo — Obsidian vault imported
|
||||||
|
- [x] nexum-projects repo — Obsidian vault imported
|
||||||
|
- [x] agent-os repo — scaffolding created
|
||||||
@@ -0,0 +1,13 @@
|
|||||||
|
# Constraints
|
||||||
|
|
||||||
|
Hard limits agents must respect. Never work around these without explicit user confirmation.
|
||||||
|
Last updated: 2026-04-30
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
- Never take destructive or irreversible action without explicit confirmation (delete, overwrite, drop, reset, force push)
|
||||||
|
- Never store credentials in output files, logs, or generated markdown — reference their location instead
|
||||||
|
- Never skip git hooks or bypass signing
|
||||||
|
- TrueNAS (172.27.40.5) is being migrated to a new server — do not create hard dependencies on that IP
|
||||||
|
- Linux server (172.27.40.3) has no GPU — never schedule LLM inference to run locally there
|
||||||
|
- Docker Compose only — no Kubernetes, no Swarm
|
||||||
@@ -0,0 +1,8 @@
|
|||||||
|
# Notes from Last Run
|
||||||
|
|
||||||
|
Populated automatically at the end of each skill run. Cleared at the start of each new session.
|
||||||
|
Last updated: —
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*No runs yet — Agent OS not yet deployed.*
|
||||||
@@ -0,0 +1,18 @@
|
|||||||
|
# Persistent Memory
|
||||||
|
|
||||||
|
Facts that don't expire. If you'd have to re-explain it to a new agent every time, it belongs here.
|
||||||
|
Last updated: 2026-04-30
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Infrastructure decisions
|
||||||
|
- RustDesk is self-hosted on 172.27.40.3 — clients connect to local server not public relay
|
||||||
|
- Netbird client management is on port 8443 via Caddy sidecar, NOT port 443
|
||||||
|
- Headscale v0.28: all write operations require numeric user ID, not username
|
||||||
|
- Tailscale on Windows overrides DNS — disconnect before testing split DNS changes
|
||||||
|
- Servers running Tailscale must run `sudo tailscale set --accept-dns=false` before joining Netbird
|
||||||
|
|
||||||
|
## Agent OS build state
|
||||||
|
- Phase 1-2 (file structure + NFS + identity interview): not yet started
|
||||||
|
- First skill to build: infra-monitor (Docker health + agent watchdog)
|
||||||
|
- Notifications target: Home Assistant at 172.27.10.6
|
||||||
@@ -0,0 +1,11 @@
|
|||||||
|
# Recent Decisions
|
||||||
|
|
||||||
|
Decisions made in the last 30 days that affect current work. Archive when no longer relevant.
|
||||||
|
Last updated: 2026-04-30
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
- **2026-04-30:** Chose Gitea (self-hosted git) over Obsidian for documentation — AI-writable, browser-accessible, version controlled
|
||||||
|
- **2026-04-30:** Agent OS files to live on 172.27.40.3 at /opt/agent-os/, accessed from Kubuntu via NFS
|
||||||
|
- **2026-04-29:** Chose Syncthing-free approach for Obsidian migration — NFS for Linux, SMB for Windows
|
||||||
|
- **2026-04-29:** infra-monitor will be first Agent OS skill — covers Docker health and agent watchdog in one skill
|
||||||
@@ -0,0 +1,5 @@
|
|||||||
|
# Handoff: infra-monitor → notification
|
||||||
|
|
||||||
|
Populated by infra-monitor when anomalies are found. Read by the notification skill (future).
|
||||||
|
|
||||||
|
*Empty — no anomalies from last run, or skill has not run yet.*
|
||||||
@@ -0,0 +1,8 @@
|
|||||||
|
{
|
||||||
|
"criteria": [
|
||||||
|
{ "name": "all_services_checked", "weight": 0.3, "description": "Every expected container and service was checked, none skipped" },
|
||||||
|
{ "name": "clear_status_summary", "weight": 0.3, "description": "Output leads with a plain-English summary line before detail" },
|
||||||
|
{ "name": "actionable_findings", "weight": 0.2, "description": "Any warning or critical item includes enough detail to act on immediately" },
|
||||||
|
{ "name": "agent_watchdog_complete", "weight": 0.2, "description": "All skills in /skills/ were checked for staleness and errors" }
|
||||||
|
]
|
||||||
|
}
|
||||||
@@ -0,0 +1,3 @@
|
|||||||
|
# Last Output: infra-monitor
|
||||||
|
|
||||||
|
*Not yet populated — skill has not run.*
|
||||||
@@ -0,0 +1,12 @@
|
|||||||
|
# Learnings: infra-monitor
|
||||||
|
|
||||||
|
Updated automatically after each run. The skill reads this before executing to improve its next output.
|
||||||
|
|
||||||
|
## What has worked well
|
||||||
|
*Not yet populated — skill has not run.*
|
||||||
|
|
||||||
|
## What missed the mark
|
||||||
|
*Not yet populated — skill has not run.*
|
||||||
|
|
||||||
|
## Adjustments for next run
|
||||||
|
*Not yet populated — skill has not run.*
|
||||||
@@ -0,0 +1,58 @@
|
|||||||
|
# Skill: infra-monitor
|
||||||
|
|
||||||
|
Monitors server health and watches all Agent OS skills for staleness or errors. Runs on a cron schedule on 172.27.40.3.
|
||||||
|
|
||||||
|
## Inputs
|
||||||
|
|
||||||
|
Reads before executing:
|
||||||
|
- `../../identity.md`
|
||||||
|
- `../../brain.md`
|
||||||
|
- `../../memory/persistent.md`
|
||||||
|
- `learnings.md` (this skill's improvement notes)
|
||||||
|
|
||||||
|
## What to check
|
||||||
|
|
||||||
|
### Docker health (on 172.27.40.3)
|
||||||
|
- All expected containers are running (not exited/restarting)
|
||||||
|
- Flag any container that has restarted more than 3 times in the last hour
|
||||||
|
- Expected containers: portainer, nginx-proxy-manager, uptime-kuma, gitea, headscale, netbird, vaultwarden, flowise, plane, zabbix, homarr
|
||||||
|
|
||||||
|
### Service reachability
|
||||||
|
Lightweight HTTP check (curl, timeout 5s) on each internal URL:
|
||||||
|
- http://172.27.40.3:9443 (Portainer)
|
||||||
|
- http://172.27.40.3:3002 (Uptime Kuma)
|
||||||
|
- http://172.27.40.3:3000 (Gitea)
|
||||||
|
- http://172.27.40.3:3010 (Flowise)
|
||||||
|
- http://172.27.40.3:7575 (Homarr)
|
||||||
|
- http://172.27.6.139:11434 (Ollama)
|
||||||
|
|
||||||
|
### Agent watchdog
|
||||||
|
For each skill directory under `../../skills/`:
|
||||||
|
- Check `last-output.md` modification time — flag if older than expected schedule
|
||||||
|
- Check `../../logs/<skill-name>/` for ERROR entries in last run
|
||||||
|
- Report: healthy / stale / erroring
|
||||||
|
|
||||||
|
### System resources (on 172.27.40.3)
|
||||||
|
- Disk usage on / — warn if >80%, critical if >90%
|
||||||
|
- Memory usage — flag if >85%
|
||||||
|
|
||||||
|
## Output
|
||||||
|
|
||||||
|
Write a digest to `last-output.md` in this format:
|
||||||
|
- Summary line: X healthy, Y warnings, Z critical
|
||||||
|
- Section per category: Docker, Services, Agent Watchdog, System
|
||||||
|
- Each item: ✓ OK / ⚠ Warning / ✗ Critical + one line detail
|
||||||
|
|
||||||
|
Pass anomalies to `context/handoff.md` for notification skill (future).
|
||||||
|
|
||||||
|
## Wrap-up
|
||||||
|
|
||||||
|
After writing output:
|
||||||
|
1. Update `learnings.md` with anything that went wrong or could be improved
|
||||||
|
2. Append a one-line log entry to `../../logs/infra-monitor.log`: `YYYY-MM-DD HH:MM | status | summary`
|
||||||
|
3. Update `../../memory/notes-from-last-run.md`
|
||||||
|
|
||||||
|
## Schedule
|
||||||
|
|
||||||
|
- **Heartbeat:** every hour — checks Docker + Ollama only (fast, <30s)
|
||||||
|
- **Full digest:** daily at 07:00 — all checks
|
||||||
Reference in New Issue
Block a user