# NxM Infrastructure Project ## Primary Linux Server - **IP:** 172.27.40.3 - **User:** nxm - **Password:** 6589 - **OS:** Ubuntu Server (LTS) - **Docker stack root:** `/opt/stacks/` ## Network | VLAN | Name | Subnet | Gateway | |---|---|---|---| | 40 | Servers40 | 172.27.40.0/24 | 172.27.6.1 | | 20 | Workshop20 | 172.27.20.0/24 | 172.27.6.1 | | 10 | IoT10 | 172.27.10.0/24 | 172.27.6.1 | ## Key Devices | Device | IP | Role | |---|---|---| | OPNsense Firewall | 172.27.6.1 | Firewall, router, DHCP | | Ubuntu Server | 172.27.40.3 | Docker host, Headscale | | TrueNAS | 172.27.40.5 | NAS storage | | Home Assistant | 172.27.10.6 | Home automation (IoT10) | ## Docker Stacks & Ports | Stack | Path | Port | |---|---|---| | Portainer | `/opt/stacks/portainer/` | 9443 (HTTPS) | | Nginx Proxy Manager | `/opt/stacks/nginx-proxy-manager/` | 80, 81, 443 | | Uptime Kuma | `/opt/stacks/uptime-kuma/` | 3002 | | Zabbix | `/opt/stacks/zabbix/` | 8091 | | Headscale | `/opt/stacks/headscale/` | 8080 (internal) | | Headplane | `/opt/stacks/headplane/` | 3001 | | Headscale UI | `/opt/stacks/headscale-ui/` | 3005 | | Homarr | `/opt/stacks/homarr/` | 7575 | | Dashy | `/opt/stacks/dashy/` | 4000 | | Vaultwarden | `/opt/stacks/vaultwarden/` | 8222 | | Netbird | `/opt/stacks/netbird/` | 3479/udp STUN | | Caddy (Netbird sidecar) | `/opt/stacks/caddy-netbird/` | 8443/tcp — gRPC proxy for Netbird clients | | Plane | `/opt/stacks/plane/` | 8095 (HTTP, via NPM) | | Gitea | `/opt/stacks/gitea/` | 3000 (web), 2222 (SSH git) — self-hosted git, infrastructure docs | | Open WebUI | `/opt/stacks/open-webui/` | 3010 — Chat UI for Ollama + MCP (replaced Flowise 2026-05-01) | | agent-sites | `/opt/stacks/sites/` | internal only (proxy network) — nginx:alpine serving /opt/sites/ at agents.nxm.co.za | | hodor-gateway | `/opt/stacks/hodor-gateway/` | 8200 — FastAPI agent gateway, POST /ask → Ollama | | bran-changelog | `/opt/stacks/bran-changelog/` | one-shot container, run.sh + cron 06:00 daily | | citadel-mcp | `/opt/stacks/citadel-mcp/` | 8300 — MCP SSE+HTTP server, tools: list_agents/get_agent_status/get_agent_output/web_search | | varys-monitor | `/opt/stacks/varys-monitor/` | one-shot container, run.sh + cron every 15 min | | sam-research | `/opt/stacks/sam-research/` | 8500 — Research agent, POST /research → SearXNG + Ollama | | searxng | `/opt/stacks/searxng/` | 8600 — Self-hosted search backend (internal only, used by sam + citadel) | ## Public Subdomains (via NPM + Let's Encrypt) | Subdomain | Internal Target | |---|---| | headscale.nxm.co.za | 172.27.40.3:8080 | | vault.nxm.co.za | 172.27.40.3:8222 | | kuma.nxm.co.za | 172.27.40.3:3002 | | zabbix.nxm.co.za | 172.27.40.3:8091 | | netbird.nxm.co.za | netbird-dashboard:80 via NPM (dashboard UI only) | | netbird.nxm.co.za:8443 | Caddy sidecar → netbird-server:80 (client management, gRPC) | | plane.nxm.co.za | 172.27.40.3:8095 | | git.nxm.co.za | 172.27.40.3:3000 | | rmm.nxm.co.za | 172.27.40.4:443 | | api.nxm.co.za | 172.27.40.4:443 | | mesh.nxm.co.za | 172.27.40.4:4430 | ## OPNsense Split DNS (Unbound Host Overrides) All subdomains resolve to 172.27.40.3 internally via Unbound overrides. If a subdomain isn't resolving internally, check: 1. OPNsense → Services → Unbound DNS → Overrides 2. Tailscale client on Windows — disconnect it, as it overrides local DNS ## Headscale (VPN) - Version: v0.28 - Public URL: https://headscale.nxm.co.za - Policy mode: database (live apply via API) - **v0.28 breaking change:** All write operations require numeric user ID, not username - Get IDs: `headscale users list` - Example: `headscale preauthkeys create --user 13` - Config: `/opt/stacks/headscale/config/config.yaml` - After restart, always check: `tailscale status` — if server node offline: `sudo systemctl restart tailscaled` ## Vaultwarden - Admin panel: https://vault.nxm.co.za/admin (token in OneNote) - Signups currently OPEN (disable after all 4 users register) - To disable: set `SIGNUPS_ALLOWED=false` in `/opt/stacks/vaultwarden/.env` → `docker compose up -d` - Admin token needs Argon2 upgrade: `docker exec -it vaultwarden /vaultwarden hash --preset owasp` ## Domain Knowledge - **Networking:** VLANs, inter-VLAN routing, firewall rules, NAT, split DNS, DHCP — comfortable at OPNsense config level, not just GUI clicks - **DNS:** Unbound overrides, split-horizon, DNS-over-TLS, troubleshooting resolution order (Tailscale/Netbird conflict patterns) - **VPN:** WireGuard fundamentals, Headscale (self-hosted Tailscale), Netbird (self-hosted, embedded Dex IdP), relay vs P2P, ACL/policy models - **Docker:** Compose stacks, container networking, volume mounts, healthchecks, reverse proxy patterns — not Kubernetes - **Linux:** Ubuntu Server admin, systemd, cron, file permissions, basic shell scripting — not a kernel developer - **Reverse proxy:** NPM (OpenResty), Caddy — knows the difference between HTTP/HTTPS termination and gRPC proxying - **Self-hosted services:** IPAM (NetBox), monitoring (Zabbix, Uptime Kuma, VictoriaMetrics, Grafana), dashboards (Homarr, Dashy), secrets (Vaultwarden) - **Not expert in:** Kubernetes, cloud platforms (AWS/Azure/GCP), advanced Python (learning), application development ## Agent Guidelines - Never take destructive or irreversible action without explicit confirmation (delete, overwrite, drop, reset) - Never store credentials in output, logs, or generated files — reference credential location instead - Always return structured output (JSON or markdown table) unless plain text is explicitly requested - When suggesting shell commands, use PowerShell syntax for Windows-side tasks, bash for Linux server tasks - SSH to Linux server always uses the Posh-SSH pattern defined in the Shell & Tools section - Docker commands always use `docker compose` (not `docker-compose`) with `-q` on pulls - If a task touches Headscale, Netbird, or NPM config — flag the relevant Known Issues entry before proceeding - Prefer idempotent operations — scripts should be safe to run more than once - When uncertain about current server state, ask rather than assume ## Credentials Location - All service passwords: Jaco's OneNote + Nexum Password Spreadsheet - No passwords stored in code or config files (except SSH for automation) ## Documentation (Gitea) - Obsidian vaults migrated to Gitea on 2026-04-30 - **nxm-infrastructure** repo: `https://git.nxm.co.za/admin/nxm-infrastructure` - Local path: `/home/nxm/Documents/NxM Linux Server/` - Remote: `gitea-local:admin/nxm-infrastructure.git` (SSH) - **nexum-projects** repo: `https://git.nxm.co.za/admin/nexum-projects` - Local path: `/home/nxm/Documents/Nexum Projects/` - Remote: `gitea-local:admin/nexum-projects.git` (SSH) - **agent-os** repo: `https://git.nxm.co.za/admin/agent-os` - Local path: `/home/nxm/Documents/agent-os/` - Server runtime: `/opt/agent-os/` on 172.27.40.3 - Remote: `gitea-local:admin/agent-os.git` (SSH) - At end of any session that changes infrastructure: update relevant markdown files, commit and push - Key files to update after infrastructure changes: - `Home.md` — master index - `Quick Reference/IP & Port Map.md` — any new service - `Quick Reference/Docker Stacks.md` — any new stack - Relevant `Services/.md` — service-specific docs - `Troubleshooting/` — any new known issues ## Known Issues & Gotchas - Docker pull logs are very verbose — always use `-q` flag - Headscale YAML: never add duplicate `policy:` block (causes crash) - Homarr board names must be slugs (no spaces) - Vaultwarden requires HTTPS — LAN IP (port 8222) will show crypto error, use vault.nxm.co.za - Tailscale client on Windows overrides DNS — disconnect when testing split DNS changes - Chrome profile-specific links always open in last active profile (Windows limitation) - NPM forward scheme is HTTP even for HTTPS external — NPM handles SSL termination - Netbird STUN is on 3479/udp (not 3478 — Headscale owns that) - Netbird client management URL is port 8443 (Caddy sidecar) — NOT 443 - NPM (OpenResty) has no gRPC module — Caddy sidecar is the workaround until Traefik migration - Netbird config.yaml contains authSecret + encryptionKey — back this file up, losing it breaks all peers - Servers running Tailscale must run `sudo tailscale set --accept-dns=false` before joining Netbird (Tailscale DNS overrides Unbound and resolves via public IP, breaking gRPC hairpin) - Open WebUI → Citadel MCP: auth_type must be `none` — empty bearer key generates an illegal header and the connection silently fails - Open WebUI connects via Streamable HTTP POST at `http://citadel-mcp:8300/mcp` — do NOT use /sse (Open WebUI 0.9+ only supports POST-based transport)