๐Ÿ”ฌ Vault Architecture โ€” Stress Test & MoE

Workflow scenarios ยท Expert panel review ยท Write safety ยท Claude Code model ยท Git strategy

๐Ÿงช Workflow Stress Tests

Pat's real-world workflows and edge cases run against the proposed structure.

ScenarioWalkthroughResultNotes
1. Pat sends voice memo with 3 tasks in Slack
Most common workflow
1. OpenClaw receives โ†’ Shogun transcribes via Whisper
2. task-capture classifies each task
3. Captures โ†’ agents/shogun/memory/captures/2026-04-11.md
4. If any task is Forge-routable โ†’ mailbox to ops/mailbox/forge/inbox/
5. Evening brief surfaces captures for Pat validation
PASS Clean path. All files land in predictable locations.
2. Pat asks "what did Serge say last week?"
Knowledge retrieval
1. Shogun receives question
2. Query world-model API (:8081) for "Serge" + recent transcripts
3. API searches knowledge/transcripts/ + knowledge/people/serge-glushenko.md
4. Returns relevant snippets (not full files)
5. Shogun synthesizes and responds
PASS Clean separation: knowledge/ is the data lake, world-model API is the query layer.
3. Pat drops a PDF in staging
File triage workflow
1. PDF lands in ops/staging/from-pat/
2. Shogun detects on triage pass (or Pat says "check staging")
3. Classify: Is this a project artifact? Knowledge? Reference?
4. Process: Extract text, summarize if needed
5. Move to knowledge/reference/ or projects/*/artifacts/
6. Remove from staging
PASS But: PDFs are .gitignored. Need to decide: convert to MD before filing? Or keep PDFs in knowledge/ and gitignore that dir?
4. Forge needs to deploy a code change
Engineering workflow
1. Forge (Claude Code) opens session at repo root
2. Reads root CLAUDE.md โ†’ identifies as Forge
3. Works in agents/forge/code/ or relevant project dir
4. Writes to agents/forge/architecture/ for ADRs
5. Logs to agents/forge/memory/
6. Commits via git
โš ๏ธ But where does actual code live?
PARTIAL Gap identified. Code (Python scripts, configs, LaunchAgents) isn't clearly placed. See Claude Code section.
5. New agent "Dispatch" needs to be created
Agent onboarding
1. mkdir agents/dispatch/
2. Create SOUL.md + CLAUDE.md from template
3. mkdir agents/dispatch/memory/ skills/
4. mkdir ops/mailbox/dispatch/inbox/
5. Add to governance/ roster
6. Update root CLAUDE.md routing table
7. Configure in openclaw.json
PASS Simple, repeatable process. 7 steps, all scripted.
6. Pat says "update the Orcrist capture plan"
Cross-agent project work
1. Shogun receives โ†’ checks projects/orcrist-capture/README.md for owner
2. Reads current projects/orcrist-capture/plan.md
3. If Shogun owns: updates directly
4. If Forge owns: mailbox to Forge
5. Updates logged to projects/orcrist-capture/logs/
6. Decision captured in projects/orcrist-capture/history/
PASS Project dir has all the context. No searching across vaults.
7. Two agents edit the same project file simultaneously
Concurrency edge case
1. Shogun and Forge both open projects/orcrist/plan.md
2. Both make changes
3. Second writer overwrites first writer's changes
โš ๏ธ No lock mechanism exists
RISK Real problem. Mitigations: (1) ownership rule in README.md, (2) agents check before writing, (3) git history as safety net. See Write Safety section.
8. Sentinel detects Shogun violated a constraint
Audit workflow
1. Sentinel reads gateway logs โ†’ detects violation
2. Writes checkpoint to ops/logs/checkpoint-reviews/
3. Posts alert to Slack #forge
4. Shogun reads alert on next heartbeat
5. Corrective action โ†’ logged in agents/shogun/memory/
PASS Clean separation of concerns. Sentinel writes to ops/, not to agents/shogun/.
9. Pat says "what's the status of everything?"
Status rollup
1. Shogun reads all projects/*/README.md (status field)
2. Reads agents/shogun/memory/captures/ for unchecked tasks
3. Reads recent ops/mailbox/shogun/inbox/ for pending items
4. Queries Linear API for ticket status
5. Synthesizes and responds
PASS All status info in predictable locations. ls projects/*/README.md gives the full portfolio.
10. Obsidian Sync conflict on Cockpit
Sync edge case
1. Pat edits governance/DECISIONS.md on Cockpit
2. Shogun appends to same file on Engine
3. Obsidian Sync creates conflict file
4. Need manual resolution
PARTIAL Mitigations: (1) governance/ changes go through ADR process (not ad-hoc edits), (2) Pat edits via Slack not Obsidian for governance, (3) Obsidian Sync conflict files are visible in Obsidian UI.
11. Pat travels with Nomad โ€” needs project context
Remote access
1. Nomad has governance/ + agents/shogun/ via sparse checkout
2. Pat opens Obsidian โ†’ can read governance and Shogun notes
3. Needs a project file โ†’ projects/ not synced
4. Options: (a) SSH to Engine, (b) expand sparse checkout, (c) Obsidian Sync includes projects/
PARTIAL Need to decide: does Nomad sync projects/? Active projects only? All projects? Recommend: sync active projects only.
12. Knowledge gets stale โ€” CRM contact changed jobs
Knowledge maintenance
1. Network SME runs enrichment pass
2. Detects stale contact in knowledge/people/john-smith.md
3. Updates file with new info
4. world-model API re-indexes on next cycle
5. Next time Shogun queries "John Smith" โ†’ gets current data
PASS Pipeline handles this. One file per person makes updates atomic.
13. Pat says "build me an investor portal"
Major new project
1. Shogun creates projects/dxd-investor-portal/
2. Writes README.md (owner: Forge, participants: Shogun + Forge)
3. Mailbox to Forge with requirements
4. Forge creates plan.md, starts in artifacts/
5. Code lives in... where?
โš ๏ธ The code output isn't clearly addressed
PARTIAL Same gap as #4. A project that produces code needs a clear code location. See Claude Code & Git section.
14. Morning brief fails to generate
Service failure
1. 6:15 AM cron fires โ†’ Shogun starts brief generation
2. Linear API is down โ†’ partial data
3. Shogun writes what it can to ops/briefs/
4. Logs error to agents/shogun/memory/
5. Next heartbeat: retry?
PASS Failure is logged, partial output still serves. Structure doesn't hinder recovery.
15. Pat says "I told you about this last month"
Historical recall
1. Shogun checks agents/shogun/MEMORY.md (curated long-term)
2. If not there: checks agents/shogun/memory/2026-03-*.md (daily logs)
3. If not there: queries world-model API for transcripts
4. If not there: searches projects/*/history/ for Pat direction captures
PASS Four-layer recall: curated memory โ†’ daily logs โ†’ knowledge API โ†’ project history. Good coverage.

Summary

10 PASS Clean workflows: task capture, knowledge retrieval, file triage, agent onboarding, project work, status rollup, audit, maintenance, brief generation, historical recall.
3 PARTIAL Need refinement: Obsidian Sync conflicts, Nomad sync scope, code output location.
2 GAPS Must address: (1) Where does code live? (2) Concurrent write safety.

๐Ÿง  Mixture of Experts โ€” Panel Review

Five independent expert lenses evaluating the architecture. Each expert flags landmines and proposes fixes.

๐Ÿ—๏ธ Systems Architect

Verdict: Sound foundation, two structural gaps.

๐Ÿ”’ Security & Access Control

Verdict: Write safety is the biggest risk. Two critical recommendations.

๐Ÿ“Š Operations & Cost

Verdict: Operationally clean. Watch the knowledge/ size in Git.

๐Ÿค– Agent UX & Developer Experience

Verdict: Excellent navigability. One UX concern.

๐Ÿ‘ค CEO / Stakeholder Experience

Verdict: Much better than current state. One gap for Pat's workflow.

MoE Consensus Findings

FindingSeverityFix
Code has no explicit homeHighAdd code/ or infra/ convention โ€” see Claude Code tab
Concurrent write riskHighWrite-access matrix + single-writer rules โ€” see Write Safety tab
knowledge/ too large for GitMediumDefine Git-tracked boundary โ€” see Git tab
Agent skills/ naming ambiguityLowAdd README.md to skills/ or rename to playbook/
Entity contexts buried in agent dirMediumMove business context to knowledge/, keep agent stubs
Pat's quick notes have no homeLowops/staging/from-pat/ handles this

๐Ÿ’ป Claude Code Interaction Model

How Forge (Claude Code) interacts with the vault. The CLAUDE.md chain. Where code lives.

How Claude Code Works Today

The CLAUDE.md Chain

When Claude Code opens a session, it reads CLAUDE.md files from the working directory upward:

  1. Root ~/ShogunOS/CLAUDE.md โ€” Currently says "You are Forge." This is the entry point for every Claude Code session.
  2. Subdirectory CLAUDE.md โ€” If Claude Code cds into agents/forge/, it reads that CLAUDE.md too (extends parent).
  3. Project CLAUDE.md โ€” A project could have its own CLAUDE.md with project-specific instructions.

Key insight: Claude Code almost always IS Forge. Pat launches Claude Code at the repo root โ†’ it reads root CLAUDE.md โ†’ it becomes Forge. This is correct and should be preserved.

Proposed CLAUDE.md Architecture

FileContainsWho It Serves
~/ShogunOS/CLAUDE.mdSystem overview + Forge identity + vault map + operating principles. This is the PRIMARY identity file for Claude Code.Forge (Claude Code)
agents/forge/CLAUDE.mdForge-specific workspace details: project list, key files, current prioritiesForge when working in its own dir
agents/shogun/CLAUDE.mdShogun-specific orientation (for when Claude Code reviews Shogun's workspace)Any agent reviewing Shogun's dir
projects/*/CLAUDE.mdOptional: project-specific instructions for Claude Code sessions on that projectForge when working on a specific project
Design: Root CLAUDE.md stays Forge-oriented. Shogun runs on OpenClaw (reads SOUL.md/AGENTS.md, not CLAUDE.md). There's no conflict โ€” they use different identity systems.

Where Does Code Live?

The Problem

The repo contains multiple types of executable code with no clear placement:

Recommendation: Keep Code in agents/forge/ with Clear Subdirs

Code is Forge's domain. Other agents don't write code. So code stays in Forge's workspace, organized by purpose:

Code TypeLocationExampleGit-Tracked?
Service code (applications)agents/forge/services/world-model app, brief serverYes
Infrastructure configsagents/forge/infra/LaunchAgent plists, nginx, Docker composeYes
Shared scriptsops/scripts/mailbox_send.sh, log-rotate.shYes
Agent-specific scriptsagents/*/scripts/Shogun's briefing generatorYes
Project code outputsprojects/*/artifacts/Investor portal HTMLYes (if text), No (if binary)
Runtime data (ChromaDB, venvs)agents/forge/services/*/data/chromadb_store/, .venv/No (.gitignored)
Why not top-level code/? Because code is Forge-owned. Putting it in agents/forge/ maintains the ownership principle. Shared scripts go in ops/scripts/ because any agent can use them. This matches the "who writes it?" principle throughout the system.

Claude Code Workflow

StepWhat HappensFiles Touched
1. Pat or Shogun requests workMailbox โ†’ ops/mailbox/forge/inbox/Mailbox message
2. Forge (Claude Code) starts sessionReads root CLAUDE.md โ†’ becomes ForgeCLAUDE.md
3. Forge reads the requestChecks inbox, reads project contextops/mailbox/, projects/*/README.md
4. Forge worksWrites code, configs, architecture docsagents/forge/*, projects/*/artifacts/
5. Forge commitsgit add + git commit with descriptive messageGit index
6. Forge reports doneWrites DONE message to Shogun's mailboxops/mailbox/shogun/inbox/
7. Forge logsUpdates memory + changelogagents/forge/memory/, agents/forge/CHANGELOG.md

๐Ÿ”’ Write Safety โ€” Who Can Write Where

The concurrent-write problem is real. Here's the access matrix and mitigation strategy.

Write Access Matrix

DirectoryWrite ModelWho WritesConcurrent RiskMitigation
governance/Single-writer + approvalAPEX proposes โ†’ Pat approves. No ad-hoc edits.LowADR process gates all changes. Sentinel monitors for unauthorized writes.
governance/DECISIONS.mdAppend-only, serializedAny agent can append (via script)MediumUse decisions-append.sh with file locking (flock). Never edit existing entries.
knowledge/Pipeline-ownedNetwork SME, Librarian, ingestion pipelinesLowEach agent owns specific subdirs. Network SME โ†’ people/. Transcript skill โ†’ transcripts/. No overlap.
agents/<name>/Owner-onlyOnly the named agentNoneSentinel flags any write by non-owner. CLAUDE.md boundaries.
projects/*/README.mdOwner-onlyOnly the project owner (listed in README.md)LowOwner updates status. Others read only.
projects/*/plan.mdOwner-onlyOnly the project ownerLowSame as README.md.
projects/*/artifacts/Multi-writer, file-level ownershipAny participating agent writes their own filesMediumEach agent creates their own files (no two agents edit the same file). Naming convention: agent-slug.md.
projects/*/logs/Multi-writer, no overlapEach agent writes YYYY-MM-DD-agentname.mdNoneFilenames prevent collision by design.
projects/*/history/Append-onlyAny agent adds new files (never edits old)NoneNew files only. Old history files are immutable.
projects/*/memory/Owner-onlyProject owner maintains context.md, open-questions.mdLowSingle maintainer.
ops/mailbox/Multi-writer, no overlapAny agent writes to any inbox, unique filenames (timestamped)NoneTimestamp + sender in filename prevents collision.
ops/briefs/Single-writerShogun generates all briefsNoneOne generator.
ops/staging/Multi-writer intake, single-writer triageAnyone drops files. Shogun triages exclusively.LowIntake is additive. Triage (move/delete) is Shogun only.
ops/logs/Service-ownedEach service writes its own log fileNoneOne writer per log file by definition.
ops/scripts/Forge-onlyForge maintains all shared scriptsNoneSingle maintainer.

The Three Write Models

1. Owner-Only (safest)

One agent owns the file/dir. Nobody else writes.

Used for: agent workspaces, project README/plan, governance files, ops/scripts

Enforcement: CLAUDE.md instructions + Sentinel monitoring

2. Multi-Writer, No Overlap (safe by design)

Multiple agents write, but file naming prevents collision.

Used for: mailbox (timestamped), project logs (agent-named), project history (date+slug), staging (intake)

Enforcement: Naming convention makes collision impossible

3. Append-Only, Serialized (requires tooling)

Multiple agents append to the same file. Requires locking script.

Used for: DECISIONS.md only

Enforcement: decisions-append.sh using flock for file-level locking. Never edit existing content โ€” only append new entries.

Key insight: By designing the naming conventions correctly, most concurrent-write risk disappears. The only truly dangerous pattern (two agents editing the same file) is eliminated everywhere except DECISIONS.md, which gets a locking script.

๐Ÿ“ฆ Git & GitHub Strategy

What's tracked, what's synced, how code ships.

Git-Tracked vs. Sync-Only

DirectoryGit-Tracked?SizeRationale
governance/โœ… Yes~1 MBConstitutional docs need version history. Every change auditable.
knowledge/โŒ No (.gitignored)~475 MBToo large for Git. Synced via Obsidian Sync + rclone. World-model API is primary access.
agents/โœ… Yes (except runtime data)~5 MBAgent identity and skills need version history. .gitignore: chromadb_store/, .venv/, *.log
projects/โœ… Yes~10 MBProject decisions and artifacts need audit trail.
ops/scripts/โœ… Yes~100 KBAutomation scripts are code โ€” need version control.
ops/mailbox/โšก OptionalVariableHistorical value but high churn. Could .gitignore and rely on filesystem.
ops/briefs/โŒ NoVariableRegenerated daily. No version history needed.
ops/staging/โŒ NoVariableTransient. Files move to final home.
ops/logs/โŒ NoVariableRotated. Already .gitignored.
archive/โšก Selective~46 MBKeep lightweight archives. .gitignore large binary archives.

Estimated Git Repo Size After Restructure

~16 MB Git-tracked content (governance + agents + projects + ops/scripts). Down from current ~50MB+ with cleaner boundaries. The 475MB knowledge/ is out of Git entirely.

Git Workflow

WhoCommitsBranch StrategyPush Cadence
ForgeCode changes, infra configs, architecture docsDirect to main for routine. Feature branches for major changes.After every significant work session
ShogunDoes NOT commit directly. Shogun runs on OpenClaw, not Claude Code.N/AN/A โ€” Shogun's writes are picked up in Forge's next commit, or auto-committed via a cron job
Auto-commit cronCatches Shogun's memory writes, mailbox messages, capturesmainEvery 30 min (if changes exist)
Pat (Obsidian)Direct edits via Obsidian Sync โ†’ EngineObsidian Sync handles. Git picks up on next auto-commit.Passive

GitHub Remote

Current State

Remote: github.com/patrickrmadden-byte/ShogunOS (private repo)

Issue: 34 local commits with no shared ancestry to GitHub. Forge flagged this as a pain point โ€” needs a sync reconciliation.

Post-Restructure

Code vs. Docs: "Docs in the vault. Code in the repo." โ€” the existing Forge principle stays. The repo IS the vault. There's no separate code repo. All code (services, scripts, infra configs) is tracked in Git alongside docs. The .gitignore boundary separates tracked (small, auditable) from untracked (large, transient).

๐Ÿ”ง Proposed Improvements

Changes to the v3 blueprint based on stress testing and MoE review.

#ChangeWhyImpact
1Add agents/forge/services/ for application code and agents/forge/infra/ for infrastructure configsCode currently has no explicit home. Forge owns all code.Resolves stress test scenarios #4 and #13
2Move business entity operating contexts (DxD, HoldCo) to knowledge/. Keep only CLAUDE.md stubs in agents/shogun/entities/MoE CEO expert: Pat looks for DxD info in the knowledge base, not inside Shogun's workspaceBetter discoverability for Pat
3Create ops/scripts/decisions-append.sh with file locking for DECISIONS.mdOnly append-only multi-writer file in the system. Needs locking.Eliminates concurrent-write risk
4.gitignore knowledge/ entirely. Sync via Obsidian Sync + rclone.475MB is too large for Git. World-model API is primary access anyway.Repo drops from ~525MB to ~16MB
5Add auto-commit cron (every 30 min) for Shogun's writesShogun runs on OpenClaw (no Claude Code, no git access). Its writes need to be committed.Shogun's memory, captures, and mailbox messages get version-tracked
6Add README.md to agents/*/skills/ clarifying it's operational reference, not OpenClaw SKILL.md filesMoE Agent UX expert: naming ambiguity with OpenClaw's skill systemPrevents agent confusion
7Add project log rotation: 30 days, then consolidate to single archive fileMoE Ops expert: project logs could accumulate without boundsPrevents project dir bloat
8Add Nomad sync rule: governance/ + agents/shogun/ + active projects/ (via Git sparse checkout)Stress test #11: Pat needs project context on travelResolves Nomad access gap

Updated Structure (Incorporating Fixes)

Changes from v3 shown with NEW tags:

ShogunOS/
โ”œโ”€โ”€ CLAUDE.md              โ† Forge identity (Claude Code entry point)
โ”œโ”€โ”€ README.md              โ† Human onboarding
โ”‚
โ”œโ”€โ”€ governance/            โ† Git-tracked. Read every session.
โ”‚   โ”œโ”€โ”€ SYSTEM.md
โ”‚   โ”œโ”€โ”€ DECISIONS.md       โ† append via decisions-append.sh (flock)
โ”‚   โ”œโ”€โ”€ SYSTEM_LESSONS.md
โ”‚   โ”œโ”€โ”€ THE_AGENT_DOCTRINE.md
โ”‚   โ”œโ”€โ”€ decisions/
โ”‚   โ”œโ”€โ”€ frameworks/
โ”‚   โ”œโ”€โ”€ integrations/
โ”‚   โ””โ”€โ”€ metrics/
โ”‚
โ”œโ”€โ”€ knowledge/             โ† .gitignored. Synced via Obsidian/rclone.
โ”‚   โ”œโ”€โ”€ people/
โ”‚   โ”œโ”€โ”€ orgs/
โ”‚   โ”œโ”€โ”€ transcripts/
โ”‚   โ”œโ”€โ”€ digests/
โ”‚   โ”œโ”€โ”€ reference/
โ”‚   โ”œโ”€โ”€ pat/
โ”‚   โ”œโ”€โ”€ dxd/               โ† NEW: DxD business context (moved from entities)
โ”‚   โ”œโ”€โ”€ holdco/            โ† NEW: HoldCo context (moved from entities)
โ”‚   โ”œโ”€โ”€ montauk/           โ† NEW: Montauk context (moved from entities)
โ”‚   โ”œโ”€โ”€ graph/
โ”‚   โ””โ”€โ”€ staging/
โ”‚
โ”œโ”€โ”€ agents/                โ† Git-tracked.
โ”‚   โ”œโ”€โ”€ shogun/
โ”‚   โ”‚   โ”œโ”€โ”€ SOUL.md, CLAUDE.md, MEMORY.md, HEARTBEAT.md
โ”‚   โ”‚   โ”œโ”€โ”€ memory/
โ”‚   โ”‚   โ”œโ”€โ”€ skills/
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ README.md  โ† NEW: clarifies these aren't SKILL.md files
โ”‚   โ”‚   โ”œโ”€โ”€ entities/      โ† CLAUDE.md stubs only (business context in knowledge/)
โ”‚   โ”‚   โ””โ”€โ”€ scripts/
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ forge/
โ”‚   โ”‚   โ”œโ”€โ”€ SOUL.md, CLAUDE.md, FORGE_DOCTRINE.md, CHANGELOG.md
โ”‚   โ”‚   โ”œโ”€โ”€ memory/
โ”‚   โ”‚   โ”œโ”€โ”€ skills/
โ”‚   โ”‚   โ”œโ”€โ”€ architecture/
โ”‚   โ”‚   โ”œโ”€โ”€ services/      โ† NEW: application code (world-model, brief-server)
โ”‚   โ”‚   โ”œโ”€โ”€ infra/         โ† NEW: LaunchAgents, nginx, Docker configs
โ”‚   โ”‚   โ”œโ”€โ”€ specs/
โ”‚   โ”‚   โ””โ”€โ”€ code/          โ† symlinks to service code if needed
โ”‚   โ”‚
โ”‚   โ””โ”€โ”€ sentinel/
โ”‚
โ”œโ”€โ”€ projects/              โ† Git-tracked.
โ”‚   โ””โ”€โ”€ <name>/
โ”‚       โ”œโ”€โ”€ README.md, plan.md
โ”‚       โ”œโ”€โ”€ decisions/, artifacts/, research/
โ”‚       โ”œโ”€โ”€ history/, memory/, logs/
โ”‚       โ””โ”€โ”€ archive/
โ”‚
โ”œโ”€โ”€ ops/                   โ† scripts/ Git-tracked. Rest .gitignored.
โ”‚   โ”œโ”€โ”€ mailbox/
โ”‚   โ”œโ”€โ”€ briefs/
โ”‚   โ”œโ”€โ”€ staging/
โ”‚   โ”œโ”€โ”€ logs/
โ”‚   โ””โ”€โ”€ scripts/
โ”‚       โ”œโ”€โ”€ mailbox-send.sh
โ”‚       โ”œโ”€โ”€ decisions-append.sh  โ† NEW: flock-based append for DECISIONS.md
โ”‚       โ”œโ”€โ”€ auto-commit.sh       โ† NEW: 30-min cron for Shogun's writes
โ”‚       โ””โ”€โ”€ log-rotate.sh
โ”‚
โ””โ”€โ”€ archive/
Result: All MoE findings addressed. All stress test gaps closed. Write safety handled by design (naming conventions) + tooling (flock script). Code has a home. Git boundary is clean.