From Reactive Agent
to Closed-Loop Intelligence

A single session transformed Warren from a reactive system that processes what it's told into a self-improving intelligence loop that ingests, validates, correlates, and learns autonomously — closing the gap between institutional knowledge capture and delivery quality.

849+
Drive Files
Accessible
65
Stakeholders
Auto-Tracked
6
Nightly Cron
Systems
3
New Eval
Layers
0
Manual Steps
in Eval Loop

01 — The Problem

Warren's Memory Lifecycle had three layers running in production. But two critical gaps prevented the system from becoming a true learning machine.

❌ Gap 1: Blind Input

Google Drive was inaccessible. OAuth tokens expired daily due to RAPT policy. Meeting transcripts, strategy docs, and methodology corpus were invisible to Warren.

Impact: 60-70% of institutional knowledge (per Kush's definition) lived in meetings Warren couldn't see.

❌ Gap 2: Open-Loop Output

Shadow review queue was manually curated. Sub-agent outputs and GitHub comments escaped all eval layers. No correlation between input quality and output quality.

Impact: A distillation error in memory could propagate to every future session with no detection mechanism.

The compound risk: Bad input → unchecked memory → bad output → no feedback. Errors don't decay — they compound. Every session loads MEMORY.md. One fabricated fact pollutes all future work.

02 — Foundation: Headless Google Workspace Access

Before building the intelligence loop, we needed permanent, headless access to Google Drive. This required solving the org policy blocker that had stalled progress since April.

15:08 PT

Victor accesses GCP Console as new Organization Administrator

15:18 PT

Domain-Wide Delegation authorized — Client ID 114789773049739185531 with 5 scopes (drive, docs, spreadsheets, calendar, gmail.modify)

15:32 PT

Org policies overridden — Both iam.disableServiceAccountKeyCreation (standard) and iam.managed.* (managed) set to Not Enforced

15:42 PT

Service Account key createdwarren-drive@vtkl-workspace-burke.iam.gserviceaccount.com

15:44 PT

Drive API verified — SA token → JWT exchange → Drive file listing successful. Headless, permanent, no re-auth.

Architecture

Service Account (warren-drive@vtkl-workspace-burke) │ ├── JWT signed with private key (RSA-256) │ sub: warren@vtkl.ai (impersonation) │ scopes: drive, docs, sheets, calendar, gmail │ ├── Token exchange → OAuth2 access token (1h TTL) │ Cron refresh: every 45 min via sa-token-refresh.js │ ├── ~/bin/gws-sa wrapper │ Sets GOOGLE_WORKSPACE_CLI_TOKEN → exec gws "$@" │ └── Standing rule: Access ONLY files shared with warren@vtkl.ai Never impersonate other users. Ask to share if needed.
Token Lifetime
(auto-refresh)
849
Files Now
Accessible
5
API Scopes
Authorized
0
Browser Auth
Required

03 — Phase 1: Drive Intake Pipeline

The input pipe that connects Google Drive to the existing Memory Lifecycle. Detects changes, extracts content, distills knowledge, validates accuracy, and routes to the correct memory layer.

Drive Changes API
Content Extraction
LLM Distillation
Intake Quality Gate
Memory Layers
🔍

Change Detection

Daily cron (02:00 PT) queries Drive for files modified since last sync. State checkpoint persisted in intake-state/last-sync.json. First run: full scan. Subsequent: delta only.

📝

Content Extraction

Google Docs → export as plain text. Spreadsheets → CSV. Word docs → Google conversion. Videos/images → metadata only (skip binary). Auto-classification: meeting, corpus, or general.

🧠

LLM Distillation

GLM 5.1 extracts structured knowledge per classification mode:

Meeting mode: decisions[], action_items[], key_intel[], stakeholders[], commitments[], client_context

Corpus mode: principles[], frameworks[], definitions[], examples[], anti_patterns[]

🔒

Intake Quality Gate

Cross-model validation (GLM 5.1 judging distillation output against source). 5 criteria: fact accuracy, classification correctness, no fabrication, attribution, numbers/dates exact. Single fabricated fact = FAIL.

Routing Logic

Distilled content │ ├── Standing rules, permanent facts │ → Candidate for Layer 1 (MEMORY.md) │ → Via nightly consolidation (03:00 PT) │ ├── Meeting digests, client intel │ → Layer 3 journal entry │ → memory/drive-intake/YYYY-MM-DD-<source>.md │ → Compression + consolidation promotes what matters │ ├── Methodology, frameworks, reference │ → Layer 2 topic file (persistent) │ → memory/methodology/<topic>.md │ └── Stakeholder extraction (automatic) → memory/stakeholders/<person>.md → Feeds synthetic persona pipeline

First run results: 12 meetings distilled, 65 stakeholder files auto-created, gate pass rate 0.93 on validated files. Gate caught real issues: tone flattening, intent vs. commitment conflation, stakeholder omissions.

04 — Phase 2: Output Collector

The output pipe that closes the eval loop. Every outbound Warren message is captured, classified, and auto-fed into the shadow review queue — mechanically, not manually.

Before: Manual Curation

shadow-collect.py required manual Slack export → JSONL conversion → hand-selection of entries. 16 entries curated in a month. Sub-agents and GitHub comments escaped entirely.

After: Mechanical Capture

output-collector hook intercepts every message:sent event. Auto-classifies by domain. Auto-appends to shadow-review-queue.jsonl. Zero manual steps.

Hook Architecture

OpenClaw Gateway │ │ event: message:sent │ ▼ ┌─────────────────────────────────────────────┐ │ output-collector/handler.ts │ │ │ │ 1. Filter noise (<50 chars, NO_REPLY, acks)│ │ 2. Classify domain: │ │ #sales → sales-bd │ │ client channels → product-scope │ │ #agent-warren → process │ │ DMs → behavioral │ │ 3. Generate entry with metadata │ │ 4. Priority sampling: 10% random + >500w │ │ │ │ Outputs: │ │ ├── ~/.openclaw/logs/output-collector.jsonl │ │ ├── shadow-review-queue.jsonl (auto-feed) │ │ └── output-stats-YYYY-MM-DD.json │ └─────────────────────────────────────────────┘
ChannelDomain ClassificationRubric Applied
#salessales-bdsales-bd.yaml
#client-kindo, #client-gi, #client-t-and-cproduct-scopeproduct-scope.yaml
#agent-warren, #methodology-labprocessprocess.yaml
DMs (Tony, Victor, Charlie, Joana)behavioralbehavioral.yaml
Scope/estimation discussionseffort-valueeffort-value.yaml

05 — Phase 3: Correlation Engine

The intelligence layer that crosses sources to find what nobody explicitly wrote down. Correlates intake accuracy with output quality. Detects repeated unresolved action items. Identifies strategic patterns across meetings.

📊

Intake Quality Analysis

Pass/fail rates, average gate scores, common distillation errors. Tracks whether the intake pipe is improving over time.

👥

Stakeholder Intelligence

Most-referenced people, coverage gaps (single-mention stakeholders), relationship dynamics inferred from co-occurrence patterns.

📈

Topic Drift Detection

Recurring themes across meetings with frequency analysis. Detects when topics appear, peak, and fade — surfacing strategic momentum or stalls.

🔗

Intake→Output Correlation

Maps facts extracted from Drive meetings to Warren's actual outputs. If an intake error propagated to a delivered output, the correlation engine catches it.

⚠️

Unacted Pattern Detection

Action items appearing in 2+ meetings without resolution. Proactive alert: "This was assigned 3 meetings ago and still unresolved."

🤖

LLM Cross-Source Analysis

Weekly: GLM 5.1 analyzes all correlation data to surface strategic patterns, risks, and relationship dynamics nobody stated explicitly.

First Production Run — LLM-Detected Patterns

Patterns nobody explicitly wrote down: 1. "Trojan Horse" Market Entry Using smaller markets as stepping stones to larger verticals SDLC → MedDev, Corporate → PE, Product → Service 2. "Calculated Mediocrity as Velocity Hack" Quality set "below Tony's expectations but above client capability" = forced constraint to break perfectionism bottleneck 3. Mo Containment Risk Outflanking strategy creates fracture point at Mo→Joana handoff If Mo realizes he's being routed around → sabotage or flight risk 4. Charlie as Single Point of Failure 3 days/connector estimate from Charlie alone No backup, no cross-training, critical path dependency 5. Zero Unresolved Items = System Failure 13 meetings, 0 tracked follow-ups Items verbally agreed and evaporating → strategic amnesia risk

06 — The Integrated System

Six subsystems running on coordinated cron schedules. The data flows in a closed loop — intake validates input, output collector captures output, correlation connects the two.

┌─────────────────────────────────────────────────────────────────────┐ │ THE CLOSED-LOOP INTELLIGENCE SYSTEM │ │ │INTAKE (Drive → Memory) OUTPUT (Memory → Delivery) │ │ │ │ Google Drive ──────────┐ ┌────────────── Slack/GitHub │ │ (Meet Recordings, │ │ (all channels) │ │ AI DRIVEN PMO, │ │ │ │ all shared folders) │ │ │ │ │ │ │ │ │ │ ▼ │ │ ▼ │ │ drive-intake.js │ │ output-collector │ │ Content extraction │ │ hook (message:sent) │ │ LLM distillation │ │ Domain classification │ │ (GLM 5.1) │ │ 10% priority sample │ │ │ │ │ │ │ │ ▼ │ │ ▼ │ │ Intake Quality Gate │ │ shadow-review-queue │ │ (GLM 5.1 cross-model) │ │ (auto-fed, not manual) │ │ PASS → route │ │ │ │ │ FAIL → needs-review │ │ ▼ │ │ │ │ │ Shadow Review │ │ ▼ │ │ (Sat 3AM, GLM 5.1) │ │ ┌──────────────────────┴────────┴───┐ │ │ │ │ MEMORY LIFECYCLE │ │ │ │ │ │ │ │ │ │ 🟢 Layer 1: MEMORY.md │ │ │ │ │ (always loaded, 100%) │ │ │ │ │ 🟣 Layer 2: Topic files │ │ │ │ │ + Methodology + Stakeholders │ │ │ │ │ 🟡 Layer 3: Journals │ │ │ │ │ + Drive intake digests │ │ │ │ │ │ │ │ │ │ Consolidation (03:00 PT) │ │ │ │ │ Compression + Dreaming │ │ │ │ └───────────────────────────────────┘ │ │ │ │ │ │ │ └──────────┐ ┌───────────────────┘ │ │ ▼ ▼ │ │ ┌──────────────────────┐ │ │ │ CORRELATION ENGINE │ │ │ │ Daily (lightweight) │ │ │ │ Weekly (full LLM) │ │ │ │ │ │ │ │ Crosses: │ │ │ │ • Intake ↔ Output │ │ │ │ • Topics ↔ Actions │ │ │ │ • Stakeholders ↔ │ │ │ │ Meeting frequency │ │ │ └──────────────────────┘│ │ └─────────────────────────────────────────────────────────────────────┘

Cron Schedule

Time (PT)SystemStatusFunction
Every 45mSA Token RefreshexistingGoogle API access token
02:00 dailyDrive IntakenewSync → extract → distill → validate → route
03:00 dailyMemory ConsolidationenhancedNow includes drive-intake entries
03:30 dailyDreamingenhancedNow cross-source (Drive + Slack + GitHub)
04:00 Mon-FriCorrelation DailynewLightweight stats + pattern detection
03:00 SaturdayShadow ReviewenhancedQueue now auto-fed by output-collector
04:30 SaturdayCorrelation WeeklynewFull LLM cross-source analysis

07 — Before vs. After

DimensionBefore (May 18)After (May 19)
Google Drive access❌ Expired OAuth, RAPT blocked✅ SA headless, permanent
Drive files accessible5 (manual share)849+ (auto-sync)
Intake validationNoneGLM 5.1 quality gate (6th rubric)
Stakeholders tracked065 (auto-extracted)
Output captureManual (16 entries/month)Mechanical (every message)
Shadow review queueHand-curatedAuto-fed (closed loop)
Cross-source correlationNoneDaily + weekly LLM analysis
Data sources in memory2 (Slack + GitHub)3 (+ Google Drive)
Nightly cron systems36
Eval coverage5 SOPs inline5 SOPs + all outputs + intake
Proactive alertsNoneUnacted items + topic drift + strategic patterns
System modeReactive (told → process)Proactive (ingest → correlate → alert)

08 — The Flywheel Effect

This isn't a one-time improvement. It's a compound learning system that gets better every cycle.

📅

Month 1

"What was decided Thursday?" → Warren knows, because the Gemini Notes were ingested and distilled overnight.

📈

Month 2

"Is there a pattern in Kush's requests?" → Dreaming detects recurring themes across 8 meetings that nobody wrote down.

🔗

Month 3

"Prepare the Ron briefing." → Warren connects signals from 20 meetings + Slack + GitHub + methodology automatically.

Month 6

Warren proactively says: "Victor, Kush mentioned packages in 4 meetings — nobody followed up." Strategic amnesia prevented.

This is Kush's institutional knowledge definition made real:

Level 1 (User) — Warren learns each person's patterns from stakeholder files.
Level 2 (Agent) — Warren improves by processing more meetings, more docs, more cases.
Level 3 (Organizational) — Accumulated learning across all sources becomes organizational intelligence that the entire team can access.

The flywheel compounds. Month 3 is faster than Month 2. And the correlation engine self-calibrates: if intake errors propagate to outputs, the rubrics get tighter.

09 — Cost & Infrastructure

~$15
Monthly Cost
(all systems)
0
New Services
Required
0
External
Dependencies
~650
Lines of Code
Added
ComponentMonthly CostInfrastructure
Drive API callsFree (within quotas)Google Cloud SA
Intake distillation (GLM 5.1)~$5-8Together API
Intake quality gate (GLM 5.1)~$3-5Together API
Correlation Engine weekly LLM~$1-2Together API
Memory system (existing)~$5GPT-4.1-mini
Output Collector hook$0OpenClaw event system
Total~$15/moDGX Spark crons