From Reactive Agent
to Closed-Loop Intelligence

A single session transformed Warren from a reactive system that processes what it's told into a self-improving intelligence loop that ingests, validates, correlates, and learns autonomously — closing the gap between institutional knowledge capture and delivery quality.

849+

Drive Files
Accessible

65

Stakeholders
Auto-Tracked

6

Nightly Cron
Systems

3

New Eval
Layers

0

Manual Steps
in Eval Loop

01 — The Problem

Warren's Memory Lifecycle had three layers running in production. But two critical gaps prevented the system from becoming a true learning machine.

❌ Gap 1: Blind Input

Google Drive was inaccessible. OAuth tokens expired daily due to RAPT policy. Meeting transcripts, strategy docs, and methodology corpus were invisible to Warren.

Impact: 60-70% of institutional knowledge (per Kush's definition) lived in meetings Warren couldn't see.

❌ Gap 2: Open-Loop Output

Shadow review queue was manually curated. Sub-agent outputs and GitHub comments escaped all eval layers. No correlation between input quality and output quality.

Impact: A distillation error in memory could propagate to every future session with no detection mechanism.

The compound risk: Bad input → unchecked memory → bad output → no feedback. Errors don't decay — they compound. Every session loads MEMORY.md. One fabricated fact pollutes all future work.

02 — Foundation: Headless Google Workspace Access

Before building the intelligence loop, we needed permanent, headless access to Google Drive. This required solving the org policy blocker that had stalled progress since April.

15:08 PT

Victor accesses GCP Console as new Organization Administrator

15:18 PT

Domain-Wide Delegation authorized — Client ID 114789773049739185531 with 5 scopes (drive, docs, spreadsheets, calendar, gmail.modify)

15:32 PT

Org policies overridden — Both iam.disableServiceAccountKeyCreation (standard) and iam.managed.* (managed) set to Not Enforced

15:42 PT

Service Account key created — warren-drive@vtkl-workspace-burke.iam.gserviceaccount.com

15:44 PT

Drive API verified — SA token → JWT exchange → Drive file listing successful. Headless, permanent, no re-auth.

Architecture

Service Account (warren-drive@vtkl-workspace-burke) │ ├── JWT signed with private key (RSA-256) │ sub: warren@vtkl.ai (impersonation) │ scopes: drive, docs, sheets, calendar, gmail │ ├── Token exchange → OAuth2 access token (1h TTL) │ Cron refresh: every 45 min via sa-token-refresh.js │ ├── ~/bin/gws-sa wrapper │ Sets GOOGLE_WORKSPACE_CLI_TOKEN → exec gws "$@" │ └── Standing rule: Access ONLY files shared with warren@vtkl.ai Never impersonate other users. Ask to share if needed.

∞

Token Lifetime
(auto-refresh)

849

Files Now
Accessible

5

API Scopes
Authorized

0

Browser Auth
Required

03 — Phase 1: Drive Intake Pipeline

The input pipe that connects Google Drive to the existing Memory Lifecycle. Detects changes, extracts content, distills knowledge, validates accuracy, and routes to the correct memory layer.

Drive Changes API

→

Content Extraction

→

LLM Distillation

→

Intake Quality Gate

→

Memory Layers

🔍

Change Detection

Daily cron (02:00 PT) queries Drive for files modified since last sync. State checkpoint persisted in intake-state/last-sync.json. First run: full scan. Subsequent: delta only.

📝

Content Extraction

Google Docs → export as plain text. Spreadsheets → CSV. Word docs → Google conversion. Videos/images → metadata only (skip binary). Auto-classification: meeting, corpus, or general.

🧠

LLM Distillation

GLM 5.1 extracts structured knowledge per classification mode:

Meeting mode: decisions[], action_items[], key_intel[], stakeholders[], commitments[], client_context

Corpus mode: principles[], frameworks[], definitions[], examples[], anti_patterns[]

🔒

Intake Quality Gate

Cross-model validation (GLM 5.1 judging distillation output against source). 5 criteria: fact accuracy, classification correctness, no fabrication, attribution, numbers/dates exact. Single fabricated fact = FAIL.

Routing Logic

Distilled content │ ├── Standing rules, permanent facts │ → Candidate for Layer 1 (MEMORY.md) │ → Via nightly consolidation (03:00 PT) │ ├── Meeting digests, client intel │ → Layer 3 journal entry │ → memory/drive-intake/YYYY-MM-DD-<source>.md │ → Compression + consolidation promotes what matters │ ├── Methodology, frameworks, reference │ → Layer 2 topic file (persistent) │ → memory/methodology/<topic>.md │ └── Stakeholder extraction (automatic) → memory/stakeholders/<person>.md → Feeds synthetic persona pipeline

First run results: 12 meetings distilled, 65 stakeholder files auto-created, gate pass rate 0.93 on validated files. Gate caught real issues: tone flattening, intent vs. commitment conflation, stakeholder omissions.

04 — Phase 2: Output Collector

The output pipe that closes the eval loop. Every outbound Warren message is captured, classified, and auto-fed into the shadow review queue — mechanically, not manually.

Before: Manual Curation

shadow-collect.py required manual Slack export → JSONL conversion → hand-selection of entries. 16 entries curated in a month. Sub-agents and GitHub comments escaped entirely.

After: Mechanical Capture

output-collector hook intercepts every message:sent event. Auto-classifies by domain. Auto-appends to shadow-review-queue.jsonl. Zero manual steps.

Hook Architecture

OpenClaw Gateway │ │ event: message:sent │ ▼ ┌─────────────────────────────────────────────┐ │ output-collector/handler.ts │ │ │ │ 1. Filter noise (<50 chars, NO_REPLY, acks)│ │ 2. Classify domain: │ │ #sales → sales-bd │ │ client channels → product-scope │ │ #agent-warren → process │ │ DMs → behavioral │ │ 3. Generate entry with metadata │ │ 4. Priority sampling: 10% random + >500w │ │ │ │ Outputs: │ │ ├── ~/.openclaw/logs/output-collector.jsonl │ │ ├── shadow-review-queue.jsonl (auto-feed) │ │ └── output-stats-YYYY-MM-DD.json │ └─────────────────────────────────────────────┘

Channel	Domain Classification	Rubric Applied
#sales	sales-bd	sales-bd.yaml
#client-kindo, #client-gi, #client-t-and-c	product-scope	product-scope.yaml
#agent-warren, #methodology-lab	process	process.yaml
DMs (Tony, Victor, Charlie, Joana)	behavioral	behavioral.yaml
Scope/estimation discussions	effort-value	effort-value.yaml

05 — Phase 3: Correlation Engine

The intelligence layer that crosses sources to find what nobody explicitly wrote down. Correlates intake accuracy with output quality. Detects repeated unresolved action items. Identifies strategic patterns across meetings.

📊

Intake Quality Analysis

Pass/fail rates, average gate scores, common distillation errors. Tracks whether the intake pipe is improving over time.

👥

Stakeholder Intelligence

Most-referenced people, coverage gaps (single-mention stakeholders), relationship dynamics inferred from co-occurrence patterns.

📈

Topic Drift Detection

Recurring themes across meetings with frequency analysis. Detects when topics appear, peak, and fade — surfacing strategic momentum or stalls.

🔗

Intake→Output Correlation

Maps facts extracted from Drive meetings to Warren's actual outputs. If an intake error propagated to a delivered output, the correlation engine catches it.

⚠️

Unacted Pattern Detection

Action items appearing in 2+ meetings without resolution. Proactive alert: "This was assigned 3 meetings ago and still unresolved."

🤖

LLM Cross-Source Analysis

Weekly: GLM 5.1 analyzes all correlation data to surface strategic patterns, risks, and relationship dynamics nobody stated explicitly.

First Production Run — LLM-Detected Patterns

Patterns nobody explicitly wrote down: 1. "Trojan Horse" Market Entry Using smaller markets as stepping stones to larger verticals SDLC → MedDev, Corporate → PE, Product → Service 2. "Calculated Mediocrity as Velocity Hack" Quality set "below Tony's expectations but above client capability" = forced constraint to break perfectionism bottleneck 3. Mo Containment Risk Outflanking strategy creates fracture point at Mo→Joana handoff If Mo realizes he's being routed around → sabotage or flight risk 4. Charlie as Single Point of Failure 3 days/connector estimate from Charlie alone No backup, no cross-training, critical path dependency 5. Zero Unresolved Items = System Failure 13 meetings, 0 tracked follow-ups Items verbally agreed and evaporating → strategic amnesia risk

06 — The Integrated System

Six subsystems running on coordinated cron schedules. The data flows in a closed loop — intake validates input, output collector captures output, correlation connects the two.

┌─────────────────────────────────────────────────────────────────────┐ │ THE CLOSED-LOOP INTELLIGENCE SYSTEM │ │ │ │ INTAKE (Drive → Memory) OUTPUT (Memory → Delivery) │ │ │ │ Google Drive ──────────┐ ┌────────────── Slack/GitHub │ │ (Meet Recordings, │ │ (all channels) │ │ AI DRIVEN PMO, │ │ │ │ all shared folders) │ │ │ │ │ │ │ │ │ │ ▼ │ │ ▼ │ │ drive-intake.js │ │ output-collector │ │ Content extraction │ │ hook (message:sent) │ │ LLM distillation │ │ Domain classification │ │ (GLM 5.1) │ │ 10% priority sample │ │ │ │ │ │ │ │ ▼ │ │ ▼ │ │ Intake Quality Gate │ │ shadow-review-queue │ │ (GLM 5.1 cross-model) │ │ (auto-fed, not manual) │ │ PASS → route │ │ │ │ │ FAIL → needs-review │ │ ▼ │ │ │ │ │ Shadow Review │ │ ▼ │ │ (Sat 3AM, GLM 5.1) │ │ ┌──────────────────────┴────────┴───┐ │ │ │ │ MEMORY LIFECYCLE │ │ │ │ │ │ │ │ │ │ 🟢 Layer 1: MEMORY.md │ │ │ │ │ (always loaded, 100%) │ │ │ │ │ 🟣 Layer 2: Topic files │ │ │ │ │ + Methodology + Stakeholders │ │ │ │ │ 🟡 Layer 3: Journals │ │ │ │ │ + Drive intake digests │ │ │ │ │ │ │ │ │ │ Consolidation (03:00 PT) │ │ │ │ │ Compression + Dreaming │ │ │ │ └───────────────────────────────────┘ │ │ │ │ │ │ │ └──────────┐ ┌───────────────────┘ │ │ ▼ ▼ │ │ ┌──────────────────────┐ │ │ │ CORRELATION ENGINE │ │ │ │ Daily (lightweight) │ │ │ │ Weekly (full LLM) │ │ │ │ │ │ │ │ Crosses: │ │ │ │ • Intake ↔ Output │ │ │ │ • Topics ↔ Actions │ │ │ │ • Stakeholders ↔ │ │ │ │ Meeting frequency │ │ │ └──────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────┘

Cron Schedule

Time (PT)	System	Status	Function
Every 45m	SA Token Refresh	existing	Google API access token
02:00 daily	Drive Intake	new	Sync → extract → distill → validate → route
03:00 daily	Memory Consolidation	enhanced	Now includes drive-intake entries
03:30 daily	Dreaming	enhanced	Now cross-source (Drive + Slack + GitHub)
04:00 Mon-Fri	Correlation Daily	new	Lightweight stats + pattern detection
03:00 Saturday	Shadow Review	enhanced	Queue now auto-fed by output-collector
04:30 Saturday	Correlation Weekly	new	Full LLM cross-source analysis

07 — Before vs. After

Dimension	Before (May 18)	After (May 19)
Google Drive access	❌ Expired OAuth, RAPT blocked	✅ SA headless, permanent
Drive files accessible	5 (manual share)	849+ (auto-sync)
Intake validation	None	GLM 5.1 quality gate (6th rubric)
Stakeholders tracked	0	65 (auto-extracted)
Output capture	Manual (16 entries/month)	Mechanical (every message)
Shadow review queue	Hand-curated	Auto-fed (closed loop)
Cross-source correlation	None	Daily + weekly LLM analysis
Data sources in memory	2 (Slack + GitHub)	3 (+ Google Drive)
Nightly cron systems	3	6
Eval coverage	5 SOPs inline	5 SOPs + all outputs + intake
Proactive alerts	None	Unacted items + topic drift + strategic patterns
System mode	Reactive (told → process)	Proactive (ingest → correlate → alert)

08 — The Flywheel Effect

This isn't a one-time improvement. It's a compound learning system that gets better every cycle.

📅

Month 1

"What was decided Thursday?" → Warren knows, because the Gemini Notes were ingested and distilled overnight.

📈

Month 2

"Is there a pattern in Kush's requests?" → Dreaming detects recurring themes across 8 meetings that nobody wrote down.

🔗

Month 3

"Prepare the Ron briefing." → Warren connects signals from 20 meetings + Slack + GitHub + methodology automatically.

⚡

Month 6

Warren proactively says: "Victor, Kush mentioned packages in 4 meetings — nobody followed up." Strategic amnesia prevented.

This is Kush's institutional knowledge definition made real:

Level 1 (User) — Warren learns each person's patterns from stakeholder files.
Level 2 (Agent) — Warren improves by processing more meetings, more docs, more cases.
Level 3 (Organizational) — Accumulated learning across all sources becomes organizational intelligence that the entire team can access.

The flywheel compounds. Month 3 is faster than Month 2. And the correlation engine self-calibrates: if intake errors propagate to outputs, the rubrics get tighter.

09 — Cost & Infrastructure

~$15

Monthly Cost
(all systems)

0

New Services
Required

0

External
Dependencies

~650

Lines of Code
Added

Component	Monthly Cost	Infrastructure
Drive API calls	Free (within quotas)	Google Cloud SA
Intake distillation (GLM 5.1)	~$5-8	Together API
Intake quality gate (GLM 5.1)	~$3-5	Together API
Correlation Engine weekly LLM	~$1-2	Together API
Memory system (existing)	~$5	GPT-4.1-mini
Output Collector hook	$0	OpenClaw event system
Total	~$15/mo	DGX Spark crons

From Reactive Agentto Closed-Loop Intelligence

01 — The Problem

❌ Gap 1: Blind Input

❌ Gap 2: Open-Loop Output

02 — Foundation: Headless Google Workspace Access

Architecture

03 — Phase 1: Drive Intake Pipeline

Change Detection

Content Extraction

LLM Distillation

Intake Quality Gate

Routing Logic

04 — Phase 2: Output Collector

Before: Manual Curation

After: Mechanical Capture

Hook Architecture

05 — Phase 3: Correlation Engine

Intake Quality Analysis

Stakeholder Intelligence

Topic Drift Detection

Intake→Output Correlation

Unacted Pattern Detection

LLM Cross-Source Analysis

First Production Run — LLM-Detected Patterns

06 — The Integrated System

Cron Schedule

07 — Before vs. After

08 — The Flywheel Effect

Month 1

Month 2

Month 3

Month 6

09 — Cost & Infrastructure

From Reactive Agent
to Closed-Loop Intelligence