VEKTOR Memory as Your Developer Second Brain: Complete Setup Guide

Step-by-step tutorial for setting up VEKTOR Slipstream as a persistent, agent-maintained knowledge base via Claude MCP — AES-256 encrypted secrets, SKILL.md routing, stealth web traversal, and approval gates. One afternoon to set up, running forever.

A hands-on, step-by-step tutorial for turning VEKTOR Slipstream into a persistent, agent-maintained knowledge base — connected to Claude Desktop via MCP, secured with AES-256 encryption, set up in one afternoon and running forever.

12 min read · vektormemory.com

Why this article exists

I spent five months building automation on OpenClaw before it collapsed. The Roy trading bot, the Rachel research agent — they were useful, and they broke in all the ways the previous article described. Token blow-outs. Silent cron failures. Credentials in plaintext configs. A ClawHub marketplace that was 11.93% malware.

But the most persistent failure wasn't security or cost. It was amnesia.

Every session started from zero. The agent didn't know what decisions we'd already made. It didn't know which APIs had broken and why. It didn't know that we'd benchmarked three LLM providers last week and settled on one. Every time a conversation ended, the context window closed, and everything in it disappeared.

I was the only memory in the system. And I'm not a good memory system. I forget things, lose context, repeat mistakes I've already debugged. The agent was capable of doing real work — and it was being bottlenecked by the fact that it couldn't remember doing it.

VEKTOR Memory solves this. Not by keeping a chat log — that's not memory, that's a transcript. It solves it through a layered, namespace-isolated, AES-256 encrypted knowledge store that survives across sessions, compounds with use, and surfaces context the moment it's relevant. Combined with Claude Desktop via MCP, it turns Claude from a capable-but-stateless assistant into something that actually accumulates understanding of your work over time.

This tutorial is the technical how-to. By the end, you'll have a working harness where Claude:

Remembers decisions you made in previous sessions without being told
Stores private credentials and secrets in an encrypted vault, never in plaintext
Routes intelligently across tool types using SKILL.md files you write once
Traverses the real web using stealth browser identities
Asks before executing anything irreversible on your server
Costs pennies per day when you're not using it and scales linearly when you are

The setup takes one afternoon. The value compounds for years.

The mental model before we touch a terminal

Most people try to use an LLM as a second brain by giving it a long system prompt. That's not a second brain — it's a briefing note. It doesn't update. It doesn't cross-reference. It doesn't get smarter as you use it.

VEKTOR Memory treats memory the way a human brain actually treats it: layered, associative, and time-aware. There are three layers in the system:

LAYER 1 — WORKING MEMORY (the active session)
The current conversation context. Fast, temporary. Cleared on session end.
Equivalent: what's in your head right now.

LAYER 2 — EPISODIC MEMORY (vektor_store / vektor_recall)
Facts, decisions, preferences stored from past sessions.
Retrieved by semantic relevance, not exact keyword match.
Equivalent: "I remember we discussed this last month."

LAYER 3 — SEMANTIC MEMORY (vektor_recall_rrf)
Dual-channel retrieval: BM25 keyword search + semantic vector search,
fused via Reciprocal Rank Fusion. The smartest retrieval path.
Equivalent: "This reminds me of three other things you've mentioned."

On top of these three layers sits a fourth process that runs in the background between sessions — the REM consolidation loop (via vektor_ingest). It deduplicates redundant memories, resolves contradictions, decays stale facts, and surfaces higher-order patterns. After six months of use, you don't have 1,000 raw memories. You have a compressed, accurate model of how you think about your work.

This is what makes VEKTOR different from a note-taking app connected to an LLM. The knowledge gets cleaner with use, not noisier.

Part 1 — The three memory zones (and why separation matters)

Before installing anything, understand the data architecture. VEKTOR organises memory into namespaces — isolated partitions with different access rules and encryption contexts.

MEMORY ARCHITECTURE
──────────────────────────────────────────────────────────────────────
NAMESPACE: "private"
  Encryption: AES-256, key from your passphrase + PBKDF2
  Contents:  personal preferences, context, private notes
  Access:    explicit namespace reference only
  Example:   "I prefer deploy windows on Tuesday evenings"

NAMESPACE: "credentials"  (via cloak_passport vault)
  Encryption: AES-256 separate vault, never appears in recall results
  Contents:  API keys, SSH credentials, OAuth tokens, secrets
  Access:    explicit get/set/list only — values never exposed in search
  Example:   vps-vektor (SSH key), anthropic-key, x-bearer-token

NAMESPACE: "work:{project}"
  Encryption: AES-256
  Contents:  project decisions, architecture notes, technical context
  Access:    scoped to project queries
  Example:   "work:roy-bot", "work:rachel-agent", "work:vektormemory"

NAMESPACE: "public"  (or no namespace)
  Encryption: none
  Contents:  general knowledge, non-sensitive patterns, tool configs
  Access:    default recall results
  Example:   "pgvector has better latency under 1M vectors than Qdrant"

Why does this matter in practice? When you ask "what do I know about the trading bot?" you get work:roy-bot memories — not your private notes, not your credentials. When you do a general query like "what LLM providers do I have configured?", credentials namespace never bleeds into the answer. The vault and the memory are separate subsystems that never cross.

This is the architectural gap OpenClaw and Hermes never filled. They had capability. They had no boundary enforcement.

Part 2 — The three connection paths (and which to pick)

Before the step-by-step, you need to decide how Claude physically connects to VEKTOR. Three viable paths exist in 2026:

PATH COMPARISON
─────────────────────────────────────────────────────────────────────────
PATH 1 — Claude Desktop via MCP (recommended starting point)
  How:   Install VEKTOR globally via npm. Run setup wizard. VEKTOR 
         registers as MCP server in claude_desktop_config.json. 
         Claude Desktop picks it up on next launch.
  Cost:  5 minutes setup.
  Best:  Daily use, personal knowledge base, credential vault,
         web traversal, SSH automation with approval gates.
  Limit: Tied to Claude Desktop being open.

PATH 2 — Direct API calls (for artifact/app builders)
  How:   Call api.anthropic.com directly with VEKTOR tools in
         mcp_servers parameter. No Desktop required.
  Cost:  10 minutes to wire up first call.
  Best:  Building AI-powered apps that need persistent context,
         multi-session workflows, automated pipelines.
  Limit: You manage the API key and request lifecycle yourself.

PATH 3 — Hybrid (MCP for interactive + API for automation)
  How:   Desktop MCP for daily use; separate API key for cron/scheduler.
         Both write to same VEKTOR database — shared memory.
  Cost:  15 minutes. Two config files.
  Best:  Power users who need both interactive and automated modes.
  Limit: Two credential sets to manage (but both through cloak_passport).

My recommendation: start with Path 1. It's the fastest to set up, produces immediate value in your daily Claude sessions, and you can debug it when things go wrong. When you hit "I need this to run at 3 AM without Desktop open," migrate the automation layer to Path 2 while keeping Path 1 for interactive work. The memory database is shared — context from your interactive sessions is available to automated scripts, and vice versa.

The rest of this tutorial assumes Path 1. I'll note where Paths 2 and 3 diverge.

Part 3 — Step by step setup

3.1 — Prerequisites

You need:

Node.js 18+ installed (node --version to check)
Claude Desktop installed (claude.ai/download)
A VEKTOR licence key (vektormemory.com — one-time purchase, no subscription)
Terminal familiarity
Optional but recommended: a VPS for server automation workflows

If you've never used Claude Desktop, open it and have one conversation first — this tutorial assumes you can start a session.

3.2 — Install VEKTOR globally

npm install -g vektor-slipstream

Verify the install:

vektor --version
# vektor-slipstream v1.4.9

3.3 — Activate your licence and run the setup wizard

vektor activate YOUR-LICENCE-KEY-HERE

The wizard walks through five steps:

VEKTOR SETUP WIZARD
─────────────────────────────────────────────────
[1/5] Licence verified ✓

[2/5] LLM Provider configuration
      Primary provider: anthropic
      Enter your Anthropic API key: sk-ant-...
      (Stored encrypted — not written to any config file)

[3/5] Additional providers (optional)
      OpenAI API key: (enter or skip)
      MiniMax API key: (enter or skip)

[4/5] Claude Desktop MCP setup
      Found Claude Desktop at: C:\Users\you\AppData\Roaming\Claude\
      Configure VEKTOR as MCP server? [Y/n]: Y
      ✓ claude_desktop_config.json updated

[5/5] Playwright browser (for web traversal tools)
      Install Playwright headless browser? [Y/n]: Y
      ✓ Playwright installed

Setup complete. Restart Claude Desktop to activate VEKTOR tools.
─────────────────────────────────────────────────

The wizard writes claude_desktop_config.json safely via PowerShell on Windows or direct write on macOS/Linux. Never edit this file manually — the JSON structure is sensitive to trailing commas and whitespace that text editors introduce silently.

What the config looks like after wizard completes:

{
  "mcpServers": {
    "vektor": {
      "command": "node",
      "args": ["/path/to/vektor-slipstream/vektor.mjs", "mcp"],
      "env": {
        "VEKTOR_LICENCE_KEY": "YOUR-KEY-HERE",
        "CLOAK_PROJECT_PATH": "/path/to/vektor-slipstream"
      }
    }
  }
}

3.4 — Verify tools are loading in Claude Desktop

Restart Claude Desktop. In a new conversation, look for the tools icon (⚙️ or the hammer icon depending on your version). VEKTOR should appear as a connected MCP server with 49 tools available.

Quick verification — ask Claude:

What VEKTOR tools do you have access to?

Expected: a list that includes vektor_store, vektor_recall, vektor_recall_rrf, vektor_status, cloak_fetch, cloak_ssh_exec, cloak_passport, and others. If you see 49 tools, you're live.

Run the health check:

Run vektor_status and tell me what it shows.

Expected:

Memory count: 0 (new installation)
Namespace: default
Database: healthy
Last store: never
Licence: active

3.5 — The SKILL.md system: the routing brain

Here's the part most tutorials skip — and it's the difference between an agent that interrupts you constantly and one that knows what to do.

VEKTOR's cloak_cortex tool scans your project directories and builds a token-aware skill index. Any .md file in your project or a designated skills folder that Claude reads becomes part of how it routes requests — what tools to use, what not to touch, how to behave in specific contexts.

Create your personal harness skill file. This is your CLAUDE.md equivalent — the file that tells Claude how to behave in every session:

mkdir -p ~/.claude/skills/personal-harness

Create ~/.claude/skills/personal-harness/SKILL.md:

---
name: personal-harness
description: Personal knowledge and workflow rules. Load this on every session
  start. Defines memory namespaces, credential access patterns, and what
  requires approval before executing.
---

# Personal Harness — Session Rules

## Session start (always, silently)

On every session start, run without announcing:
1. `vektor_status` — health check
2. `vektor_recall` with query matching the user's first message topic
3. Load any relevant project namespace memories

Report only if something is wrong. Otherwise just use the context.

## Memory namespaces

- Personal preferences and context → namespace: "private"
- Project-specific decisions → namespace: "work:{project-name}"
- General knowledge and patterns → no namespace (default)
- Credentials and secrets → cloak_passport vault ONLY (never vektor_store)

## Credential rules

NEVER store API keys, passwords, or SSH credentials via vektor_store.
ALL secrets go through cloak_passport:

  cloak_passport set <key-name>      ← store
  cloak_passport get <key-name>      ← retrieve
  cloak_passport list                ← see what exists (names only)

If I ask "what's my API key for X?", retrieve via cloak_passport get,
not from memory recall results.

## Approval gates

The following ALWAYS require explicit confirmation before executing:
- Any cloak_ssh_exec with write, delete, restart, or rm commands
- Any email or message sent on my behalf
- Any file deleted or overwritten
- Any external API call that modifies state (POST/PUT/DELETE)

Read-only operations (grep, cat, ls, curl GET, log reads) → proceed without asking.

## VPS access pattern

Host: [your-server-ip]
User: server
Key: stored in cloak_passport as "vps-vektor"
Pattern:
  cloak_ssh_exec({ host: "your-server-ip", username: "server",
                   keyName: "vps-vektor", command: "..." })

## Memory at session end

When conversation winds down, store a consolidated note:
  vektor_store({
    content: "Session summary: [what was decided/changed/pending]",
    namespace: "work:{relevant-project}",
    tags: ["session", "handover"],
    importance: 5
  })

This skill file is the equivalent of CLAUDE.md in the Obsidian setup. Claude reads it, loads the rules, and operates within them — without you having to re-explain your setup every conversation.

3.6 — Store your first credentials

Before anything else, move your API keys out of .env files and into the encrypted vault:

# In Claude Desktop — ask Claude to run:

Store my Anthropic API key in the credential vault as "anthropic-key"
Store my VPS SSH key content as "vps-vektor"
Store my OpenAI key as "openai-key"

Claude will call:

// What Claude runs under the hood
await cloak_passport({ action: "set", key: "anthropic-key", value: "sk-ant-..." })
await cloak_passport({ action: "set", key: "vps-vektor", value: "-----BEGIN..." })

Verify they're stored:

await cloak_passport({ action: "list" })
// → ["anthropic-key", "vps-vektor", "openai-key"]
// Values are never shown in list — names only

Your .env file can now be deleted or emptied. Credentials live in an AES-256 encrypted SQLite vault that only VEKTOR can access with your passphrase-derived key.

3.7 — Store your first memories

Have a project in flight? Give VEKTOR the context it needs to help immediately:

Tell VEKTOR:
- My main project right now is [project name]
- We're using [stack/tech decisions]
- The last three things I worked on were [list]
- My preferred deploy window is [time]
- I use [LLM providers] for different task types

Claude will translate this into structured memory calls:

await vektor_store({
  content: "Primary project: Roy trading bot. Stack: Node.js, PostgreSQL, 
            Anthropic API. Currently migrating from OpenClaw to direct API.",
  namespace: "work:roy-bot",
  tags: ["project", "stack", "context"],
  importance: 8
})

await vektor_store({
  content: "Deploy preference: Tuesday evenings, never Friday. VPS is 
            production — always use approval gate before write commands.",
  namespace: "private",
  tags: ["preferences", "deployment"],
  importance: 7
})

Three sessions from now, you won't need to repeat any of this. Claude will recall it the moment a relevant topic comes up.

3.8 — Setup verification

Test the full loop:

You: What do you know about my current projects?

Expected: Claude runs vektor_recall silently, retrieves project context, answers with specifics — without you having to re-explain your setup.

You: Can you check the VPS logs for errors?

Expected: Claude reads the personal-harness SKILL.md, sees the VPS access pattern, calls cloak_ssh_exec with the right parameters (key from vault, not hardcoded), and returns log output — all without asking you for the VPS IP, username, or key location.

If both of those work, the harness is running.

Part 4 — The real workflows: what this looks like in daily use

Real workflow 1: The research → decision → memory pipeline

Suppose you're evaluating two approaches to rate-limiting your API and want to make a documented decision.

You: I need to decide between token bucket and sliding window
     rate limiting for the Roy bot. What do we know in memory?

Claude runs vektor_recall_rrf — dual-channel search across both keyword and semantic dimensions. Finds:

A previous note about API reliability concerns
A stored preference for "less infra complexity over marginal performance"
A memory about a previous rate-limit incident

Reports what it found, with context. You discuss. You decide on token bucket.

You: Decision made: token bucket rate limiting. Simpler to reason about,
     predictable burst behaviour, fits the current traffic profile.
     Store this and link it to the Roy bot project.

Claude stores:

await vektor_store({
  content: "Rate limiting decision (Roy bot): Token bucket selected over 
            sliding window. Rationale: simpler burst reasoning, predictable 
            behaviour, lower implementation complexity. Traffic profile 
            doesn't justify sliding window precision at current scale.",
  namespace: "work:roy-bot",
  tags: ["architecture", "rate-limiting", "decision"],
  importance: 8
})

Six months later: "Why did we choose token bucket?" — Claude recalls the decision, the rationale, and the date, without you keeping a decision log anywhere.

Real workflow 2: Web research without prompt injection risk

The Rachel bot originally fetched web content and fed it directly into prompts. That's a prompt injection surface.

Here's the correct pattern with VEKTOR:

You: Research the current state of pgvector performance vs Qdrant
     for datasets under 5M vectors. Use web search.

Claude:

Calls cloak_fetch_smart — checks target sites for llms.txt agent-native access first
If no llms.txt, falls back to cloak_fetch with a mature browser identity
Wraps all retrieved content in tags before passing to the model
Extracts relevant information only — never executes instructions found in page content
Stores key findings:

await vektor_store({
  content: "pgvector benchmark finding (May 2026): Sub-50ms p99 latency 
            at 1M vectors with IVFFlat index, HNSW shows 30% improvement 
            at cost of higher build time. Qdrant shows marginal advantage 
            only above 5M vectors. Source: multiple benchmarks via web.",
  namespace: "work:roy-bot",
  tags: ["pgvector", "performance", "research"],
  importance: 6
})

The difference from the OpenClaw pattern: the web content was treated as untrusted data, not instruction. The injection surface is explicitly managed. The findings persist in memory so you don't pay to research the same question twice.

Real workflow 3: Server automation with approval gates and rollback

This is where the harness earns its architecture. The Rachel bot's OpenClaw predecessor would execute SSH commands immediately, with no gate. Here's what correct looks like:

You: The Roy bot hasn't posted in 6 hours. Check what's wrong
     and prepare a fix but don't execute yet.

Claude:

Calls cloak_ssh_exec with read-only commands (no approval needed per SKILL.md):

// Read-only — proceeds automatically
await cloak_ssh_exec({
  host: "your-server-ip",
  username: "server", 
  keyName: "vps-vektor",  // ← retrieved from vault, not hardcoded
  command: "sudo journalctl -u roy-bot --since '6 hours ago' | tail -50"
})

Reads logs. Finds: service crashed due to OOM at 02:17. Node process not restarted.

Calls cloak_ssh_plan to queue the fix — not execute it:

await cloak_ssh_plan({
  host: "your-server-ip",
  username: "server",
  keyName: "vps-vektor",
  commands: [
    "sudo systemctl restart roy-bot",   // ← write op — queued
    "sudo systemctl status roy-bot"     // ← verify
  ]
})

Reports back:

Found the issue. Roy bot OOM'd at 02:17 — the market data fetch was 
loading the full instrument list into memory on each cycle. 

Prepared fix: restart the service. 
Plan ID: plan_a8f2c1 — ready for your approval.

Also stored the OOM pattern in memory so we can track frequency.

You say: approve. Claude calls cloak_ssh_approve. Service restarts. Every step logged with a rollback_key.

The bot went dark for 6 hours because of a memory leak. You found the cause, fixed it, and the fix is logged in VEKTOR memory tagged as a known failure pattern — so next time the agent checks memory first before escalating to you.

BEFORE (OpenClaw pattern)
─────────────────────────────────────────────────────────
Noticed issue → asked agent to fix → agent runs restart
command immediately → no gate, no log, no rollback key
Discovery if it makes things worse: next human check

AFTER (VEKTOR pattern)
─────────────────────────────────────────────────────────
Noticed issue → agent reads logs (auto, no approval) 
→ agent queues fix → you review plan → you approve
→ rollback_key generated for every write operation
→ incident stored in memory as known failure pattern
→ next OOM: agent recalls fix, proposes same plan faster

Part 5 — The memory consolidation loop: your knowledge gets smarter over time

VEKTOR's vektor_ingest does something no other persistent memory tool does: it runs active consolidation on stored memories.

Every week or two (or whenever you ask), run:

You: Run a memory consolidation pass on the work:roy-bot namespace.
     Identify contradictions, stale facts, and patterns worth surfacing.

Claude runs vektor_ingest, which:

CONSOLIDATION PASS — work:roy-bot
─────────────────────────────────────────────────────────
Memories scanned:          47
Contradictions found:       2
  - Memory 12: "Using OpenClaw for Claude access" 
    conflicts with 
    Memory 38: "Migrated to VEKTOR direct API"
    Resolution: SESSION 38 supersedes SESSION 12
    
  - Memory 19: "Deploying Tuesday evenings"
    conflicts with
    Memory 44: "New deploy window: Thursday mornings"
    Resolution: SESSION 44 supersedes SESSION 19

Stale facts (>90 days, not reinforced):  3
  - "Watching Qdrant 2.0 release" (resolved — decided on pgvector)
  → Marked for decay

Patterns surfaced:
  - OOM events: 3 incidents in 4 months. Pattern: always during
    market-open data fetch cycle. Suggest architecture review.
  - Rate limit hits: 7 events, all between 09:00-09:30 UTC.
    Consistent enough to be worth an explicit backoff rule.

Memories after consolidation: 41 (6 compressed/merged)
─────────────────────────────────────────────────────────

You now have a memory store that got more accurate and more useful over time — not by adding more information, but by removing noise and surfacing signal.

Part 6 — Security, cost, and governance

Building an agent harness with real credentials and real server access has real implications. This section isn't optional reading.

6.1 — The actual risk model

A VEKTOR-connected Claude agent with full configuration can:

Read your stored memories (including private namespace)
Access credentials via cloak_passport get
Execute SSH commands on your server (with approval gates — but you hold the approval)
Fetch arbitrary web content (with injection defence — but defence in depth, not perfect)
Store new memories under any namespace

Most of these risks are governed by the SKILL.md you wrote in 3.5 — the approval gate rules are enforced at the tool level, not just as instructions. cloak_ssh_plan physically queues commands that don't execute until cloak_ssh_approve is called. This is not a prompt asking the agent to be careful. It's an API that requires a second call.

6.2 — What the ClawHub disaster teaches us

When we covered the ClawHub marketplace in Part Two of this series, the root cause was trust boundary collapse: external content (fake skills) was given the same access level as trusted system configuration. The agent had no way to distinguish "legitimate skill from developer" from "malicious payload from threat actor."

VEKTOR's trust model is explicit:

TRUST HIERARCHY
──────────────────────────────────────────────────────────────────
LEVEL 1 — SKILL.md files (you wrote these)
  Trust: full. These are your operational rules.
  Location: ~/.claude/skills/ or project directories
  Access: read by cloak_cortex, applied as policy

LEVEL 2 — Stored memories (agent + you wrote these)
  Trust: high. Namespace-scoped. Encrypted. No external write path.
  Access: vektor_recall / vektor_store — internal only

LEVEL 3 — cloak_passport vault (you wrote these)
  Trust: full, separately encrypted. Never appears in recall results.
  Access: explicit get/set/list calls only

LEVEL 4 — External web content (untrusted by definition)
  Trust: zero until processed. Wrapped as <untrusted_source>.
  Access: read-only. Never executed as instruction.

LEVEL 5 — External "skills" or packages (not a VEKTOR concept)
  VEKTOR has no marketplace. No third-party skill installs.
  This attack surface does not exist in this architecture.

The ClawHub attack vector — malicious third-party skills with C2 infrastructure — simply doesn't exist in VEKTOR because there's no skill marketplace. Your SKILL.md files are text files you wrote. Nothing else loads.

6.3 — Cost model: what this actually costs to run

Unlike OpenClaw's subscription-arbitrage model (which blew up), VEKTOR runs on direct API billing. What that means in practice:

TYPICAL COST BREAKDOWN — personal harness daily use
────────────────────────────────────────────────────────────────
Interactive sessions (3-5/day, ~2,000 tokens each):
  ~30,000 tokens/day × $3/MTok (claude-sonnet) = ~$0.09/day

Memory recall operations (automatic, small):
  ~50 operations/day × ~200 tokens = ~10,000 tokens
  = ~$0.03/day

Web fetch + research (occasional):
  ~5 fetches/day × ~3,000 tokens = ~15,000 tokens
  = ~$0.045/day

Total typical daily cost:               ~$0.17/day (~$5/month)
Total with heavy research days:         ~$0.50/day (~$15/month)

Compare to OpenClaw community reports:  $300-750/month
Compare to one blow-out incident:       $200+ in a single day

The circuit breaker prevents blow-outs:

CIRCUIT BREAKER DEFAULTS
────────────────────────────────────────────
Hard spend limit per session:  configurable (default $5)
Hard call limit per session:   configurable (default 200)
On limit hit:                  HALT + notify (not silent death)
Notification path:             console + optional Slack/webhook

Set your limits on first run. A session that hits the call limit doesn't silently hang — it stops, reports what happened, and waits for you to continue or abort.

6.4 — Multi-LLM routing: not locked to one provider

Because VEKTOR calls providers directly via API, you're not tied to Claude for everything. vektor_providers shows what's configured:

await vektor_providers()
// → anthropic (claude-sonnet-4-20250514, claude-opus-4-20250514)
// → openai (gpt-4o, gpt-4o-mini)
// → minimax (abab6.5s)
// → nvidia-nim (llama-3.1-70b)

Different tasks route to different providers:

TASK                          OPTIMAL PROVIDER
────────────────────────────────────────────────────────
Complex reasoning, analysis   claude-opus-4 (best quality)
Code generation, daily work   claude-sonnet-4 (fast + accurate)
High-volume summarisation     minimax-abab6.5s (lowest cost/token)
Vision + image analysis       gpt-4o (strong multimodal)
Latency-critical automation   nvidia-nim (near-local speed)

When Anthropic has an outage — which happens — VEKTOR fails over automatically to the next configured provider. The memory context travels with the request. Your session continues with a different model, not a silent failure.

This is what the OpenClaw/Hermes era couldn't deliver: provider resilience built into the architecture, not bolted on as a workaround.

Part 7 — What comes next: harness evolution

This setup is the foundation. Common evolutions as your use deepens:

Add project-specific SKILL.md files for each major project. A work:roy-bot/SKILL.md that tells Claude exactly how the bot is structured, what the known failure modes are, and which files are sensitive. Claude loads it automatically when the topic comes up.

Migrate automation to Path 2 when you need things running at 3 AM without Desktop open. The same cloak_passport vault and VEKTOR memory database is accessible via direct API call with mcp_servers parameter. Memory from your interactive sessions is available to your automation scripts.

Add debrief patterns to your SKILL.md for incidents. When the Roy bot crashes, the session that debugs it automatically stores a structured incident memory — cause, fix, time-to-resolution — without you having to write it up. Six months of incident memories become a failure pattern library.

Session start hooks via SKILL.md — the vektor_status + initial vektor_recall pattern in your harness skill means every session starts with relevant context pre-loaded. As your memory database grows past 500 entries, add a vektor_briefing call that summarises the most recent 7 days of stored context before the first response.

Team memory with shared namespaces — if you're working with other developers, VEKTOR supports a shared namespace model where both parties can read/write a common memory store. Decisions, architecture choices, and known failure patterns become team knowledge, not individual memory.

Closing

You now have a harness where:

Memory persists. Every decision, preference, and failure pattern survives session close and is available in the next conversation without re-explanation.

Credentials are isolated. API keys, SSH credentials, and OAuth tokens live in an AES-256 encrypted vault that never appears in prompt context, never gets committed to git, and never shows up in recall results.

Skills route intelligently. SKILL.md files tell Claude how to behave for your specific setup — VPS access patterns, approval rules, namespace routing — without you repeating the same briefing every session.

Web content is treated as untrusted. Everything fetched by cloak_fetch is wrapped as untrusted data before being passed to a model. The prompt injection surface that took down Rachel is explicitly managed.

Irreversible actions require approval. cloak_ssh_plan queues. cloak_ssh_approve executes. The gate is in the API, not in a prompt instruction the model might ignore under pressure.

Cost is bounded and predictable. Circuit breakers halt runaway loops before they become incidents. You pay roughly $5/month for daily use. The bill doesn't spike 47× overnight.

This is what the agentic age looks like when it's built correctly — not as a demo that works once, but as infrastructure that accumulates value every day you use it.

The difference between what Marcus lost in one night and what this harness costs in a month is not a feature set. It's an architecture.

VEKTOR Slipstream SDK — vektormemory.com

npm install -g vektor-slipstream

References

VEKTOR Slipstream documentation — vektormemory.com/docs
cloak_passport vault API — vektor tool reference
Claude Desktop MCP configuration — docs.claude.com
Anthropic Usage Policy (September 2025) — anthropic.com/legal/aup
OpenClaw security incidents — Part Two of this series

Tags: AI Agents · Personal Knowledge Management · Claude MCP · LLM Memory · Developer Tools · Node.js · AES-256 · Second Brain · VEKTOR · Automation

GET VEKTOR → READ THE DOCS